HENRY E. KYBURG
A C T S A N D C O N D I T I O N A L P R O B A B I L I T I E S
1
The budget o f problems and considerations raised by Gibbard and Harper [ 1]
is a rich and entertaining one. Since I believe neither in the usefulness of
counterfactuals, nor in conditionalization, but rather in the deep importance
of a distinction that Gibbard and Harper flatly reject - that between epi-
stemic and stochastic independence - it is both interesting and curious that
my framework leads to many of the same conclusions they are led to. But in
addition, it seems to me, I can throw light on matters that remain shadowed
(or, what amounts to the same thing, a matter of 'intuition') for them. And I
do this in a classical, simple, naive, simple-minded, framework [2].
I construe a rational corpus K to consist of a set of statements in a language
L. Since rational corpora are just sets of statements, we may consider arbitrary
ones without indulging in counterfactuality; i f K is Smith's corpus at a certain
time, we may consider the corpus consisting of the deductive consequences of
K U {S} ; this set o f statements 'exists' just as much as K exists. For the sake
of simplicity, I shall suppose here that rational corpora are deductively closed,
though I would not always make this assumption. Thus I shall represent the
corpus consisting of the deductive closure of the set of statements consisting
of K and S by 'K and S': K and S = {x :x E Cn(K U {S})}, provided S is con-
sistent with K.
What Harper and Gibbard represent as A c~-~B, I represent as 'B E K and
A' . Their axioms, of course, do not hold for my notion, though there is a
trivial analogue of Axiom 1 ( I f A CK, and B E K and A, then B C K ) and a
partial analogue of Axiom 2 (If ~ S E K and A, then it is not the case that
S E K and A). The converse of this analogue does not hold - nor is it very plausible as a principle governing counterfactuais.
The crucial distinction for me is that between stochastic and epistemic
conditional probability. In order to explain this distinction, I must say
Theory and Decision 12 (1980) 149-171. 0040-5833/80/0122-0149502.30. Copyright �9 1980 by D. Reidel Publishing Co., Dordrecht, Holland, and Boston, U.S.A.
150 HENRY E. KYBURG
something about probability in general. We say that the probability that the next toss of this coin will land heads is a half; for me, such probabilities are
based on statistical knowledge - in this case, knowledge that half the tosses
of coins yield heads. It is not based on any knowledge I have about the fre-
quency with which this coin lands heads; and it need not be based on fre-
quencies concerning coin tosses at all. For example, I might assign the prob-
ability 0.9 to heads on the next toss of this coin on the grounds that Swami
X predicts heads, and he is right 90% of the time. The probability is based on
a frequency, but it is not the frequency in any class mentioned in the prob-
ability assertion. The probability that I will go to the movies next Saturday
may be a half, not because I go to the movies half the time on Saturdays, but
because I have decided to toss a coin Saturday night, and to go to the movies
if and only if it lands heads.
Now consider the probability that a toss of a coin will land heads. I regard
such probabilities as equally epistemic; talk about their being unknown, or
changing, or being different, is to be construed as talk about various rational
corpora subject to various conditions. But they are different from the pre-
vious statements in a significant way. The use of the indefinite article directs
our attention to a specific reference class: tosses of coins, in the example.
Furthermore, if we suppose that the indefinite article in 'a toss' has the sense
of 'a described toss which is random relative to K ' , we can show that the
probability of 'an A is a B' is the interval (p, q) relative to K if and only i fK
contains a statement to the effect that the measure of B 's among A's lies
betweenp andq, and contains no stronger statement. So we might be tempted
to call these statements 'stochastic' as opposed to 'properly epistemic'.
2. AN A L T E R N A T I V E S T R U C T U R E
Let K be a set of statements in a language L. We construe K as a rational corpus and assume that it is consistent and deductively closed:
A-1 C n K C K ;
A-2 ~ 7 0 = 17EK.
We def'me the expansion of K by the statement S, K and S to be the deduc-
tive closure of K U {S}, if S is consistent with K, and the empty set otherwise:
ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 151
D-1 K a n d S = Cn(KU{S}) i f ~ r - 0 = 17ECn(KU{S})
= 0 otherwise.
L does not contain a counterfactual connective. It may contain intensional
expressions in the form of statistical laws, but it need not. It does contain
frequency statements, of course, but most o f these can be construed as per-
fectly extensional. Intensional or not, statistical statements will have the form
vS(A, B, p, q)7, and be interpreted as 'the frequency with which (propensity
with which) A's are B's lies between p and q' . We want to focus on the
strongest such statements in K: we write:
D-2 rS(A,B,p,q)-l*EKifandonlyifC-S(A,B,p,q)TEKandif rS(A,B, r, s) -1 EK then V(p, q) C (r, s) 7 is a theorem.
For reasons that by now should be well understood, the terms that may
appear in the place o f A and B in vS(A,B,p,q)~ are limited to a certain recursively defined class: we don ' t want to bother our heads about the fre-
quency of tails in the union of the set of tosses that yield tails with the set of tosses with prime ordinal numbers.
Following the example of Harper and Gibbard, I shall simply appeal to the
reader's 'ordinary understanding' of the relation: a is a random member of B
with respect to C relative to the corpus K, written RAN(a,B, C,K). As a
guide to that understanding, note that the relation obtains just when B is an
appropriate reference class for assessing the probability of 'a E C ' given the
background knowledge K.
Probability is defined for equivalence classes of statements:
D-3 Prob (K, S) = (p, q) if and only if there exist terms a, B and C of
L such that:
( 1 ) r a E C ~ S - q E K ;
(2) rS(B, C,p,q) 7 . EK;
(3) RAN(a, B, C, K).
We introduce into L a special operator a. Combined with terms appropriate to the position of A in FS(A,B,p, q)7 _ i.e., terms that may plausibly be
taken to denote reference sets - it forms terms that are taken to be random
members of those sets, with respect to membership in any set denoted by a
152 HENRY E. KYBURG
term appropriate to the place of B in rS(A ,B ,p , q)7. More informally, it corresponds to one use of the indefinite article in English - that employed in the sentence 'the probability that a coin toss will land heads is a half ' , as
well as in 'a tiger is a carnivore' or at least one interpretation o f the latter
sentence. There are two axioms characterizing the operator or:
A-3 I f B and C are terms appropriate to C-S(B, C, p, q)7 , then
RAN(aB, B, C, K).
A-4 If B, VA f~ B -q, and C are appropriate terms, and FaA E B 7 E K, then RAN(aCA 0 BT), r A N B ~ , C, K).
Conditional probability is defined in the obvious way: the probability of S
given T relative to K is just the probability of S relative to K and T:
D-4 Prob(K, T,S) = Prob(K and T,S).
The distinction between stochastic and epistemic probabilities relative to
K is now a simple syntactic one: i fS has the form FaA EB-q thenProb(K, S) is stochastic; otherwise it is epistemic. (Note that both are relativized to a
body of knowledge K.) If S has the form raA EB 7 and T has the form
raA E C 7 , then Prob(K, S, T) is a stochastic conditional probability; if the
operator a does not occur in either S or T it is epistemic. If the operator c~
occurs, but not in the way first described, we shall suppose that Prob (K, S, T) is neither epistemic nor stochastic; this case will not arise in what follows.
Two simple theorems, whose proof would depend on a more detailed
characterization of randomness will help to clarify matters:
T-1 Prob(K,r~A EB 7) = (p,q) i fandonlyi fFS(A,B,p,q)-q*CK
T-2 Prob (K, VaA E B 7, F~A E C -q) = (p, q) if and only if ;-S(A OB, C,p ,q)7*EK.
This theorem follows from T-1 with the help o l D - 4 and A-4.
It will be argued that in deliberation stochastic conditional probability
plays the role that conditional probability plays for Harper and Gibbard, and that epistemic probability plays the role of 'the probability of the conditional'.
But there are differences as well.
We say that S is independent of T, given K, if the probability of S is unaffected by the addition o f T t o K:
A C T S AND C O N D I T I O N A L P R O B A B I L I T I E S 153
D-5 Slnd(K, T) if and only ifProb(K, S) = Prob(K, T, S).
It should be observed that independence is not generally symmetrical: 'this
coin is biassed 0.6 for heads' is independent of 'the tenth toss of this coin
yielded heads', since a single toss does not provide enough evidence to alter
the probability of the statement concerning bias. But equally clearly, to add 'this coin is biassed 0.6 in favor of heads' to your body of knowledge will alter the probability of 'the tenth toss of this coin yielded heads' from 0.5 to 0.6.
But we may distinguish between proper epistemic independence and sto- chastic independence. If S and T have the forms c-ctA ~ B -1 and r-aA E C ~ , respectively, then we may interpret S lnd(K, T) as asserting stochastic independence. On the basis of T-2 we may prove that stochastic indepen- dence is symmetrical:
T-3 F-aA ~ B -11nd (K, r-aA E C 7) if and only if VaA ~ C -q Ind(K, F-aA C BT).
Neither sort of independence has anything to do with 'causal' indepen- dence; but it wilt turn out that we have no need to introduce the notion of causality, except as it appears in the guise of statistical knowledge in K.
3. U T I L I T Y
Since probabilities are interval valued, so will expected utilities be interval
valued. This fact will play a role in a general decision theory, but is of relatively minor importance here. For notational convenience, we suppose Prob is vector valued and Des (desirability) is scalar valued. Utility will thus
be vector valued. Suppose that A = {Ay} represents the alternative types of actions and 0 = {Oi} the alternative types of outcomes.
Relative to A and O, we may define expected utility as follows:
D-6 Ua,o(Fd @Ay 7 ,K) = ~ Des(Fd ~ Oi (3Ay 7) i
eroa (K, ra ~ A~ 7 , rd ~ Op).
Again we may distinguish between proper epistemic and stochastic utility.
154 HENRY E. KYBURG
UA, o ( r - d E A j T , K ) is stochastic if d has the form r-aPT; it is properly
epistemic otherwise.
The basic decision rule is to act in such a way as to maximize expected
utility. Clearly what is intended here is the proper epistemic utility o f the
agent. Since utility is interval-valued, this rule will not solve all problems -
but the ways of choosing actions when this rule does not provide guidance
are not our concern here.
The sure thing principle will be a consequence of the basic decision role
under certain circumstances which we will characterize in due course.
4. DAVID AND B A T H S H E B A
Harper and Gibbard do not seem to distinguish between specific actions and
types or classes of actions, or between specific outcomes and types or classes
of outcomes. I shall suppose that the actions open to the agent in a situation
d can be represented as making true statements of the form r d ~ Aj 7 , where
Aj represents a type or class o f action. Similarly, the entities on which utility
functions will be defined are statements of the form Cd E Oj -~, where Oj
represents a type of outcome. We may also consider instances of types of situations, represented by
terms of the form VaST o
Let d be David's particular situation, P the set of similar political situations,
A = {B, B} the set of actions open to David (sending or not sending for the
woman at the well), and O = {R, R} the set of outcomes to be contemplated
(revolution or not). We suppress A and O in the notation, since they do not
play a role in the analysis of this case. KD we suppose to be David's corpus
o f knowledge. Relation to David's corpus, anyone, including David, can com-
pute the stochastic utilities o f the two courses o f action:
U(r-aP E B 7 , KD) = Des( VaP E B f7 RT)Prob (KD, raP E B 7 , raP E R 7) + Des C a P E B ~ ff~) Prob (KD, l a P C B 7 , %ec~7).
U(r-aP E B -1 , KD) = Des (raP E B C7 R 7) Prob (KD, raP ~ ~7 , raPeR 7) + Des(rae~B nR 7)Prob(KD, r a P ~ 7, rap~-~).
Since a is an operator that insures randomness, the probabilities are just
ACTS AND CONDITIONAL P R O B A B I L I T I E S 155
the conditional measures taken to be known in KD. A political situation in which the woman is sent for is more desirable than one in which she is not,
given the assignment of values made by Harper and Gibbard, just in case the
frequency with which revolution then ensues, plus the frequency with which
revolution fails to ensue following abstention, is less than 10/9. These are
straight-forward conditional measures, assumed to be known by David.
What David wants to know, however, is not the utility an an act of sending for the women at the well, but the utility of his act of sending for the woman at the well. He wants to know U(Fd E BT, KD) and U(rd E ~7, KD). These
will be equal to U(-oaPEBT,KD) and U~-aPEB-1,KD) respectively, just in case two conditions are met:
(i)
(ii)
RAN(d, r-p N B 7, R, K and rd E BT), from which it follows that RAN(d, Fp 0 B 7, _R, KD and rd E B 7).
RAN(d, rp N if7, R, KD and r-d E fin), from which it follows
thatRAN(d, rPnffT,R,KD and Fd E fin).
(In general (i) neither entails nor is entailed by (ii); usually both will be true
or both false, however.)
Are these two conditions met? In the story as told by Hart~er and Gibbard
it seems that they are. This can ordinarily be insured by choosing P in such a
way that they are met. Stochastic and proper epistemic utility can thus be made to coincide.
5. SOLOMON
Let s be Solomon's particular situation, P,B and R as before. Cis the subset
of P in which the leader is charismatic, C its complement in P. K s is
Solomon's rational corpus, except that it also includes the knowledge that
revolts depend largely on the lack of charisma of the leader, and not on the
performance of unjust act, such as B, as well as the knowledge that charis- matic kings tend to act justly and uncharismatic kings tend to act unjustly.
(Note parenthetically that this latter information does not entail, as Gibbard and Harper seem to suppose, that justice is evidence of charisma.
Suppose that 20% of kings are charismatic and just; 10% charismatic and unjust; 40% uncharismatic and unjust; and 30% uncharismatic and just. Then
156 HENRYE. KYBURG
charismatic kings tend to act justly (0.2/(0.2 + 0.1) > 0.1/(0.2 + 0.1)) and uncharismatic kings tend to act unjustly (0.4/(0.3 + 0.4) > 0.3/(0.3 + 0.4)), but justice is not evidence of charisma, but of the lack of it (0.2/(0.2 + 0.3) <
0.3/(0.2 + 0.3))! This same blunder, as committed by another author, was pointed out in Levi [3] .)
Let us add the constraint clearly intended, that the frequency of charisma among Bathsheba abstaining kings is greater than that among non-abstaining kings and that the frequency of non-charismaticity among Bathsheba prone kings is greater than that among abstaining kings.
Furthermore, the relation between justness and sending for Bathsheba is not clear. One unjust act does not unjustice make. Even if unjustness is a sign of lack of charisma, a single instance of unjustness might well not be regarded as significant evidence regarding lack of charisma. For example, if unjustness is a tendency to act unjustly on at least 50% of the occasions on which one is presented with the choice between justness and unjustness, a
single instance of unjustness will not make that hypothesis probable, much less acceptable. This is an instance of the epistemic asymmetry of indepen-
dence mentioned earlier. Since in this case having charisma and having a revolution are indepen-
dent of the act of sending for Bathsheba (but not necessarily conversely - kings lacking charisma might be known to be prone to covet other men's
wives), both the stochastic expectation of the generic act of sending for another man's wife, and the particular act of Solomon's sending for
Bathsheba, are given by
UCaP E BT, K s ) = Des(-aP E B n R ) Prob(K s , raP E B -n,
raP E R 7) + DesCaP E B n g l ) Prob(Ks, raP E B -7,
raP~g~). U~-aP E ~-7, Ks) may be computed similarly.
Note that charisma doesn't enter into the computation at all, even in this general case. What Solomon is interested in, of course, is the particular case: he is interested in U~-s E B -7, Ks) and UCs @ g7, Ks), and if P is chosen wisely so that RAN(s, rp N B 7, R, K s and r-s E B 7) and RAN(s, rp N ~'7, R, Ks and rs E g 7) hold, these will be the same as the stochastic utilities just
computed.
ACTS AND CONDITIONAL PROBABILITIES 157
Under the assumption that Prob(Ks and r a P E B - I , r a p E R q ) = Prob(K and FaPEBT,raPER-q)= Prob(Ks,rapERT), and that RAN(s, rpNBT,R,Ks and rsEB~), we also have Prob(Ks,rsEB~,VsER-1)= Prob(Ks, rs E R 7): the eventuality rs E R 7 is properly epistemically indepen- dent of the decision to make r-s E B -q true.
To obtain an example that will illustrate the point at issue, we may sup- pose that it is given to each king to make a Bathsheba decision once in his reign. B denotes the set of instances in which the other man's wife is sent for, and not merely the general character of unjustness. Let us represent the statistical knowledge of the class of situations P by the following table:
B N C N R : Pl
/~n C A R : P2
B N C A R : P3
B N C N R : P4
B NCN/~: Ps
/~ N CN/~: P6
B N C N R : P7
B N C N R : P8.
Keeping in mind that these frequencies can, in reality, only be known approxi-
mately, the assumed body of knowledge incorporates the following constraints:
(i) (PB + P4)/(/)3 + P4 -1- P7 + P8 ) is high (uncharismatic kings are frequently revolted against)
(this entails that P7 + Ps is small relative to P3 + P4)-
(ii) (/92 q-P6)/(P2 q-P6 +Px +Ps)>P2 +P4 +P6 +Ps
(charismatic kings are prone to make the just Bathsheba decision).
(iii) (P3 -F Pv)/(P3 +P7 -t-p4 +P~) > P , -t-P3 +Ps +P7
(uncharismatic kings are prone to make the unjust Bathsheba decision).
158 HENRY E. KYBURG
(iv) (P2 +Pa)/(P2 +P6 +P4 + P s ) > P ~ +P2 +Ps +P6
(kings who make the just Bathsheba decision tend to be charismatic).
(v) (Pa +PT)/(Pl +P3 +Ps + P T ) > P 3 +P4 +P7 +P8
(kings who make the unjust Bathsheba decision tend to be uncharismatic).
It follows that
(Vi) (Pl +Pa)/(Pl +P3 +Ps + P T ) > P l +P2 +P3 +P4
(kings who fail the Bathsheba test are more often revolted
against than others).
(vii) (/)2 +P4)/(P2 +P4 +-P6 + P s ) < p l +P2 +P3 +P4
(kings who pass the Bathsheba test are less often revolted against than others).
Now Harper and Gibbard are talking about subjective probabilities and not
about frequencies. Nevertheless, every assignment of subjective probability must be consistent with some set of frequencies, on pain of incoherence. (We leave to one side the probabilities of the counterfactuals, attending only to the indicative sentences.) A natural assumption, given the claim that 'Unjust acts themselves.. , do not cause successful revolts' is that the outcome of the Bathsheba test is irrelevant to the frequency of revolt; put in other words
Pl = P2, P3 ---= P4, Ps = P6, and P7 = P8. Since this is flatly inconsistent with the claim that 'charismatic kings tend to be just' (ii) together with 'uncharis- matic kings tend to be unjust' (iii); and with the (implied) claim that 'just kings tend to be charismatic' (iv) and 'unjust kings tend to be uncharismatic' (v), we must assume that Harper and Gibbard do not intend that these probabilities (Pl &P2 ,P3 &P4, etc.) are equal. And there is no reason why they should: their probabilities are subjective, so there is no reason why one frequency cannot be known to be equal to another (revolts under charis- matic kings who fail the Bathsheba test and revolts under charismatic kings who pass the Bathsheba test, for example), and at the same time a higher probability be assigned to one (revolt under a specific king who fails the test) than to the other (revolt under a king who passes the test). This seems to me to be patently irrational, but that may merely reflect my prejudices about probability and rational belief.
ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 159
So far so good, until we are given the distinction between an 'indication'
and a 'cause'. We are told that abstention 'for this reason' (i.e., to bring about
an indication of charisma) would be useless in avoiding a revolt. But in the
earlier discussion there was no mention of motivation one way or the other.
If it is significant, it should be taken account of. It is quite true that if we are thinking of frequencies, to bring about an indication of an outcome is not 'in
any way' to bring about the outcome itself. But surely if this is the case we cannot take the probability of the outcome given the indication to be differ-
ent from the probability of the outcome. In Bayesian terms this contradicts the previous assertion that the probability of the indication given the outcome
is greater than the prior probability of the indication. Relevance, for the
Bayesian must be symmetrical: if A is positively relevant to B then B is posi-
tively relevant to A. Rather than to try to sort out what the authors have in mind, let us consider both cases.
Case (a): Solomon knows that the frequency of revolt is independent of
the frequency of positive results on the Bathsheba test: Pl = P 2 , P3 = P4,
p5 = P6, and P7 = Pa- This appears to be the situation that Harper and Gibbard have in mind when they talk of causal independence. But then we
may easily compute that the frequency of Revolution in B, (Pl +P3)/ (Pl + P 3 + P s +PT) , is just the same as that in /~, (P2 + P 4 ) /
(/)2 +P4 +P6 +Ps) , and indeed in P in general, (Pl +P2 + P 3 + P 4 ) :
U(-mP E B -q, Ks ) = Des(raP E B N R 7) Prob(Ks and r a P ~ B 7,
I-aP E R ~) + Des(-aP E B (1 ~-1) Prob(Ks and r a P E B 7, Fap E / ~ 7 )
= Des(-ap~B ART)Prob(Ks,FaPER 7) + + Des(-aP E B (1 ~7 Prob(Ks, raP E_~).
V(-aP ~ ffq, Ks) = Des(-aP E B VI R 7) Prob(Ks, FaP E R ~) + + Des(aP E B (3 R-q Prob(Ks, vaPE ~7).
Since in both cases the same probability is involved, and whether or not there is a revolution Solomon prefers having Bathsheba to not having her, he should send for her forthwith provided RAN(s, r-p (~ B 7, R, Ks and r-s E B -q)
and RAN(s, P A B, R, Ks and Fs E if-l). The analysis boils down to that for David, so long as revolution is independent of Bathsheba both in the presence of charisma and in the absence of charisma.
Case(b): Pl r or Pa r orps :/:P6 o r p 7 =/:Ps-This is to say that
160 H E N R Y E. K Y B U R G
we abandon the assumption that injustice is known to be irrelevant to revolt
and replace it with knowledge of relevance. But that does not make the
case uninteresting. Consider first the stochastic expectation of the generic
decision raP E B -q. Under the new circumstances we have
U(WaP E B -1 , Ks) = Des(aP E B A R -q) Prob(Ks, rap ~ B 7 ' cap E R 7 ) + D e s ( a p E B (1K 7) Prob(Ks, raP ~ B 7, raP E ~ ~).
U (-aP e f f n , K s ) = D e s (-aP e ~ n RT ) Pro b ( K s , F aP ~ f i , r-aPER~) + Des(-aPEB (1R 1) Prob(K s, raP ~ gn , raP ~ ~ 7).
Since we are now assuming that
Prob(Ks,VaPEB-I,r-aPER -1 > Prob(Ks,FaPERT), and
Prob(Ks, raP E fin, raP E R ~) < Prob(Ks , raP E R~) ,
it is clear that the new utilities may not be the same as the previous utilities.
That is, under the new assumptions the relative stochastic expectations of
F a P E B 7 and c - a P E f i may be reversed. Remember, though, that we are
talking of the expected utility of a king who falls (or passes) the Bathsheba
test, and the probabilities that enter into these computatations are simply
the counterparts of the general relative frequencies or propensities known in
Solomon's corpus Ks. Next let us look at the utilities of r-s E B 7 and r-s C f i evaluated from an
outsider's point of view. By this I mean that we consider Solomon's corpus
K s and add to it merely r-s E B -n or r-s ~ / ? q . Solomon naturally knows much more than this: he will know, for example, whether it would be lust that would tempt him to make r-s C B -7 true, or whether it would be selfless
consideration for Uriah whom he knows to be unhappy with Bathsheba.
(David knows much more than Ku and r-d E B 7 , too, but in his case we
assumed that knowledge to be irrelevant to his deliberations.) From the
outsider's point of view, we may well have RAN(s,FPC3B~,R,Ks and
r-sEB-1) and RAN(s , r -PC3f i ,R ,Ks and r-s EB-~). That is, relative to an
outsider's point of view, Solomon's situation s may well be a random member of r-p C3 B 7 or r-p (~ g n as the case may be with respect to R. For the outsider,
then, the stochastic utilities may correspond to the proper epistemic utilities.
ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 161
Next, suppose that Solomon himself truncates his c o r p u s - i . e . , he
brackets the motives, noble and ignoble, that incline him one way or the
other. When he does this, he is adopting the point of view of the outsider:
the randomness conditions are still met; the utilities o f r s E B -q and rs E f t q are still the same as those of the general events r aP C B 7 and raP E fin. It
seems quite appropriate to regard these utilities, as Harper and Gibbard do,
as the values of r-s ~ f i n and C-sEB7 'as news'; the deliberative element has
been wiped out. These would be the appropriate epistemic utilities for
Solomon i f he were to be tom what he was going to decide to do. But he is
not then deeming what to do: the utilities are not relevant to his decision,
since they represent utilities based on the hypothesis that he has made one
or another of the decisions.
Now let us allow Solomon to deliberate. We need not follow the course
of deliberations, but may abbreviate what he learns in the course of those
deliberations by 'cc'. Thus what Solomon really wants to evaluate are
u(r-s ~ B 7 , Ks and ee) and
U(rs ~ 7 , Ks and ce).
It is unlikely that s is a random member of either r p N B 7 or r p C7/~-7 with
respect to R relative to Ks and ce and r-s E B 7 or relative to Ks and ee and
r-sEffT. In fact with respect to determining the probability of r-s E R -7
relative to Ks and ce and rs E B 7 , it might turn out that the relevant random-
ness relation obtained between three entirely new objects: X, Y, and Z. That
is, we might have :
rX E Z +--+ s E R 7 E K s and cc and Fs EBT;
R A N ( X , Y, Z, K s and cc and rs E BT);
rS( y , Z, p , q )7 * E K s and cc and r-s CBT;
and therefore
Prob(K s and cc and V-s C BT,C-s C R 7 ) = (p, q).
There is yet another way of dealing with this problem. Suppose that Solomon's conscious motives in making his decision are a mixture of the
noble and the ignoble, so that whichever decision he makes, s will belong
both to a class in which revolution is rare and to a class in which it is frequent;
162 HENRY E. KYBURG
and it may well be that Solomon does not have the information to determine
the frequency of revolution in the intersection of these classes. There is never-
theless, a course of action open to him which will leave his expectations unchanged: Suppose he declines to make the decision between r-s C B 7 and
r s @fin, and instead employs a chance mechanism with frequency of 'Go'
equal to Pl q- P3 + Ps + P7- Since this is the overall frequency of B in the
reference class P, it will not indicate anything positive or negative about the
likelihood of revolution. The outcome, by definition, is independent of
whether or not he has charisma, and thus of the occurrence of revolution.
This may explain the propensity of many people to flip a coin to decide
between two courses of action, one of which would be more fun, and one
of which would reveal strength of character (to go to the movies or to study).
6. D E L I B E R A T I O N AND CHOICE
Finally, let us suppose that K s contains just Solomon's knowledge that he
is deliberating. Let us further take this as implying that he believes that he
has freedom of choice, and that therefore there is no connection between
how he decides and whether or not he has charisma. Now he cannot assume
that this is true of all the kings involved in the reference class P - since in
that class there is revealed a connection between passing the Bathsheba test
and being revolted against. But he is perfectly free to believe that there is a
subset o f P, say FP, in which the frequency of revolt is the same for rFp N B 7 and for rFp A f t 7. Furthermore, since his knowledge of the statistical
relations between C and R is perfectly general, we may suppose that it applies
to FP as well as to P. (We could justify this assumption through more detailed
considerations of the weight of inductive evidence, but that would lead us
astray from the main point.) We are then supposing that Solomon's corpus
Ks contains
r-S(FP, R, q, q)7,
rS(Fe n e, R, q,
r-S(FP n Jg, R, q, q)7,
where q = Pl + P2 + P3 + P4. Relative to this corpus, it may well be the case
that
ACTS AND CONDITIONAL P R O B A B I L I T I E S 163
RAN(s, FFPNB-q,R,Ks andr sEBT) , and
RAN(~, FFP n ffq, R, Ks and rs E if7),
and, therefore, that rs E R-7 is properly epistemically independent of r-s @ B-q.
If we compute the utilities in this case it is clear that we will find that the
utility of sending for Bathsheba is greater than the utility of abstaining.
7. R E B O A M
Let r be Reboam's special situation, P the set of like situations, C those in
which the king is charismatic, S those in which the king is severe, and D
those in which the king is deposed. We suppose enough data so that the following frequencies may be accepted as practically certain by Reboam.
C A S N D 0.16
C N S N ff) 0.24
C N S N D 0.02
C N S A D 0.08
C N S A D 0.08
~ n s n ~ 0.02
C n S N D 0.22
C N S N D 0.18.
Consider first the general case:
U(-aP E S-q, K R ) = Des(-ap C D n S -1) Prob(K R and rap E S -7, raao c D 7) + Des(-o~P E D n S-q) Prob(Kn and r~p E S 7, rape/~7)
= 4.8 + 52 = 56.8.
U~-aP @ if-q, KR) = Des(aP E D n S'q) Prob(K n and r ap E Sq, r a p e D -q) + D e s ( a P E D N~) Prob(Kn and rap E if-l, rap EE/3-q)
= 0 + 4 1 . 6 = 41.6.
164 HENRY E. K Y B U R G
This is as it should be, and corresponds to the application of the stochastic sure-thing principle described by Harper and Gibbard.
But it does not apply to Reboam, because r is not a random member of r p N S 7 with respect to D, relative to Reboam's own corpus. This is (as in
Solomon's case) a result of the fact that he is deliberating and choosing. Let
the corpus that reflects the course of his deliberations be Kn and C. The
probability that Reboam is charismatic is 1/2, since before he chooses
whether or not to be severe, he is just another king, and half of them are
charismatic. The same is true for us.
Let Reboam now choose to be severe. For us, he is a random member of the kings who thus choose, of which 80% are charismatic. For Reboam this is
not so: that he chooses to be severe or not to be severe has no effect on his
degree of belief in his charisma. He knows (given the story that Gibbard and
Harper tell) that his choice cannot affect his degree of charisma. We can express this by saying that r belongs to a subclass o f r p n S 7 (or o f r p N fin),
even when r r E S -~ is known (when r r E fin is known) in which the frequency
of C is the same as it is in P in general: namely, the class in which S is freely chosen. Surely Reboam regards his choice as free, even if we do not (else he
would not be deliberating). Furthermore, that it is freely chosen implies that
it cannot be taken as evidence of a prior state: that Reboam freely chooses r r E S 7 implies that the epistemic probability that he is charismatic is unchanged - that r r E C -q is epistemically independent of r r E S q. Of course we don't know what proportion of r p N S -1 represents the free choice of
S - but we do know that however small this class may be, it must contain the same proportion of C as P itself, else epistemic independence must fail.
Let the class of instances in which S or ff is freely chosen be PF; then Reboam will regard his situation r as a random member of PF, where he knows that the frequency of C among both r(PF) O S --1 and r (pF) N if-1 is
1/2. He also knows that the frequency of deposition given C and S, given C
and S, given C and S, and given C and S is as stipulated earlier: F has no known bearing on these frequencies. Therefore, we may compute:
Prob(KR and rrES-q,rrED 7) = Prob(KR a n d r r e S 7, rrE (D n C) U (D n e~ ) .
Since RAN(r, rpF n S 7, D, Ks and r r E S-q), and we may compute:
ACTS AND CONDITIONAL P R OBABILITIES 165
S(PF n S, (D N C) U (D N C), 0.6, 0.6);
S(PF n S, (D G C) U (D N C), 0.4, 0.4);
S(PF N S, (D N C) U (D N C), 0.375,0.375);
S(PF n S, (D n C) U (D G C), 0.625,0.625).
Thus: U(rrEST,KR) = 0 . 6 x 0 . 1 0 + 0 . 4 x 100 = 46;
U(rrEST,Ka) = 0.375 x 0 + 0.675 x 80 = 50.
In the former case we could apply a sure-thing principle because the prob-
abilities of the outcomes of the acts were independent of the acts. Here
matters are a little more complicated: there is a circumstance (C or C) independent of the act which serves as a basis for the application of the sure-
thing principle. Note that C is not independent of the act under the stochastic
analysis, but that D is; under the epistemic analysis C is independent of the
act but D is not. The distinction between purely epistemic utility and sto-
chastic utility, which reflects the difference between purely epistemic prob-
ability and (indefinite) stochastic probability, provides for precisely the dis-
tinction that Harper and Gibbard want to draw attention to, but requires
neither a peculiar counterfactual, nor a variety of sure-thing principles.
8. NEWCOMB
Let s be the subject's situation, N the set of Newcomb situations, M the set in
which there is a million dollars in the right-hand box, and R the set in which
the right-hand box alone is taken. We had best forget about the demon or the
psychologist, because that introduces a competitive element which might
distort our intuitions. We merely suppose that we and the subject know that all (almost all) R's are M's and that none (almost none)/~"s are M's. The
stochastic expectations are easily computed:
u(r aN E R 7 ,Kn) = $106 X Prob(Kn and raN E R~ , aN ~M) ~ $ 1 0 6 ;
U(FaN E R -1,Kn) = $106 X P y o b ( K n and raN E R-1, aNEM) + $1000 ~ $1000.
But the subject of the experiment is not concerned with stochastic
166 HENRY E. KYBURG
probability, but with proper epistemic probability:
U(r-s ER 7 ,gn) = $106 x Prob(Kn and rs E R 7 , r-s E 3~r7);
U(Vs ER 7 ,Kn) = $106 x Prob(Kn and rs ER 7, Vs ~M 7) + + $1000.
We need to evaluate these probabilities. Now since to make Fs E R 7 or
rs E/~7 true is in the power of the subject (and if we want to tall about a
demon or a psychologist, it is not within her power to alter the truth of
Vs E M 7 at the time of the subjects' decision), he must regard his situation s as falling in a subset FN of N in which M is statistically independent of R; he
may know relatively little about this frequency, say merely that it lies
between 0.1 and 0.5. This is to say that in Kn are the statements:
rS(N,M, 0.1,0.5)7;
r-S(FN cl R,M, 0.1,0.5)7;
r:S(FN c~ P,, M, 0.1,0.5) 7 .
We may suppose that RAN(s, r-FN A R 7 , M, Kn and r-s C R 7) and RAN(s, rFNC3KT,M, Kn and r s c / 2 7 ) . Therefore Prob(Kn and VsERT, rsEM7)= Prob(Kn and VsER7, r s E M T ) = ( 0 . 1 , 0 . 5 ) , and U(rsER 7, Kn) = (100000, 500000), U(rs E/~ 7 ,Kn) -- (101 000,501 000).
Note that we obtain the same result if the subject is totally ignorant of the
frequency with which a million dollars is put in the right hand box; let K*
contain: r-S(N, M, O, 1)7;
r-S(FN C3 R,M, O, 1)7;
r-S(FN c3 R, M, O, 0 7 ;
U(r-s E R 7,K~,) = (0, 106);
UCs E / 2 7 , K *) = (1000, 10 6 + 1000).
Finally, the subject may consider a set of rational corpora Knp which are like his except that they contain the knowledge of the frequency with which M
occurs in N;Knp contains:
rS(N,M,p,p)7;
ACTS AND CONDITIONAL PROBABILITIES 167
rS(FN n R, M, p, p)-q ;
FS(FN N R, M, p, p)-q ;
U(F-s~R-q,Knp) = p • 106;
U([-sER-q,Knp) = p • 106+ 1000.
In any event, we see that taking both boxes is more profitable than taking
one box.
9. THE ROAD TO DAMASCUS
This is strikingly different from Newcomb's problem because of the com-
petitive element that enters into it. We suppose that Death is trying specifi-
cally to outwit the traveller. (If there is an appointment book made up in
advance, the previous analyses go through and the probability, of death for
the traveller is the same whatever he does: it is epistemically independent
of his decision.) Furthermore, we suppose that Death has psychic powers, so that the decision to go to Aleppo will, with high probability, be psychically perceived by Death and the decision to go to Damascus will, with high prob- ability, be perceived by Death, in either case with unfortunate results. What
is the traveller to do?
Let t be his situation, F the set of such situations, A the subset in which the traveller decides to go to Aleppo, A the subset in which he decides to go
to Damascus, Da the set in which Death seeks him in Aleppo and D.a the set in which Death seeks him in Damascus. We have supposed that the traveller's
corpus Kt contains:
rS(FNA,DA, 0.9, 0.9)-q ;
r S ( F n .4, D•, 0.9, 0.9) -q .
That is, due to Death's psychic powers the frequency with which Death seeks
a traveller who has decided to go to Aleppo is high, and similarity for Damascus. Thus:
u(rceF E A -q , Kt) = DesCaF E A ^ aF E DA-q) Prob (K t and r-ceF E A-q, VaF E DA ~) + Des (raF E A ^ aF E D~) Prob (Kt and VaF E A -q, V aF E D ~ )
168 H E N R Y E. K Y B U R G
= - - 1 0 0 • 2 1 5 = - -90 .
Similarly,
U(VaF e Z17 , Kt) = -- 90.
Since Death is psychic, the same holds for the specific situation t:
U(Vt E A 7 , Kt) = -- 90;
U(rt E A 7, Kt) = -- 90.
But the traveller has a way out: since Death is depending on his psychic
powers to perceive the traveller's decision, the traveller can improve his
chances by not deciding either to go to Damascus or to go to Aleppo. How?
By tossing a coin! We have not supposed that Death is totally prescient, but
merely that he can with high probability, read the traveller's mind. Let T be
the statement that the traveller goes to Aleppo or Damascus according as his
toss lands heads or tails; TF the set of F situations in which a coin toss is
used. Then
U(T, Kt) = Des(rT ^ t E A ^ tEDaT)Prob(Kt, VT ̂ t E A ^ t @ DA 7) + Des(FT ̂ t E A A t E Da-7) Prob(Kt,VT ̂ t @ A ^ tEDA 7) + Des(FT ̂ t E A ^ tED~)Prob (K t , FT ̂ t E A ^ t E D ~ ) + Des(VT^ t EA ^ t E Dy~-q)Prob(Kt, FT^ t EA ^ t E D~7).
We suppose that in general Death spends half his time in Damascus and half
his t ime in Aleppo; we have already supposed that he has no foreknowledge
of the outcomes of coin tosses. This means that we may plausibly suppose the
traveller to have the following statistical knowledge:
vS(TF, ACIDA, 1/4, 1/4)-7;
rS(TF, A CTD~, l /4, l/4)-~; 7S(TF, A C7 DA, 1/4, 1/4)-7;
vS(TF, A CT Da, 1/4, 1/4) 7
So far as the traveller is concerned,
RAN(t, TF, rA (7 D A N , Kt),
ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 169
RAN(t, TF, VA A D~ q ,Kt),
RAN(t, TF, r-~ ~ Da~ ,Kt),
RAN(t, TF, F~ • D ~ , K,).
We may thus compute U(T, Kt)= - -50 . The situation is not only stable,
but the traveller has improved his expected utility.
10. C O N C L U S I O N S
What is the upshot of all this? First of all, that an analysis of these seemingly
problematic decision situations can be provided in which no use is made of
counterfactuals, and in particular in which no use is made of Stalnaker's
implausible principle that one or the other of
I fA were true,B would be true, or
I fA were true, B would be false,
is employed. Although we may want the object language to contain an inten-
tional statistical predicate, there is no need for this to be the case, and we
have little reason, in most non-theoretical contexts to introduce such a
predicate.
In the metalanguage, similarly, everything is extensionalized: we do not
talk about what Solomon's corpus would be if he were to decide to make
Fs E B q true; we can actually refer to a certain set of sentences (namely
K s and Fs E Bq), which exists at least as unproblematically as any other
set. (This may not be very unproblematic.)
We need only one decision rule, that of maximizing expected proper
epistemic utility. (This rule breaks down in some situations, due to the
fact that utilities should be regarded as interval valued, which fact in turn
is a consequence of the fact that probabilities are interval valued. But this
refinement does not enter into any of the examples discussed in this paper.)
In those cases in which, in general, a proposition is an indicator of a
property known to have a bearing on the outcome of the act of making that
proposition true, we can in general suppose that the agent, insofar as he
regards himself as an agent, must regard himself in that act as belonging to a
subset of the general class of situations, in which subset the property in
170 HENRY E. KYBURG
question is properly epistemically independent of the proposition that that agent is contemplating making true.
We can define expected stochastic utility, with which some of the examples have been concerned, but this is of interest to an agent only when it has the same value as expected proper epistemic utility - that is, when the agent's
situation is a random member of the class of situations with which the sto-
chastic utility is concerned.
We can derive a sure thing principle from our decision rule; it is simply a
consequence of that rule, and has the form:
If rs E R -q is properly epistemically independent of r-s E C 7,
relative to K, and r-s E R ~ is properly epistemically independent of r s E C T , relative to K, and U(T,K and r-sECT) is greater
than U(~ T,K and rs E C-1), and U(T,K and rs ~ C-~) is greater
than U(~T, KandrsECT) , then U(T, K) is greater than
U(~T,K).
But this is not a very interesting principle - it merely, sometimes, simplifies
some calculations.
We can make sense of situations in which the analysis of Harper and
Gibbard yields unstable conditional utilities by means of randomization.
According to the analysis provided by Harper and Gibbard, U-maximiz-
ation sometimes leads to instability, and V-maximization simply isn't defined
once a person knows what he is going to do. Neither of these problems beset
the analysis offered here. Finally, the semantics for all that is employed here requires the use of only
one possible world, unless we want to give an intensional interpretation to
some of the statistical statements involved. (For example, we might suggest that the traveller toss a fair coin, which would bring in a number of theoretical
considerations.) One world, classical sentential connectives, one notion of epistemic probability and one of utility (of which, as special cases, we may distinguish between stochastic and properly epistemic), one principle of decision, and yet we can respond to the most common intuitions about the cases examined. We have to give up the general Bayesian principle of condition-
alization, but I would argue that even it has its advantages.
The University of Rochester
ACTS AND C O N D I T I O N A L P R O B A B I L I T I E S 171
B I B L I O G R A P H Y
[ 1 ] Gibbard, Allen and Harper, William L., 'Counterfactuals and Two Kinds of Expected Utility', in Hooker, Leach and McClennen (eds.), Foundations and Applications of Decision Theory, Vol. I, D. Reidel Pubfishing Co., Dordrecht, Holland, 1977.
[2] Kyburg, Henry E., Jr., The Logical Foundations of Statistical Inference, D. Reidel Publishing Co., Dordrecht, Holland, 1974.
[3] Levi, Isaac, 'Newcomb's Many Problems', Theory and Decision 6 (1975), 161-175.