Uncertainty Notes 2

Embed Size (px)

Citation preview

  • 7/30/2019 Uncertainty Notes 2

    1/39

    Notes on Rational Choice Under Uncertainty

    John DugganSeptember 4, 2012

    Contents1 Rational Choice 2

    2 Preferences Over Lotteries 3

    3 Expected Utility 6

    4 Uniqueness of the vN-M Representation 11

    5 Choice Under Uncertainty 14

    6 The One-dimensional Choice Model 20

    7 Measures of Risk Aversion 25

    8 Stochastic Dominance 29

    9 The Multidimensional Choice Model 33

    10 Mean-Variance Analysis 33

    11 Experimental Evidence 35

    12 The Paradox of Voting 37

    We consider a decision maker whose welfare is determined by the outcome realizedfrom a set X of alternatives, and we allow for the possibility that there is uncertaintyabout which alternative will be realized. For example,

    X may consist of the possible winners of an election, candidate A or candidateB , and a candidate who adopts a particular campaign platform may not knowwho the winner will be;

    1

  • 7/30/2019 Uncertainty Notes 2

    2/39

    X may consist of various possible economic outcomes, and a politician may notknow the economic effects of different public policies;

    X may consist of two types of politician, conservative and liberal, and votersmay not know for sure what type a candidate is;

    X may consist of the possible winners of an election, candidates A or B, and avoter who votes for A (or B) may not know who the winner will be.

    These are all examples of decision problems under uncertainty: What campaign plat-form should a candidate adopt? Which policy should a politician choose? Whichcandidate should a voter vote for? Should a voter bother to vote at all? To answerthese questions, we model uncertainty as a probability distributionor lotteryoveroutcomes, and we extend the rational choice framework to explore the deep implica-tions of two fundamental axioms (the expected utility axioms) on preferences overlotteries. This analysis plays a central role in modern approaches to formal modeling

    in economics and political science.

    1 Rational ChoiceThe general framework of rational choice theory consists of a decision maker whomust choose from a set A of possible alternatives. We maintain the assumption thather preferences are represented by an ordering P . Here, xP y means x is strictlypreferred to y. Formally, we assume P is asymmetric (not both xP y and yP x) andnegatively transitive ( xP y implies either xP z or zP y). Associated with this strictpreference relation is a weak preference relation R dened so that xRy means not

    yP x; this relation is complete ( xRy or yRx ) and transitive ( xRyRz implies xRz ).Also associated is an indifference relation I dened so that neither xP y nor yP x(equivalently, xRy and yRx ); this is symmetric ( xIy implies yI x) and transitive(xIyIz implies xIz ). Here, A is totally general, but it is typically either assumed to benite or a subset of Euclidean space. Under either assumption, we can very generallyrepresent the decision makers preferences by a utility function, say U : A R , thatassigns higher numbers to preferred alternatives. Formally, U (x) > U (y) if and onlyif xP y . If U is a utility function, then every monotonic transformation of U is aswell. That is, letting f : R R be strictly increasing on the range of U , the mappingV : X R dened by V (x) = f (U (x)) is also a utility function. This goes to showthat a utility function is only informative through the relative magnitudes of utilitiesassigned; the curvature of the utility function is arbitrary, and for this reason wesometimes use the term ordinal utility function. The rational choice hypothesisis that the decision maker will choose an alternative that is best according toP . Formally, an alternative x is a viable choice if it is maximal in the sense thatthere is no alternative y such that yP x (equivalently, xRy for all y). In terms of utilities, the decision maker will choose an alternative x that maximizes utility, i.e.,U (x) = max {U (y) | y A}.

    2

  • 7/30/2019 Uncertainty Notes 2

    3/39

    2 Preferences Over LotteriesAssume for now that the set X of possible the outcomes is nite, and enumeratethis set as {x1, . . . , x m }. We represent uncertainty about the outcome from X by aprobability distribution, a vector = ( 1, . . . , m ), which tells us the probability of outcome x j is j . Of course, j 0 for all j , and m j =1 j = 1. In choice theory, weusually call a lottery. Let denote the set of all lotteries, i.e.,

    = R m+ |m

    j =1

    j = 1 .

    In other words, is the unit simplex in R m . The simplex has m vertices, 1, . . . , m ,where j = (0 , . . . , 0, 1, 0, . . . , 0) is a vector with a one in the j th coordinate, zeroselsewhere. This is the lottery that puts probability one on outcome x j (it is a surething), and for this reason we refer to these lotteries as degenerate. See below.

    1

    2 3

    1

    2 3

    In the above examples, a decision maker needs to make a choice that will determinesome lottery. To understand how that choice will be or should be made, we need toconsider preferences over lotteries. We assume that the decision maker is an individualand has a strict preference relation P on , with weak preference R and indifferenceI derived as in the standard rational choice model: given , , R if and only if not P , and I if and only if R and R . Assume that P is an ordering, i.e., itis asymmetric and negatively transitive (equivalently, R is complete and transitive).We will investigate the most prominent model of choice under uncertainty, which isdened by the following primitives.

    Expected Utility Model:

    X outcomes, x1, . . . , x m lotteries on X P ordering on

    It may be that P is represented by a utility function, but we do not impose thatassumption directly. Although it is not modeled explicitly, we may imagine that the

    3

  • 7/30/2019 Uncertainty Notes 2

    4/39

    decision maker is allowed to choose from a set of feasible lotteries, and that alottery is a viable choice if it is maximal in , i.e., for all , R .

    Note that we have formulated a special case of the rational choice framework, wherethe set of alternatives is . There is nothing corresponding to the set X in the rationalchoice model because the latter is totally abstract; here, we consider a choice problemin which the set of alternatives has a special structure (it is a simplex) and a specicinterpretation (lotteries over outcomes).

    Do all orderings of lotteries make sense? Or can we rule some out on rst principles?To formulate one restriction, assume that the decision maker has strict preferencesover degenerate lotteries, and without loss of generality suppose 1P 2P P m . Weinterpret this to mean that outcome x1 is strictly preferred to all other outcomes.In this case, it would make sense to assume that 1 is strictly preferred to all otherlotteries (not just the degenerate ones). Why? The difference between 1 and any

    other lottery, say , is that puts positive probability on outcomes other than x1, andthese outcomes are worse from the decision makers point of view. Another sensiblerestriction is that if there is one lottery that is strictly preferred to all others,then it should be degenerate. Otherwise, puts positive probability on at least twooutcomes, and one of those, say x j , will be at least as good as the other, say xk . Butthen if we move probability from xk to x j , we should have a lottery that is at leastas good as , contradicting the supposition that is uniquely best.

    To discuss further restrictions, it is useful, given two lotteries , and [0, 1],to dene the new lottery + (1 ). Writing this out in vector form,

    ( 1 + (1 )1, . . . , m + (1 )m ),

    this is just the lottery that puts probability j + (1 ) j on each alternativex j . We may refer to this as a mixture of and . (You can check that theseprobabilities are non-negative and sum to one.) What is the intuitive meaning of thenew lottery? Imagine a two-stage lottery, where we randomly decide whether to uselottery or lottery (with probability on and 1 on ), and we then drawthe nal outcome in X according to whichever lottery was chosen in the rst stageof randomization. The probability distribution on X determined by this two-stagelottery is just + (1 ). Geometrically, of course, the mixed lottery is just aconvex combination of the original ones. To describe such a two-stage lottery, it iscustomary to use the term compound lottery.

    Returning to the question of preference restrictions, the following logic is appealing.Let and be two lotteries such that P , and consider the mixture = +(1 ), where (0, 1). Then it stands to reason that PP . Why? Viewedas a compound lottery, is equivalent to choosing with probability 1 and abetter lottery with positive probability > 0. It only differs from in that with

    4

  • 7/30/2019 Uncertainty Notes 2

    5/39

    probability 1 it chooses the worse lottery, so it stands to reason that P ; andit only differs from in that with probability it chooses a better lottery, so weconclude P . More generally, considering , (0, 1) with > , the same logicsuggests that we should have

    P ( + (1 ))P ( + (1 ))P ,

    so lotteries with more weight on are preferred to lotteries with more weight on .Thus, it seems reasonable to assume that preferences over lotteries possess some nat-ural convexity properties. In fact, this extends the intuition from the above examplein which one lottery was uniquely best.

    Are there other restrictions we can impose? The literature on decision theory hasconsidered many restrictions, formulated as axioms on the ordering P . A very weakaxiom is that the ordering P has no gaps.

    Intermediate values For all ,, with PP , there exists (0, 1) such thatI ( + (1 ) ).

    The above axiom is implied if, for example, the decision makers ordering has acontinuous utility representation. Indeed, suppose U : R is a continuous utilityrepresentation, so U () > U () > U ( ). Dene the function f : [0, 1] R by

    f () = U ( + (1 ) ).

    Since U is continuous, it follows that f is continuous as well. By the intermediatevalue theorem, the image f ([0, 1]) of the unit interval [0, 1] under f is convex. Sincef (0) = U ( ) and f (1) = U () it follows that U ( ), U () f ([0, 1]). Notice thatthe utility U () lies between U ( ) and U (), so convexity of f ([0, 1]) implies thatU () f ([0, 1]), which means there is some (0, 1) such that U () = f (), asclaimed. For this reason, the Intermediate Values axiom is generally regarded as quiteplausible.

    Another common axiom used is next.

    Independence For all ,, such that P , and for all (0, 1), we have

    ( + (1 ) )P ( + (1 ) ).

    Independence is also referred to as the Substitution axiom. The intuitive justica-tion for this axiom is to consider choosing between + (1 ) and + (1 ) .Viewing these as compound lotteries, they differ only in that the rst draws from with positive probability , while the second draws from ; since is preferred to ,the idea is that the rst compound lottery should be preferred to the second.

    5

  • 7/30/2019 Uncertainty Notes 2

    6/39

    To understand the Independence axiom more deeply, let and be two lotteries suchthat P . Given , (0, 1) with > , dene

    = + (1 ) = + (1 ).

    From the discussion above and the fact that puts greater weight on the lottery ,which is preferred to , we might think it reasonable that is preferred to . Infact, this is an implication of the Independence axiom. To see why, note that theaxiom implies P by setting = . We then apply the axiom again with playingthe role of in the axiom, with / playing the role of in the axiom, and setting = . Then

    +

    P

    +

    ,

    or equivalently, P . We conclude that compound lotteries with greater weight on

    are indeed preferable to the decision maker, depicted below to the left.

    In the above, we have applied the Independence axiom only in two special cases,where we set equal to either or . But the axiom can be applied more generallyto the case where is any third lottery. Above to the right is the case where is athird lottery, and the mixtures = + (1 ) and = + (1 ) lie on linesparallel to that connecting and .

    3 Expected UtilityWhile the axioms of Independence and Intermediate values are simple and seem quitereasonable, it turns out that in conjunction, they are very restrictive. In fact, they

    imply that preferences over lotteries have a specicand very convenientfunctionalform, known as expected utility. The following characterization is due to vonNeumann and Morgenstern (1944). The theorem provides the theoretical foundationfor most formal analyses of choice under uncertainty.

    6

  • 7/30/2019 Uncertainty Notes 2

    7/39

    Theorem 3.1 (von Neumann and Morgenstern) The strict preference relation P satises Independence and Intermediate values if and only if there is a function u : X R such that, for all , ,

    P

    m

    j =1 j u(a j ) >

    m

    j =1 j u(a j ). (1)

    We call any function satisfying ( 1) for all and a von Neumann-Morgenstern representation. We specically avoid the term utility, as we elaborate at a laterpoint. The quantity

    U () =m

    j =1

    j u(a j )

    is called the expected utility of lottery , and you can see that the function U is reallyan ordinal utility representation of the decision makers preferences over lotteries:condition (1) is equivalent to

    U () > U () P,

    as required for an ordinal utility representation. Of course, the condition in ( 1) isequivalent to the following: for all , ,

    R U () U ().

    Both conditions imply that for all , , the decision maker is indifferent between and if and only if U () = U ().

    What do the indifference curves of P look like in ? Of course, the indifference curvesare the level sets of U . Rewriting the expression for expected utility as

    U () = (u(x1), . . . , u (xm )),

    note that U is a linear function of with gradient ( u(x1), . . . , u (xm )), so its levelsets are parallel lines (or planes or hyperplanes in higher dimensions). A monotonictransformation of U will also represent the decision makers preferences over lotteries,but it may not have the nice expected utility form.

    It is straightforward to verify that expected utility preferences satisfy the two axioms.We now assume P satises the axioms and deduce the expected utility form. Weprovide a simple proof of Theorem 3.1 for the case of three outcomes in which thecorresponding degenerate lotteries are strictly ranked. We begin with three generallemmas that do not depend on these assumptions. The rst states that if one lottery is strictly preferred to another , then every lottery on the line between the two that

    7

  • 7/30/2019 Uncertainty Notes 2

    8/39

    Figure 1: Proof of Lemma 3.3

    is closer to is strictly preferred to every lottery on the line closer to . In fact, we

    have already given the argument for this following the statement of the Independenceaxiom.

    Lemma 3.2 If P , then for all , [0, 1] with > , we have

    ( + (1 ))P ( + (1 )).

    The next lemma extends the Independence axiom to the case in which the decisionmaker is indifferent between two lotteries: in this case, the decision maker is indifferentamong all lotteries on the line between them.

    Lemma 3.3 If I , then for all (0, 1), we have ( + (1 ))I.

    Proof. Let = + (1 ), and suppose that I but not I . Then either P or P . The two cases are analogous, so we focus on the rst. Let = 1+ 2 , anddene = + (1 ). In particular, lies between and (see Figure 3), andtherefore PP by Lemma 3.2. Then PI implies P . But lies between and , so another application of Lemma 3.2 implies P , a contradiction.

    The third lemma extends the previous lemma to show that indifference is maintainedwhen I and we mix those lotteries with the same weight on any third lottery. Notethat the lemma is stated for all nonnegative scalars , rather than between zeroand one, as long as the resulting mixtures are indeed lotteries.

    Lemma 3.4 If I , then for all and all 0 with + (1 ) and + (1 ) , we have

    ( + (1 ) )I ( + (1 ) ).

    8

  • 7/30/2019 Uncertainty Notes 2

    9/39

    Figure 2: Proof of Lemma 3.4

    Proof. Assume I . If I , then the result follows from Lemma 3.3, so assumenot I . Then P or P , and as the cases are analogous, assume without loss of generality that P . Dene = + (1 ) and = + (1 ) , and suppose

    not I . Without loss of generality, suppose P . First, consider the > 1 case.Letting = 1 , we have =

    + (1 ) and = + (1 ) . But thenLemma 3.2 implies P , contradicting I . Second, consider < 1. By Lemma3.3, P . Since IP P , the Intermediate Values axiom implies that there is alottery between and such that I . Formally, there exists (, 1) suchthat = + (1 ) . Now translate the line between and so that it runsbetween and the lottery depicted in Figure 2. Formally, we let = 1 and dene = + (1 ) . By the rst part of the lemma, since I and = + (1 ) ,we must have I . Finally, note that

    =1

    + 1 1 =

    + 1

    ,

    and since (0, 1), Lemma 3.2 implies that P , which contradicts II .

    To complete the sketch of the proof, we focus on the simple case of three basic alterna-tives, and we assume that the degenerate lotteries corresponding to these alternativesare strictly ranked: 1P 2P 3. By the Intermediate Values axiom, we may choosean (0, 1) and dene = 1(1 ) 3 such that I 2. By Lemma 3.4, the deci-sion makers indifference curve between 2 and is a straight line, and her otherindifference curves will be straight lines parallel to this. See Figure 3. Moreover, in-difference curves closer to the preferred outcome x

    1correspond to preferred lotteries.

    For example, let be on the 1 side of the indifference curve through 2, and suppose 2R. Letting lie on this indifference curve and satisfy = + (1 ) 3 forsome (0, 1) and using Lemma 3.2, we then have P I 2R, which implies P ,a contradiction.

    Next, we exhibit a linear utility function over the set of lotteries that represents thepreferences depicted in Figure 3. Dene the linear function U () = , where the

    9

  • 7/30/2019 Uncertainty Notes 2

    10/39

    = (1 , , 0)

    2

    = 1 + (1 ) 3

    1

    3

    Figure 3: Parallel indifference curves

    xed gradient is = (1 , , 0). (Technically speaking, the gradient depicted in thegure is the projection of onto the unit simplex.) In particular,

    U ( 2) = U (0, 1, 0) =

    and

    U () = U ( 1 + (1 ) 3) = U (, 0, 1 ) = .

    That is, U takes the same value at 2 and , so the line connecting these two lotterieslies on a level set of U . To verify that U represents the decision makers preferencesover lotteries, we need only check that it takes higher values on line segments closerto 1, but this is immediate; it follows because U ( 1) = 1 > 0 = U ( 3). We conclude

    that u(a1) = 1 , u(a2) = , u (a3) = 0

    is a von Neumann-Morgenstern representation for the decision maker, as required.

    To see the upshot of Theorem 3.1, note that the set of lotteries is an innite (in fact,uncountable) set, and so a decision makers ordering of lotteries could be conceivablyvery complex. But according to the theorem, the comparison of two lotteries reducesto two simple calculations of expected utility: the lottery with the greater expectedutility is preferred. For example, suppose that there are three outcomes, x,y,z , anda decision maker has the von Neumann-Morgenstern representation u below.

    x y z u 1 .6 0

    Then the expected utility of the lottery = ( .3, .4, .3) is just U () = ( .3)(1)+( .4)(.6)+(.3)(0) = .54, and the expected utility of = ( .4, .2, .4) is U () = ( .4)(1) + ( .2)(.6) +(.4)(0) = .52, and we see that she prefers to .

    10

  • 7/30/2019 Uncertainty Notes 2

    11/39

    4 Uniqueness of the vN-M RepresentationSince expected utility U , from Theorem 3.1, is an ordinal utility representation of the decision makers preferences over lotteries, it follows that any strictly increasingtransformation of U is an equally valid representation. In particular, a corollary of the theorem is that if preferences satisfy Independence and Intermediate values, thenthere is a mapping v : X R such that for all , ,

    P m

    j =1

    v(x j ) j >m

    j =1

    v(x j ) j ,

    giving us a multiplicative representation rather than an additive one. The proof istrivial: under the two axioms, Theorem 3.1 yields a von Neumann-Morgenstern utilityu satisfying (1); since the exponential function is strictly increasing, it follows thatthe function V : R dened by

    V () = eU ( ) =m

    j

    (eu (x j )) j

    is an ordinal utility representation; and then we can dene v by v(x j ) = eu (x j ) tofulll the claim.

    We cannot, however, take arbitrary increasing transformations of the von Neumann-Morgenstern representation u without altering the preferences of the decision maker.While u is a function dened on elements of X , it is used to represent preferencesover lotteries on X . Of course, it determines preferences over X as well, because wecan identify outcome x j with the degenerate lottery j . It then follows from (1) thatx j is weakly preferred to xk if and only if u(x j ) u(xk ). If you take an arbitrarymonotonic transformation of u, you will not change the implied ordering of X , butyou may change the decision makers preferences over lotteries.

    To see this, suppose X contains three alternatives, and consider the von Neumann-Morgenstern utility function u and two monotonic transformations of it: 2 u + 3 andu3.

    u 2u + 3 u3

    x 1 5 1

    y .6 4.2 .2z 0 3 0Obviously, neither of these transformations of u has affected the implied ordering of X , but consider the decision makers preference between y for sure, i.e., the lottery(0, 1, 0), and the even chance lottery between x and z , i.e., (.5, 0, .5). Under u and2u + 3, the former is preferred to the latter, but this preference is reversed for u3!

    11

  • 7/30/2019 Uncertainty Notes 2

    12/39

    The lesson is that if we are using a von Neumann-Morgenstern representation functionto represent preferences over lotteries, then we cannot subject it to arbitrary mono-tonic transformations. Transformations like 2 u + 3 (called affine transformations)are okay, but not others. The next result formalizes these observations.

    Theorem 4.1 Let u : X R be a von Neumann-Morgenstern representation. Then v : X R is also a von Neumann-Morgenstern representation if and only if there exist constants > 0 and such that v = u + , i.e., for all x X , v(x) = u (x)+ .

    Proof. If u is a von Neumann-Morgenstern representation, if > 0, and if v = u + ,then it is easy to check that v also satises (1) for all and . For the other direction,suppose u and v satisfy (1) for all and but there do not exist > 0 and suchthat v = u + . In particular, note that u and v are not constant on X . Then theremust exist x ,y,z ,w X such that

    u(x) u(y)u(z ) u(w) = v(x) v(y)v(z ) v(w) . (2)

    Otherwise, equality must hold instead, i.e.,

    v(x) v(y)u(x) u(y)

    =v(z ) v(w)u(z ) u(w)

    ,

    for all x ,y,z ,w X . Choose c, d X arbitrarily such that u(c) = u(d), and dene = ( v(c) v(d))/ (u(c) u(d)). Note that a > 0. (Why?) Then

    v(x) v(y) = (u(x) u(y))

    for all x, y X . Now choose b X arbitrarily, dene = v(b) u (b), and note thatv(x) = u (x) +

    for all x X , a contradiction. Therefore, we may choose a,b,c,d X such that ( 2)holds. I claim a consequence is that u and v do not both satisfy ( 1) for all lotteries and . To see this, let = (1 /m, . . . , 1/m ) denote the even chance lottery over allalternatives, and dene as follows. For > 0 small, take probability from b andtransfer it to a, and take probability

    u(a) u(b)u(c) u(d)

    from c and transfer it to d. Leave the probabilities of all other alternatives the same.Then

    m

    j =1

    j u(x j ) m

    j =1

    j u(x j )

    = u (a) + u (b) +u(a) u(b)u(c) u(d)

    u(c) u(a) u(b)u(c) u(d)

    u(d),

    12

  • 7/30/2019 Uncertainty Notes 2

    13/39

    and then you can check that ( 2) impliesm

    j =1

    j v(x j ) m

    j =1

    j v(x j ) = 0 .

    Thus, given and as above, u and v cannot both satisfy ( 1), a nal contradictionthat completes the proof.

    Lets consider some implications of Theorem 4.1. First, suppose u and v are bothvon Neumann-Morgenstern representation with v = u + , take any a,b,c,d X ,and suppose

    u(a) u(b) > u (c) u(d).

    Then(u (a) + u (b) ) > (u (c) + u (d) ),

    which is equivalent to v(a) v(b) > v (c) v(d).

    Therefore, unlike an ordinal utility function, u does give us a way of comparingmagnitudes of utility differences: if the increase in utility going from b to a is largerfor u than the increase going from d to c, then it is larger for all von Neumann-Morgenstern representations. We can actually say something more: given any twopairs of alternatives, the ratio of utility differences

    u(a) u(b)u(c) u(d)

    is the same for all von Neumann-Morgenstern representations. You should prove thisfor yourself and show that this implies the rst claim.

    Does this give us an unambiguous measure of how many times bigger the increase inwelfare going from b to a is compared to the increase going from d to c? Well, sortof. We just have to be careful about how we interpret the term welfare here. Tobe precise, it tells us about the decision makers willingness to trade off probabilitieson certain outcomes. Suppose, for the sake of argument, that u(a) > u (b) andu(c) > u (d). If we take probability from a and put it on b, we have to move

    u(a) u(b)

    u(c) u(d)times more probability from d to c to compensate the decision maker. Indeed, supposewe move probability from a to b and move from d to c in order to maintain theequality

    ((a) )u(a) + ( (b) + )u(b) + ( (c) + )u(c) + ( (d) )u(d)= (a)u(a) + (b)u(b) + (c)u(c) + (d)u(d),

    13

  • 7/30/2019 Uncertainty Notes 2

    14/39

    or equivalently,

    =u(a) u(b)u(c) u(d)

    ,

    as claimed.

    From a more practical perspective, Theorem 4.1 allows us to pin down the valuesof a von Neumann-Morgenstern representation at any two alternatives. In the con-struction in the proof of Theorem 3.1, we provided a u that attained a maximumvalue of one and a minimum value of zero. In fact, this can be done as long as thedecision maker is not completely different over all lotteries; and we can choose anytwo outcomes, say x and y, with x P y , and we can specify that the numbers assignedto x and y are any numbers a and b (respectively) as long as a > b. To see how this ispossible, suppose we have one von Neumann-Morgenstern representation u, and weseek a positive affine transformation of u, say v, such that v(x) = a and v(y) = b.

    We then need to solve two equations in two unknowns,u (x) + = au (y) + = b,

    with unique solution

    =a b

    u(x) u(y)and =

    bu(x) au(y)u(x) u(y)

    .

    In particular, if we want to specify that the number assigned to the best degenerate

    lottery is one and the number assigned to the worst degenerate lottery is zero, wecan obtain the von Neumann-Morgenstern representation v from u satisfying theseconditions as follows:

    v(x j ) =1

    maxh =1 ,...,m u(xh ) minh =1 ,...,m u(xh )u(x j ) min

    h =1 ,...,mu(xh ) .

    This gives us an equally valid representation of the decision makers preferences, andit attains a maximum value of one and minimum of zero.

    Clearly, u is different from the notion of an ordinal utility function. A term that is alsosometimes used for u to make this distinction clear is cardinal utility function. This

    jargon is used in other ways, however, so we adhere to the terminology of expectedutility representation or von Neumann-Morgenstern utility.

    5 Choice Under UncertaintyWe can impose more structure on the model to analyze the decision makers problem.Assume there are possible states of the world, S = {s1, . . . , s }, which summarize

    14

  • 7/30/2019 Uncertainty Notes 2

    15/39

    all details of the world that are relevant to the decision maker but unknown to herat the time she makes a choice. The decision makers beliefs about the state of theworld are represented by a probability distribution = ( 1, . . . , ), where j is theprobability of state s j . The decision maker makes one of k choices, C = {c1, . . . , ck}and cares both about the choice made and state of the world realized. To map thismodel into the previous framework, an outcome is a pair x = ( ci , s j ), and the set of outcomes is

    X = C S = {c1, . . . , ck} { s1, . . . , s }.

    The decision maker has an ordering on the set of lotteries over X satisfying Indepen-dence and Intermediate Values axioms, and by Theorem 3.1, we may represent thosepreferences succinctly by a von Neumann-Morgenstern representation u dened onX .

    Each choice ci determines the following lottery on X : alternative ( ch , s j ) is realizedwith probability

    Pr( ch , s j ) = j if h = i,0 else.

    That is, the decision makers choice determines the rst component of a = ( c, s), whileuncertainty about the second component (the state of the world) is summarized bythe probability distribution . Then the decision makers expected utility choice of ci determines a lottery (ci |) over outcomes and thereby her expected utility, as in

    ci (ci |)

    j =1 j u(ci , s j ).

    To simplify notation, we write this expected utility as

    U (ci |) =

    j =1

    j u(ci , s j ),

    where we modify notation slightly to bring out the dependence of U on the choiceand the beliefs about the state of the world. When is degenerate on a single states j , we write the expected utility as U (ci |s j ).

    An optimal choice ci is one that maximizes the decision makers expected utility givenher beliefs about the state of the world:

    U (ci |) = max j =1 ,...,k

    U (c j |).

    Clearly, there will always be a choice ci that solves this problem. And if there ismore than one choice that solves the above problem, all such optimal choices must

    15

  • 7/30/2019 Uncertainty Notes 2

    16/39

    generate the same expected utility. There could be two optimal choices, say ci andch , and then it does not matter which the decision maker makes; in fact, she coulddecide between them randomly without affecting her expected utility.

    There may be some choices that are optimal for no beliefs; that is, some ci may besuch that for every , there exists ch with U (ch |) > U (ci |). The statement of thiscondition allows ch to vary with ; for different beliefs, it may be that ci is inferiorto different choices. This is immaterial, for it is still the case that ci would neverbe optimal. A much stronger condition is that there is a single choice that generateshigher expected utility than ch in every state of the world. We say ch strictly dominates ci if ch provides a strictly higher utility than ci in every state of the world: for alls j , u(ch , s j ) > u (ci , s j ). Then it is obvious that ci is never optimal, and it is actuallythe case that a single choice is superior to ci in every state of the world. Thus, if ciis strictly dominated by some other choice, an implication is that it is never optimalfor any beliefs, but strict dominance is much stronger than sub-optimality.

    In the following example, there are three states and four possible choices. Here, choicec3 strictly dominates c4, but no other choice strictly dominates any other.

    s1 s2c1 4 0c2 0 4c3 1 1c4 0 0

    Of course, c4 can never be optimal. In contrast, c1 is optimal if 1 12 , and c2 isoptimal if 2 12 . Note, however, that while c3 is not strictly dominated, it (like c4)can never be optimal. Indeed, if 1 12 , then U (c1 |) 2 > 1 = U (c3 |), and if 2 12 , then U (c2 |) 2 > 1 = U (c3 |).

    The above example is depicted in Figure 4, where the horizontal axis representsvon Neumann-Morgenstern number in state s1, and the vertical corresponds to s2.Each black dot represents the pair ( u(ci , s1), u(ci , s2)) of numbers corresponding toa particular choice in the two states. Since is a vector with dimensionality (inthis case = 2), we can place the decision makers beliefs in this gure emanatingfrom the origin. If a choice generates utilities ( u1, u2), then the expected payoff fromthat choice given is 1u1 + 2u2, which is a linear function of utility pairs, and sowe can depict the decision makers expected utility by the level sets of this linearfunction. In the gure, is such that 2 12 , and the choice c2 generates the highestexpected utility. If we imagine varying the beliefs to range from probability one ons1 to probability one on s2, then at rst, c1 will be the unique optimal choice; at1 = 2 = 12 , both c1 and c2 will be optimal; and then c2 will be the unique optimalchoice, as depicted.

    To give a complete characterization of optimality, we must consider the possibility

    16

  • 7/30/2019 Uncertainty Notes 2

    17/39

    utility

    utilityfrom s2

    from s1

    U

    Figure 4: Optimal choices

    that the decision maker randomizes over choices. A mixed choice, denoted =( 1, . . . , k ), is a probability distribution over choices c1, . . . , ck . The interpretation isthat ci is chosen with probability i . We view each pure choice ci as a special caseof mixed choice in which is degenerate on ci . The expected utility from a mixedchoice is

    U ( |) =k

    i=1

    iU (ci |).

    Now a mixed choice is optimal if it maximizes the decision makers expected utilityover all mixed choices, i.e.,

    U ( |) = max

    U ( |).

    We extend the notion of strict dominance in the straightforward way: say strictly dominates ci if for all s j , U ( |s j ) > U (ci |s j ).

    In the original formulation of the decision makers problem, we have already notedthat there will always be a choice that maximizes the decision makers expected utility,and it turns out that any optimal pure choice will remain optimal when mixing is

    allowed: mixing does not increase the highest achievable expected utility. In fact, thefollowing theorem establishes that a mixed choice is optimal if and only if it placesprobability one on choices that are optimal in the original problem.

    Theorem 5.1 A mixed choice is optimal if and only if for all i = 1 , . . . , k ,

    i > 0 U (ci |) = maxh =1 ,...,k

    U (ch |).

    17

  • 7/30/2019 Uncertainty Notes 2

    18/39

    Proof. First, assume is optimal among mixed choices, and suppose that there isa choice ci with i that is not optimal among pure choices, i.e., there exists ch withU (ch |) > U (ci |). Let be the mixed choice dened to be equal to except thatwe transfer probability i to choice ch :

    j = i + h if j = h0 if j = i j else.

    Then we have

    U ( |) =k

    j =1

    j U (c j |) >k

    j =1

    j U (c j |) = U ( |),

    contradicting optimality of . Next, assume that puts probability one on choicesthat are optimal among pure choices, and suppose there exists such that U ( |) >U ( |). Then there must exist h, i = 1 , . . . , k such that h , i > 0 and U (ch |) >U (ci |), but this contradicts the assumption that i > 0 implies ci is optimal amongpure choices.

    Despite the foregoing result, it may be that a (non-optimal) pure choice is inferiorto a mixed choice. Perhaps surprisingly, it may be that a pure choice is not strictlydominated by any other pure choice, but it is strictly dominated by a mixed choice.Returning to the above example, note that the mixed choice = ( 12 ,

    12 , 0) that places

    equal probability on choices c1 and c2 generates an expected utility of two in eachof the two states. Therefore, strictly dominates c3. In Figure 4, this mixed choicecorresponds to the midpoint on the line connecting the pairs ( u(c

    1, s

    1), u(c

    1, s

    2)) and

    (u(c2, s1), u(c2, s2)). In general, a mixed choice corresponds to a convex combinationof payoff vectors corresponding to pure choices, and the set of pairs attainable througha mixed choice is the convex hull of the set

    {(u(ci , s1), . . . , u (ci , s )) | i = 1 , . . . , ck },

    the gray region in the gure.

    It is no coincidence that c3 is never optimal for any beliefs: if a choiceci is strictlydominated by a mixed choice , then it cannot be optimal. Indeed, consider anybeliefs . By strict dominance, we have U ( |s j ) > U (ci |s j ) for all states, and thereforeU ( |) > U (ci |). Rewriting this, we have

    j =1

    j U (c j |) > U (ci |), (3)

    so at least one expected utility on the left is strictly greater than U (ci |), i.e.,U (c j |) > U (ci |) for some c j , so ci is not optimal.

    18

  • 7/30/2019 Uncertainty Notes 2

    19/39

    utility

    utilityfrom s2

    from s1

    V u i

    U

    p

    Figure 5: Proof of Theorem 5.2

    The next result provides a complete characterization of sub-optimality, showing thatthe opposite direction of the above claim holds: a choice is optimal for no beliefs if and only if it is strictly dominated by some mixed choice. Interestingly, the proof isgeometric, relying on the separating hyperplane theorem.

    Theorem 5.2 The following conditions on a choice ci are equivalent:

    (a) for every belief , there exists c j such that U (c j |) > U (ci |),

    (b) there is a mixed choice that strictly dominates ci .

    Proof. We have already argued that (b) implies (a). Assume (a) holds. To everychoice ch , associate a vector uh = ( u(ch |s1), . . . , u (ch |s)) R of utilities correspond-ing to the different states of the world. Note that for every , U (ch |) = uh .Furthermore, the vector of utilities associated with the mixed choice is the convexcombination kh =1 h u

    h . Let

    U = conv {u1, . . . , u k}

    denote the vectors of utilities generated by all mixed choices. LetV = {v R m | v u i }

    denote the set of vectors that are strictly greater than u i in each coordinate. SeeFigure 5 for the case of two states. Suppose in order to deduce a contradiction thatthere is no mixed choice that strictly dominates ci ; in other words, U V = .Both sets U and V are convex. By the separating hyperplane theorem, there exists

    19

  • 7/30/2019 Uncertainty Notes 2

    20/39

    p R m \ { 0} that weakly separates U and V , i.e., for all u U and all v V , wehave

    p u p ui p v,

    as in the gure. I claim that p 0, i.e., p j 0 for all j = 1 , . . . , . Suppose, instead,

    that p j < 0 for some coordinate. But then

    u() = ( u i1, . . . , ui j 1, u

    i j + , u

    i j +1 , . . . , u

    i) V,

    and we could choose sufficiently large that p v < p ui . Therefore, p 0, and j =1 p j > 0, so we can dene

    =1

    j =1 p j

    p. (4)

    Then the inequality u ui continue to hold for all u U . But note that is a

    probability distribution over states, and ci maximizes the decision makers expectedutility over all choices, contradicting (a).

    We can extend the analysis: say weakly dominates ci if U ( |s j ) U (ci |s j ) for all j , with strict inequality for at least one state. A belief is totally mixed if it placespositive probability on each state, i.e., j > 0 for all j . The next result establishesthat a choice is optimal for some totally mixed belief if and only if it is not weaklydominated by a mixed choice.

    Theorem 5.3 The following conditions on a choice ci are equivalent:

    (a) for every totally mixed belief , there exists c j such that U (c j |) > U (ci |),

    (b) there is a mixed choice that weakly dominates ci .

    The proof that (b) implies (a) is similar to the above, now using the assumption thateach j is positive to obtain the strict inequality in ( 3). The proof that (a) implies(b) follows the proof for Theorem 5.2; we need only modify the denition of V as theset of vectors that weakly dominate u i , and we show that p 0. Details are omitted.

    6 The One-dimensional Choice ModelIn the one-dimensional choice model, we assume the set of outcomes is a subset of thereal line: X R . In contrast to the original setup, X may now be innite. Theorems3.1 and 4.1 remain true, the main complication being how we should dene the notionof a lottery to model uncertainty on the part of the decision maker. It is still possiblethat only a nite number of outcomes are possible, and then a lottery is a probability

    20

  • 7/30/2019 Uncertainty Notes 2

    21/39

    distribution over that set and simply assigns non-negative numbers to those outcomesin a way that sums to one. But there are other distributions, such as the normal oruniform, that cannot be represented in this way. Such a distribution is dened by adensity function, say f : R R , that is non-negative and integrates to one. Thenthe probability that any single outcome is realized is zero, and the probability thatan outcome in the interval [ a, b] is realized is equal to the integral

    ba f (x)dx.

    A lottery that puts probability on a nite set {x1, x2, . . . , x m } of outcomes is calledsimple, and in this case we let (xi ) be the probability that xi is realized. A lotterythat is represented by a density is called continuous. In either case, because outcomesare ordered in the real line, we can dene the probability, F (x), that the realizationfrom is less than or equal to x. Formally, F is the distribution of , dened as .When is simple, it is dened as

    F (x) = j :x j x

    (x j ) or F (x) =

    x

    f (x)dx,

    when is simple or continuous, respectively. In the former case, the distribution F may be any weakly increasing step function with a nite number of steps satisfyingminx F (x) = 0 and max x F (x) = 1; and in the latter case, F may be any continuous,weakly increasing function satisfying inf x F (x) = 0 and sup x F (x) = 1. We usuallydistinguish the distribution of a lottery , distinct from , with the notation G,distinct from F .

    There are lotteries that are neither simple nor continuous (for example, ip a coin,draw from the normal distribution if it comes up heads, and draw from a simple

    lottery if it comes up tails). The most general form of lottery is called a probabilitymeasure, the subject of a branch of mathematics known as measure theory, andwe will restrict attention to simple and continuous lotteries.

    In addition, we can now dene two important statistics of a lottery . The rst isthe expected value, or mean, of the lottery, which is dened as

    E [] =m

    j =1

    x j (x j ) or E [] =

    xf (x)dx,

    when is simple or continuous, respectively. The second is the variance of the lottery,

    which is dened as

    V [] =m

    j =1

    (x j E [])2(x j ),

    or

    V [] =

    (x E [])2f (x)dx,

    21

  • 7/30/2019 Uncertainty Notes 2

    22/39

    when is simple or continuous, respectively.

    The axiomatic result of Theorem 3.1 carries over to this more general setting: as-suming Intermediate Values and Independence, a decision makers preferences overlotteries can be represented by a von Neumann-Morgenstern representation u : X Rso that one lottery is preferred to another if and only if it generates higher expectedutility. The way we dene expected utility now depends on the nature of the lottery.If is simple, it is

    U () =m

    j =1

    (x j )u(x j ),

    exactly analogous to the original setup; and if is continuous, it is

    U () =

    u(x)f (x)dx,

    now an integral rather than a weighted sum.

    As an example of a choice problem that involves lotteries over monetary outcomes,consider a consumers decision about whether to purchase a lottery ticket with thefollowing properties: the ticket costs c dollars, the probability of winning is p, andit pays b dollars if it wins. Let the outcome x of the lottery denote the change inthe consumers income. We identify the ticket with the lottery that yields x = cwith probability 1 p and x = b c with probability p. So the expected change frombuying ticket is

    E [] = p(b c) + (1 p)( c) = pb c.We say the lottery is fair if the cost equals expected winnings, c = pb, or equivalentlyE [] = 0. Of course, not buying the ticket corresponds to the degenerate lottery 0with probability one on zero (no change). Should the consumer purchase the lotteryticket? That depends on which lottery he or she prefers. Assume is fair, andsuppose the consumers von Neumann-Morgenstern representation is u(x) = x. ThenU () = E [] = 0, so the expected utility of buying the lottery ticket is zero, which isthe same as the expected utility of not buying, so the consumer is indifferent. If theprobability of winning the lottery were higher or lower, the consumer would have acorresponding strict preference for buying or not buying the ticket, respectively.

    Does this mean that consumers cannot have a strict preference to buy a fair lotteryticket? No, it depends on the consumers preferences over lotteries. Assume, for thesake of example, that winning the lottery more than pays for the ticket, i.e., b > c,and note that pb = c for a fair lottery. If

    u(x) = x if x < 02x if x 0,

    22

  • 7/30/2019 Uncertainty Notes 2

    23/39

    then the consumer would have a strictly higher expected utility,

    U () = pu(b c) + (1 p)u( c) = 2 p(b c) (1 p)c= p(b c) + pb c = p(b c) > 0,

    from buying the lottery ticket than not buying. If

    u(x) = 2x if x < 0x if x 0,

    then the consumer would have a strictly lower expected utility from buying the lotteryticket. Thus, the consumers decision depends on his or her risk preferences.

    In the foregoing, we have considered the case of a fair lottery ticket, but when theticket is better or worse than fair, the question of whether the consumer should buythe ticket depends on his preferences. A consumer who strictly prefers to buy a fairlottery ticket than buy no ticket will also prefer to buy it if the lottery is slightlyworse than fair, i.e., pb is less than but close to c; and a consumer who strictly prefersnot to buy the ticket will also prefer not to buy if the lottery is only slightly betterthan fair.

    Letting E [ ] denote the degenerate lottery with probability one on the mean of , wesay the decision maker is risk averse if for all non-degenerate lotteries on X , wehave E [ ]P . We say the decision maker is risk loving if for all non-degenerate ,we have P E [ ]. And we say the decision maker is risk neutral if for all , we have E [ ]I .

    The next result characterizes the decision makers attitude toward risk in terms of the curvature of the von Neumann-Morgenstern representation. We say u : X Ris affine linear if there exist , such that for all x X , u(x) = x + . Infact, part 1 of Theorem 6.1 is a version of Jensens inequality, which states thatif a function u is concave, then the expected value of the function with respect to alottery is less than or equal to the function evaluated at the mean of :

    u(E []) m

    j =1

    (x j )u(x j ) or u(E []) u(x)f (x)dx,depending on whether is simple or continuous. The proof focuses on part 1, as riskaversion is the more common assumption in the modeling literature.

    Theorem 6.1 Assume X R is convex, and let u be any von Neumann-Morgenstern representation.

    1. A decision maker is risk averse if and only if u is strictly concave.

    23

  • 7/30/2019 Uncertainty Notes 2

    24/39

    2. A decision maker is risk loving if and only if u is strictly convex.

    3. A decision maker is risk neutral if and only if u is affine linear.

    Proof. To prove the left-to-right direction of part 1, assume risk aversion, andconsider distinct x, y X and (0, 1). Dene the lottery so that (x) = and(y) = 1 , and note that E [] = x + (1 )y. Then

    u(x + (1 )y) = U ( E [ ]) > U () = u (x) + (1 )u(y),

    as desired. Next, we prove the left-to-right direction of part 3. Assume u is affinelinear, and consider any lottery ; for simplicity, suppose is simple. Then

    U ( ) =m

    i=1

    (x i )u (x i ) =m

    i=1

    (x i )(x i + ) = m

    i=1

    (x i )x i + = u (E [ ]) = U ( E [ ]),

    as desired. We now turn back to the right-to-left direction of part 1. Dene theepigraph of u as the subset {(x, y) R 2 | y u(x)} of the Cartesian plane belowthe graph of u, as depicted in This in turn follows from the separating hyperplanetheorem, as depicted in Figure 6. Since u is concave, its epigraph is convex, andthe ordered pair ( E [], u(E [])) lies on the boundary of this set. By the separatinghyperplane theorem, there is a linear function that weakly separates ( E [], u(E []))from this set. Let have gradient p = ( p1, p2). Thus, given ( x, y) such that y u(x),

    p1x + p2y p1E [] + p2u(E []).

    Furthermore, p2 = 0, 1 and it follows that the level set of through ( E [], u(E [])) isthe graph of an affine linear function h : R R ; specically,

    h(x) =p1E [] + p2u(E [])

    p2+

    p1 p2

    x.

    Then, using linearity of h and the fact that u h, we have

    u(E []) = h(E []) = hm

    j =1

    (x j )x j =m

    j =1

    (x j )h(x j ) m

    j =1

    (x j )u(x j ),

    delivering Jensens inequality. When u is strictly concave, the vector ( E [], u(E []))uniquely maximizes over the epigraph of u, so if is non-degenerate, we obtain astrict inequality in the above displayed expression, delivering part 1.

    The usual assumption in applications of expected utility to the one-dimensional modelis that of risk aversion. This is largely due to introspection: for most of us, a bird

    1 Technical point: it may actually be that p2 = 0, but because u is concave, this is only possibleif E [ ] is an extreme point of X , so must then be degenerate, and the result follows trivially.

    24

  • 7/30/2019 Uncertainty Notes 2

    25/39

    E [x]

    p

    u(E [x])

    u

    h

    Figure 6: Jensens Inequality

    in the hand is worth two in the bush. To see why risk neutrality, for example, is notalways compelling, suppose X = R + is money, suppose a decision maker has a vonNeumann-Morgenstern representation u(x) = x, and consider the following lottery:we ip a coin until it comes up heads, and, if that takes t tosses, then we give thedecision maker x = 2 t dollars.2

    H TH TTH TTTH . . .prize 2 4 8 16 . . .prob 1/2 1/4 1/8 1/16 . . .

    What is the most the decision maker would be willing to pay for this lottery? In otherwords, what is the certainty equivalent of this lottery? Because the decision maker isrisk neutral, it is equal to E [], which is innite! Since most of us wouldnt be willingto pay an arbitrarily high amount for this lottery, were not risk neutral or risk loving.Though its not really a paradox, this example is called the St. Petersburg Paradox.In the sequel, we explore the one-dimensional model in more detail.

    7 Measures of Risk Aversion

    It is typical to assume X R is a convex subset of the real line and a continuous vonNeumann-Morgenstern representation u. Then, by the intermediate value theorem,for every lottery on X there is an outcome x X such that u(x) = U (). Infact, we will assume that u is twice continuously differentiable. In many applications,alternatives represent a good (such as money) for the decision maker, so we now

    2 Admittedly, this is not a simple lottery, because it puts positive probability on an innite numberof outcomes, but it makes a nice point.

    25

  • 7/30/2019 Uncertainty Notes 2

    26/39

    utility

    x1 x2E []C a []

    u(x2)

    u(x1)

    u(E [])E [u(x)]

    X

    Ra []

    u

    Figure 7: Risk aversion

    assume that X = R + and u is strictly increasing as well. Then there is exactly onex X such that u(x) = U (). We call this the (absolute) certainty equivalent of ,denoted C a [], which is implicitly dened by u(C a []) = U (). If the decision maker

    is risk averse, then it follows that C a[] E []. (Why?) We refer to the differenceRa [] = E [] C a [] as the (absolute) risk premium of . Obviously, if the decision

    maker is risk neutral, then C a [] = E [], and the risk premium is zero. See Figure 7.

    We can dene relative versions of these quantities, as well. Given any lottery , denethe relative certainty equivalent of , denoted C r [], implicitly by u(C r []E []) =U (). If the decision maker is risk averse, it follows that C r [] 1. (Right?) Thenthe fraction R r [] = 1 C r [] is the relative risk premium of . Obviously, if thedecision maker is risk neutral, then C r [] = 1 and the relative risk premium is zero.

    So far, we have the necessary conceptual apparatus to categorize a decision maker asrisk averse or not, but we have not attempted to measure the level of risk aversion.One approach would be to compare the risk aversion of two decision makers by com-paring their respective certainty equivalents. But certainty equivalents are denedwith respect to a given lottery, so this would give us a measure of risk aversion withrespect to a specic . To develop a measure that is independent of any particularlottery, we consider risk preferences over local lotteries at a given initial monetary

    26

  • 7/30/2019 Uncertainty Notes 2

    27/39

    level. Formally, our measure of the absolute risk aversion of u at x is

    u (x)u (x)

    .

    This provides a local measure of risk aversion at a monetary level x. It may be thatu is more risk averse than some other u around x but less so around y. Note that riskaversion essentially implies u (x) < 0, so the above measure is positive, with highermeasures indicating less acceptance of risk. Note also that the measure of absoluterisk aversion is invariant with respect to affine transformations of u:

    D 2(u (x) + )D(u (x) + )

    = u (x)u (x)

    = u (x)u (x)

    .

    Thus, it depends only on the decision makers risk preferences, and not on any cardinalproperties of a particular von Neumann-Morgenstern representation.

    Interestingly, the absolute risk aversion of u at x is related to the risk premium of small, fair lotteries around x. Given > 0, let ( ) be the even chance lottery betweenx + and x . To simplify notation, let C a ( ) denote the certainty equivalent of ( ), and let R a ( ) denote the corresponding risk premium. Then, by denition, C a ( )satises

    u(C a ( )) =12

    u(x ) +12

    u(x + )

    for all (0, x). Differentiating this identity with respect to , we have

    u (C a ( ))d

    dC a ( ) =

    1

    2u (x ) +

    1

    2u (x + ),

    and therefore, dd C a (0) = 0. With this in mind, we take the second derivative of the

    above identity to arrive at

    u (C a ( ))dd

    C a ( )2

    + u (C a ( ))d2

    d 2C a ( ) =

    12

    u (x ) +12

    u (x + )

    Evaluating this at = 0 and substituting into d2

    d 2 Ra (0) = d

    2

    d 2 C a (0), it follows that

    d2

    d 2Ra (0) =

    u (x)u (x)

    ,

    so the absolute risk aversion of u at x is just the second derivative of the risk premiumfor small gambles around x. This reveals the behavioral content of this measure: if the absolute risk aversion of u at x is greater than that of u, then a decision makerwhose preferences are characterized by u is willing to pay a higher premium to avoidthe risk of small gambles around x.

    To elaborate, if the absolute risk aversion of u at x exceeds that of u, then the rst

    27

  • 7/30/2019 Uncertainty Notes 2

    28/39

    X R a , morerisk

    Ra , lessrisk

    averse

    averse

    derivative of Ra ( ) at zero is increasing faster foru than it is for u; by twice continuous differen-tiability, this is true for close to zero. Since therst derivative dd R

    a (0) is equal to zero for bothutilities, it follows that the risk premium of u ex-ceeds that of u for small , as depicted to theright.

    When u is strictly concave but not increasing, i.e., it achieves a maximum on X ,we can extend the analysis. Now the certainty equivalent is not uniquely dened, asthere are two solutions to the equation in the denition of that concept. That is,there are C a [] and C a + [] such that C a [] E [] C a+ [] and

    u(C a []) = u(C a+ []) = U (),

    and two corresponding denitions of risk premium,

    Ra [] = E [] C a []Ra + [] = C a + [] E [],

    both dened to be positive numbers indicating how far (to the left or right) from themean of the decision maker would be willing to move to avoid risk. Following theabove, we can dene the risk premiums Ra ( ) and R a ( ) corresponding to small,fair gambles around a given x. Although for a given positive , these premiums maydiffer, the derivative of each at = 0 is zero, and the second derivatives are equal:

    d

    d Ra

    (0) =

    d

    d Ra +

    (0) = 0 and

    d2

    d 2 Ra

    (0) =

    d2

    d 2 Ra +

    (0).Again, the absolute risk aversion of u at x is the second derivative of Ra and R a+at = 0, with the same interpretation as before.

    Returning to the assumption that u is strictly increasing, we can dene relative risk aversion of u at x,

    xu (x)u (x)

    ,

    in terms of relative certainty equivalents. Given > 0, let ( ) be the even chancelottery between (1 + )x and (1 )x. Let C r ( ) be the relative certainty equivalentof this lottery and R r ( ) the relative risk premium. By denition,

    u(C r ( )x) =12

    u((1 )x) +12

    u((1 + )x),

    and differentiating with respect to ,

    u (C r ( )x)dd

    C r ( )x = x2

    u ((1 )x) +x2

    u ((1 + )x),

    28

  • 7/30/2019 Uncertainty Notes 2

    29/39

    F

    F

    G

    G

    Figure 8: First order stochastic dominance

    which implies dd C r (0) = 0. Taking the second derivative, we have

    u (C r ( )x)dd

    C r ( )x2

    + u (C r ( )x)d2

    d 2C r ( ) =

    x2

    2u ((1 )x) +

    x2

    2u ((1 + )x).

    Evaluating this at = 0 and substituting into R r (0) = 1 C r (0), it follows that

    d2

    d 2R r (0) =

    xu (x)u (x)

    ,

    so the relative risk aversion of u at x is just the second derivative of the relativerisk premium for small gambles (in percentage terms) around x. Again, the anal-ysis extends to the general case of strictly concave, but possibly single-peaked, vonNeumann-Morgenstern representations.

    8 Stochastic DominanceIn the one-dimensional choice model, we are able to make some comparisons betweenlotteries based on minimal structure of the decision makers preferences. For example,under the reasonable assumption that the decision maker prefers more money toless, so u is increasing, the lottery with probability 999 , 999/ 1, 000, 000 on 1 andprobability 1 / 1, 000, 000 on 999, 999 is less desirable than the even chance lotteryon 1 and 999, 999. The difference between these lotteries is that we have movedprobability from worse outcomes to better ones, which must be an improvement forthe decision maker. Formally, we say rst order stochastically dominates if for allx R , F (x) G(x), where F is the distribution of and G is the distribution of .Note the latter inequality may be the reverse of that expected; think of probabilitybeing squeezed to the right up the real line. See Figure 8 for continuous and discretepictures of a lottery with distribution F that stochastically dominates a lottery withdistribution G in the rst order sense.

    In fact, rst order stochastic dominance is characterized by the property that it alwaysamounts to an improvement for a decision maker with monotonic preferences.

    29

  • 7/30/2019 Uncertainty Notes 2

    30/39

    Theorem 8.1 Assume X = R , and let and be any two lotteries on R with bounded support. Then rst order stochastically dominates if and only if for every weakly increasing von Neumann-Morgenstern utility representation u, we have U () U ().

    Proof. First, assume rst order stochastically dominates , and let u be weakly in-creasing. For simplicity, let have distribution F and density f , while has distribu-tion G and density g, and let u be differentiable. We need to show

    u(x)f (x)dx

    u(x)g(x)dx, or equivalently,

    u(x)(f (x) g(x))dx 0.

    Using integration by parts, this is equivalent to

    limx u(x)(F (x) G(x)) limx u(x)(F (x) G(x))

    u

    (x)(F (x) G(x))dx,

    and because the lotteries have bounded support, the lefthand side is just zero. Andfor each x, u (x) 0 and F (x) G(x) 0, so the right-hand side is non-positive,as required. Now assume that for all weakly increasing functions u, the inequality

    u(x)f (x)dx

    u(x)g(x)dx holds. Consider any x R . To show F (x) G(x), dene u as

    u(y) = 0 if y x1 else.

    Then1 F (x) =

    u(y)f (y)dy

    u(y)g(y)dy = 1 G(x),

    which implies F (x) G(x), completing the proof.

    Note that one direction of Theorem 8.1 is that if U () U () for every weaklyincreasing von Neumman-Morgenstern utility representation u, then rst orderstochastically dominates . In fact, a stronger result is true: if U () U () for everycontinuous, strictly increasing von Neumman-Morgenstern utility representation u,then rst order stochastically dominates . To prove this stronger result, we wouldnot specify u as the step function in the second part of the proof of Theorem 8.1.Instead, we would suppose there is some x such that = F (x) G(x) > 0, andwe would choose a continuous, strictly increasing u that is close enough to the stepfunction so that

    1 F (x) +2

    >

    u(y)f (y)dy

    u(y)g(y)dy > 1 G(x)

    2,

    which implies F (x) G(x) < , a contradiction.

    30

  • 7/30/2019 Uncertainty Notes 2

    31/39

    F

    F

    G

    G

    Figure 9: Second order stochastic dominance

    Thus, in general, if u is weakly increasing, then a rst order stochastic increase in leads to a higher expected value of u. This remains true if u is also concave. Butwith the added structure of concavity, it may be that the expected value of u isincreased by a larger class of transformations of . Returning to the lottery example,suppose that a lottery ticket costs only 50 cents and pays off 500 , 000 dollars, i.e., thelottery ticket yields x = .5 with probability 999 , 999/ 1, 000, 000 and x = 499, 999.5with probability 1 / 1, 000, 000. The expected payoff of this lottery is still zero, butcompared to the original example, the downside and the upside of the lottery arereduced by half; the mean of the lottery is unchanged, but the risk is reduced. Thisshould be preferable for a risk averse decision maker.

    In fact, this transformation is an example of another type of stochastic dominance.We say second order stochastically dominates if for all x R ,

    x F (y)dy

    x

    G(y)dy, where F and G are, respectively, the distributions of and . Obviously,

    rst order stochastic dominance implies second order stochastic dominance, and theconverse does not hold generally, i.e., there are strictly more lotteries that dominate in the second order sense than in the rst order sense. See Figure 9 for continuousand discrete pictures of second order stochastic dominance. In fact, second orderstochastic dominance is characterized by the property that it always increases theexpected utility of a risk averse decision maker with monotonic preferences.

    Theorem 8.2 Assume X = R , and let and be any two lotteries on R with bounded support. Then second order stochastically dominates if and only if for every weakly increasing and concave von Neumann-Morgenstern utility representation

    u, we have U () U ().Proof. First, assume second order stochastically dominates , and let u be weaklyincreasing and concave. For simplicity, let have distribution F and density f , while has distribution G and density g, and let u be twice differentiable. As in the proof of Theorem 8.1, we need to show

    0

    u (x)(F (x) G(x))dx.

    31

  • 7/30/2019 Uncertainty Notes 2

    32/39

    Using integration by parts again, the righthand side is equal to

    limx

    [u (x) x

    (F (y) G(y))dy] lim

    x [u (x)

    x

    (F (y) G(y))dy]

    u

    (x) x

    (F (y) G(y))dydx.Since the lotteries have bounded support, the rst two terms above are zero. Andfor each x, u (x) 0 and

    x (F (y) G(y))dy 0, so the entire expression is

    non-positive, as required. Now assume that for all weakly increasing and concavefunctions u, we have

    u(x)f (x)dx

    u(x)g(x)dx. Consider any x R . To

    show x

    F (y)dy x

    G(y)dy, dene the weakly increasing and concave functionu as

    u(y) = y if y x,x else.

    Then using integration by parts twice, we have

    x x

    F (y)dy = xF (x) lim

    yyF (y)

    x

    F (y)dy + x(1 F (x))

    = x

    yf (y)dy +

    xxf (y)dy

    = x

    u(y)f (y)dy

    x

    u(y)g(y)dy

    = x

    yg(y)dy +

    xxg(y)dy

    = xG(x) limy

    yG(y) x

    G(y)dy + x(1 G(x))

    = x x

    G(y)dy,

    which implies x

    F (y)dy x

    G(y)dy, completing the proof.

    The analysis of rst order stochastic dominance can be applied to the case of a niteset of outcomes and a decision maker with a prespecied ranking of the outcomes.Letting P be an ordering (asymmetric and negatively transitive) of X , we say avon Neumann-Morgenstern representation u : X R is compatible with P if for allx, y X , u(x) > u (y) if and only if xP y (equivalently, u(x) u(y) if and only if xRy ). Given such a representation, we write

    U ( |u) =m

    j =1

    (x j )u(x j )

    32

  • 7/30/2019 Uncertainty Notes 2

    33/39

    for the expected utility calculated using u, where we write u explicitly as a parameter.

    Here, the outcomes may not be real numbers, but we can order them according to P and dene a notion of rst order stochastic dominance. Specically, say rst order stochastically dominates with respect to P if for all j = 1 , . . . , m , we have

    k :x j Rx k

    (xk ) k :x j Rx k

    (xk ).

    Theorem 8.1 can be used to prove the following characterization of this notion of rstorder stochastic dominance in terms of compatible representations.

    Theorem 8.3 Assume X = {x1, . . . , x m }, let and be any two lotteries on X , and let P be an ordering of X . Then rst order stochastically dominates with respect to P if and only if for all u : X R compatible with P , we have U ( |u) U (|u).

    Notice that if the two lotteries in the previous theorem are distinct, then cannot rstorder stochastically dominate with respect to P , and so there must be a compatiblerepresentation that has strictly higher expected utility from than from . Thisyields the following corollary.

    Corollary 8.4 Assume X = {x1, . . . , x m }, let and be any distinct lotteries on X , and let P be an ordering of X . Then rst order stochastically dominates with

    respect toP if and only if (i) for all u : X

    R

    compatible with P , we have U ( |u) U (|u), and (ii) there is some u compatible with P such that U ( |u) > U (|u).

    9 The Multidimensional Choice ModelEverything carries over. We dene simple lotteries and risk aversion (loving, neutral-ity) just as in the one-dimensional choice model, and Theorems 3.1, 4.1, and 6.1 holdin their original forms. Furthermore, everything carries over to lotteries on R n withcontinuous distributions too. Of course, the analysis of risk premiums and stochasticdominance relies on the structure of the one-dimensional choice model.

    10 Mean-Variance AnalysisIn the multidimensional choice model, it turns out that quadratic von Neumann-Morgenstern representations are especially easy to work with. The decision makersexpected utility from a lottery can be decomposed into two terms, one corresponding

    33

  • 7/30/2019 Uncertainty Notes 2

    34/39

    to the mean of the lottery and one corresponding to the variance. In the multidimen-sional setting, the mean and variance of a lottery are dened as

    E [] =m

    j =1

    (x j )x j

    V [] =m

    j =1

    (x j )|| x j E []|| 2

    when is simple. Of course, the above expressions are dened analogously in termsof density when has a continuous distribution.

    Theorem 10.1 Assume X R d , and the von Neumann-Morgenstern representation u is quadratic with ideal point x, i.e., u(x) = || x x|| 2. Then for every lottery ,we have

    U () = u(E []) V [].Proof. For simplicity, we give the argument for a simple lottery. Note that

    U ( ) =m

    j =1

    (x j )u (x j )

    =m

    j =1 (x j )(x x ) (x x )

    = m

    j =1

    (x j )[x j x j 2x x + x x ]

    = m

    j =1

    (x j ) E [ ] E [ ] 2x j x + x x

    + x j x j 2x j E [ ] + 2x j E [ ] E [ ] E [ ]]

    = m

    j =1

    (x j )[E [ ] E [ ] 2x j x + x x ]

    m

    j =1

    (x j ) x j x j 2x j E [ ] E [ ] E [ ]

    = E [ ] E [ ] + 2E [ ] x x x

    m

    j =1

    (x j ) x j x j 2x j E [ ] + E [ ] E [ ]

    = (E [ ] x ) (E [ ] x)

    m

    j =1

    (x j )[(x j E [ ]) (x j E [ ])]

    = u (E [ ]) V [ ],

    34

  • 7/30/2019 Uncertainty Notes 2

    35/39

    where the rst equality follows by denition of expected utility, the second fromdenition of quadratic utility, the third from expansion, the fourth from adding andsubtracting like terms, the fth from breaking the sum, the sixth because E [] and xare constants independent of j , the seventh from gathering terms, and the last fromdenition of quadratic utility, mean, and variance.

    Theorem 10.1 has interesting implications for the one-dimensional model with quadraticutilities. Assume there is an odd number n of voters with ideal points indexedx1 < x2 < < xn , and let m = n +12 be the index of the voter with the medianideal point. It follows that a majority of voters prefer one lottery to another if and only if the median voter prefers to . Indeed, suppose the median votersexpected utility is such that U m () > U m (). Without loss of generality, supposeE [] E [], and consider any i > m . Then

    u i(E []) ui (E []) + ( V [] V []) um (E []) um (E []) + ( V [] V []) > 0,

    and we conclude that U i () > U i (), as claimed.

    11 Experimental EvidenceWe can empirically test the accuracy of the Independence and Intermediate valuesaxioms. In fact, there has been much work in experimental psychology and economicsto see whether these axioms are consistent with observed choice behavior. Thoughthey often are, the empiricists have noted some choice situations that are especiallyproblematic for expected utility theory.

    One such situation is as follows. Suppose there are three possible wealth levels,

    x1 = 0x2 = 1 , 000, 000x3 = 5 , 000, 000,

    and four possible lotteries over them,

    (a) (0 , 1, 0)

    (b) ( .01, .89, .10)

    (c) (.89, .11, 0)

    (d) ( .90, 0, .10).

    Experiments have shown that subjects tend to prefer lottery (a) to (b) and lottery(d) to (c).

    35

  • 7/30/2019 Uncertainty Notes 2

    36/39

    If such a decision maker has an ordering of lotteries satisfying Independence andIntermediate values, then there is a von Neumann-Morgenstern representation, sayu, representing those preferences. A preference for (a) over (b) implies

    u(x2) > (.01)u(x1) + ( .89)u(x2) + ( .10)u(x3),

    or equivalently,(.11)u(x2) > (.01)u(x1) + ( .10)u(x3).

    But a preference for (d) over (c) implies the opposite inequality, an impossibility.Therefore, the decision makers preferences over lotteries cannot satisfy the vonNeumann-Morgenstern axioms: either Independence or Intermediate values is vio-lated. This is known as the Allais Paradox.

    A more fundamental problem for expected utility theory is the Ellsburg paradox,which challenges the assumption that we can model uncertainy on a decision makers

    part using the mathematical construction of lotteries. Here, we imagine two problemsof decision making under uncertainty that involve drawing colored balls from an urn.The urn contains 90 balls, 30 of which are red and 60 of which are either black orwhite. The exact numbers of black balls and white balls is unknown; it is known onlythat the sum of those numbers if 60. We rst give the decision maker a choice betweentwo gambles with payoffs as follows: (a) $100 if a red ball is drawn, zero otherwise,(b) $100 if a black ball is drawn, zero otherwise. Next, we give the decision maker achoice between: (c) $100 if a red or white ball is drawn, zero otherwise, (d) $100 if ablack ball is drawn, zero otherwise.

    In experimental situations, subjects tend to choose (a) over (b) and to choose (d) over

    (c). To capture this decision problem in our framework, we must assume that thedecision maker has some beliefs about the likelihood of drawing a black ball as opposedto a white one. Let = (1 / 3, ,) represent her beliefs, where the probability of drawing a black ball is and the probability of a white ball is . Then, however,these preferences are inconsistent with the expected utility hypothesis. Letting u bea von Neumann-Morgenstern representation, a preference for (a) over (b) implies

    13

    u(100) +23

    u(0) > u (100) + (1 )u(0),

    or equivalently,

    13

    (u(100) u(0)) > (u(100) u(0)) ,

    which assuming u(100) > u (0) is equivalent to < 13 . But a preference for (d) over(c) implies

    ( + )u(100) +13

    u(0) >13

    + u(100) + u (0),

    36

  • 7/30/2019 Uncertainty Notes 2

    37/39

    or equivalently,

    ( + )(u(100) u(0) >13

    + (u(100) u(0)) ,

    which is equivalent to > 13 , a contradiction.

    12 The Paradox of VotingAs a simple application of expected utility theory, we consider a model of voting dueto Riker and Ordeshook (1968), one of the earliest applications of formal methodsto political science. Here, a voter must decide whether to vote for candidate A, tovote for candidate B , or to abstain from voting altogether, so the possible choices areC = {A,B,H }, representing voting for A or B or staying home. We assume that,from the voters point of view, the actions of other voters are subject to uncertainty,and we let vA denote the number of votes received by A from other voters and vB thenumber of votes received by B from other voters. Assuming there are n voters total,the possible states of the world are then S = {(vA , vB ) R 2+ | vA + vB n 1},representing the number of other voters voting for A and B. The election is majorityrule, with a fair coin toss deciding the election; we could have incorporated theoutcome of the coin toss in the state of the world, but it is simpler to fold it intothe voters von Neumann-Morgenstern representation. We denote the voters beliefsabout the state of the world by , and we write, e.g., P (vA > v B ) for the probabilityof the event that a majority of other voters support A over B .

    Assume the voters preferences have a von Neumann-Morgenstern representation asfollows,

    u(ci , s j ) =

    uA if ci = H , vA > v BuA + D C if ci = A, vA vBuA + D C if ci = B, vA vB + 2u A + u B

    2 if ci = H , vA = vBu A + u B

    2 if ci = A, vA = vB 1u A + u B

    2 if ci = B, vA = vB + 1uB if ci = H , vB > v AuB + D C if ci = A, vB vA + 2

    uB + D C if ci = B, vB vA ,where s j = ( vA , vB ), uA is the utility from A winning, D 0 is a utility the voterreceives from fullling his/her responsibilities as a citizen (the duty term), andC 0 is a cost of voting. Assume for simplicity that uA > u B . Note that in the eventthat the election results in a tie, the voters utility is u A + u B2 , reecting the assumptionthat the winner is determined by the toss of a coin.

    37

  • 7/30/2019 Uncertainty Notes 2

    38/39

  • 7/30/2019 Uncertainty Notes 2

    39/39

    References[1] J. von Neumann and O. Morgenstern (1944) Theory of Games and Economic

    Behavior, Princeton: Princeton University Press.

    [2] W. Riker and P. Ordeshook (1968) A Theory of the Calculus of Voting, Amer-ican Political Science Review, 62: 2542.