A Theoretical Foundation of Ambiguity Measurement · ﬀ Ilan Kremer, Evgeny Lyandres, Fabio Maccheroni, Massimo Marinacci, Sujoy Mukerji, Yacov Oded, Efe Ok, Jacob Sagi, David Schmeidler,

A Theoretical Foundation of Ambiguity Measurement

Yehuda Izhakian∗†

April 17, 2015

Abstract

Ordering alternatives by their degree of ambiguity is a crucial element in decision-making pro-

cesses. The current paper introduces an empirically applicable, stake-independent ambiguity mea-

sure that allows for such ordering. This measure relies upon the idea that, in the presence of

ambiguity, probabilities are themselves uncertain, and related preferences are applied to these

probabilities such that aversion to ambiguity is defined as aversion to mean-preserving spreads in

probabilities. Thereby, the degree of ambiguity can be measured by the volatility of probabilities.

The applicability of this measure is demonstrated by incorporating ambiguity into an asset pricing

model.

Keywords: Ambiguity Measure, Ambiguity Aversion, Knightian Uncertainty, Uncertain Probabilities, Ambiguity Pre-

mium.

JEL Classification Numbers: D81, D83, G11, G12.

∗Department of Economics and Finance, Zicklin School of Business, Baruch College; [email protected]†I thank Menachem Abudy, Yakov Amihud, Doron Abramov, David Backus, Adam Brandenburger, Menachem Bren-

ner, Xavier Gabaix, William Greene, Eitan Goldman, Yaniv Grinstein, Sergiu Hart, Edi Karni, Ruth Kaufman, PeterKlibanoff, Ilan Kremer, Evgeny Lyandres, Fabio Maccheroni, Massimo Marinacci, Sujoy Mukerji, Yacov Oded, Efe Ok,Jacob Sagi, David Schmeidler, Uzi Segal, Marciano Siniscalchi, Laura Veldkamp, Paul Wachtel, Jan Werner, JaimeZender, Stanley Zin and especially Itzhak Gilboa, Mark Machina and Thomas Sargent for valuable discussions and sug-gestions. I would also like to thank the seminar and conference audiences at Bar Ilan University, Baruch College, IndianaUniversity, Johns Hopkins University, Michigan State University, New York University, Norwegian School of Business,Tel Aviv University, The Interdisciplinary Center (IDC) Herzliya, The Hebrew University of Jerusalem, University ofColorado, University of Houston, University of Michigan, Arne Ryde Workshop in Financial Economics 2013, Decision:Theory, Experiments and Applications (D-TEA) 2013, Netspar International Pension Workshop 2013, North AmericanMeetings of the Econometric Society Northwestern University 2012, Risk Uncertainty and Decision (RUD) 2013, Uni-versity of Chicago Workshop on Ambiguity and Robustness in Macroeconomics and Finance 2013, and Foundations ofUtility and Risk (FUR) 2014.

1

1 Introduction

How should uncertain alternatives be ranked by the criterion of ambiguity? Consider the following

example: a large urn contains 30 balls which are either black or yellow, in an unknown proportion,

and a second smaller urn contains only 10 balls which are also either black or yellow, in an unknown

proportion. Which of the following two bets is more ambiguous? “A ball drawn from the large urn

is yellow” or, “A ball drawn from the small urn is Yellow.” Say you were offered $10 if a ball drawn

from the large urn is yellow and $10 if a ball drawn from the small urn is yellow. Which of these two

bets would you choose? Answering this type of questions is part of almost any real-life decision. They

imply that decision-making involves the ordering of alternatives by their degree of ambiguity.

This paper introduces an empirically applicable ambiguity measure, underpinned by a new the-

oretical concept, that allows for such ordering of alternatives. This new concept proposes that, in

the presence of ambiguity, probabilities are themselves uncertain, and preferences concerning ambi-

guity are applied directly to these probabilities such that aversion to ambiguity is defined as aversion

to mean-preserving spreads in probabilities—analogous to the Rothschild-Stiglitz (1970) aversion to

mean-preserving spreads in outcomes. Thereby, the degree of ambiguity can be measured by the

volatility of probabilities, just as the degree of risk can be measured by the volatility of outcomes.

The resulting measure is objective, stake independent, simple and intuitive. It measures the degree of

ambiguity independently of individuals’ preferences and can be computed from the data in empirical

studies. These are key qualities for the introduction of ambiguity into economic and financial models.

The decision making framework underpinning the current paper is expected utility with uncertain

probabilities (henceforth EUUP), introduced by Izhakian (2014). This framework assumes two tiers

of uncertainty, one with respect to consequences (outcomes) and the other with respect to the proba-

bilities of these consequences. A decision maker (DM) in this environment applies two differentiated

phases of the decision process, each refers to one of these tiers. In the first phase—the probability

formation phase—she forms a representation of her perceived probabilities for all the events which are

relevant to her decision. Then, in the second phase—the valuation phase, she assesses the value of

each alternative using her perceived probabilities and chooses accordingly. Ambiguity—the uncertainty

about probabilities—plays a role in the probability formation phase, while risk—the uncertainty about

consequences—plays a role in the valuation phase.1 This structure introduces a complete distinction

of risk from ambiguity with regard to both beliefs and tastes. The degree of ambiguity and attitudes

toward it are then measured with respect to one tier, while risk and risk attitudes are measured with

1Risk is defined as a condition in which the event to be realized is a-priori unknown, but the odds of all possibleevents are perfectly known. Ambiguity (Knightian uncertainty) refers to conditions in which not only is the event to berealized a-priori unknown, but the odds of events are also either not uniquely assigned or are unknown.

2

respect to the other tier.

The main idea of EUUP is that, in the probability formation phase, perceived probabilities are

formed in a Bayesian approach by the “certainty equivalent probabilities” of uncertain probabilities.2

That is to say, an uncertain probability is modeled explicitly in a state space that is subject to a

prior probability, and the perceived probability is the unique certain probability value that the DM

is willing to accept in exchange for the uncertain probability of a given event. Perceived probabilities

are subjectively formed based upon the DM’s preferences concerning ambiguity. These preferences

are applied to probabilities such that aversion to ambiguity is defined as aversion to mean-preserving

spreads in probabilities. Thereby, the Rothschild and Stiglitz (1970) approach can be used over

probabilities to define an ordering by ambiguity.

Based upon probability ordering, this paper shows that the degree of ambiguity can be measured

by four times the expected volatility of probabilities, across the relevant events. Formally, the measure

of ambiguity is given by

f2 [f ] = 4

∫XE [φf (x)] Var [φf (x)] dx,

where f is an act (a bounded measurable function from states into consequences); X is a convex

subset of the real numbers (consequences); φf (·) is an uncertain probability density function; and

the expectation E [·] and the variance Var [·] are taken with respect to second-order probabilities

(probabilities over a set of probability distributions).3 The measure f2 (mho2) can be viewed as

an objective measure of ambiguity, as it measures ambiguous beliefs (information) in isolation from

individuals’ tastes for ambiguity. The main advantage of this measure is that it can be computed from

the data and can be employed in empirical tests.4 Stake independence is another major advantage

of f2; unlike risk measures, it does not depend upon the magnitude of consequences. This is an

important property of f2, for example, for assessing the ambiguity associated with a particular stock

market, regardless of the investment amount and the associated risk.

Several approaches to estimating ambiguity have been proposed in the literature. Dow and Wer-

lang (1992) measure uncertainty as the sum of the probability of an event and the probability of its

complementary event. Ui (2011) measures ambiguity by the difference between the minimal possible

mean and the true mean. Bewley (2011) and Boyle et al. (2011) measure ambiguity by a critical

confidence interval. Maccheroni et al. (2013) measure ambiguity by the variance of an unknown

mean. These studies assume that the variance of outcomes is known and suggest a stake-dependent

2In this paper the terms perceived probabilities and subjective probabilities are used interchangeably.3The measure f2 is also applicable in finite state space. In this case, f2 [f ] = 4

∑i E [φf (xi)] Var [φf (xi)] , where

φf (·) is a probability mass function.4See, for example, Brenner and Izhakian (2011, 2012).

3

measure, based only on the variation of the mean. Nevertheless, ambiguous variance has been found

to be an important element in decision making processes; as stressed, for example, in Epstein and

Ji (2013).5 The ambiguity measure f2, proposed in this paper, is stake-independent and encompasses

both ambiguous variance and ambiguous mean, as well as the ambiguity of all higher moments of

the probability distribution (i.e., skewness, kurtosis, etc.), through the uncertainty of probabilities.

Relative entropy, measured by the deviation of a probability distribution from a reference probability

distribution (reference model), can also be interpreted as a measure of ambiguity; see, for example,

Hansen et al. (1999), Hansen and Sargent (2001) and Maccheroni et al. (2006). However, while the

use of relative entropy is restricted to cases of a single prior relative to a known true probability

distribution, f2 can be employed in cases of multiple priors, when either a single true probability

distribution does not exist or it is unknown.

Measuring the degree of ambiguity allows alternatives to be ranked by the criterion of ambiguity.

The ambiguity measure is a critical instrument for introducing ambiguity into models that attempt

to explain observable phenomena such as financial anomalies. It provides a way to address important

questions that arise regarding the nature of ambiguity, in general, and the nature of the aggregate

ambiguity of portfolios, in particular. Accounting for ambiguity might shed light on some phenomena

that previously could not be fully explained. Notable examples include the fact that individuals tend

to hold very small portfolios, 3-4 stocks (Goetzmann and Kumar, 2008), the equity premium puzzle

(Mehra and Prescott, 1985), the risk-free rate puzzle (Weil, 1989), the phenomenon of the observed

equity volatility being too high to be justified by changes in the fundamental (Shiller, 1981), and the

home bias puzzle (Coval and Moskowitz, 1999).

To demonstrate the applicability of the proposed measure of ambiguity, this paper generalizes asset

pricing theory (Arrow-Pratt) to incorporate ambiguity. Relaxing the assumption that probabilities

are known, it shows that the price of an asset is determined not only by its degree of risk and

the DM’s attitude toward risk, but also by its degree of ambiguity and the DM’s attitude toward

ambiguity. The paper constructs an uncertainty premium and proves that it can be separated into a

risk premium and an ambiguity premium. It provides a well-defined ambiguity premium, attributed to

ambiguity and preferences concerning ambiguity and completely distinguished from the risk premium.

Previous models have been mainly focused on the theoretical aspects of the implication of ambiguity

for the equity premium (e.g., Chen and Epstein (2002), Izhakian and Benninga (2011), Ui (2011), and

Maccheroni et al. (2013)). Unlike these models, the ambiguity premium in the current paper can be

computed from the data and tested empirically. For example, Brenner and Izhakian (2011) show that

5In empirical asset pricing and macroeconomic contexts, stochastic time varying volatility also plays an importantrole; see, for example, Bollerslev et al. (1988), Fernandez-Villaverde et al. (2010) and Bollerslev et al. (2011).

4

ambiguity, measured by f2, has a significant impact on the market portfolio return.6

The rest of the paper is organized as follows. Section 2 presents the decision-making framework.

Section 3 simplifies the framework as a preparation for the extraction of an ambiguity measure. Using

this simplified representation, Section 4 defines ordering of events by ambiguity and Section 5 uses this

ordering to suggest a measure of ambiguity. Section 6 analyzes the special properties of the proposed

measure, and Section 7 discusses it relative to alternative measures of ambiguity. To demonstrate

an application of this measure for asset pricing, Section 8 models the ambiguity premium. Section 9

concludes. All proofs are provided in the Appendix.

2 The decision making framework

The decision making framework employed in this paper is expected utility with uncertain probabilities

(EUUP), proposed by Izhakian (2014). EUUP assumes two different tiers of uncertainty, one with re-

spect to consequences (outcomes) and the other with respect to the probabilities of these consequences.

Each tier is modeled by a separate state space. A decision maker (DM) in this framework applies two

differentiated phases of the decision process, each refers to one of these tiers. Preferences for ambi-

guity, which are applied to uncertain probabilities in the first phase of a decision process, rely upon

the Savage (1954) axiomatic foundation. They underpin perceived probabilities, which are structured

from uncertain probabilities in a Bayesian approach. Given these perceived (nonadditive) proba-

bilities, preferences for risk are applied to consequences in the second phase of a decision process.7

These preferences, which rely upon the foundations of Schmeidler’s (1989) Choquet expected utility

and Tversky and Kahneman’s (1992) cumulative prospect theory, are formulated by Wakker’s (2010)

axiomatization.

Formally, let S be a (finite or infinite) nonempty state space, called the primary space, endowed

with a σ-algebra, E , of subsets of S. Generic elements of this σ-algebra are called events and are

denoted by E. Define X ⊆ R to be a convex set of consequences that contains the interval [0, 1]. Let

a primary act f : S → X be a bounded E-measurable function from states into consequences, and

denote the set of all these (Savage) acts by F and the set of all simple measurable acts by F0. A simple

primary act can be represented as a sequence of pairs, f = (E1 : x1, . . . , En : xn) , where (E1, . . . , En) is

a generic partition of the state space S; xj is the consequence if event Ej occurs; and the consequences

x1, . . . , xn are listed in a non-decreasing order. A primary indicator act δE =ÄEC : 0, E : 1

äassigns

6As far as I’m aware, prior studies do not conduct direct empirical tests of models of decision making under ambiguityother than through parametric fitting and calibrations. Uppal and Wang (2003), Epstein and Schneider (2008), and Juand Miao (2012), for example, calibrate their model to the data. Several papers attribute different explanatory variablesto ambiguity. For example, Anderson et al. (2009) attribute the disagreement of professional forecasters to ambiguity.

7Schmeidler (1989), in his pioneering study, introduces the idea that, in the presence of ambiguity, the probabilitiesthat reflect the DM’s willingness to bet may not be additive, i.e., the sum of the probabilities can be either smaller orgreater than 1.

5

the outcome 1 to event E ∈ E and the outcome 0 to its complementary event EC ∈ E . The domain

of first-order preference relation, %1, is the set of primary acts F0, and the relations -1, ≺1, ≻1 and

∼1 are defined as usual. A consequence x ∈ X is considered to be unfavorable if x ≤ k and favorable

if k < x, where k is a reference point. An event E ∈ E is considered to be unfavorable under act f if

f (E) ≤ k and favorable if k < f (E).

Probabilities of events E occurring in the primary space are determined in a (finite or infinite)

nonempty secondary space, defined by a set P of all possible additive probability measures over the

primary space S. A first-order probability measure P ∈ P is then viewed as a state of nature in

this secondary space, and the state space P is assumed to be endowed with the maximal σ-algebra,

Π = 2P , of subsets of P. A secondary act, Êf : P → X , is a bounded function from the secondary

space P into the set of consequences X . The set of all secondary acts is denoted ÁF . A secondary actÊf that describes the resulting expected outcome of a primary act f contingent upon a prior P ∈ P

(on S), is denoted f ; that is, f : P → X satisfies f (P) =∫S fdP. The set of all secondary acts f ∈ ÁF

is denoted “F , and the subset of all secondary acts in “F that are associated with primary indicator

acts is denoted ∆. A secondary act δE : P → [0, 1] in ∆, associated with a primary indicator act

δE =ÄEC : 0, E : 1

ä, is given by

δE (P) = P (E)

for every P ∈ P. A secondary act δE can, therefore, be viewed as a function that assigns each event

E ∈ E with its possible probabilities. In this view, δE can be interpreted as an uncertain variable

describing the probability P (E) of event E. A second-order non-atomic finitely-additive probability

measure χ on Π assigns each subset A ∈ Π of first-order probability measures in P with a probability

χ (A).8 This second-order belief is implicated in the DM’s second-order preference relation %2 over

the set of all secondary acts ÁF . In the view of δE : P → [0, 1] as describing the (uncertain) probability

P (E) of event E, the preference relation %2 over ∆ defines a preference over probabilities.9

Suppose that X and S satisfy the required richness of EUUP, the preference relation %1 on the set

of primary acts F0 satisfies Wakker’s (2010, Theorem 12.3.5) axioms, the preference relation %2 on the

set of secondary acts ÁF satisfies Savage’s (1954) axioms, and that they jointly satisfy Izhakian’s (2014)

axiom. Then, by Izhakian (2014, Theorem 1), there exists a function V : F0 → R such that

f %1 g ⇐⇒ V (f) ≥ V (g) ,

8To maintain non-atomic when P is finite, one can define the probability measure χ on the product σ-algebra Π⊗2J ,where J is a non-singleton convex set of probability distributions over some auxiliary state space S1.

9To understand this interpretation, consider the two secondary acts δE , δF ∈ ∆, associated with the primary indicatoracts δE , δF ∈ F (whose outcomes are the same), and assume a DM who prefers δE to δF . This means that she prefersto get the good outcome with the (uncertain) probability P (E) than with the (uncertain) probability P (F ).

6

for every f, g ∈ F0, where

V (f) =

∫ k

−∞

ïΓ−1

Å∫PΓÄδ{s∈S |U(f(s))≥z} (P)

ädχ

ã− 1

òdz + (1)∫ ∞

kΓ−1

Å∫PΓÄδ{s∈S |U(f(s))≥z} (P)

ädχ

ãdz;

U : X → R is strictly increasing continuous bounded functions, normalized such that U (k) = 0; and

Γ : [0, 1] → R is a non-constant bounded function.10 Furthermore, χ is uniquely determined, U is

unique up to a unit, and Γ is unique up to a positive linear transformation.

The Bayesian approach asserts that everything that is not known should be modeled explicitly in a

state space and be subject to a prior probability. The model of Equation (1) applies this approach to

uncertain probabilities. As a result, the function V takes the form of a two-sided Choquet integration

to unfavorable outcomes and to favorable outcomes (relative to the reference point). This functional

representation of the DM’s aggregate preferences makes a complete distinction between beliefs and

tastes and between risk and ambiguity. First-order beliefs are formed by the uncertain probability

measure P; second-order beliefs are formed by the probability measure χ; risk preferences are formed

by the utility function U;11 and ambiguity preferences are formed by the function Γ. The function Γ,

referred to as a outlook function, forms the DM’s attitude toward ambiguity. As with risk attitudes,

there are three types of attitudes toward ambiguity: aversion to ambiguity (formed by a concave Γ),

loving of ambiguity (formed by a convex Γ) and indifference to ambiguity (formed by a linear Γ).

To simplify the functional representation V in Equation (1), secondary acts (δE) can be replaced

by their resulting probabilities to obtain

V (f) =

∫ k

−∞

ïΓ−1

Å∫PΓ (P ({s ∈ S |U(f (s)) ≥ z})) dχ

ã− 1

òdz + (2)∫ ∞

kΓ−1

Å∫PΓ (P ({s ∈ S |U(f (s)) ≥ z})) dχ

ãdz.

This functional representation considers acts taking infinitely many values in an infinite state space.

It is important to note that all the results in this paper can be applied to a discrete representation in

a finite state space with acts taking finitely many values.

3 Preliminaries

To elicit a measure of the degree of ambiguity, the functional representation V has to be further sim-

plified. The key for the additional simplification is the DM’s perceived probabilities. In EUUP, these

10EUUP stems from the multiple priors paradigm (Gilboa and Schmeidler, 1989) and results in a two sided variation ofCEU (Gilboa, 1987 and Schmeidler, 1989). It combines the concept of nonadditive probabilities with the idea of reference-dependent beliefs, which is applied to differentiate between the probability of unfavorable and favorable events. Since thefocus of this paper is ambiguity measurement, as opposed to preferences for ambiguity, it is assumed for simplicity thatthe DM has the same preference for ambiguity, formed by Γ, concerning unfavorable and favorable events.

11As usual, a concave U implies risk aversion, and a convex U implies risk loving.

7

are derived from the nature of the uncertainty about probabilities (ambiguity) and the DM’s prefer-

ences concerning this uncertainty. In particular, the concept of EUUP is that perceived probabilities

are formed by the certainty-equivalent probabilities of uncertain probabilities in a Bayesian approach.

Formally, the perceived probability Q(E) of event E ∈ E is defined by12

Q(E) = Γ−1Å∫

PΓ (P (E)) dχ

ã. (3)

This probability is a function of first-order (uncertain) probabilities, formed by a set P of possible

probability measures over E ; a second-order probability measure χ (second-order belief) over P; and

the DM’s preferences concerning ambiguity, applied to probabilities and captured by Γ. Equation (3)

proposes that while making decisions, the DM, who views uncertain probabilities as a set of priors,

aggregates these probabilities in a nonlinear way to form her perceived probabilities.13

To simplify the exposition of perceived probabilities in Equation (3), the perceived probability

Q(E) of an event E ∈ E can be approximated by taking a second-order Taylor approximation with

respect to its first-order probabilities P (E) around its expected probability E [P (E)].14 To this end,

E [P (E)] =

∫PP (E) dχ

is defined to be the expected probability of event E; and

Var [P (E)] =

∫P

ÄP (E)− E [P (E)]

ä2dχ

to be the variance of the probability of event E.

Theorem 1. Assume a strictly-increasing, continuous and twice-differentiable Γ satisfying

12

(Γ′′(E[P(F )])Γ′(E[P(F )])Var [P (F )]− Γ′′(E[P(E∪F )])

Γ′(E[P(E∪F )])Var [P (E ∪ F )])≤ E [P (E)] for any events E,F ∈ E. Then,

for a relatively small P (E), the perceived probability of event E is

Q(E) ≈ E [P (E)] +1

2

Γ′′ (E [P (E)])

Γ′ (E [P (E)])Var [P (E)] .

Notice that the approximated perceived probabilities satisfy Q(∅) = 0, Q(S) = 1, and set mono-

tonicity with respect to set-inclusion, i.e., Q(E) ≤ Q(F ) if E ⊂ F (by Lemma 2).15 The condition on

Γ bounds the level of ambiguity aversion (the concavity of Γ) and the level of ambiguity loving (the

convexity of Γ) to assure that the approximated perceived probabilities are nonnegative and that the

12This functional representation is obtained by the value function V of an indicator act, formed in Equation (1), andby replacing the secondary act with its resulting probabilities; see Izhakian (2014).

13As a consequence of probabilistic sensitivity, i.e., the nonlinear ways in which individuals may interpret probabilities,perceived probabilities are nonadditive. That is, the sum of the probabilities can be either smaller or greater than 1.Ambiguity aversion results in a subadditive probability measure, while ambiguity loving results in a superadditivemeasure.

14The same method was applied by Arrow (1965) and Pratt (1964) to consequences within the expected utilityframework, whereas in this case it is applied to probabilities.

15In fact, Q is a capacity—a subjective nonadditive probability; see, for example, Schmeidler (1989).

8

probability of an event is not smaller than the probability of any of its sub-events. Henceforth it is

assumed that Γ satisfies this condition. The perceived probabilities, proposed in Theorem 1, provide

a natural way to simplify the functional representation of preferences over acts to a more applicable

form.

Proposition 1. Suppose that the axioms of Izhakian (2014, Theorem 1) and the conditions of Theo-

rem 1 are satisfied. The value of an act f ∈ F0, formed in Equation (2), can then be written

V (f) ≈ −∫ k

−∞E [P ({s ∈ S |U(f (s)) ≤ z})] dz +

∫ ∞

kE [P ({s ∈ S |U(f (s)) ≥ z})] dz

+

∫ ∞

−∞

1

2

Γ′′ (E [P ({s ∈ S |U(f (s)) ≥ z})])Γ′ (E [P ({s ∈ S |U(f (s)) ≥ z})])

Var [P ({s ∈ S |U(f (s)) ≥ z})] dz.

To simplify the notation, the following conventions are used. Pf (x) stands for the cumulative prob-

ability P ({s ∈ S | f (s) ≤ x}), and φf (x) stands for the probability density φ ({s ∈ S | f (s) = x}).16

It is assumed that a density function φf (x) exists and well defined for every x ∈ X . When it is clear

from the context, the subscript f , indicating an act, is omitted. With this notation in place, the next

theorem presents a dual representation of the value function V.

Theorem 2. Suppose that the axioms of Izhakian (2014, Theorem 1) and the conditions of Theorem 1

are satisfied. The dual representation W of the value function V can then be approximated by

W(f) ≈∫ k

−∞U(x)

ÇE [φf (x)]−

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)] Var [φf (x)]

ådx+ (4)∫ ∞

kU(x)

ÇE [φf (x)] +

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)] Var [φf (x)]

ådx.

That is,

f %1 g ⇐⇒ W(f) ≥ W(g) .

The functional representation provided by this theory eases the use of EUUP theory in general

and the extraction of an ambiguity measure in particular. The following corollary demonstrates it.

Corollary 1. Assume a DM typified by a constant relative ambiguity aversion (CAAA), i.e., Γ (P (E)) =

(P(E))1−η

1−η .17 The value function then takes the form

W(f) ≈∫ k

−∞U(x)

ÄE [φf (x)]− ηE [φf (x)] Var [φf (x)]

ädx+∫ ∞

kU(x)

ÄE [φf (x)] + ηE [φf (x)] Var [φf (x)]

ädx.

16In a discrete representation, when the state space S is finite, φf (x) stands for the probability mass function.17CAAA means that, given a return value, while shifting linearly the range of its possible probabilities, the attitude

toward ambiguity remains unchanged. See Izhakian (2014) for a detailed discussion about the nature of different attitudestoward ambiguity.

9

Notice that if the DM is ambiguity neutral or if there is no ambiguity, i.e., the variance of proba-

bilities is zero, the functional representation of the value of an act collapses to the classical expected

utility representation

W (f) =

∫ ∞

−∞U(x) E [φf (x)] dx.

That is, no disutility occurs.

4 Ordering ambiguous events

A preliminary step in ordering acts by their degree of ambiguity is to define such an order over events.

This ordering is determined by the second order preference relation %2. In the view of a secondary

act δE : P → [0, 1] as describing the (uncertain) probability P (E) of event E, the preference relation

%2 over the set ∆ of δ may well be referred to as a preference relation over probabilities. To see this,

consider the two secondary acts δE , δF ∈ ∆, associated with the primary indicator acts δE , δF ∈ F

(whose outcomes are the same), and assume a DM who prefers δE to δF . This means that she

prefers to get the good outcome with the (uncertain) probability P (E) than with the (uncertain)

probability P (F ). That is, she prefers P (E) to P (F ). With this notion, an ordering of events by

their degree of ambiguity, induced by the DM’s preferences, can be defined as follows.

Definition 1. Let the uncertain probabilities of events E,F ∈ E have the same expectation, i.e.,

E [P (E)] = E [P (F )]. Event F is more ambiguous than event E if and only if

δE %2 δF

by any ambiguity-averse DM.

This definition provides a subjective ordering of events that arises from the DM’s preferences con-

cerning ambiguity. Notice that preferences concerning ambiguity, %2, apply only to the probabilities

of events and not to their consequences. Notice also that, by Izhakian (2014, Theorem 1), in EUUP18

δE %2 δF ⇐⇒ Γ−1Å∫

PΓ (P (E)) dχ

ã≥ Γ−1

Å∫PΓ (P (F )) dχ

ã,

implying that an objective ordering by the degree of ambiguity can be defined by mean-preserving

spreads in probabilities. Rothschild and Stiglitz (1970) apply the idea of mean-preserving spreads to

outcomes in order to define a ranking by risk, whereas here, this idea is applied to probabilities in

order to define a ranking by ambiguity.

18This obtained immediately by applying the value function in Equation (1) to indicator acts.

10

Definition 2. Event F ∈ E is more ambiguous than event E ∈ E if there exists a random variable ϵ

such that

P (F )− E [P (F )] =d P (E)− E [P (E)] + ϵ,

where =d means equal in distribution and E [ϵ |P (E)] = E [ϵ] = 0.19 That is, P (F ) is a mean-

preserving spread of P (E). If ϵ is not identically zero, then F is strictly more ambiguous than E.

This definition does not assume that events share an identical expected probability or similar prob-

ability distributions. In this definition, one event being more ambiguous than another is a condition on

the deviations of its possible probabilities from the respective expectations of probabilities. It implies

that, given a random variable ϵ with E [ϵ] = 0, an event with the uncertain probability P (E) + ϵ is

more ambiguous than event E with the uncertain probability P (E). In turn, this implies that any

event with a non-constant probability is strictly more ambiguous than an event with its expected

probability. The next proposition ties the subjective ordering of Definition 1 by the DM’s preferences

to the objective ordering of Definition 2, guided by the notion that every ambiguity-averse DM prefers

a less ambiguous event to a more ambiguous one, assuming both have the same consequence and the

same expected probability.

Proposition 2. Suppose E and F are events in E with identical expected probabilities. Then,

P (F )− E [P (F )] =d P (E)− E [P (E)] + ϵ ⇐⇒ δE %2 δF

by every ambiguity-averse DM, where E [ϵ |P (E)] = E [ϵ] = 0. That is, definitions 1 and 2 of the more

ambiguous event coincide.

The next step is to define the conditions under which spreads in probabilities can be measured

by the variance of probabilities such that the higher the variance of probabilities, Var [P (E)], the

higher the degree of ambiguity. These conditions apply to the nature of both, the DM’s preference for

ambiguity and to her beliefs. The former refers to cases where the DM’s attitude toward ambiguity

is quadratic or of the CAAA type. The latter refers to cases where the probabilities of events are

uniformly or elliptically distributed, i.e., the second-order probability distribution (probabilities of

probability distributions) is uniform or elliptical.20 The probability P (E) of event E is said to be

19The condition E [ϵ |P (E)] = E [ϵ] means that ϵ is mean-independent of the uncertain probability P (E) of event E.Note that, equality in distribution is a much weaker condition than equality and that mean-independence is less strongthan independence; independence implies mean-independence, but the converse is not true.

20For applications of elliptically distributed returns to asset pricing theory see, for example, Owen and Rabi-novitch (1983).

11

elliptically distributed if its probability characteristic function is of the form

ϕP(E) (t) = eitE[P(E)]Ψ

Å1

2t2Var [P (E)]

ã,

where i =√−1 and Ψ is a characteristic generator.21

Theorem 3. Suppose E and F are events in E with identical expected probabilities. Then,

F is more ambiguous than E ⇐⇒ Var [P (F )] ≥ Var [P (E)] ,

when one or more of the following conditions hold:

(i) The probabilities of events E and F are uniformly distributed;

(ii) The probabilities of events E and F are truncated elliptically distributed with an identical char-

acteristic generator;22

(iii) The DM’s attitude toward ambiguity is of the CAAA type;

(iv) The DM’s attitude toward ambiguity is quadratic.

This theorem proposes that, given an event, the greater the spread of its possible probabilities, the

greater its ambiguity. It suggests that, when probabilities are uniformly or elliptically distributed, the

ordering of events by the variance of their probabilities coincides with the ordering of Definitions 1

and 2. Henceforth, it is assumed that the probabilities of all events are either uniformly or elliptically

distributed. If needed, this assumption can be replaced by assuming a CAAA or a quadratic outlook

function, Γ. At this point, the order of events by their degree of ambiguity, measured by Var [P (·)],

is well-defined. Using this order, one can define stochastic dominance with respect to ambiguity by

applying probabilities to outcomes.23

Definition 3. Let f, g ∈ F0 be two primary acts under which the expected probabilities of each con-

sequence x ∈ X are identical. That is, E [φf (x)] = E [φg (x)], for any given x ∈ X . Act f first-order

stochastically (“cumulatively”) dominates act g with respect to ambiguity if and only if∫ x

−∞E [φf (z)] Var [φf (z)] dz ≤

∫ x

−∞E [φg (z)] Var [φg (z)] dz

for any x ∈ X .

The notion of first-order stochastic dominance with respect to ambiguity allows for the definition

of the relation between the objective ordering of acts by stochastic dominance and the subjective

21Particular forms of the elliptical distribution include: normal distribution, student-t distribution, logistic distribution,exponential power distribution and laplace distribution.

22Notice that, since probability values are bounded between 0 and 1, truncated elliptical distributions are considered.23Sarin and Wakker (1992) also extend the notion of stochastic dominance to uncertainty with respect to probabilities.

They define “that an act f stochastically (“cumulatively”) dominates an act g if the DM regards each cumulativeconsequence set at least as likely under f as under g”.

12

ordering by the DM’s preferences concerning ambiguity. The following theorem settles this relation.

Theorem 4. Suppose f, g ∈ F0 are two primary acts under which the expected probabilities of each

consequence x ∈ X are identical. Act f first-order stochastically dominates act g if and only if

any ambiguity-averse DM weakly prefers f to g, i.e., W(f) ≥ W(g), under every increasing utility

function U and every twice-differentiable increasing concave outlook function Γ.

The idea of stochastic dominance with respect to ambiguity can be developed further to define

second-order stochastic dominance.

Definition 4. Let f, g ∈ F0 be two primary acts under which the expected probabilities of each conse-

quence x ∈ X are identical. Act f second-order stochastically (“cumulatively”) dominates act g with

respect to ambiguity if and only if∫ x

−∞

∫ z

−∞E [φf (t)] Var [φf (t)] dtdz ≤

∫ x

−∞

∫ z

−∞E [φg (t)] Var [φg (t)] dtdz

for any x ∈ X .

Notice that, similarly to stochastic dominance with respect to risk, first-order stochastic dominance

with respect to ambiguity implies a second-order stochastic dominance. As with first-order stochas-

tic dominance, it can be shown that there is a tight relationship between second-order stochastic

dominance with respect to ambiguity and preferences concerning ambiguity.

5 Ambiguity measurement

The well defined ordering of events by their degree of ambiguity paves the way for defining an ordering

of acts by their degree of ambiguity—a necessary step toward extracting a measure of ambiguity (over

acts). To inspect the impact of ambiguity, this ordering is preformed over acts with identical properties

except for their degree of ambiguity. That is, they have the same set of possible consequences with

the same expected probability (implying the same risk) such that the only difference between them

is the dispersion of probabilities around their expectation. The ordering of acts by their degree of

ambiguity, as revealed from the DM’s subjective choices, can then be defined as follows.

Definition 5. Let f, g ∈ F0 be two primary acts whose expected probabilities of any given consequence

x ∈ X are identical. Act g is more ambiguous than act f if and only if

f %1 g,

by any ambiguity-averse DM.

13

To validate a measure of ambiguity, it has to be shown that ordering acts by the proposed measure

coincides with the ordering provided by a DM. The next theorem proposes a new measure of ambiguity.

It asserts that the degree of ambiguity associated with an act can be measured by the expected

volatility of its related probabilities.

Theorem 5. Assume an ambiguity-averse DM whose preferences satisfy the conditions of Theorem 2

and whose reference point is k = −∞. Then

f %1 g ⇐⇒ f2 [f ] ≤ f2 [g] ,

where

f2 [f ] = 4

∫XE [φf (x)] Var [φf (x)] dx,

f, g ∈ F0 are primary acts under which the expected probabilities of each consequence x ∈ X are

identical, and where the probabilities of each consequence x ∈ X are uniformly or elliptically distributed

(with the same characteristic generator).

This theorem ties the measure of ambiguity, denoted f2 (mho2), to preferences concerning ambi-

guity. The idea that ambiguity—the uncertainty about probabilities—takes the form of probability

perturbations and that aversion to ambiguity takes the form of aversion to mean-preserving spreads

in probabilities underpins the construction of f2. Thereby, just as the degree of risk can be measured

by the volatility of outcomes, so too can the degree of ambiguity be measured by the volatility of

probabilities. Theorem 5 proves that if two acts are identical except in their degree of ambiguity, then

any ambiguity-averse DM prefers the act with the lower f2 over the act with the higher f2. That

is, she prefers the act whose associated probabilities are on average less volatile (less spread) over

the act whose associated probabilities are on average more volatile (more spread).24 Therefore, the

measure f2 aggregates the variances of probabilities, which measures the dispersions of probabilities

of each outcome, while assigning the variance of the probability of each outcome a weight relative to

its expected probability.

Theorem 5 assumes that the DM’s reference point is k = −∞, which means that all outcomes are

considered favorable such that all outcomes are assigned with a positive utility. This can be viewed

as if the utility function is normalized such the minimal utility is 0. Note that when the utility is

always positive the Choquet expected utility of Schmeidler (1989) is obtained. This assumption can be

replaced by the assumption that the outcomes of acts are symmetrically distributed. Such a symmetry

in a framework with uncertain probabilities is defined as follows.

24Jewitt and Mukerji (2011), for example, study the ranking of ambiguous acts as revealed by preferences, based uponthe smooth model of Klibanoff et al. (2005).

14

Definition 6. The outcomes of an act f ∈ F0 are said to be symmetrically distributed around a point

of symmetry k if

E [φf (k − x)] = E [φf (k + x)] and Var [φf (k − x)] = Var [φf (k + x)]

for any x ∈ X .

With this definition of symmetry in place, Theorem 5 can be restated. It is important to note

that for measuring ambiguity by the following theorem, the more restrictive assumption of normally

distributed outcomes, which allows measuring risk by variance, can be relaxed to only symmetry.

Theorem 6. Assume an ambiguity-averse DM whose preferences satisfy the conditions of Theorem 2.

Then

f %1 g ⇐⇒ f2 [f ] ≤ f2 [g] ,

where f, g ∈ F0 are primary acts whose outcomes are symmetrically distributed around a reference

point k, with identical expected probabilities of each consequence x ∈ X , and where the probabilities

of each consequence x ∈ X are uniformly or elliptically distributed (with the same characteristic

generator).

The measure of ambiguity f2 carries the unites of squared probabilities. A normalized (to the

units of probability) measure can then be simply defined by

f [f ] = 2

∫XE [φf (x)] Var [φf (x)] dx.

The measure f2 in Theorems 5 and 6 considers acts taking infinitely many values in an infinite state

space. This measure, however, can be applied to acts taking finitely many values in a finite state

space. In this case it takes the form

f2 [f ] = 4∑i

E [φf (xi)] Var [φf (xi)] .

This measure can also be applied to any nonempty subset Y ⊂ X ⊆ R of consequences

f2 [f,Y] = 4

∫YE [φf (y)] Var [φf (y)] dy,

or to any given event E ∈ E

f2 [f,E] = 4

∫f−1(x)∈E

E [φf (x)] Var [φf (x)] dx.

Similarly to risk, stochastic dominance (with respect to ambiguity) is closely related to ambiguity

measurement. The next proposition ties the notion of stochastic dominance with respect to ambiguity

and the measure of ambiguity f2.

15

Proposition 3. Suppose that the conditions of Theorem 5 or of Theorem 6 are satisfied. Let f, g ∈ F0

be two primary acts under which the expected probabilities of each consequence x ∈ X are identical

and are uniformly or elliptically distributed (with the same characteristic generator). Then, g is more

ambiguous than f , i.e., f2 [g] ≥ f2 [f ], if and only if g is first-order stochastically dominated by f

with respect to ambiguity.

This proposition implies that if for any consequence the cumulative probability uncertainty asso-

ciated with act f is preferred (by an ambiguity-verse DM) to the cumulative probability uncertainty

associated with act g, then g is more ambiguous than f . It shows that the ordering of acts by first-

order stochastic dominance with respect to ambiguity coincides with their ordering by the degree of

ambiguity, measured by f2. The next section further investigates the special properties of f2.

6 Properties of f2

It is worth opening this section with an example that demonstrates some properties of the measure f2.

Consider, first, a large urn with 30 colored balls which are either black or yellow, in an unknown

proportion. Assume that drawing a black ball (B) entitles the DM to a sum of $0, and a yellow

ball (Y ) entitles her to a sum of $1. The probability of B (an unfavorable event) can be one of the

values 030 ,

130 , . . . ,

3030 , where the DM is assumed to act as if each is equally likely. Thus, the degree of

ambiguity (in units of probability) is f = 0.596. Now, consider a smaller urn with only 10 colored

balls which are either black or yellow, in an unknown proportion. The ambiguity associated with a

bet on the color of a ball drawn from this urn is higher than in the large urn; it is f = 0.632. If

there is only one ball in the urn, of unknown color, then f = 1, and in the other extreme case, if

there is an infinite number of balls in the urn, then f = 1√3. Table 2 is a stylized description of these

variations. The perceived probabilities and the value of each alternative are computed, respectively,

by Equations (3) and (2), assuming a DM whose preferences concerning ambiguity are represented

by Γ (P (E)) =»P (E), her preferences concerning risk are represented by U (c) = 1 − e−c, where c

stands for consumption, and her reference point is k = 0.25

It can be observed that a larger number of possible probability values, which in this case are

uniformly spread over the interval [0, 1], implies a lower degree of ambiguity. To see the intuition for

this, notice that, since the consequence of the favorable (unfavorable) event is identical for all bets,

when the DM makes her choice over urns, she actually bets on the composition of the urn rather

than on the consequence. Suppose she chooses to bet on the 10-ball urn and that her probability

(proportion of balls) assessment is wrong. The minimal size of her error (in terms of probability) is

25The utility function U is normalized such that U (k) = 0 , when k = 0.

16

#Balls

Total Y B P Q V f

30 15 15 1530 0.500 0.316 0.000

∞ 0 . . .∞ 0 . . .∞ 0 . . . 1 0.445 0.281 0.577

30 0, . . . , 30 0, . . . , 30 030 ,

130 , . . . ,

3030 0.435 0.275 0.596

10 0, . . . , 10 0, . . . , 10 010 ,

110 , . . . ,

1010 0.417 0.263 0.632

1 0, 1 0, 1 0, 1 0.250 0.158 1.000

Table 1: Degrees of ambiguity

110 . If, however, she chooses to bet on the 30-ball urn and her probability assessment is wrong, the

minimal size of her error is only 130 . Accordingly, the degree of ambiguity of the 30-ball urn is lower

than the degree of ambiguity of the 10-ball urn. The next observation defines the property that arises

from this intuition formally.

Observation 1. Suppose a finite set of probability measures P such that the probability density φ (x)

of each x ∈ X is uniformly distributed over [ax, bx]. Then, the higher the cardinality n of P, the lower

the degree of ambiguity f2 [f ] of any f ∈ F0.

The next observation identifies the range of possible values that the ambiguity measure f2 can

obtain.

Observation 2. The values of the ambiguity measure f2 are always between 0 and 1.

The minimal possible degree of ambiguity, f2 = 0, is attained when all probabilities are perfectly

known. The maximal possible degree of ambiguity, f2 = 1, is attained when there are two possible

outcomes and the probability of each is either 0 or 1 with equal odds. In this most extreme case, the

weighted cumulative variance of probabilities attains its maximal possible value, 14 ; see Observation 2.

Variances of probabilities are therefore normalized by 4 to provide an ambiguity measure ranging

between 0 and 1.

It is important to note that f2 is an objective measure of ambiguity. It does not depend upon

the reference point, k, which determines the sets of unfavorable and favorable events. It also does not

depend upon the DM’s subjective preferences. However, the most important property of f2 is stake

independency. Given an event, its degree of ambiguity, measured by f2, is invariant to the consequence

of this event. That is, given an event, changing its associated (by an act) consequence does not affect

the degree of ambiguity of this event. Consider, for example, an event with an unknown probability of

winning $100. Changing the magnitude of gain to $1000 does not affect either its perceived probability

17

or its degree of ambiguity. This property of stake independency is of primary importance, as it allows

for the measurement of ambiguity independently of risk.

An interesting property of the measure f2 is that ambiguity may be “canceled out”.26 This

can happen when composing a “portfolio” of acts. To see this, consider the two binary acts f =ÄE : 1, EC : 2

äand g =

ÄEC : 1, E : 2

ä, where EC stands for the complementary event of E. Even if

separately each act has a strictly positive degree of ambiguity, a portfolio consisting of only these two

acts has a zero degree of ambiguity (and in this extreme case, also a zero degree of risk). The reason

is that E ∪ EC = S and the probability P (S) of S is always exactly one, which in turn implies that

the degree of ambiguity of the entire state space (measured by 4Var [P (S)]), as well as the degree

of ambiguity of an empty subset of the state space, is zero. In a less extreme scenario, consider two

acts f =ÄEC : 0, E : x

äand g =

ÄFC : 0, F : x

äwhere E and F are mutually exclusive events. In

this case, the ambiguity associated with the outcome x may be lower under the portfolio {f, g} than

under each act f or act g separately. This may happen when the possible probabilities of events E

and F are negatively correlated. To see this, the ambiguity associated with x under {f, g} can be

written f2 [{f, g} , x] = f2 [{f, g} , E ∪ F ] = 4Var [Pf (E)] + 4Var [Pg (F )] + 8Cov [Pf (E) ,Pg (F )] =

f2 [f,E]+f2 [g, F ]+8Cov [Pf (E) ,Pg (F )]. An important conclusion arises from this example is that

unpacking an event E ∪ F into disjoint components E and F with different outcomes increases its

cumulative ambiguity when Cov [P (E) ,P (F )] < 0.27 Note that the ambiguity associated with a union

of an event and its complementary event is always zero, since the probability of an event is perfectly

negatively correlated with the probability of its complementary event; see Lemma 1.

The Ellsberg’s three-color experiment can also be viewed as demonstrating the effect that unpack-

ing events has on the degree of ambiguity. In this experiment the DM is presented with an urn. She

is told that the urn contains 90 colored balls, 30 of them red and the others either black or yellow

in an unknown proportion. A ball will be drawn from the urn at random and the prize for a correct

bet is $100. The experiment consists of two parts. In the first part, the DM has to choose between

two bets: the next drawn ball is red (R), or the next drawn ball is black (B), formed respectively by

acts f and g. Then, in the second part, the DM has to choose between betting that the next drawn

ball is red or yellow (RY ) or, alternatively, that the next drawn ball is black or yellow (BY ), formed

respectively by acts f∗ and g∗. The DM in this example does not have any information indicating

which of the possible urn compositions (probabilities) is more likely, and thus she acts as if she assigns

an equal weight to each possibility. The following table formalizes this experiment in terms of acts

26This notion coincides with Epstein and Zhang’s (2001) and Siniscalchi’s (2009) notion of complementarity.27Support theory, of Tversky and Koehler (1994) and Rottenstreich and Tversky (1997), documents that the judged

probability of an event generally increases when its description is unpacked into disjoint components and decreases byunpacking its alternative description.

18

and summarizes the degree of ambiguity associated with each event and each act. The ambiguity of

the events with the high payoff that are relevant to each act are underlined.

Prize ($) Event f Act f

Act R Y B R Y B RY BY

f 100 0 0 0.000 0.233 0.233 0.329 0.000 0.000

g 0 0 100 0.000 0.233 0.233 0.329 0.000 0.584

f∗ 100 100 0 0.000 0.233 0.233 0.329 0.000 0.584

g∗ 0 100 100 0.000 0.233 0.233 0.329 0.000 0.000

Table 2: Ellsberg’s three-color experiment

Behavioral experiments have demonstrated that individuals usually prefer R over B and BY over

RY ; formally, f %1 g and g∗ %1 f∗.28 It can be observed from Table 2 that, aligned with Theorem 6,

f [f ] < f [g] and f [g∗] < f [f∗]. This means that DMs usually prefer the less ambiguous bet, implying

an ambiguity-aversion behavior. Table 2 demonstrates that under act g event BY is unpacked into

events B and Y such that the ambiguity associated with act g is higher than that associated with act

f . Under act f∗ event BY can also be viewed as unpacked such that the ambiguity associated with

f∗ is higher than that associated with act g∗.

The ambiguity measure f2, extracted in Theorems 5 and 6, measures the degree of ambiguity

at the highest possible accuracy. It measures the volatility of probabilities in the resolution of each

possible outcome separately. Sometimes such a resolution is not required and a simpler measure can

be defined at the expense of accuracy. For example, at the expense of a loss of some information, the

measure of ambiguity can be applied over only the two fundamental events: unfavorable and favorable

events. That is,

f2 [f ] = 4Var [Pf (UF )] = 4Var [Pf (FV )] ,

where Pf (UF ) is the probability of the unfavorable event UF = {s ∈ S | f(s) ≤ k} under act f , and

Pf (FV ) is the probability of the favorable event FV = {s ∈ S | f(s) > k} under act f .

7 Alternative measures of ambiguity

Since the seminal works of Knight (1921) and Ellsberg (1961) several attempts have been made to

define a measure of ambiguity. Hansen et al. (1999), Hansen and Sargent (2001) and Maccheroni et

al. (2006), for example, interpret relative entropy as a measure of ambiguity (or of model uncertainty).

28In expected utility theory, the DM’s assessments of the likelihoods of R, B and Y can be described by some probabilitymeasure P. The DM is assumed to prefer a greater chance of winning $100 to a smaller chance of winning $100, suchthat the choices above imply that P (R) > P (B) and P (B ∪ Y ) > P (R ∪ Y ). However, since R, B and Y are mutuallyexclusive events, no such conventional probability measure exists; hence, it is considered a paradox.

19

Relative entropy, also called the Kullback-Leibler distance, is measured by the deviation of a proba-

bility distribution from a reference distribution (reference model). Formally, the relative entropy of

probability distribution P with respect to distribution Q is defined by

DKL (P |Q) =

∫ ∞

−∞p (x) ln

p (x)

q (x)dx,

where p and q are respectively the probability densities of P and Q. While the use of relative entropy

is restricted to cases of a single prior relative to a known true probability distribution, f2 can be

employed in cases of multiple priors where either a single true probability distribution does not exist

or it is not known.

Sometimes the literature takes the variance of variance or the variance of mean as measures of

ambiguity (see, for example, Maccheroni et al. (2013)). The measure f2 is broader than either

of these measures in that it accounts for both, as well as for the variance of all higher moments

of the probability distribution (i.e., skewness, kurtosis, etc.) through the variance of probabilities.

Furthermore, f2 solves some major issues that arise from the exclusive use of either the variance of

variance or the variance of mean as measures of ambiguity.

To illustrate a major drawback of using the variance of mean as a measure of ambiguity, consider

the following two bets: a bet A with the outcomes x = (−1, 0, 1) and, respectively, probabilities

P = (0.5, 0, 0.5), and a bet B with the same outcomes, but with two equally likely possible probability

distributions P1 = (0.4, 0.2, 0.4) and P2 = (0.3, 0.4, 0.3). The expected outcome of bet A is EA [x] = 0.

The expected outcome of bet B is either EB [x |P1] = 0 or EB [x |P2] = 0, respectively contingent

upon the probability distributions P1 and P2. Measuring ambiguity by the variance of mean indicates

that both A and B have a zero degree of ambiguity, i.e., both are unambiguous. However, by definition

(and as f2 indicates), B, which has a positive degree of ambiguity, is more ambiguous than A, which

clearly is unambiguous.

The use of the variance of variance as a measure of ambiguity also bears a major drawback.

To illustrate this, one can take the following example: a bet A with the outcomes x = (−1, 0, 1)

and, respectively, probabilities P = (0.48, 0.04, 0.48), and a bet B with the same outcomes but with

two equally likely possible probability distributions P1 = (0.6, 0, 0.4) and P2 = (0.4, 0, 0.6). The

variance of the outcomes of bet A is VarA [x] = 0.96. The variance of the outcomes of bet B is either

VarB [x |P1] = 0.96 or VarB [x |P2] = 0.96, respectively contingent upon the probability distributions

P1 and P2. Measuring ambiguity by the variance of variance indicates that both A and B have a zero

degree of ambiguity. While, by definition (and as f2 indicates), B is more ambiguous than A.

Both variance of variance and variance of mean are functions of outcomes, which makes them

stake dependent. As such, neither of these two measures allow for the measurement of the degree

20

of ambiguity in isolation from the degree of risk. Consider, for example, an event with an unknown

probability of winning $100. One would expect that changing the magnitude of gain to $1000 affects

neither the perceived probability of that event nor its degree of ambiguity. This requirement, however,

is not satisfied by the variance of variance or by the variance of mean. Both will indicate that the

bet with the $1000 prize is more ambiguous than the bet with the $100 prize, even though both are

bets on the same event and the change of its associated outcome from $100 to $1000 does not provide

any new information about its likelihood. On the other hand, as a stake-independent measure of

ambiguity, f2 will indicate that both bets have the same degree of ambiguity. The reason is that,

while variance of variance and variance of mean are functions of outcomes, and therefore subject to

risk, f2 is solely a function of probabilities. This means that f2 is not affected by the magnitude

or the sign of consequences. Increasing or decreasing the consequences of an act does not change its

degree of ambiguity, but it does change its degree of risk. Stake independency is a major advantage

of f2. This property is of primary importance because it allows for the measurement of the degree

of ambiguity independently of risk, as well as for the detection of the implications of ambiguity in

isolation from risk, in empirical and behavioral studies.

The point to emphasize is that a decision-making process considers not only the degree of ambiguity

but also the degree of risk. Hence, when making choices, these two factors jointly play a role. For

example, a consolidated uncertainty measure that aggregates risk and ambiguity can be defined by

Υ (x) =

√Var [x]

1− f2 [x],

where the variance of outcomes Var [x] is taken using expected probabilities (see, Izhakian (2012)).

Namely, the variance of outcomes is defined by

Var [x] =

∫E [φ (x)]

Äx− E [x]

ä2dx,

and the expected outcome is defined by the double expectation (with respect to probabilities and to

outcomes)

E [x] =

∫E [φ (x)]xdx.

8 Application for asset pricing

To demonstrate the qualities of f2, this section presents an application of the theory to asset pricing.

The prices that financial decision makers (investors) are willing to pay for assets could be affected

by the fact that they do not know the precise probabilities of future returns. They might require a

premium for bearing ambiguity in addition to the premium they require for bearing risk.

The risk premium can be viewed as the premium that a DM is willing to pay for exchanging a risky

21

bet for its expected outcome. The ambiguity premium can be viewed as the premium she is willing to

pay for exchanging an ambiguous bet for a risky but unambiguous bet that has an identical expected

outcome.29 The uncertainty premium can be viewed as the total premium that a DM is willing to pay

for exchanging an ambiguous bet for its expected outcome, i.e., the accumulation of the risk premium

and the ambiguity premium. In this view, the uncertainty premium, denoted K, can be defined by

U (E [x]−K) ≈∫ k

−∞U(x)

ÇE [φ (x)]− Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)]

ådx+ (5)∫ ∞

kU(x)

ÇE [φ (x)] +

Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)]

ådx,

where x is the outcome of some act f and φ (x) is its probability (under act f); and C = E [x] − K

is the certainty equivalent satisfying C ∼1 f . That is, C is the constant sure outcome for which

the DM is willing to exchange a risky and ambiguous (uncertain) outcome of act f . The next theo-

rem approximates the uncertainty premium and separates it into a risk premium and an ambiguity

premium.30

Theorem 7. Assume a DM whose preferences are characterized by a twice-differentiable utility func-

tion U and a twice-differentiable outlook function Γ. For relatively small outcomes with relatively small

probabilities the uncertainty premium is

K ≈ −1

2

U′′ (E [x])

U′ (E [x])Var [x]− E

ñΓ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])

ôEî|x− E [x]|

óf2 [x] , (6)

where the former is the risk premium and the latter is the ambiguity premium.31

Concerning financial decisions, consequences can be described by rates of return, denoted r. As-

sume a DM who decides to save one unit of wealth and invest it in a uncertain (risky and ambiguous)

portfolio. The uncertainty premium in this case takes the following form.32

Corollary 2. Suppose that the conditions of Theorem 7 hold. The uncertainty premium, in terms of

rate of return, takes the form

K ≈ −1

2

U′′ (1 + E [r])

U′ (1 + E [r])Var [r]− E

ñΓ′′ (1− E [P (r)])

Γ′ (1− E [P (r)])

ôEî|r − E [r]|

óf2 [r] . (7)

This model (and Theorem 7) provides two distinctions. First, it distinguishes between risk and

ambiguity premiums such that these two premiums are orthogonal. Second, within each premium it

29The ambiguity premium can also be viewed as the price that a DM is willing to pay for the information about thetrue probabilities of events.

30The proof of this theorem applies the same methodology to probabilities as used by Arrow (1965) and Pratt (1964)for consequences.

31Formally, EîΓ′′(1−E[P(x)])Γ′(1−E[P(x)])

ó=∫X E [φ (x)] Γ′′(1−E[P(x)])

Γ′(1−E[P(x)])dx and E

[|x− E [x]|

]=∫X E [φ (x)] |x− E [x]| dx.

32This representation is obtained by applying the same development of Theorem 7, where x = 1 + r.

22

distinguishes between the sources of premiums—preferences and beliefs. The risk premium,

R ≈ −1

2

U′′ (1 + E [r])

U′ (1 + E [r])Var [r] ,

is the Arrow-Pratt risk premium. Independently, a higher risk, measured by Var [r], or a higher

aversion to risk, measured by the coefficient of absolute risk aversion −U′′(·)U′(·) , result in a greater risk

premium.

The ambiguity premium,

A ≈ −EñΓ′′ (1− E [P (r)])

Γ′ (1− E [P (r)])

ôEî|r − E [r]|

óf2 [r] ,

possesses attributes resembling those of the risk premium, but with respect to probabilities rather

than to consequences. A complete separation between ambiguity, measured by f2, and tastes for

ambiguity, measured by the coefficient of absolute ambiguity aversion −Γ′′(·)Γ′(·) , is achieved. Ambiguity

aversion (−Γ′′(·)Γ′(·) > 0) implies a positive ambiguity premium. Ambiguity loving (−Γ′′(·)

Γ′(·) < 0) implies

a negative premium. Ambiguity neutrality (−Γ′′(·)Γ′(·) = 0) implies a zero premium, obtained also when

probabilities are perfectly known (i.e., when f2 = 0). Higher degree of ambiguity or a higher aversion

to ambiguity result in a greater ambiguity premium. The ambiguity premium is also a function of

the expected absolute deviation of outcomes from expectation. This component scales the ambiguity

premium to the units of outcomes. For example, one may consider the case of measuring the ambiguity

premium in terms of dollars versus in terms of percentage rate of return.

The next corollary shows the different premiums in the case of a DM typified by constant relative

risk aversion (CRRA) and CAAA.

Corollary 3. Suppose that the conditions of Theorem 7 hold, and assume a DM who is characterized

by CRRA, U(c) =

c1−γ−k1−γ

1−γ , γ = 1

ln (c)− ln (k) , γ = 1, and CAAA, Γ (P (E)) = − e−ηP(E)

η .33 The uncertainty

premium is then

K ≈ γ1

2Var [r] + ηE

î|r − E [r]|

óf2 [r] .

Several studies have documented ambiguity-averse behavior concerning gains (favorable events)

and ambiguity-loving behavior concerning losses (unfavorable events); see, for example, Maffioletti

and Michele (2005), Abdellaoui et al. (2005), and Du and Budescu (2005). The ambiguity premium,

constructed in Corollary 2, can be refined to support different ambiguity preferences concerning un-

33A more standard formulation of CRRA, U (c) = c1−γ

1−γfor γ = 1 and otherwise for γ = 1 U (c) = ln (c), is not always

normalized to U (k) = 0.

23

favorable and favorable events. Allowing this flexibility, the ambiguity premium takes the form

A ≈ −ñ∫ k

−∞E [φ (r)]

Γ′′UF (1− E [P (r)])

Γ′UF (1− E [P (r)])

dr +

∫ ∞

kE [φ (r)]

Γ′′FV (1− E [P (r)])

Γ′FV (1− E [P (r)])

dr

ôEî|r − E [r]|

óf2 [r] ,

where ΓUF (·) captures ambiguity preferences concerning unfavorable events and ΓFV (·) captures

ambiguity preferences concerning favorable events.

The implications of ambiguity for the equity premium have been studied mainly by focusing on

theoretical aspects. Chen and Epstein (2002), Izhakian and Benninga (2011), Ui (2011), and Mac-

cheroni et al. (2013) add an ambiguity premium to the conventional risk premium.34 In these models

the ambiguity premium is also a function of risk attitude; whereas in the model of Equation (7), the

ambiguity premium is independent of risk attitude.

The pricing model of Equation (7) has been tested empirically by Brenner and Izhakian (2011).

This study of the risk–ambiguity–return relationship employs the measure of ambiguity, f2, as an

explanatory factor of the aggregate return on the stock market. To do so, it assumes that each

subset of stock returns is generated by the choices of a single representative DM conditional upon a

different prior P within her subjective set of priors P.35 The probability distribution of returns in

each subset is then estimated to reveal the set of priors. Assuming some structure on second-order

beliefs, Brenner and Izhakian (2011) compute f2 from the data and investigate its effect on stock

market returns. They find that ambiguity has a significant impact on expected returns. Their study

provides a possible explanation for the equity premium puzzle, demonstrating that f2 can be useful in

empirical studies of the implications of ambiguity.

9 Conclusion

Almost any real-life decision entails ambiguity. Naturally, one of the first steps of a decision-making

process is to rank alternative choices by their degree of ambiguity. The key to addressing this need

is a simple well-defined measure of ambiguity. The search for such a measure that can quantify the

degree of ambiguity associated with different alternatives can be viewed as having started with the

seminal study of Knight (1921). The measure of ambiguity introduced in this paper aims to address

this need. Ambiguity in this paper takes the form of probability perturbation (uncertain probabilities)

and aversion to ambiguity the form of aversion to mean-preserving spreads in these probabilities. In

this view, just as the degree of risk can be measured by the volatility of outcomes, so too can the

degree of ambiguity be measured by the volatility of probabilities. This concept provides a natural

objective stake-independent ambiguity measure, denoted f2, which is simply four times the expected

34Segal and Spivak (1990) also analyze the ambiguity premium, which they call a premium of order 2.35A representative DM can be defined as an artificial DM whose tastes and beliefs are such that if all investors in the

economy had tastes and beliefs identical to hers the equilibrium in the economy remains unchanged; see, for example,Constantinides (1982).

24

volatility of probabilities across the relevant events.

The measure of ambiguity f2 has two main qualities. First, it is simple, applicable and can be

used for the empirical measurement of the degree of ambiguity. Second, it is an objective stake-

independent measure. That is, it is independent of risk and independent of individuals’ preferences.

These qualities are of primary importance for introducing ambiguity into theoretical, behavioral and,

especially, empirical studies. The importance of ambiguity—the uncertainty about probabilities—

for understanding economic and financial decision processes has being recognized in the literature

for the past half century. Relevant studies have acknowledged that attempts to portrait a realistic

picture of observable phenomena and anomalies should consider also the dimension of uncertainty with

respect to probabilities. Accounting for ambiguity might shed light on many economic and financial

phenomena that previously could not be fully explained. The measure of ambiguity introduced in

this paper can be employed for this mission. For example, it can be employed for investigating the

nature of the risk-ambiguity relationship and its implication for optimal decision making. Hopefully,

this measure will pave the way not only for the introduction of ambiguity into empirical studies, but

also for the expansion of theoretical and behavioural studies regarding the nature of ambiguity and

related preferences.

25

References

Abdellaoui, M., F. Vossmann, and M. Weber (2005) “Choice-Based Elicitation and Decomposition of DecisionWeights for Gains and Losses Under Uncertainty.,” Management Science, Vol. 51, No. 9, pp. 1384–1399.

Anderson, E. W., E. Ghysels, and J. L. Juergens (2009) “The Impact of Risk and Uncertainty on ExpectedReturns,” Journal of Financial Economics, Vol. 94, No. 2, pp. 233–263.

Arrow, K. J. (1965) Aspects of the Theory of Risk Bearing, Helsinki: Yrjo Jahnssonin Saatio.

Bewley, T. F. (2011) “Knightian Decision Theory and Econometric Inferences,” Journal of Economic Theory,Vol. 146, No. 3, pp. 1134–1147.

Bollerslev, T., R. F. Engle, and J. M. Wooldridge (1988) “A Capital Asset Pricing Model with Time-VaryingCovariances,” Journal of Political Economy, Vol. 96, No. 1, pp. 116–131.

Bollerslev, T., N. Sizova, and G. Tauchen (2011) “Volatility in Equilibrium: Asymmetries and Dynamic Depen-dencies,” Review of Finance, Vol. 16, No. 1, pp. 31–80.

Boyle, P. P., L. Garlappi, R. Uppal, and T. Wang (2011) “Keynes Meets Markowitz: The Tradeoff BetweenFamiliarity and Diversification,” Management Science, Vol. 58, pp. 1–20.

Brenner, M. and Y. Izhakian (2011) “Asset Prices and Ambiguity: Empirical Evidence,” Stern School ofBusiness, Finance Working Paper Series, FIN-11-010.

(2012) “Pricing Systematic Ambiguity in Capital Markets,” Stern School of Business, Finance WorkingPaper Series, FIN-12-008.

Chen, Z. and L. Epstein (2002) “Ambiguity, Risk, and Asset Returns in Continuous Time,” Econometrica, Vol.70, No. 4, pp. 1403–1443.

Constantinides, G. M. (1982) “Intertemporal Asset Pricing with Heterogeneous Consumers and without DemandAggregation,” The Journal of Business, Vol. 55, No. 2, pp. 253–67.

Coval, J. D. and T. J. Moskowitz (1999) “Home Bias at Home: Local Equity Preference in Domestic Portfolios,”The Journal of Finance, Vol. 54, No. 6, pp. 2045–2073.

Dow, J. and S. R. d. C. Werlang (1992) “Uncertainty Aversion, Risk Aversion, and the Optimal Choice ofPortfolio,” Econometrica, Vol. 60, No. 1, pp. 197–204.

Du, N. and D. V. Budescu (2005) “The Effects of Imprecise Probabilities and Outcomes in Evaluating InvestmentOptions,” Management Science, Vol. 51, No. 12, pp. 1791–1803.

Ellsberg, D. (1961) “Risk, Ambiguity, and the Savage Axioms,” Quarterly Journal of Economics, Vol. 75, No.4, pp. 643–669.

Epstein, L. G. and S. Ji (2013) “Ambiguous Volatility and Asset Pricing in Continuous Time,” Review ofFinancial Studies, Vol. 26, No. 7, pp. 1740–1786.

Epstein, L. G. and M. Schneider (2008) “Ambiguity, Information Quality, and Asset Pricing,” The Journal ofFinance, Vol. 63, No. 1, pp. 197–228.

Epstein, L. G. and J. Zhang (2001) “Subjective Probabilities on Subjectively Unambiguous Events,” Econo-metrica, Vol. 69, No. 2, pp. 265–306.

Fernandez-Villaverde, J., P. Guerron-Quintana, J. F. Rubio-Ramırez, and M. Uribe (2010) “Risk Matters: TheReal Effects of Volatility Shocks,” American Economic Review, Vol. 101, pp. 2530–2561.

Gilboa, I. (1987) “Expected Utility with Purely Subjective Non-Additive Probabilities,” Journal of MathematicalEconomics, Vol. 16, No. 1, pp. 65–88.

Gilboa, I. and D. Schmeidler (1989) “Maxmin Expected Utility with Non-Unique Prior,” Journal of Mathemat-ical Economics, Vol. 18, No. 2, pp. 141–153.

Goetzmann, W. N. and A. Kumar (2008) “Equity Portfolio Diversification,” Review of Finance, Vol. 12, No. 3,pp. 433–463.

Goldberger, A. (1991) A Course in Econometrics: Harvard University Press, 1st edition.

26

Hansen, L. P. and T. J. Sargent (2001) “Robust Control and Model Uncertainty,” American Economic Review,Vol. 91, No. 2, pp. 60–66.

Hansen, L. P., T. J. Sargent, and T. D. Tallarini (1999) “Robust Permanent Income and Pricing,” The Reviewof Economic Studies, Vol. 66, No. 4, pp. 873–907.

Izhakian, Y. (2012) “Capital Asset Pricing under Ambiguity,” Stern School of Business, Economics WorkingPaper Series, ECN-12-02.

(2014) “Expected Utility with Uncertain Probabilies Theory,” SSRN eLibrary, 2017944.

Izhakian, Y. and S. Benninga (2011) “The Uncertainty Premium in an Ambiguous Economy,” The QuarterlyJournal of Finance, Vol. 1, pp. 323–354.

Jewitt, I. and S. Mukerji (2011) “Ordering Ambiguous Acts,” University of Oxford, Department of Economics,Economics Series Working Papers.

Ju, N. and J. Miao (2012) “Ambiguity, Learning, and Asset Returns,” Econometrica, Vol. 80, pp. 559–591.

Klibanoff, P., M. Marinacci, and S. Mukerji (2005) “A Smooth Model of Decision Making under Ambiguity,”Econometrica, Vol. 73, No. 6, pp. 1849–1892.

Knight, F. M. (1921) Risk, Uncertainty and Profit, Boston: Houghton Mifflin.

Maccheroni, F., M. Marinacci, and D. Ruffino (2013) “Alpha as Ambiguity: Robust Mean-Variance PortfolioAnalysis,” Econometrica, Vol. 81, pp. 1075–1113.

Maccheroni, F., M. Marinacci, and A. Rustichini (2006) “Ambiguity Aversion, Robustness, and the VariationalRepresentation of Preferences,” Econometrica, Vol. 74, No. 6, pp. 1447–1498.

Maffioletti, A. and M. Santoni (2005) “Do Trade Union Leaders Violate Subjective Expected Utility? SomeInsights From Experimental Data,” Theory and Decision, Vol. 59, No. 3, pp. 207–253.

Mehra, R. and E. C. Prescott (1985) “The Equity Premium: A Puzzle,” Journal of Monetary Economics, Vol.15, No. 2, pp. 145–161.

Owen, J. and R. Rabinovitch (1983) “On the Class of Elliptical Distributions and Their Applications to theTheory of Portfolio Choice,” The Journal of Finance, Vol. 38, No. 3, pp. 745–52.

Pratt, J. W. (1964) “Risk Aversion in the Small and in the Large,” Econometrica, Vol. 32, No. 1/2, pp. 122–136.

Rothschild, M. and J. E. Stiglitz (1970) “Increasing Risk: I. A Definition,” Journal of Economic Theory, Vol.2, No. 3, pp. 225–243.

Rottenstreich, Y. and A. Tversky (1997) “Unpacking, Repacking, and Anchoring: Advances in Support Theory,”Psychological Review, Vol. 104, No. 2, pp. 406–415.

Sarin, R. K. and P. P. Wakker (1992) “A Simple Axiomatization of Nonadditive Expected Utility,” Econometrica,Vol. 60, No. 6, pp. 1255–1272.

Savage, L. J. (1954) The Foundations of Statistics, New York, USA: Wiley.

Schmeidler, D. (1989) “Subjective Probability and Expected Utility without Additivity,” Econometrica, Vol.57, No. 3, pp. 571–587.

Segal, U. and A. Spivak (1990) “First Order Versus Second Order Risk Aversion,” Journal of Economic Theory,Vol. 51, No. 1, pp. 111–125.

Shiller, R. J. (1981) “Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends?”American Economic Review, Vol. 71, No. 3, pp. 421–436.

Siniscalchi, M. (2009) “Vector Expected Utility and Attitudes Toward Variation,” Econometrica, Vol. 77, No.3, pp. 801–855.

Tversky, A. and D. Kahneman (1992) “Advances in Prospect Theory: Cumulative Representation of Uncer-tainty,” Journal of Risk and Uncertainty, Vol. 5, No. 4, pp. 297–323.

Tversky, A. and D. J. Koehler (1994) “Support Theory: A Nonextensional Representation of Subjective Prob-ability,” Psychological Review, Vol. 101, pp. 547–567.

27

Ui, T. (2011) “The Ambiguity Premium vs. the Risk Premium under Limited Market Participation,” Reviewof Finance, Vol. 15, No. 2, pp. 245–275.

Uppal, R. and T. Wang (2003) “Model Misspecification and Under Diversification,” The Journal of Finance,Vol. 58, No. 1, pp. 2465–2486.

Wakker, P. and A. Tversky (1993) “An Axiomatization of Cumulative Prospect Theory,” Journal of Risk andUncertainty, Vol. 7, No. 2, pp. 147–175.

Wakker, P. (2010) Prospect Theory: For Risk and Ambiguity: Cambridge University Press.

Weil, P. (1989) “The Equity Premium Puzzle and The Risk-Free Rate Puzzle,” Journal of Monetary Economics,Vol. 24, No. 3, pp. 401–421.

28

Appendix

Lemma 1. The covariance between the probability of event E and the probability of its complementary

event EC satisfies

CovîP (E) ,P

ÄECäó

= −Var [P (E)] = −VarîPÄECäó

,

where CovîP (E) ,P

ÄECäó

=

∫P

ÄP (E)−E [P (E)]

äÄPÄECä−E

îPÄECäó ä

dχ is the covariance of

the probabilities of events E and EC .

Lemma 2. Assume a twice-differentiable outlook function Γ, satisfying

1

2

ÇΓ′′ (E [P (F )])

Γ′ (E [P (F )])Var [P (F )]− Γ′′ (E [P (E ∪ F )])

Γ′ (E [P (E ∪ F )])Var [P (E ∪ F )]

å≤ E [P (E)]

for any events E,F ⊆ S. Then

Q(F ) ≤ Q(E ∪ F ).

Lemma 3.

d

dφ (x)

∫ x

−∞φ (z) dz =

φ (x)

φ′ (x)

Lemma 4. Assume two secondary acts δE , δF ∈ ∆, whose resulting probabilities are uniformly dis-

tributed or elliptically distributed with an identical characteristic generator, and have an identical

expectation, i.e., E [P (E)] = E [P (F )]. Let Std [P (E)] and Std [P (F )] be, respectively, the standard

deviations of their resulting probabilities. Then

P (F )− E [P (F )] =d λ (P (E)− E [P (E)]) ,

where λ = Std[P(F )]Std[P(E)] .

Lemma 5. Let Z and ϵ be two random variables. If ϵ is mean-independent of z, then

E [Zϵ] = E [Z] E [ϵ] .

Lemma 6. Let Y and X be two random variables. If Y is mean-independent of X, then Y is also

mean-independent of Z = h(X), where h : R → R.

Lemma 7. The following mean-independencies hold:

(i) (φ (x)− E [φ (x)])2 is mean-independent of φ (x), implying that

Eîφ (x) (φ (x)− E [φ (x)])2

ó= E [φ (x)] E

î(φ (x)− E [φ (x)])2

ó;

(ii) Var [φ (x)] is mean-independent of x, implying that E [xVar [φ (x)]] = E [x]E [Var [φ (x)]];

(iii) Var [φ (x)] is mean-independent of P (x), implying that

29

E [P (x)Var [φ (x)]] = E [P (x)]E [Var [φ (x)]];

(iv) |x− E [x]| is mean-independent of E [P (x)], implying that

EîE [P (x)] |x− E [x]|

ó= EîE [P (x)]

óEî|x− E [x]|

ó.

Lemma 8. If Y is mean-independent of X, and φY , φX , φY,X exist, then∫ k

−∞

∫ k

−∞φY,X (y, x) yxdydx =

∫ k

−∞φY (y) ydy

∫ k

−∞φX (x)xdx

and ∫ ∞

k

∫ ∞

kφY,X (y, x) yxdydx =

∫ ∞

kφY (y) ydy

∫ ∞

kφX (x)xdx,

for any k ∈ R.

Proof of Lemma 1. Since P (E) is additive, PÄECä= 1 − P (E). Thus, the covariance between

P (E) and PÄECäcan be written

CovîP (E) ,P

ÄECäó

=

∫P(P (E)− E [P (E)])

ÄPÄECä− E

îPÄECäóä

dχ

=

∫P(P (E)− E [P (E)]) (E [P (E)]− P (E)) dχ,

and therefore

CovîP (E) ,P

ÄECäó

= −Var [P (E)] .

The second equality is obtained by

Var [P (E)] =

∫P(P (E)− E [P (E)])2 dχ =

∫P

ÄPÄECä− E

îPÄECäóä2

dχ = VarîPÄECäó

.

Proof of Lemma 2. By Theorem 1,

Q(E ∪ F )−Q(F ) ≈ E [P (E)] + E [P (F )] +1

2

Γ′′ (E [P (E ∪ F )])

Γ′ (E [P (E ∪ F )])Var [P (E ∪ F )]−

E [P (F )]− 1

2

Γ′′ (E [P (F )])

Γ′ (E [P (F )])Var [P (F )]

= E [P (E)] +1

2

Γ′′ (E [P (E ∪ F )])

Γ′ (E [P (E ∪ F )])Var [P (E ∪ F )]− 1

2

Γ′′ (E [P (F )])

Γ′ (E [P (F )])Var [P (F )] ,

which is nonnegative by the Lemma’s hypothesis.

Proof of Lemma 3. Let u = φ (z), then changing the integration variable provides∫ x

−∞φ (z) dz =

∫ φ(x)

φ(−∞)

u

φ′(z)du =

∫ φ(x)

φ(−∞)

u

φ′(φ−1(u))du.

Differentiating with respect to φ(x) gives

d

dφ(x)

∫ φ(x)

φ(−∞)

u

φ′(φ−1(u))du =

u

φ′(φ−1(u))

∣∣∣∣∣u=φ(x)

=φ(x)

φ′(x).

30

Proof of Lemma 4. Let y = P(E) − E [P (E)] and z = P(F ) − E [P (F )], and assume that event

F is more ambiguous than event E. To show that z =d λy it has to be proved that λy and z have an

identical probability characteristic function.

Consider, first, the case of uniformly distributed y and z. Since E [y] = 0 and E [z] = 0, then

y ∈ [−ay, ay] and z ∈ [−az, az], where ay and az are nonnegative. The characteristic function of z and

λy are, respectively,

ϕz (t) =eitaz − e−itaz

2itazand ϕλy (t) =

eitλay − e−itλay

2itλay.

Since E [z] = E [y] = 0 and y and z are uniformly distributed, one can write their standard deviations

λ =Std [z]

Std [y]=

Ã(2az)

2 /12

(2ay)2 /12

to show that az = λay. This implies that ϕz (t) = ϕλy (t), and therefore z =d λy.

Consider now the case of elliptically distributed z and λy. That is,

z ∼ el (E [z] ,Var [z] ,Ψ) and λy ∼ elÄλE [y] , λ2Var [y] ,Ψ

ä.

Therefore, the characteristic function of z and λy are respectively

ϕz (t) = eitE[z]Ψ

Å1

2t2Var [z]

ãand ϕλy (t) = eitλE[y]Ψ

Å1

2t2λ2Var [y]

ã.

Since ϕz and ϕλy have an identical characteristic generator Ψ, E [z] = λE [y] = 0 and Std [z] = λStd [y],

then ϕz = ϕλy, which implies z =d λy.

Proof of Lemma 5. The expectation E [Zϵ] of Zϵ over the joint distribution of Z and ϵ can be

taken first over the distribution of ϵ conditional upon Z, and then over the marginal distribution of

Z. That is,

E [Zϵ] = E [E [Zϵ|Z]] .

Then, Z can be passed out of the inner expectation, implying that

E [Zϵ] = E [ZE [ϵ|Z]] .

By mean-independence E[ϵ∣∣∣Z]

= E [ϵ]. Therefore,

E [Zϵ] = E [Z] E [ϵ] .

Proof of Lemma 6. See, Goldberger (1991) page 61, M1.

31

Proof of Lemma 7.

(i) One can write

φ (x)− E [φ (x)] =d E [φ (x)]− E [φ (x)] + ϵ.

Since E [ϵ|φ (x)] = E [ϵ] = 0, clearly ϵ and φ (x) are mean-independent. Let Var [φ (x)] = σ2, then

by construction E[ϵ2|φ (x)

]= σ2 = E

[ϵ2]. That is, ϵ2 is mean independent of φ (x), implying that

Eîφ (x) (φ (x)− E [φ (x)])2

ó= E [φ (x)] E

î(φ (x)− E [φ (x)])2

ó.

(ii) Writing the conditional expectation of Var [φ (x)] explicitly, provides

E [Var [φ (x)] |x] = EîEî(φ (x)− E [φ (x)])2

ó|xó.

By the law of iterated expectation36

EîEî(φ (x)− E [φ (x)])2

ó|xó

= EîEî(φ (x)− E [φ (x)])2 |x

óó.

By (i), (φ (x)− E [φ (x)])2 is mean-independent of φ (x). By Lemma 6, (φ (x)− E [φ (x)])2 is also

mean-independent of φ−1 (φ (x)) = x. Therefore,

EîEî(φ (x)− E [φ (x)])2 |x

óó= E

îEî(φ (x)− E [φ (x)])2

óó,

implying that E [xVar [φ (x)]] = E [x]E [Var [φ (x)]];

(iii) By (ii), Var [φ (x)] is mean-independent of x. By Lemma 6, Var [φ (x)] is also mean-independent

of the function P (x) of x. Therefore, E [P (x)Var [φ (x)]] = E [P (x)]E [Var [φ (x)]].

(iv) Write

x− E [x] =d E [x]− E [x] + ϵ.

Since E [ϵ|x] = E [ϵ] = 0, clearly ϵ is mean-independent x. By Lemma 6, ϵ is also mean-independent

E [P (x)]. Therefore, E [ϵ|E [P (x)]] = E [ϵ]. Mean-independent implies uncorrelatedness; see, for exam-

ple, Goldberger (1991), page 63, M2. Therefore, E [ϵE [P (x)]] = E [ϵ]E [E [P (x)]]. Since E [P (x)] ≥ 0

for every x,∣∣∣E [ϵE [P (x)]]

∣∣∣ = E [|ϵ|E [P (x)]] = E [|ϵ|]E [E [P (x)]]. Substituting for ϵ = x − E [x]

completes the proof.

Proof of Lemma 8. Define Z = h(X) such that z = x if x ≤ k and otherwise z = 0. Since

Y is mean-independent of X, by Lemma 6, it is also mean-independent of Z. Therefore, E [Y Z] =

E [Y ] E [Z] . Writing the expectation explicitly, provides∫ ∞

−∞

∫ ∞

−∞φY,Z (y, z) yzdydz =

∫ k

−∞φY (y) ydy

∫ k

−∞φZ (z) zdz +

∫ ∞

kφY (y) ydy

∫ k

−∞φZ (z) zdz,

36See, for example, Goldberger (1991) page 47, T8.

32

which implies ∫ k

−∞

∫ k

−∞φY,X (y, x) yxdydx =

∫ k

−∞φY (y) ydy

∫ k

−∞φX (x)xdx.

The second part is proved similarly.

Proof of Theorem 1. The perceived probability, Q(E), of event E ∈ E can be written

Q(E) = Γ−1 (Γ (E [P (E)]− Λ)) = Γ−1Å∫

PΓ (P (E)) dχ

ã, (8)

for some Λ ∈ R. Taking the first-order Taylor approximation of Γ (E [P (E)]− Λ) around E [P (E)]

yields

Γ (E [P (E)]− Λ) ≈ Γ (E [P (E)])− ΛΓ′ (E [P (E)]) . (9)

The second-order Taylor approximation of Γ (P (E)) in Equation (8) around E [P (E)] is

Γ (P (E)) ≈ Γ (E [P (E)]) + Γ′ (E [P (E)]) (P (E)− E [P (E)]) (10)

+1

2Γ′′ (E [P (E)]) (P (E)− E [P (E)])2 .

Since Γ (E [P (E)]), Γ′ (E [P (E)]) and Γ′′ (E [P (E)]) are constants, the expectation of Equation (10) is∫PΓ (P (E)) dχ ≈ Γ (E [P (E)]) +

1

2Γ′′ (E [P (E)]) Var [P (E)] . (11)

Equating (9) to (11) and organizing terms yields

Λ ≈ −1

2

Γ′′ (E [P (E)])

Γ′ (E [P (E)])Var [P (E)] .

Substituting Λ into Equation (8), together with Lemma 2 (that assures nonnegativity), proves the

theorem.

Proof of Theorem 2. By Wakker and Tversky (1993, Equation 6.1), the dual representation of

Equation (2) can be written

W (f) = −∫ k

−∞U(x) d

ïΓ−1

Å∫PΓ (1− Pf (x)) dχ

ã− 1

ò(12)

+

∫ ∞

kU(x) d

ïΓ−1

Å∫PΓ (1− Pf (x)) dχ

ãò.

The subscript f can be omitted to write

d

dxΓ−1 (E [Γ (1− P (x))]) = E

ñ− Γ′ (1− P (x))φ (x)

Γ′ (Γ−1 (E [Γ (1− P (x))]))

ôand to denote

D (φ (x)) = − Γ′ (1− P (x))φ (x)

Γ′ (Γ−1 (E [Γ (1− P (x))])).

33

By Lemma 3, differentiating D with respect to φ (x) provides

d

dφ (x)D (φ (x)) =

Γ′′ (1− P (x))φ2 (x)− Γ′ (1− P (x))

Γ′ (Γ−1 (E [Γ (1− P (x))]))−

Γ′ (1− P (x))φ (x) Γ′′ (Γ−1 (E [Γ (1− P (x))]))E [Γ′ (1− P (x))φ (x)]

(Γ′ (Γ−1 (E [Γ (1− P (x))])))3.

Notice that, since φ (z) is additive,

∫ x

−∞E [φ (z)] dz = E [P (x)]. Taking the first-order Taylor approx-

imation of D with respect to φ (x) around E [φ (x)] provides

E [D (φ (x))] ≈ D (E [φ (x)]) + E

ñd

dφ (x)D (E [φ (x)]) (φ (x)− E [φ (x)])

ô= −E [φ (x)] +

Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])Eîφ (x) (φ (x)− E [φ (x)])2

ó.

By Lemma 7, (φ (x)− E [φ (x)])2 is mean-independent of φ (x), which implies Eîφ (x) (φ (x)− E [φ (x)])2

ó=

E [φ (x)] Eî(φ (x)− E [φ (x)])2

ó. Therefore,

E [D (φ (x))] ≈ −E [φ (x)] +Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)] .

Substituting for E [D (φ (x))] in Equation (12), while accounting for the sign switch of U (x) E [φ (x)]

when moving from negative to positive utility across k (see Wakker and Tversky (1993)), provides

W (f) ≈∫ k

−∞U(x)

ÇE [φ (x)]− Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)]

ådx+∫ ∞

kU(x)

ÇE [φ (x)] +

Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)]

ådx.

Proof of Theorem 3. This proof considers ambiguity aversion; the proof for ambiguity loving is

similar.

(i+ii) Let y = P(E)−E [P (E)] and z = P(F )−E [P (F )], and assume that event F is more ambiguous

than event E. Then, by Definition 2, z =d y + ϵ, where ϵ is mean-independent of y; therefore

Var [P (F )] = Var [P (E)] + Var [ϵ] .

For the opposite direction, assume that Var [P (F )] ≥ Var [P (E)] and define λ = Std[P(F )]Std[P(E)] ≥ 1. By the

distributions of P (E) and P (F ), the random variables y and z are either uniformly distributed or

elliptically distributed with an identical characteristic generator and E [z] = E [y] = 0. The random

variable λy has the same characteristic function as z with E [λy] = E [z] = 0 and Var [z] = λ2Var [y].

Therefore, by Lemma 4, z =d λy. Next, write

x+ y = α (x+ λy) + (1− α)x,

where α = 1λ and x is a random variable satisfying E [x | y] = E [x] = 0. Then, since Γ is concave, by

34

the Jensen inequality

Γ (x+ y) ≥ αΓ (x+ λy) + (1− α) Γ (x) .

Taking expectations of both sides yields

E [E [Γ (x+ y) |x]] ≥ αE [E [Γ (x+ λy) |x]] + (1− α) E [Γ (x)] . (13)

Since E [λy] = 0, a concave Γ implies

E [Γ (E [x+ λy |x])] = E [Γ (x)] ≥ E [E [Γ (x+ λy) |x]] ,

which jointly with Equation (13) implies

E [E [Γ (x+ y) |x]] ≥ E [E [Γ (x+ λy) |x]] .

Let x = 0, then

E [Γ (y)] ≥ E [Γ (λy)] = E [Γ (z)] ,

which, by Izhakian (2014, Proposition 5), implies δE %2 δF .

(iii) Let Γ (P (E)) = − e−ηP(E)

η . Taking a second-order Taylor approximation around E [P (E)] yields

Γ (P (E)) ≈ −1 + (P (E)− E [P (E)])− 1

2(P (E)− E [P (E)])2 ,

and taking expectation yields

E [Γ (P (E))] ≈ −1− 1

2ηVar [P (E)] .

This implies that, concerning an ambiguity-averse DM,

E [Γ (P (E))] ≥ E [Γ (P (F ))] ⇐⇒ Var [P (E)] ≤ Var [P (F )] .

and, therefore, by Izhakian (2014, Proposition 5),

δE %2 δF ⇐⇒ Var [P (E)] ≤ Var [P (F )] .

(iv) Let Γ (P (E)) = − (P (E)− α)2, where P (E) ≤ α for some α ∈ R. Taking expectation provides

E [Γ (P (E))] = −ÄVar [P (E)] + (E [P (E)]− α)2

ä.

Since E [P (E)] = E [P (F )] = 0, then

E [Γ (P (E))] ≥ E [Γ (P (F ))] ⇐⇒ Var [P (E)] ≤ Var [P (F )] .

and, therefore, by Izhakian (2014, Proposition 5),

δE %2 δF ⇐⇒ Var [P (E)] ≤ Var [P (F )] .

35

Proposition 2 then completes the proof.

Proof of Theorem 4.

(⇐=) Suppose that V (f) ≥ V (g) but f does not stochastically dominate g with respect to ambiguity.

Then, there exists x∗ ∈ X such that

∫ x∗

−∞E [φf (z)] Var [φf (z)] dz >

∫ x∗

−∞E [φg (z)] Var [Pg (z)] dz.

Define U (x) such that U (x) = −1 if x < x∗ ≤ k and otherwise U (x) = 0. Assume CAAA, i.e.,

Γ (P (E)) = (P(E))1−η

1−η . Then, since E [φf (x)] = E [φg (x)] for every x ∈ X , by Equation (4)

V (f)−V (g) ≈ −η

∫ x∗

−∞E [φf (z)]

îVar [φf (z)]−Var [φg (z)]

ódz

Clearly V (f)−V (g) < 0, which is a contradiction.

(=⇒) Since E [φf (x)] = E [φg (x)] for every x ∈ X , by Equation (4)

V (f)−V (g) ≈ −∫ k

−∞U(x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)]

îVar [φf (x)]−Var [φg (x)]

ódx

+

∫ ∞

kU(x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)]


ódx

By Lemma 7, Var [φ (x)] is mean-independent of x as well as of P (x). Therefore, by Lemma 8,

V (f)−V (g) ≈ −∫ k

−∞E [φf (x)] U (x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])dx

∫ k

−∞E [φf (x)]


ódx

+

∫ ∞

kE [φf (x)] U (x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])dx

∫ ∞

kE [φf (x)]


ódx

Since U (x) ≥ 0 for x ≥ k, U (x) ≤ 0 for x ≤ k, and Γ′′(·)Γ′(·) ≤ 0, if act f first-order stochastically

dominates act g, then V (f)−V (g) ≥ 0.

Proof of Theorem 5. Since E [φf (x)] = E [φg (x)] for any x and k = −∞, then by Theorem 2

W (f)−W(g) ≈∫ ∞

−∞U(x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)]

ÄVar [φf (x)]−Var [φg (x)]

ädx.

By Lemma 7, Var [φf (x)] is mean-independent of x, as well as of Pf (x). Therefore,

W (f)−W(g) ≈∫ ∞

−∞E [φf (x)] U (x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])dx

∫ ∞

−∞E [φf (x)]


ädx.

Since

∫ ∞

−∞E [φf (x)] U (x)

Γ′′(1−E[Pf (x)])Γ′(1−E[Pf (x)])

dx ≤ 0, then

W (f) ≥ W(g) ⇐⇒∫ ∞

−∞E [φf (x)] Var [φf (x)] dx ≤

∫ ∞

−∞E [φf (x)] Var [φg (x)] dx

and, by Theorem 2,

f %1 g ⇐⇒∫ ∞

−∞f2 [f ] ≤ f2 [g] .

36

Proof of Theorem 6. Since E [φf (x)] = E [φg (x)] for any x, then by Theorem 2

W (f)−W(g) ≈ −∫ k

−∞U(x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)]


ädx

+

∫ ∞

kU(x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])E [φf (x)]


ädx.

By Lemma 7, Var [φf (x)] is mean-independent of x, as well as of Pf (x). Therefore, by Lemma 8,

W (f)−W(g) ≈ −∫ k

−∞E [φf (x)] U (x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])dx

∫ k

−∞E [φf (x)]


ädx

+

∫ ∞

kE [φf (x)] U (x)

Γ′′ (1− E [Pf (x)])

Γ′ (1− E [Pf (x)])dx

∫ ∞

kE [φf (x)]


ädx.

Since −∫ k

−∞E [φf (x)] U (x)

Γ′′(1−E[Pf (x)])Γ′(1−E[Pf (x)])

dx ≤ 0 and

∫ ∞

kE [φf (x)] U (x)

Γ′′(1−E[Pf (x)])Γ′(1−E[Pf (x)])

dx ≤ 0 then,

by the symmetry of outcomes around k,

W (f)−W(g) ≥ 0 ⇐⇒∫ k

−∞E [φf (x)]


ädx+∫ ∞

kE [φf (x)]


ädx ≤ 0,

which implies

W (f) ≥ W(g) ⇐⇒∫ ∞

−∞E [φf (x)] Var [φf (x)] dx ≤

∫ ∞

−∞E [φf (x)] Var [φg (x)] dx

and, by Theorem 2,

f %1 g ⇐⇒∫ ∞

−∞f2 [f ] ≤ f2 [g] .

Proof of Theorem 7. The first-order Taylor approximation of the LHS of Equation (5) with respect

to K, around 0, is

LHS = U(E [x]−K) =

∫ ∞

−∞E [φ (x)] U (E [x]−K) dx ≈

∫ ∞

−∞E [φ (x)]

(U(E [x])−KU′ (E [x])

)dx.

Writing the RHS of Equation (5) as

RHS =

∫ ∞

−∞E [φ (x)] U (x) dx︸︷︷︸

I

+Ç∫ ∞

kU(x) Γ′′(1−E[P(x)])

Γ′(1−E[P(x)]) E [φ (x)] Var [φ (x)] dx−∫ k

−∞U(x) Γ′′(1−E[P(x)])

Γ′(1−E[P(x)]) E [φ (x)] Var [φ (x)] dx

å︸︷︷︸

II

,

the second-order Taylor approximation of I with respect to x, around E [x], is then

I ≈∫ ∞

−∞E [φ (x)]

ÇU(E [x]) + U′ (E [x]) (x− E [x]) +

1

2U′′ (E [x]) (x− E [x])2

ådx

= U(E [x]) +1

2U′′ (E [x])Var [x] .

37

Taking the first-order Taylor approximation of II with respect to x, around E [x], provides37

II ≈ −∫ k

−∞

(U(E [x]) + U′ (E [x]) (x− E [x])

) Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)] dx

+

∫ ∞

k

(U(E [x]) + U′ (E [x]) (x− E [x])

) Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)] dx.

Since E [x] is relatively close to the reference point k and U (k) = 0, then U (E [x]) ≈ 0. Therefore,

II = U′ (E [x])

∫ ∞

−∞|x− E [x]| Γ

′′ (1− E [P (x)])

Γ′ (1− E [P (x)])E [φ (x)] Var [φ (x)] dx.

Since, by Lemma 7, Var [φ (x)] is mean-independent of x, as well as of P (x),

II = U′ (E [x])

∫ ∞

−∞E [φ (x)] Var [φ (x)] dx

∫ ∞

−∞E [φ (x)] |x− E [x]| Γ

′′ (1− E [P (x)])

Γ′ (1− E [P (x)])dx.

By Lemma 7 again, |x− E [x]| is also mean-independent of P (x). Therefore,

II = U′ (E [x])

∫ ∞

−∞E [φ (x)] Var [φ (x)] dx

∫ ∞

−∞E [φ (x)] |x− E [x]| dx

∫ ∞

−∞E [φ (x)]

Γ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])dx

Combining the LHS, the RHS, I and II, the uncertainty premium is

K ≈ −1

2

U′′ (E [x])

U′ (E [x])Var [x]− E

ñΓ′′ (1− E [P (x)])

Γ′ (1− E [P (x)])

ôEî|x− E [x]|

óf2 [x] .

Proof of Proposition 1. Immediately obtained by substituting the perceived probabilities approx-

imated by Theorem 1 into the value function in Equation (2), while accounting for U (x) ≤ 0 when

x ≤ k and substituting E [P ({s ∈ S |U(f (s)) ≤ z})] + E [P ({s ∈ S |U(f (s)) ≥ z})] for 1.

Proof of Proposition 2. Let y = P(E)−E [P (E)] and z = P(F )−E [P (F )], and assume that F

is more ambiguous than E. By Definition 2, z =d y+ ϵ. By Izhakian (2014, Proposition 5), the DM’s

preference %2 is characterized by the outlook function Γ : [0, 1] → R, implying that

E [Γ (z)] = E [E [Γ (y + ϵ) | y]] .

Ignoring the expectation on the RHS for the moment, ambiguity aversion, formed by a concave Γ,

implies

E [Γ (z)] = E [Γ (y + ϵ)] ≤ Γ (E [y + ϵ]) = Γ (y) .

Taking expectation implies E [Γ (z)] ≤ E [Γ (y)]. Hence, by Izhakian (2014, Proposition 5), δF -2 δE .

For the opposite direction, let δF -2 δE . Then, by Izhakian (2014, Proposition 5),

E [Γ (z)] ≤ E [Γ (y)] .

It needs to be shown that there exists an ϵ that satisfies Definition 2. The proof considers two proba-

bility distributions P ∈ P; it can then be extended to any number of probability distributions. Let y

37Note that this component holds an order of magnitude of the variance of probabilities. Thus, it is smaller by oneorder of magnitude than probabilities.

38

and z take two possible values, (y1, y2) and (z1, z2), with probabilities (α, 1− α) and (β, 1− β), respec-

tively. Without loss of generality, assume that z1 ≥ y1 ≥ y2 ≥ z2. The random variable ϵ can then be

constructed as ϵ1 = (z1 − y1, z2 − y1) with probabilitiesÄy1−z2z1−z2

, z1−y1z1−z2

äand ϵ2 = (z1 − y2, z2 − y2) with

probabilitiesÄy2−z2z1−z2

, z1−y2z1−z2

ä. It can be verified that the probabilities of ϵ1 and ϵ2 are all positive, and

that E [ϵ1 | y1] = 0 and E [ϵ2 | y2] = 0. Therefore, ϵ is mean-independent of y and E [z] = E [y + ϵ] = 0.

The probability that y + ϵ = z1 is

αy1 − z2z1 − z2

+ (1− α)y2 − z2z1 − z2

.

Since E [y] = E [z], then

α =z2 − y2 + β (z1 − z2)

y1 − y2.

Together, this implies that the probability that y+ ϵ = z1 is equal to β, and that the probability that

y + ϵ = z2 is equal to 1− β. That is, z =d y + ϵ.

Proof of Proposition 3. By Theorem 4, W (f) ≥ W(g) ⇐⇒ f stochastically dominates g. Then,

by Theorem 5, W (f) ≥ W(g) ⇐⇒ f2 [f ] ≤ f2 [g]. The same holds by Theorem 6.

Proof of Corollary 1. CAAA implies Γ′ (P (E)) = e−ηP(E) and Γ′′ (P (E)) = −ηe−ηP(E). Substi-

tuting into Equation (4) proves the corollary.

Proof of Corollary 2. Obtained by substituting 1 + r for x into Equation (6) of Theorem 7 and

rearranging terms.

Proof of Corollary 3. CRRA implies U′ (x) = x−γ and U′′ (x) = −γx−γ−1. CAAA implies

Γ′ (P (E)) = e−ηP(E) and Γ′′ (P (E)) = −ηe−ηP(E). Substituting into Equation (7) proves the corollary.

Proof of Observation 1. Consider an outcome x ∈ X . Its expected probability can be written

E [φ (x)] = ax +n∑

i=1

(bx − ax)i

n= ax +

1

2(bx − ax) ,

and its variance can be written

Var [φ (x)] =n∑

i=1

Åax + (bx − ax)

i

n− E [φ (x)]

ã2=

n∑i=1

Å(bx − ax)

i

n

ã2− 1

4(bx − ax)

2 .

Differentiating Var [φ (x)] with respect to n provides

d

dnVar [φ (x)] = −2

n∑i=1

Ç(bx − ax)

2 i2

n3

å,

which proves the claim.

Proof of Observation 2. Given an outcome x ∈ X , the maximal variance of its probability is

39

attained when the possible probabilities are only either 0 or 1. In this case, the expected probability

of x is E [φ (x)] = χ. Therefore, the variance of the probability of x is

Var [φ (x)] = χ (1− χ)2 + (1− χ) (0− χ)2 = χ− χ2,

which attains its maximal value when χ = 12 . In this case, Var [φ (x)] = 1

4 , and therefore f2 = 1. Notice

that the expected probability χ satisfies χ = 1n , where n is the number of different possible outcomes.

Therefore, the maximal value of f2 is attained when there are only two possible outcomes.

40

Documents

A Theoretical Foundation of Ambiguity Measurement · ﬀ Ilan Kremer, Evgeny Lyandres, Fabio Maccheroni, Massimo Marinacci, Sujoy Mukerji, Yacov Oded, Efe Ok, Jacob Sagi, David Schmeidler,