ADAPTIVE HEDONIC UTILITY1 - as.nyu.eduas.nyu.edu/content/dam/nyu-as/econ/documents/2019-spring/Adaptive... · tion to the so called hedonic treadmill.9 The adaptation of utility here

ADAPTIVE HEDONIC UTILITY1

Arthur J. Robson and Lorne A. Whitehead

April, 2018

ABSTRACT

Recent research in neuroscience provides a foundation for a cardinal

utility function that is both hedonic and adaptive. We model such ad-

aptation as arising from a limited capacity to make fine distinctions,

where the utility functions adapt in real time. For minimizing the

probability of error, an optimal mechanism is particularly simple. For

maximizing expected fitness, a still simple mechanism is approxim-

ately optimal. The model predicts the hedonic treadmill and a variety

of behavioral anomalies related to “behavioral contrast”, including a

preference for rising consumption streams.

Keywords: Utility, hedonic, adaptation, choice, neuroscience, bounded

rationality. JEL Codes: A12, D11.

Arthur J. Robson, Department of Economics, Simon Fraser University,

Burnaby BC Canada V5A 1S6. Email: [email protected].

Lorne Whitehead, Department of Physics and Astronomy, University of

British Columbia, Vancouver BC Canada. Email: [email protected].

1We thank Paul Glimcher, Kenway Louie, Larry Samuelson, Michael Shadlen, Philippe To-

bler, Ryan Webb and Michael Woodford for helpful discussions. We also thank audiences at

the workshop “Biological Basis of Preferences and Strategic Behavior” at SFU, at the conference

“Economics and Biology of Contests” at the Queensland University of Technology, at the Warwick

Economic Theory Workshop, at the World Congress of the Game Theory Society in Maastricht,

at the Society for Neuroeconomics Conference in Berlin, at the Canada Series Seminars at the

Weatherhead Center at Harvard, at the Workshop on Information Processing and Behavioral

Variability at Columbia University, and at seminars at the University of Queensland, the Univer-

sity of Melbourne, SFU, SFU Human Evolutionary Studies Program, HESP, Stanford University,

Columbia University, Chapman University, and Penn State University. Robson thanks HESP, the

Canada Research Chairs Program and the Social Sciences and Humanities Research Council of

Canada for financial support.

1

2

1. Introduction

Jeremy Bentham is famous for, among other things, the dictum “the

greatest happiness for the greatest number”, which, as Paul Samuel-

son was fond of observing, involved one too many “greatests” to be

operationally meaningful. The happiness that Bentham described was

cardinal, and so capable of being summed across individuals to obtain

a basic welfare criterion. Conventional welfare economics, even allow-

ing for non-additivity, remains needful of some degree of cardinality.

However, in the context of individual decisions, of consumer theory, in

particular, economics has completely repudiated any need for cardin-

ality, on the basis of “Occam’s Razor,” a theme that culminates in the

theory of revealed preference.

On the other hand, there is persuasive neurological evidence that eco-

nomic decisions are actually orchestrated within the brain by a mech-

anism that relies on hedonic signals, which signals are measurable and

therefore cardinal. In particular, there is evidence that economic de-

cisions are mediated by means of neurons that produce dopamine, a

neurotransmitter that is associated with pleasure.2

Although there is no logical need for the utility used in consumer theory

to be hedonic and cardinal, it is so, as a brute fact.

Further, there is neurological evidence that this hedonic utility is ad-

aptive, so that dopamine-producing neurons adapt rapidly to changes

in the distribution of physical rewards.3 Thus, if the variance of re-

wards increases, the sensitivity of such a neuron to a given increase in

reward is reduced.

A primary motive of the present paper is then to harmonize the neur-

ological view of hedonic utility with economics. We develop a model of

hedonic adaptive utility that draws on neuroscience. The model is not

fundamentally at odds with conventional economic theory in that the

2See, for example, a key paper for the present purpose, Stauffer, Lak and Schultz, (2014).3See Tobler, Fiorillo and Schultz (2005), for example.

3

only reason for a divergence from economics is the inability to make

arbitrarily fine distinctions.

The key paper by Stauffer, Lak and Schultz (2014) is described in

the next section. It provides evidence linking increments in von Neu-

mann Morgenstern utility, in a cardinal sense, to the activity of the

dopamine neurons. These neurons evaluate economic options, in an

adaptive fashion. Our model also concerns how these evaluations feed

into a decision rule that is noisy. The rule involves “just noticeable

differences”—JND’s—in the activity of dopamine neurons.4 Adapta-

tion involves shifting the thresholds at which a just noticeable jump in

dopamine neuron activity occurs.5

It pays to shift the capacity to discriminate to the thick of the action,

so hedonic utility needs to adapt, and it needs to adapt rapidly. We

show simple neural adjustment mechanisms exist by demonstrating the

existence of a particular simple automatic mechanism.6

The present paper presents then a mechanism that generates rapid

adaptation to a entirely novel distribution. When the objective is to

minimize the probability of error, a particularly simple rule of thumb

yields optimal adaptation for an arbitrary number of thresholds.7 When

the objective is the more plausible one of maximizing the expected

outcome chosen, a different rule of thumb yields adaptation that is

approximately optimal for a large number of thresholds.

A common feature of all previous models is that the time frame for

adaptation is undefined.8 That is, adaptation to the distribution is

shown to be optimal, but it is left open how such adaptation occurs. It

4Matlin (1988) is a textbook account of JND’s.5This formulation was used by Laughlin (1981) to introduce the efficient coding hypothesis to

capture maximal informational transfer by neurons.6We abstract from the interesting but challenging question of how conscious inputs influence

automatic processing.7This rule of thumb generates efficient coding, as in Laughlin (1981). His criterion is the

formal one of informational transfer; ours is to minimize the probability of error in a concrete

binary choice problem.8This observation applies to Robson (2001), Rayo and Becker (2007), and Netzer (2009), for

example.

4

is crucial for most realistic applications that the time frame over which

adaptation occurs be short. This issue we address here.

We demonstrate the empirical power of this approach with an applica-

tion to the so called hedonic treadmill.9 The adaptation of utility here

predicts that the answers to surveys concerning life satisfaction would

revert to previous levels subsequent to a shift in circumstances.10

There are also behavioral implications of the model that derive from the

theory of “behavioral adaptation.”11 Individuals bid higher in auctions

if they were previously exposed to auctions with a low range of values.12

The model also suggests an explanation for the puzzling preference of

individuals for rising consumption streams.

The following effective but annoying sales strategy offers an informal

example of these behavioral implications of the model. When you are

buying a car, the salesman suggests that you need various more-or-

less-worthless add-ons, undercoating for example, that cost perhaps

hundreds of dollars. The salesman is relying on the hundreds of dollars

seeming insignificant relative to the thousands that the car costs. The

effectiveness of this sales technique is consistent with the shift in utility

that larger stakes induce in the present model.

2. Background from Neuroscience

A remarkable paper that grounds the current work in neuroscientific

fact is Stauffer, Lak, and Schultz (2014).13 They argue that von Neu-

mann Morgenstern utility is realized in the brain, in an hedonic fashion,

by the activity of dopamine-producing neurons. These neurons number

about a million in humans and are located in the midbrain, between

the ears and behind the mouth. Dopamine is a neurotransmitter, a

9See Schkade and Kahneman (1998), as a key example.10The model does not corroborate, however, the position of Schkade and Kahneman (1998)

that such a shift is evidence of a mistake.11See Crespi (1942).12See Vestergaard and Schultz, 2015, and Khaw, Glimcher and Louie, 2017, for example.13See Schultz, (2016), for a less formal treatment of the issues.

5

chemical that relays a signal from one neuron to the next, and it has

a number of functions in the brain, a key one of which is to generate

hedonic motivation. These dopamine producing neurons have forward

connections—“projections”—to all of the sites in the brain that are

known to implement decisions.

Most basically, perhaps, a burst of activity of the dopamine neurons

is associated with the arrival of an unanticipated physical reward.14

Furthermore, a larger reward generates a greater intensity of the burst

of activity in the neuron (as measured by the number of impulses per

second).

One of the most firmly established results in the neuroscience literature

that bears on decisions is the “reward prediction error”, which is as

follows.15 Suppose the individual is trained to anticipate a particular

reward by a cue, perhaps a unique visual signal. The dopamine neurons

then shift much of their firing activity back in time from the actual

reward to the cue. If the size of reward is as expected, indeed, there

is no further response by the neuron. If the reward is larger than

expected, however, there is a supplementary burst of activity upon the

receipt of the reward, which is larger the larger the discrepancy on the

upside; if the reward is smaller than expected, the firing rate of neuron

is reduced below the base rate.16

Stauffer, Lak, and Schultz argue that von Neumann Morgenstern util-

ity can be related rather convincingly to this reward prediction error,

14Many of these experiments are done on monkeys, and involve implanted electrodes reading

the activity of individual dopamine neurons. The rewards that the monkeys obtain are typically

food or fruit juice.15Caplin and Dean, (2008), present a model of this phenomenon, and sketch several applications

to economics.16The dopamine neuron system apparently represents unexpected upside shifts in rewards more

accurately than unexpected downside shifts. Different systems, sometimes involving another neur-

otransmitter, serotonin, may help generate appropriate responses to downside surprises. Rogers,

(2011), reviews experimental evidence involving drugs concerning the roles of dopamine and sero-

tonin. See also Weller, Levin, Shiv, and Bechara, (2007), for evidence that the neural systems

that deal with gains and losses may be partially dissociated.

6

by proceeding as follows. First they estimate von Neumann Morgen-

stern utility in a revealed preference manner, by deriving the certainty

equivalents for a variety of binary gambles that are presented to the

monkeys. This step makes no use of neural data, that is. The von

Neumann Morgentstern utility is convex at low levels of juice rewards,

but concave for higher levels, so monkeys adhere to this property of

prospect theory.

Next, Stauffer, Lak, and Schultz consider the response by dopam-

ine neurons to several binary gambles, where the absolute difference

between the high and the low reward is held constant. Each gamble

is signalled by an associated cue. Neural activity then occurs with the

arrival of the cue, but there is additional activity if the higher reward

from the binary gamble is obtained. The extra neural activity is low

for gambles involving low rewards and for those involving high rewards,

but is high for gambles involving intermediate rewards. This additional

dopamine neuron activity is then in close cardinal agreement with the

incremental (“marginal”) utility estimated from revealed preference.17

Finally, Stauffer, Lak and Schultz establish that the dopamine neuron

responses that anticipate a binary gamble reflect the expected utility of

the gamble. They consider a pair of gambles with a common mean,

but where one of these gambles has a higher expected von Neumann

Morgenstern utility. A cue announcing the first of these gambles then

generates higher dopamine neuron firing rates than does one announ-

cing the second.

Further enhancing the neuroscientific foundation of utility, Lak, Stauffer,

and Schultz (2014) show that relationship between dopamine neuron

firing rates and choice extends to multidimensional preferences. First

they examine monkeys preferences over multidimensional rewards from

a purely behavioral point of view. They show that such preferences ex-

ist, in a sense that allows for noise. Monkeys chose between rewards

that were either quantitatively safe, or risky. In addition, there could

17See their Figure 3, in particular.

7

be risk concerning the type of juice. They derived a value for each

gamble in terms of a certainty equivalent of a third type of juice, and

show that these values predict the choices made between the gambles.

Next, they tie these behaviorally constructed reward valuations to

dopamine neuron firing rates. They do this by looking at each reward

separately, rather than in a choice context, hypothesizing, that is, that

this context makes no difference. The behavioral choice preferences

reflect these separately observed dopamine neuron firing rates.

Tobler, Fiorillo, and Schultz (2005) help complete the background

needed. The important new twist arises from conditioning the mon-

keys to expect one of two 50-50 gambles, where these two gambles arise

in random order. One gamble pairs a low volume of juice with a me-

dium volume, the other pairs the same medium volume with a high

volume. In each gamble, the higher volume generated an increase in

neural activity, the lower a decrease. The identical medium volume

generates greater activity when paired with the low volume than it did

when paired with the high volume.18

Baddeley, Tobler, and Schultz, (2016), further investigate adaptation,

in humans. They now find partial rather than complete adaptation.

The rationale they advance for the desirability of this is that unlikely

signals still need to generate appropriate reactions.

The process by which rewards are encoded is then relatively well un-

derstood. Indeed, so are some of the precise ways in which choice is

implemented.19 Less well understood is how the encoded rewards are

18Adaptation is a pervasive property of neurons. Baccus and Meister, (2002), for example,

consider the adaptive properties of visual neurons. The full adaptation of dopamine neurons that

we hypothesize here is analogous to that under the efficient coding hypothesis of Laughlin (1981),

who illustrates the hypothesis with data for visual neurons.19Shadlen and Shohamy (2016), for example, discuss how sequential sampling drives the choice

of a physical action. A random walk arises in the premotor neurons of a monkey seeking a reward

for predicting the preponderant drift of a moving pattern of dots. (A premotor neuron, as the

name suggests, is immediately upstream from the motor neurons that cause the monkey to press

one button or another, for example.) The random walk is driven up or down by accumulating

evidence favoring one or the other of two choices. When the random walk hits an endogenously

determined barrier, the corresponding choice is implemented. These models work remarkably well,

8

compared prior to implementing a decision.20 However, there are tantal-

izing hints that the comparison of value is the comparison of dopamine

neuron activity 21

3. The Model

The neuroscientific evidence discussed in the previous section suggests

the following broad hypotheses. These hypotheses are based on the

evidence but also sharpen it.

1) Consumption of an unanticipated deterministic amount of a good

generates dopamine neuron firing rates that correlate with the quantity

of a good. Further, a cue that predicts the imminent arrival of such

an outcome or of a gamble, more generally, generates dopamine neuron

firing rates that linearly reflect expected utility. That is, a choice made

between several such outcomes or gambles is consistent with maximiz-

ing the expected utility based on these firing rates.22

2) Since neurons reflecting consumption (or reflecting a cue predict-

ing a gamble) produce dopamine, a supplementary hypothesis is that

decisions are made on the basis of hedonic utility.23

3) The encoded values of gambles adapt to the statistical circum-

stances. This adaptation is not always complete in reality, but we

with convincing details accounted for, but the issues they raise concerning motor implementation

of a choice are not central here.20Possibly, for example, there is further processing of value prior to comparison. It could even

be that the process of comparing values proceeds in parallel to the encoding of values. See Hare,

Schultz, Camerer, O’Doherty, and Rangel (2001), for example. Indeed, the hedonic interpretation

is not crucial to the validity of the model here, which could reflect processing by neurons that do

not produce dopamine. Nevertheless, the present formulation seems parsimonious in the light of

the empirical findings. That is, value is represented hedonically by dopamine neurons and this is

mirrored in comparison and decision.21Jocham, Klein, and Ullsperger, (2011), for example, administered a D2-selective antagonist,

amisulpride, to humans engaging first in learning values and then in exploiting these. (D2 is

a particular type of dopamine receptor.) Amisulpride did not affect reinforcement learning but

enhanced some subsequent choices made on the basis of the learned values.22However, the neurons that encode the value of each hypothetical option in the context of a

particular choice do not seem to be dopamine producing. See Schultz (2015, p. 920).23This is not strictly necessary. For example, if the model captured the activity of non-

dopamine-producing decision neurons, it could remain otherwise valid.

9

consider the limiting case of complete adaptation for simplicity and

to see most clearly the novel consequences of adaptation. It is hypo-

thesized that the ability to make fine perceptual distinctions between

outcomes is limited, and that it is this that drives adaptation.

The above suggests the following model. Suppose there are two pro-

spective rewards whose actual fitness values are yi ∈ [0, 1] for i = 1, 2.24

After processing, the prospect of each reward is encoded as a neural

firing rate given by h(yi) for i = 1, 2, where h : [0, 1] → [0, 1].25 This

formulation of h allows for noise in these neuron firing rates.26

Given the neural firing rates h(yi), a noisy choice is then made, where

J(h(y1) − h(y2)), say, is the probability of choosing y1. It is assumed

that

J(0) = 1/2; J(h)→ 1, as h→ 1,

and J(h)→ 0, as h→ −1.27

We wish to obtain a simple tractable form of this model that retains

the following desirable properties—

i) Choice is noisy, since this is one of the most salient and familiar

features of actual choice.28

ii) There is a simple explicit parametric formulation of adaptation,

which arises from a limited ability to make fine distinctions. This

24The approach can readily be generalized to allow the outcomes to be food, or something else

that is monotonically related to fitness.25The restriction of rewards to [0, 1] is without loss of generality, given only that there are

some bounds on rewards. The restriction of neural activity to [0, 1] is similarly mathematically

harmless, given bounds on neural activity. The existence of some such bounds on neural activity

is empirically clear and relevant. Such bounds imply that error in choice cannot be eliminated by

exploiting extreme neural activity levels.26All neural activity is noisy. See, for example, Tolhurst, Movshon, and Dean (1983), who

investigate noise in single visual neurons. They argue that behavior is, however, less noisy since

it may be driven by integrating signals from a (small) number of such neurons. See also Renart

and Machens, (2014), for a more recent survey of neuron noise and its effect on behavior.27A more general formulation would allow J to depend on h1 and h2 as independent arguments,

which would allow the absolute level of the arguments to matter.28Mosteller and Nogee, (1951), for example, were forced to allow noisy choice when evaluating

expected utility theory in the laboratory. Indeed, they describe this noise with a function akin to

J .

10

ability can be sharpened, at an implicit cost, so that, in the limiting

extreme, choice becomes perfect.

These goals can be attained without relying on multiple sources of

noise. Consider the matched pair of functions h and J as follows. For

each integer N take δ = 1/N and assume—

A. The deterministic function h : [0, 1] → {0, δ, 2δ, ..., (N − 1)δ, 1} is

nondecreasing. It is specified by the thresholds xn for n = 1, ..., N at

which h jumps from nδ to (n+ 1)δ, where we set x0 = 0 and xN+1 = 1.

Such a step function can approximate an arbitrary continuous function,

if δ is small.

B. The function J is given by

J(h1 − h2) = 1/2, for |h1 − h2| < δ,

J(h1 − h2) = 0, for h1 − h2 ≤ −δ, J(h1 − h2) = 1, for h1 − h2 ≥ δ.

That is, the functions h and J are described in terms of the same

parameter, δ > 0, interpreted as a “just noticeable difference”—JND.

A difference in firing rate of δ can be fully distinguished, but smaller

differences cannot.29

Choice represented in this way remains noisy. That is, if it is repeatedly

true that h(y1) = h(y2) then sometimes y1 is chosen and sometimes

y2. Adaptation is captured by the function h; in particular by the

parameters x1 < x2 < ... < xN . These parameters can advantageously

adapt to the environment because choice is made imperfectly, given the

JND δ > 0. As δ → 0, however, y1 will be chosen if and only if y1 > y2.

The foregoing motivates the following choice problem.30 The individual

must choose one of two outcomes, i = 1, 2. These are realizations

29That choice is driven by differences and that valuations attached to thresholds are fixed

leads to full adaptation. One way in which absolute values could be incorporated would be to

have the valuations attached to each interval update using estimated absolute expected fitness

from that interval. Absolute values would then also matter. Such an updating procedure entails

collecting more information and might be slow. The present model plausibly then captures the

short run effects.30This is now as in Robson (2001).

11

yi ∈ [0, 1], for i = 1, 2, that were drawn independently from the cu-

mulative distribution function, F . This has a continuous probability

density function, f > 0 on (0, 1). The cdf F represents the background

distribution of rewards to which the individual is accustomed.

As implied by the construction of h and the JND formulation of J , the

only precise information that the individual has prior to choosing one

of the arms is the interval [xn, xn+1] that contains each realization. If

the two realizations belong to different intervals, the gamble lying in

the interval further to the right is clearly better; if the two realizations

lie in the same interval, choice is noisy with an error being made with

probability 1/2.

We interpret the number of thresholds that an outcome surpasses as

utility, so that Un = n/N, for n = 0, ..., N , consistent with the evid-

ence in Stauffer, Lak, and Schultz, (2014). However, for the present

essentially deterministic choices here, we could assign any utility level

Un ∈ [0, 1] say to the outcomes lying in the interval [xn, xn+1), for

n = 0, ..., N , such that that 0 = U0 < U1 < ...Un < Un+1 < ...UN = 1.31

What are the optimal 0 ≤ x1 ≤ .... ≤ xN ≤ 1? Robson (2001) shows

that the thresholds that minimize the probability of error are equally

spaced in terms of probability. If N = 1, for example, the threshold

should be at the median of F . At the other extreme, when N →∞, it

follows that the limiting density of thresholds matches the pdf, f , and

that U(y) = F (y), where U(y) is the utility assigned to y in this limit.

This result is in striking agreement with the efficient coding hypothesis

proposed by Laughlin (1981). He considers a function precisely analog-

ous to h, where a continuous input intensity is mapped onto a finite set

of “responses”, spaced apart by the just noticeable difference. Laughlin

31Robson (2018) investigates the implications of the model for choice between gambles. This

entails setting utility Un = n/N, for n = 0, ..., N in cardinal sense. The model becomes more

complex in that it considers choice between gambles rather than deterministic outcomes; it does

not model the adaptation of the thresholds, on the other hand, but presumes they are set optimally.

Robson (2018) provides a basis for key aspects of Kahneman-Tversky (1979) and a resolution of

the puzzle raised by Rabin (2000).

12

then argues that the response function of a neuron to a single y should

match the cumulative density function in order to maximize the in-

formation content of the neural responses. (See Louie and Glimcher,

2012, for a recent review of this efficient coding hypothesis.) We replace

the abstract notion of information transfer with a more concrete binary

choice problem but arrive at precisely the same conclusion.32 However,

this agreement only holds for the probability of error criterion.

Woodford (2012) has a similar basic motivation to the present paper,

and produces related predictions, but adopts an alternative approach

based on informational transfer and rational inattention. Rather than

the constraint being the number of thresholds, the constraint is the

informational transmission capacity of the channel. Adaptation arises

to use the available resources to make the most accurate distinctions

possible and the key predictions arise from this adaptation. There is no

counterpart in Woodford, however, to the explicit process of adaptation

here.

If the criterion were instead to maximize the expected value of the yi

chosen, expected fitness, that is, and N = 1, the optimal threshold is

at the mean of F . With N thresholds, each threshold should be at the

mean of the distribution of outcomes, conditional on the outcome lying

between the next threshold to the left and the next threshold to the

right. This uniquely characterizes the optimal thresholds in this case.

Now, in the limit as N →∞, Netzer (2009) shows that the density of

thresholds is proportional to f 2/3(y), so that U(y) =∫ y f(y)2/3dy∫f(y)2/3dy

In either case, the thresholds are optimally concentrated where the ac-

tion is—where f is high. That is, if the distribution shifts, the pattern

of thresholds must shift to match.

Previous work has not considered the mechanism of adaptation. If

the thresholds were chosen by evolution, this would make adjustment

32The rate of informational transfer is a symmetric concave function of the probabilities of

each output. Hence maximizing this entails equalizing these probabilities.

13

painfully slow, too slow, indeed, to fit the stylized facts. How then

could the thresholds adjust to a novel distribution, F?

In order to study this question, suppose then that the thresholds react

to draws, are allowed to move, that is, but are confined to a finite grid

KG = {0, ε, 2ε, ..., Gε, 1}, for an integer G such that (G+ 1)ε = 1. This

restriction is for technical simplicity since it means that the adjustment

process for the thresholds will be a (finite) Markov chain. Define the

state space SG = (KG)N and let S = [0, 1]N .

At first we abstract from the choice between the two arms. Instead,

we focus on the process by which the thresholds adjust.33 Hence we

use y at first to represent either y1 or y2. Eventually, when considering

the performance of the limiting rule of thumb, we will again need to

consider the outcomes on both arms.

Suppose the thresholds are time dependent, given as xtn ∈ KG, where

0 ≤ xt1 ≤ ....xtN ≤ 1, at time t = 1, 2, ....

3.1. Minimizing the Probability of Error. The first of the two

criteria considered here is to minimize the probability of error. This

is less basic than maximizing expected fitness, but it leads to simpler

results, and is intuitively illuminating. It is also noteworthy that the

rule of thumb for the probability of error case generates efficient neural

coding.

Consider the rule of thumb for adjusting the thresholds—

(3.1) xt+1n =

xtn + ε with probability ξ if y ∈ (xtn, x

tn+1]

xtn − ε with probability ξ if y ∈ [xtn−1, xtn)

xtn otherwise

,

for n = 1, ..., N .

33It simplifies matters to suppose that most draws adjust the thresholds, but choices are made

only occasionally.

14

The parameter ξ ∈ (0, 1) represents additional idiosyncratic uncer-

tainty about whether each threshold will actually move, even if the

outcome lies in a neighboring subinterval.34

This is perhaps the simplest possible rule in this context. It moves

thresholds towards where the action is, roughly speaking. This seems

like a step in the right direction, at least. More than that, we will show

that, in the limit of the invariant distribution as the grid size, ε → 0,

the thresholds are in exactly the right place.

The rule may not provide the most rapid possible adjustment, but it

is sufficient for the present purpose.35

We have—

Theorem 3.1. In the limit as G → ∞, so that ε → 0, the invariant

joint distribution of the thresholds xtn converges to one with point mass

on the vector with components x∗n, where F (x∗n) = n/(N + 1), for n =

1, ..., N .

This theorem is a corollary of a more general result—Theorem 3.2—to

follow.

That is, this rule of thumb generates optimal adaptation of the utility

function to any unknown distribution, in a non-parametric way, for any

number of thresholds, N .

34This technical device simplifies the argument that the Markov chain is irreducible.35A Bayesian optimal updating rule entails a prior distribution over distributions F . Suppose,

for example, there is one threshold, and the pdf is either f1(y) = 2 with support [0, 1/2] or

f2(y) = 2 with support [1/2, 1], where these pdf’s are equally likely. The optimal initial threshold

should then be at 1/2. If the outcomes are to the left, the pdf must be f1 and the next position

of the threshold should be at 1/4; if the outcomes are to the right, the pdf must be f2 and the

threshold should be set next at 3/4. That is, there is rapid resolution of the uncertainty about the

distribution. On the other hand, this procedure would be wildly inappropriate for a different prior.

Even with a definite general prior, it is not obvious that placing the threshold at the median of

posterior is always fully optimal. Furthermore, if the distribution is subject to occasional change,

this will also affect the Bayesian optimal rule. Although the current rule can only be slower than

the optimal rule with a specified prior and mechanism for change, it ultimately yields optimal

placement of the thresholds, in a robust fashion, without a specified prior, and without a specified

mechanism for redrawing the distribution.

15

3.2. General Case—Maximizing Fitness. The most basic general

criterion is to maximize expected fitness. That is, individuals who

successfully do this should outperform those who do not.36

The situation is now more complicated than it was with the criterion of

minimizing the probability of error. There are no longer simple rules of

thumb that implement the optimum exactly. However, there do exist

simple rules of thumb that implement the optimum approximately, for

large N . These rules of thumb involve conditioning on the arrival of

a realization in the adjacent interval, as above, but also modify the

probability of moving using the distance to the next threshold, in a

symmetric way.

Although it is possible to accurately estimate the median of a distri-

bution from the limited information available to such a rule of thumb,

it is not possible to do this for the mean. Hence the result for the

probability of error case are sharper than the results for the expected

fitness case.

To see that simple rules of thumb like this cannot implement the op-

timum exactly, consider first the case that N = 1. Suppose that F has

median 1/2 but a mean that is not 1/2. Consider a symmetric rule of

thumb based on the arrival of an outcome to the left or the right of

the current position of the threshold at x, say, and the distance to the

ends—x or 1 − x. This will then generate a limiting position for the

threshold at 1/2, thus failing to implement the optimum. This is also

an issue for any number of thresholds, since this argument applies to

the position of any threshold relative its two neighbors.

It is important that this general rule of thumb uses only informa-

tion that is available—the location of the neighboring thresholds and

whether an outcome lies in the subinterval just to the right or just to

the left. It would contradict the interpretation of the model here to use

36This assumes that the risk is independent across individuals. See Robson (1996) for a

treatment of this issue. Another possibility would be that fitness depends on relative payoffs.

16

detailed information about the precise location of the outcome within

a subinterval.

At the same time, the general rule of thumb here makes greater de-

mands on neural processing than does the rule of thumb for the probab-

ility of error case. The need to utilize the position of adjacent thresholds

must entail a greater complexity cost.

The general rule of thumb considered here is—

(3.2)

xt+1n =

xtn + ε with probability ξ(xtn+1 − xtn)β if y ∈ (xtn, x

tn+1]

xtn − ε with probability ξ(xtn − xtn−1)β if y ∈ (xtn−1, xtn]

xtn otherwise

Again, the parameter ξ ∈ (0, 1) and the draws that are made with

probability ξ(xtn+1− xtn)β or ξ(xtn− xtn−1)β conditional on the outcome

lying in the subinterval just to the right or left, respectively, are made

independently across thresholds.37

If the parameter β = 0, we have the old rule of thumb. Formally, then,

Theorem 3.1 follows from Theorem 3.2.

If β > 0 this will encourage the closing up of large gaps that arise where

f is small, which is useful to maximize expected fitness. Consider, for

example, a threshold situated so that the probability of an outcome in

the adjacent interval to the left equals the probability of an outcome

37A few technical considerations are as follows. Given the Markov chain described here, it is

possible that the order of thresholds is reversed at some stage so that xt+1n+1 < xt+1

n , for example.

In such a case assume that the thresholds are renumbered so as to preserve the natural order.

It is also possible that the process superimpose one threshold on another so that xt+1n+1 = xt+1

n ,

for example. In this case the independence of the draws made conditional on an outcome lying in

a neighboring subinterval will eventually break such a tie. These possibilities become vanishingly

improbable as G→∞, so these issues are purely details.

The above considerations simplify the proof that the Markov chain defined here is irreducible.

That is, there exists a number of repetitions such that, for any initial configuration, x0, say, there

is positive probability of being in any final configuration, xT , say. There is therefore a unique

invariant distribution for this chain. See Footnote 61 in the Appendix.

17

just to the right. Suppose, however, that the distance to the next

threshold on the right exceeds the distance to the left, because the pdf,

f is lower to the right. It will then pay to move to right, since the

expected fitness stakes on the right exceed those on the left. Indeed, if

β = 1/2, the resulting rule will be shown to be approximately optimal

for large N .

We have—

Theorem 3.2. In the limit as G → ∞ so that ε → 0, the invariant

joint distribution of the thresholds xtn converges to one that assigns a

point mass to the vector with components x∗n, n = 1, ..., N . These are

the unique solutions to

(F (x∗n+1)− F (x∗n))(x∗n+1 − x∗n)β = (F (x∗n)− F (x∗n−1)(x∗n − x∗n−1)β,

for n = 1, ..., N .

Proof. See the Appendix.

The intuition for Theorem 3.2 straightforwardly extends that given for

Theorem 3.1. Again, the limiting position of each threshold is such

that the probability of moving to the left is equal to the probability

of moving to the right. The proof can be outlined more precisely as

follows—

Step 1. The Markov chain can be approximated by an ordinary dif-

ferential equation (ODE) system, when ε is small, given by

dxndt

= H(xn, xn+1)−H(xn−1, xn), n = 1, ...N,

where H(xn, xn+1) = ξ(xn+1 − xn)β(F (xn+1) − F (xn)), n = 0, ..., N.38

This is obtained by letting ε → 0 over a fixed small interval of time.

With no compensation for this, the system would grind to a halt. Hence

this shrinkage of ε is precisely offset by increasing the number of repe-

titions. This increase in the number of repetitions has no effect on the

38This is Eq 8.2 from the Appendix.

18

invariant distribution, however. The ODE system is then obtained by

shrinking the small interval of time.

Step 2. A Lyapunov function is then used to show that the ODE

system is globally asymptotically stable. The system converges to the

vector x∗n, n = 1, ..., N , as t→∞, from any initial vector.

Step 3. Finally, we suppose that some limit of the invariant dis-

tributions of the Markov chain does not put full weight on x∗n, n =

1, ..., N . The ODE system must then strictly lower the expected value

of the Lyapunov function, under this limit of the invariant distribu-

tions. Given that the requisite continuity properties can be shown to

hold, the Markov chain also lowers the expected value of the Lyapunov

function, using the invariant distribution of that Markov chain as the

initial distribution, for all small enough ε. This establishes the desired

contradiction.

3.3. Approximate Optimality of the Rule of Thumb. We now

consider the efficiency of the rule of thumb relative to the optimum con-

figuration of thresholds, for the expected fitness criterion. We demon-

strate approximate efficiency, as N →∞, for the particular case of f ’s

that are step functions, so that

f(y) =

α1 > 0 if y ∈ [0, 1/M)

...

αm > 0 if y ∈ [(m− 1)/M,m/M)

...

αM > 0 if y ∈ [(M − 1)/M, 1]

where∑M

m=1 αm = M .

For eachN , there exists a unique positioning of theN interior thresholds,

under the rule of thumb, in the limit as G→∞ so that ε→ 0. Suppose

that the expected deficit in y, for the limiting rule of thumb relative

to the full information ideal, is given by L(N). (The “full information

ideal” entails always choosing the higher outcome.)

19

Theorem 3.3. As N → ∞, the limiting efficiency of the rule of

thumb is characterized by N2L→∑

m α2β/(1+β)m (

∑m α

1/(1+β)m )2/(6M3).

This expression is uniquely minimized by choice of β = 1/2. Hence

the rule of thumb with the best limiting efficiency satisfies N2L →(∑α

2/3m )3/(6M3) as N →∞.

Proof. See the Appendix. The strict optimality of β = 1/2 follows

from the Holder Inequality.

Suppose the expected deficit in y, relative to the full information ideal,

for the optimal positioning of thresholds, is given by L∗(N).

Theorem 3.4. The optimal allocation of the thresholds has limiting

efficiency characterized by N2L∗ → (∑α

2/3m )3/(6M3), as N →∞.

Proof. See the Appendix.

Hence the rule of thumb, when β = 1/2, has the same limiting efficiency

as the optimal allocation of thresholds. That is, the rule of thumb is

approximately optimal for large N .39 Suppose the efficiencies here can

be represented as Taylor series in 1/N .40 The first nonzero term is the

term in 1/N2, which is the same for L∗ and L.41 They may then only

disagree for terms of higher order.42

39This approximation is additional to those already involved in i) the convergence of the

Markov chain to an invariant distribution and ii) taking the limit of the invariant distribution as

ε→ 0.40The approximation result here remains valid, as an “asymptotic expansion”, even if a Taylor

series does not exist.41The minimized probability of error is easily seen to be 1/(2(N + 1)). The efficiency loss for

maximum fitness has a leading term in 1/N2 instead because the size of an error is of order 1/N .42Netzer uses the same device of considering a step function f . However, he does not consider

the adjustment process that is the focus here as the underpinning of utility adaptation. There is,

then, no counterpart of the rule of thumb used here. His main result is to show that, in the limit

as M → ∞, the density of thresholds is proportional to f2/3, in contrast to the limiting density

of thresholds in the probability of error case which is simply f . This main result of Netzer is an

incidental by-product of the present approach. This observation does not, however, help extend

our approximation results to a general f .

20

4. Self-Reported Happiness

4.1. Hedonic Treamill. The model has immediate implications for

phenomena that rely on self-reported life satisfaction. Conspicuous

among these is the hedonic treadmill. Schkade and Kahneman (1981)

formulate a well-known version of this treadmill that concerned stu-

dents at the University of Michigan and at UCLA.43 Students in the

two locations reported similar degrees of life satisfaction, but Michigan

students projected that UCLA students would be significantly happier.

Schkade and Kahneman describe this as a conflict between “decision

utility”—which is applied when deciding whether to move from Michigan,

and which is based on a substantial increase in life satisfaction in

California—and “experienced utility”—which reflects the more mod-

est increase ultimately obtained once there. Schkade and Kahneman

argue then that “decision utility” is defective.

The model here entails the adaptation of utility, which implies such

a distinction between decision and experienced utilities. There is no

sense, however, in which either decision or experienced utility is defect-

ive, in contrast to Schkade and Kahneman.44 Either utility function is

optimal for the corresponding circumstances, in the light of the inabil-

ity to make arbitrarily fine distinctions.

We can demonstrate the hedonic treadmill by simulating the follow-

ing specific version of the model. Consider the class of cdf’s given by

F (x) = xγ with pdf’s f(x) = γxγ−1, with γ > 0, for all x ∈ [0, 1]. Sup-

pose ε = 0.0005. Consider the probability of error case, for example,

so that β = 0, with nine thresholds, so that these thresholds will be

optimally positioned at the deciles of the distribution. Take 100, 000

periods, where γ = 1 for the first 20, 000 periods and γ = 5 thereafter,

so that probability mass is shifted to the upper end of the interval [0, 1].

43See also Frederick and Lowenstein (1999).44Robson and Samuelson (2011) addressed these issues in the Rayo and Becker (2005) model

of a limited ability to make fine distinctions. However, this model, in common with other papers

in this small literature, lacks an explicit account of real time adaptation.

21

Suppose the thresholds are placed initially at 0.1, 0.2,...,0.9—that is, at

the deciles of the distribution for γ = 1. This is essentially equivalent

to supposing that the γ = 1 regime has been in effect for a long time.

The results of simulating this version of the model are presented in Fig-

ure 1.45 The distribution of thresholds quickly puts most mass near the

deciles, as shown by the restoration of the uniform empirical frequency

of outcomes in each interval.46

An outcome in Figure 1 near 0.5, for example, once generated utility

near 0.5; after the shift to a substantially more favorable distribution,

it generates a rock-bottom level of utility. But the distribution of

thresholds was optimal for the original distribution and became optimal

again for the new distribution.

For β = 0, as in Figure 1, average utility reverts completely to its ori-

ginal level, after a shift in the cdf F . This is a pure form of the “hedonic

treadmill”. For β = 1/2, however, reversion is generally incomplete

or can overshoot. Figure 2 illustrates partial reversion, presenting a

rolling average of utility, where utility is defined so that average utility

for γ = 1 is normalized to 0.5.47

Indeed, long run average expected utility generally depends on the dis-

tribution, when β > 0.48 To illustrate overshooting, consider before

and after distributions F and G, respectively, say, such that the av-

erage utility under G exceeds that under F . Suppose now that G is

45All the simulations here were done using Excel.46These simulations also serve to demonstrate the robustness of the limiting results Theorems

3.1 and 3.2 which concern limits of invariant distributions as the grid size ε tends to 0. That is,

these results hold approximately for finite time and reasonable positive grid sizes. Theorems 3.3

and 3.4 rely on taking the additional limit as N → ∞, then showing that β = 1/2 yields a rule

of thumb that is approximately optimal. Simulations also show that β = 1/2 is approximately

optimal even for small value of N . These results are also then robust. Further, although there is

a gain from β > 0, this gain is not overwhelming, relative to β = 0. The additional complexity

cost of rules of thumb with β > 0 might then outweigh the gain over the rule with β = 0.47Average utility needs to be smoothed to be meaningful. We use a rolling average of the last

1,000 periods.48Expected utility also depends on the distribution under the optimal allocation of a finite

number of thresholds.

22

Interval number

Freq

uency in last

30,000

steps

Figure 1. Rapid Adaptation of the Thresholds to a Novel Distribution.

scaled down so that F first-order stochastically dominates G.49 Such

scaling ensures that average utility will plunge with the advent of G.

However, in the long run, average utility is independent of this scaling-

down, no matter how dramatic this might be, and so average utility

will eventually climb back past its old level.50

From a welfare point of view, not merely is average utility an unreliable

guide as to the extent of the change in the distribution, it may reverse

the apparent direction.

49This is always possible if G has a strictly positive pdf.50That is, suppose the distribution is shifted to become Fk(x) = F (kx), for k ≥ 1. The new

thresholds are then xkn = xn/k since the unique characterization of thresholds in Theorem 3.2 is

satisfied with these xkn.

23

Figure 2. Rule of Thumb with β = 1/2. Modified Hedonic Treadmill.

4.2. Happiness Economics. Recently, there has been a upsurge of

interest within economics in using national surveys on self-reported life

satisfaction to at least partially substitute for Gross National Product.

It is clear that GNP has serious defects—it is highly incomplete, in

particular. Clark, et al.. (2018) make an eloquent case in favor of this

new strategy, for example. The approach they propose goes beyond

the present positive approach—where hedonic utility is used to predict

behavior—to embrace a normative view—where hedonic utility is used

to make welfare judgments.

The adaptive nature of happiness raises issues for such a normative

approach. In the model with complete adaptation (with β = 0) suppose

the distribution of material outcomes doubles across the board. Is it

desirable to claim that there is no increase in welfare?51 Indeed, as a

positive matter, wouldn’t potential immigrants, for example, strictly

prefer the doubled outcomes?

51Average utility is always independent of the distribution F when β = 0.

24

More generally, (when β > 0), average hedonic utility is not merely

an unreliable guide as to the extent of a welfare change derived from

a change in the distribution of material outcomes F , there may be

reversals. That is, a distribution G that is first-order stochastically

dominated by another distribution F may nevertheless generate higher

average utility, as discussed in the previous subsection.

These issues merit reexamination in a more general model that allows

incomplete adaptation. However, the scenarios described above seem

bound to remain as possibilities, and the interpretation of national

happiness surveys in terms of welfare seems problematic.

5. Behavioral Applications

There are behavioral implications as well that fall under the rubric of

the classic psychological phenomena of behavioral contrast effects. In

an early experiment, Crespi (1942), found that rats run faster towards

larger rewards. Further, rats trained to run towards a small reward

run faster towards a large reward than do rats trained on large rewards

throughout.

In the present model, once the rats were inured to low rewards, the

advent of higher rewards would occasion a (temporary) increase in

hedonic utility—as in Figures 1 and 2—and hence a greater willingness

to undergo exertion for the reward.

These effects are not limited to rats.52 Vestergaard and Schultz (2017),

for example, consider human bidding behavior in second-price auctions.

Lesser-valued options involving the growth of value were shown to be

systematically preferred to better options with declining time profiles.53

52Neither are the effects limited to the upside. Kahneman, Fredrickson, Schreiber, and Redel-

meier (1993) confront human subjects with a choice between a short series of painful experiences

and a longer series where the longer series is the shorter series augmented by the addition of a

somewhat less painful interval to conclude. They show that people may prefer the longer series

that ends less painfully. See also Kahneman, Wakker, and Sarin (1997).53Khaw, Glimcher, and Louie, (2017), also present experimental evidence in favor of such

maladaptation. They show that the subjective value of an option in an auction varies inversely

with the average value of recently observed items. Wittman, et al. (2016) investigate the neural

25

5.1. Behavioral Adaptation—Simulations. In the model, an in-

creasing sequence of rewards leads to a sequence of pleasant surprises

that are then associated with higher levels of utility than would arise

with a decreasing sequence of rewards.

Suppose, for example, that rewards are produced by an arbitrary cdf F

that shifts to the right with time in a linear fashion. That is, the time-

shifted cdf’s are given by Ft(y) = F (y−αt), for some α > 0. Consider

thresholds xn for n = 1, ..., N and redefine these as xn = xn − αt.

It follows then that the approximating ODE as in Eq 8.2 from the

Appendix becomes

dxndt

= H(xn, xn+1)−H(xn−1, xn)− α, n = 1, ...N,

where H(xn, xn+1) = ξ(xn+1 − xn)β(F (xn+1)− F (xn)), n = 0, ..., N, as

long as α is small enough.

The proof of Proposition 8.1 from the Appendix can readily be adapted

to show that xn → x∗n where x∗n are characterized by H(xn, xn+1) =

H(xn−1, xn) + α, n = 1, ...N.

It follows immediately that the long run effect of such increase in re-

wards is to shift all thresholds down, with each threshold being shifted

down relative to the one before. Hence average utility is unambiguously

shifted up, with this effect being more pronounced for larger α.54

An analogous argument applies if α < 0 so that average utility is shifted

down by a decreasing cdf.

This analysis is illustrated in Figure 3, where the cdf is the uniform

distribution, and there are three thresholds. This cdf shifts up in a

linear fashion over the first 50, 000 periods. During this phase, the

mechanisms by which humans track the time trends of a series of rewards. Blanchard, et al.

(2014) show that monkeys are also biased in favor of sequences of rewards that increase relative

to those that decrease.

54If F is the uniform distribution on [0, 1], for example, it follows that x∗n =n(1−N(N+1)α

2

)N+1

+n(n−1)α

2, for n = 1, ..., N, given that

N(N+1)α2

< 1. Hence the gap between successive x∗n grows

with n.

26

thresholds lag behind the quartiles of the current cdf, as the above

argument implies. This raises average utility.

At period 50, 000, there is an abrupt jump up in the cdf, which creates a

spike in average utility. After period 50, 000, the cdf declines linearly, at

the same rate that it rose before, the thresholds are above the quartiles,

and average utility falls, despite the higher average rewards in this

second phase.

Consider then a choice between rising consumption—as in the first

phase in Figure 3—and falling consumption—as in the second phase.

After a temporal reversal, the rewards in the second phase vector dom-

inate those in the first, but, except for the immediate aftermath of the

jump in rewards at the midpoint, which is an artifact of the precise

sequence here, utility levels in the first phase similarly vector domin-

ate those in the second. Applying an extended version of Hypothesis

2 of Section 3, to allow for vectors of utility levels, and for temporal

permutations of these vectors, this provides a firm basis for predicting

a preference for the first phase.55

5.2. Speed versus Accuracy—Simulations. In the current model,

when ε is small, convergence to the invariant distribution is slow, but

ultimately precise.56 This issue could be sharpened by assuming that

the underlying cdf, F , was subject to occasional change. Suppose, to

be more precise, that there is a (finite, say) set of cdf’s {Fj}. With a

Poisson arrival rate, the current cdf from this set is switched to a new

one, drawn at random from this set. It is intuitively compelling that

there should then be an optimal ε > 0 and that this should vary with

the rate of introduction of novelty, in particular.57

55This explanation obviates the need to invoke “recency bias”, which could also predict a

preference for the first phase over the second.56Increasing N must also slow convergence, if only because, although there are now more

thresholds, in general at most two of these are adjusted in each period.57In a similar spirit, the number of thresholds might be allowed to vary with the problem at

hand. That is, if a problem has particularly high stakes, N might be allowed to increase, but at

a cost.

27

Figure 3. Behavioral Adaptation

This tradeoff between speed and accuracy seems bound to be theoretic-

ally robust. That is, other models that differ in detail but still capture

rapid adaptation seem bound to also produce such a tradeoff.

Figure 4 illustrates these observations for the specific version of the

model in Section 4. It depicts the evolution of the three thresholds

over time, now contrasting two different values of the grid size ε; namely

0.002, and 0.000125, top and bottom, respectively. It is evident here

that a smaller value of ε slows down the speed of adjustment but im-

proves the precision of the ultimate allocation of thresholds.

Adaptation should be slow when circumstances change infrequently;

but fast when circumstances change frequently. (This would consider

the parameter ε as endogenous, tailored to the circumstances.) This

28

Figure 4. Speed versus Accuracy

is consistent with adaptation to living in a new locale taking several

years; but adaptation to playing a game of penny ante poker being

much faster.

5.3. A Long Run Behavioral Application. The analysis of Sec-

tions 5.1 and 5.2 provides an application to a puzzle in labor economics—

why do wage streams paid to an individual by a given firm typically

rise over time?

29

This question is posed forcefully by Frank and Hutchens (1993).58 They

illuminate the issues by considering airline pilots and intercity bus

drivers, occupations in which it is hard to see commensurate productiv-

ity gains.59

Frank and Hutchens propose that utility should be modified to include

a term in the rate of growth of consumption. The present model sug-

gests an alternative neurological explanation for increasing sequences

of consumption to be favored, ceteris paribus.

6. Sketch of Application to Choice over Gambles

The current model involves essentially deterministic choices. That is,

although the two outcomes are chosen from a common distribution,

choice is made subsequent to their realization. There are fruitful ap-

plications that involve choice between gambles. For this, it is desirable

that preferences should be shaped by choices between gambles, where

the gambles themselves come are drawn from a common distribution.

Robson (2018) develops such a model and shows that it accounts for

the adaptive S-shaped utility that is a key feature of Kahneman and

Tversky (1979). Furthermore, the paper resolves the puzzle posed most

forcefully by Rabin (2000) concerning the substantial attitudes to risks

that can be observed with small stakes.

58Dustmann and Meghir (2005) provide recent evidence of the returns to tenure. Perhaps

most relevantly, unskilled workers’ wages grow noticeably with tenure, but grow with experience

only for a short period, and not at all with sector tenure.59Frank and Hutchens also argue that other resolutions to this puzzle in the literature do not

seems convincing for airline pilots or intercity bus drivers. For example, Becker (1964) argues

that firm-specific human capital might mean that premium wages in later years might motivate

workers to stay with the firm. But most training of airline pilots is done by the military prior to

their being hired by an airline. Similarly, Frank and Hutchens argue that the bonding contract

explanation of Lazear (1981) is not well-suited to pilots.

30

7. Conclusions

A key motivation here was to develop a model that is consistent with

the burgeoning neuroscience evidence about how decisions are orches-

trated in the brain. There is good evidence that economic decisions

are made by a neural mechanism with hedonic underpinnings.

We present a model where the cardinal levels of hedonic utility shift in

response to changing circumstances, as is also consistent with neuros-

cience. This adaptation acts to reduce the error caused by a limited

ability to make fine distinctions, and is evolutionarily optimal.

There is no ultimate conflict with economics, however, since this lim-

ited ability is the only reason there are mistakes at all; as this ability

improves, behavior converges to that implied by a standard economic

model.

The model provides a direct explanation of the “hedonic treadmill”,

where reported utility levels tend to revert the former levels after a

positive (or negative) shock to the circumstances. More generally, this

reversion may be partial, or there may be overshooting. Accordingly,

it provokes skepticism about a welfare interpretation of national hap-

piness surveys.

The model generates predictions concerning observed behavior. The

most straightforward of these is that individuals should make more

mistakes over small stakes decisions when they are inured to larger

stakes (and vice versa).

There are behavioral implications of the theory related to “behavioral

contrast”. For example, rapid but incomplete adaptation explains why

bidding behavior in auctions would exhibit memory effects that appear

irrational. Finally, slow adaptation would explain why increasing con-

sumption profiles might be preferred to decreasing profiles even if the

latter dominate the former.

Department of Economics, Simon Fraser University and Department

of Physics and Astronomy, University of British Columbia

31

8. Appendix—Proofs

Proof of Theorem 3.2.

Suppose that the (finite) Markov chain described by Equation (3.2) is

represented by the matrix AG.60 This is a |SG| by |SG| matrix which is

irreducible, so that there exists a integer P such that (AG)P has only

strictly positive entries.61

Consider an initial state xtG ∈ SG where 0 ≤ xt1,G ≤ xt2,G ≤ ... ≤ xtN,G ≤1. Consider then the random variable xt+∆

G that represents the state

of the chain at t + ∆, where ∆ > 0 is fixed, for the moment. Suppose

there are R iterations of the chain, where R = b∆/εc. These iterations

arise at times t + rε for r = 1, ..., R. Suppose the process is constant

between iterations, so it is defined for all t′ ∈ [t, t+ ∆].

We consider the limit as G → ∞. Taking this limit implies ε → 0,

but also speeds up the process in a compensatory way, in that R→∞,

making the limit non-trivial. This speeding up is only a technical device

and has no effect on the invariant distribution, in particular.

We adopt the notational simplification that

H(xn, xn+1) = ξ(xn+1 − xn)β(F (xn+1)− F (xn)), n = 0, ..., N.

Indeed, the key results here only rely on the properties thatH1(xn, xn+1) <

0 and H2(xn, xn+1) > 0.

We have then that xt+∆n,G = xtn,G +

∑Rr=1 εr where

60See Mathematical Society of Japan, (1987, 260, p. 963), for example.61Consider any initial configuration, x0, say, and any desired final configuration, xT , say. First

move x01 to xT1 by means of outcomes just to the right or left, as required, that do not affect any

other thresholds. This might entail x1 crossing the position of other thresholds, but temporarily

suspend the usual convention of renumbering the thresholds, if so. This will take at most G + 1

periods. Then move x02 to xT2 in an analogous way. And so on. There is a finite time, (G+ 1)N ,

such that the probability of all this is positive, given the assumptions in Section 3.2.

32

(8.1) εr =

ε with probability H(xt+rεn,G , x

t+rεn+1,G)

−ε with probability H(xt+rεn−1,G, xt+rεn,G )

0 otherwise

.

It follows thatxt+∆n,G − xtn,G

∆=

∑Rr=1 εr/ε

R(∆/(εR)),

where (∆/(εR))→ 1, as G→∞.

Since

xt+rεn ∈ [xtn,G −∆, xtn,G + ∆], r = 1, ..., R,

it follows that

Pr{εr/ε = 1} ∈ [H(xtn,G+∆, xtn+1,G−∆), H(xtn,G−∆, xtn+1,G+∆)], r = 1, ..., R

and that

Pr{εr/ε = −1} ∈ [H(xtn−1,G+∆, xtn,G−∆), H(xtn−1,G−∆, xtn,G+∆)], r = 1, ..., R,

with probability 1, in the limit as G→∞, so that ε→ 0 and R→∞.

Hence, if, finally, ∆→ 0, it follows that

xt+∆n,G − xtn,G

∆→ H(xtn, x

tn+1)−H(xtn−1, x

tn),

with probability 1, so that, with probability 1—

(8.2)dxndt

= H(xn, xn+1)−H(xn−1, xn), n = 1, ...N.62

Lemma 8.1. There exist unique x∗n, n = 1, ..., N such that dxndt

= 0, n =

1, ..., N .

Proof. Choose any x1 > 0. Then there exist x2 < x3 < ... < xN+1

such that H(0, x1) = H(x1, x2) = ... = H(xN , xN+1). Clearly xn, n =

2, ..., N + 1 are strictly increasing and continuous in x1 with xN+1 → 0

if x1 → 0 and xN+1 → ∞ as x1 → ∞. Hence there exists a unique x1

62This expression is valid even if there are ties so that xn = xn+1, for example. In this case,

xn and xn+1 immediately split apart, relying on the convention that xn ≤ xn+1.

33

such that xN+1 = 1. This generates the x∗n, n = 1, ..., N as claimed in

Theorem 3.2.

Proposition 8.1. The differential equation system given by Equation

(8.2) is globally asymptotically stable. That is, given any initial x(0)

where 0 ≤ x1(0) ≤ x2(0) ≤ ... ≤ xN(0) ≤ 1, it follows that x(t) → x∗

as t→∞.

Proof. The proof proceeds by finding a Lyapunov function.63 Re-

versing the usual order of the thresholds, for expositional clarity, the

second derivatives are given by

d2xNdt2

= (HN1 −HN−1

2 )dxNdt−HN−1

1

dxN−1

dt, ...,

(8.3)d2xndt2

= Hn2

dxn+1

dt+ (Hn

1 −Hn−12 )

dxndt−Hn−1

1

dxn−1

dt, ...,

d2x1

dt2= H1

2

dx2

dt+ (H1

1 −H02 )dx1

dt,

where Hn1 = H1(xn, xn+1) < 0 and Hn

2 = H2(xn, xn+1) > 0, for com-

pactness of notation.

Shifting to vector notation and using “dot” notation for derivatives,

for further compactness, Equations (8.2) and (8.3) can be written as

(8.4) x = D(x), and x = E(x)x respectively,

where vectors are by default column vectors and “T” denotes transpose

so that, for example, xT = (xN , xN−1, ..., x1).

The vector D(x) is implied by Equation (8.2); the matrix E(x) is given

as follows—

E(x) =

HN

1 −HN−12 −HN−1

1 0 ...

HN−12 HN−1

1 −HN−22 −HN−2

1 ...

0 HN−22 HN−2

1 −HN−32 ...

... ... ... ...

.

63See Mathematical Society of Japan, (1987, 126F, p. 492), for example.

34

Define Bn, n = 1, ..., N as the n-th principal nested minor of E(x),

where these minors are defined relative to the lower right corner of

E(x).

Lemma 8.2. The matrix E(x) is negative definite because the sign of

Bn is (−1)n for n = 1, 2, ..., N .

Proof. From the definition of Bn, it follows that

Bn = (Hn1 −Hn−1

2 )Bn−1 +Hn−11 Hn−1

2 Bn−2,

so that, rearranging,

Bn −Hn1Bn−1 = −Hn−1

2 (Bn−1 −Hn−11 Bn−2).

It also follows that B1 = H11 −H0

2 < 0 and B2 −H21B1 = H1

2H02 > 0.

Hence the sign of Bn −Hn1Bn−1 is (−1)n.

Suppose, as an induction hypothesis, that the sign of Bn−1 is (−1)n−1.

Since Bn = (Bn −Hn1Bn−1) +Hn

1Bn−1, it follows that that the sign of

Bn is (−1)n, as required to complete the proof of Lemma 8.2.

Global asymptotic stability now follows. Define a Lyapunov function

as

V (x) = xT x = D(x)TD(x), so that V (x) ≥ 0 and V (x) = 0 iff x = x∗.

Hence, since E(x) is negative definite,

V = 2xT x = 2xTE(x)x ≤ 0, and V = 0 iff x = x∗.

It follows that the ordinary differential equation system given by Equa-

tion (8.2) is globally asymptotically stable. That is, given any initial

x(0) where 0 ≤ x1(0) ≤ x2(0)... ≤ xN(0) ≤ 1, it must be that x(t)→ x∗

as t→∞. This completes the proof of Proposition 8.1.

We can now complete the proof of Theorem 3.2. Suppose that FG is the

cdf representing the unique invariant distribution of the Markov chain

with transition matrix AG. Extend FG to be defined on the entire space

S. By compactness, it follows that there exists a subsequence of the

FG that converges weakly to a cdf F defined on S (Billingsley, 1968,

35

Chapter 1). That is, FG ⇒ F as G → ∞. We will show that F puts

full weight on the singleton x∗. Once this is shown, it follows that the

full sequence must also converge to this F .

Suppose, then by way of contradiction, that F does not put full weight

on x∗, so that∫V (x)dF (x) > 0.

Reconsider then the construction that led to the differential equation

system that approximates the Markov chain, as described from the

beginning of this Appendix. Choose any x ∈ S, where x 6= x∗ and

0 ≤ x1 < x2 < ... < xN ≤ 1. Now let xG be any of the points in SG

that are closest to x. Let x∆G(x) be the random variable describing the

Markov chain at t = ∆ that starts at xG at t = 0. Consider now the

limit as G → ∞, so that the number of repetitions in the fixed time

∆, given by R = b∆/εc also tends to infinity. Suppose x∆(x) is the

solution to Equation (8.2), that is, to x = D(x), at t = ∆, given it has

initial value x at t = 0.

Given that x 6= x∗ and 0 ≤ x1 < x2 < ... < xN ≤ 1, it follows

that V (x∆(x)) < V (x), since we showed that V (x) < 0 on [0,∆]. By

hypothesis,∫V (x)dF (x) > 0. It follows that

(8.5)

∫V (x∆(x))dF (x) <

∫V (x)dF (x).

That this inequality holds in the limit implies that it must hold for

large enough G, as follows.

First, the derivation of the approximating system x = D(x) implies, in

particular, that

(8.6) EV (x∆G(x))→ V (x∆(x)) as G→∞.

It now follows that

|∫EV (x∆

G(x))dFG(x)−∫V (x∆(x))dF (x)| ≤

|∫EV (x∆

G(x))dFG(x)−∫V (x∆(x))dFG(x)|+

36

|∫V (x∆(x))dFG(x)−

∫V (x∆(x))dF (x)|.

The first term on the right hand side tends to zero, as G→∞, by the

Lebesgue dominated convergence theorem, given Equation (8.6). The

second term on the right hand side also tends to zero as G→∞ since

FG ⇒ F and the integrand is continuous. Hence

(8.7)

∫EV (x∆

G(x))dFG →∫V (x∆(x))dF, as G→∞.

Secondly, since FG ⇒ F , as G → ∞, and V is continuous, it follows

that

(8.8)

∫V (x)dFG(x)→

∫V (x)dF (x).

Altogether, then Equations (8.5), (8.7) and (8.8) imply that, whenever

G is sufficiently large

(8.9)

∫EV (x∆

G(x))dFG(x) <

∫V (x)dFG(x),

which is a contradiction, since FG is the invariant distribution.

To show this explicitly, revert to matrix notation for the finite Markov

chain with transition matrix AG. Suppose then that fG is the column

vector describing the associated invariant distribution, so that fTG =

fTGAG. As before, let R = b∆/εc. We have

EV (x∆G(x)) =

∑x∈SG

e(x)(AG)R(x)V (x),

where e(x) is the unit vector that assigns 1 to x ∈ SG and 0 to all other

elements of SG. It follows that Equation (8.9) becomes∑x∈SG

fTG(AG)R(x)V (x) <∑x∈SG

fTG(x)V (x),

which is a contradiction, since fTG(AG)R = fTG . This completes the

proof of Theorem 3.2.

Proof of Theorem 3.3.

37

Lemma 8.3. Consider a uniform distribution with pdf 1/s on the in-

terval [0, s]. The loss from choosing at random relative to the full in-

formation ideal is s/6.

The expected payoff from choosing randomly between the two arms is

clearly s/2. The expected payoff from choosing the higher of the two

arms, on the other hand, as would be the full information ideal, is 2s/3.

To see this, suppose

K(y) = Pr{max{y1, y2} < y} = Pr{y1&y2 < y} = (y/s)2.

Hence∫ s

0ydK(y) = 2s/3. It follows that the expected loss from choos-

ing at random is s/6, proving Lemma 8.3.

Define now the expected fitness loss, relative to the full information

ideal, for the step function pdf as in the statement of Theorem 3.4, to

be L(N). It follows that

(8.10) L = (1/6)M∑m=1

(nm − 1)sm(αmsm)2 +M∑m=1

dm.

Here, nm is the number of thresholds that lie in the subinterval [(m−1)/M,m/M ], which must be evenly spaced apart with a distance sm

between them, except at the ends of the subinterval. In the intervals

that overlap m/M , the expected loss is dm, say.

This expression for L(N) holds because the loss between each pair of

thresholds in [(m−1)/M,m/M ] is sm/6, conditional on both outcomes

being in that range, there are nm − 1 such ranges, and the probability

of both outcomes lying in each range is (αmsm)2.

The limiting equilibrium of the rule of thumb entails

H(xn−1, xn) = H(xn, xn+1) = H(xn+1, xn+2).

If xn ≤ (m/M) < xn+1 it follows that H(xn−1, xn) = H(xn+1, xn+2), so

that

αm(sm)1+β = αm+1(sm+1)1+β,

38

since sm = xn − xn−1 and sm+1 = xn+2 − xn+1. It follows that there

exists λ such that

sm = λ(αm)−1/(1+β),m = 1, ...,M.

It also follows that

dm ≤ (1/6)(α)2(s)3, where α = maxm

αm and s = 2 maxm

sm.

Furthermore,

(nm−1)sm ≤ 1/M and, since (nm+1)sm ≥ 1/M, (nm−1)sm ≥ 1/M−2sm

Each value of N induces a corresponding value of λ, further, λ→ 0 as

N →∞.

The foregoing implies that

L ≥ (1/6)M∑m=1

(1/M−2sm)(αmsm)2 and L ≤ (1/6)M∑m=1

(1/M)(αmsm)2+(M/6)s3α2.

There exists η such that s ≤ ηλ. Since it is also true that sm =

λ(αm)−1/(1+β),m = 1, ...,M , it follows that

L/λ2 →M∑m=1

α2β/(1+β)m /(6M) as λ→ 0.

Furthermore, since (nm − 1)sm ∈ [1/M − 2sm, 1/M ], it follows that

nm ≤ α1/(1+β)m /(Mλ) + 1 and nm ≥ α1/(1+β)

m /(Mλ)− 1.

Since∑

m nm = N , it follows that

Nλ ∈ [∑m

α1/(1+β)m /M −Mλ,

∑m

α1/(1+β)m /M +Mλ].

Thus

Nλ→∑m

α1/(1+β)m /M as λ→ 0.

Hence

N2L→∑m

α2β/(1+β)m (

∑m

α1/(1+β)m )2/(6M3) as N →∞.

39

Lemma 8.4. The expression∑

m α2β/(1+β)m (

∑m α

1/(1+β)m )2 is minimized

uniquely by choice of β = 1/2.

This follows from the Holder Inequality (Royden, 1988, p. 119) since∑m

α2β/(3(1+β))m α2/(3(1+β))

m =∑m

α2/3m ≤ (

∑m

α2β/(1+β)m )1/3(

∑m

α1/(1+β)m )2/3.

Furthermore, equality can only hold here if α2β/(1+β)m = α

1/(1+β)m ; that

is, only if β = 1/2.

It follows that, when β = 1/2,

N2L→ (∑

α2/3m )3/(6M3).

This completes the proof of Theorem 3.3

Proof of Theorem 3.4

We will now show that the optimal rule has the same leading term as

the rule of thumb with β = 1/2. Suppose that the expected loss from

the optimal placement of the thresholds, relative to the full information

ideal, is given by L∗. An entirely similar argument to that used for L

shows, upon multiplying by N2, that

N2L∗ ∈

[(1/6)M∑m=1

(1/M−2sm)(αmNsm)2, (1/6)M∑m=1

(1/M)(αmNsm)2+(MN2/6)s3α2].

Consider the vector (n1/N, ..., nm/N, ..., nM/M). By compactness, there

must exist a convergent subsequence such that nm/N → γm, N → ∞.We will characterize the γm uniquely, so that the entire sequence must

then converge to these values as well.

It must be that sm → 0 for all m, as N → ∞, since otherwise it

would not be true that L∗ → 0, contradicting the optimality of L∗. It

follows from (nm − 1)sm ≤ 1/M and (nm − 1)sm ≥ 1/M − 2sm that

nmsm → 1/M .

40

If γm = 0, it follows from (nm/N)(Nsm) ≥ 1/M − sm that Nsm →∞.

This implies that N2L∗ → ∞ which is not optimal. Hence we have

Nsm → 1/(Mγm).

It follows now that Ns is bounded above and that s ≤ 2 maxm sm → 0,

as G→∞. Hence

N2L∗ → (∑m

α2m/γ

2m)/(6M3).

The optimal rule must minimize this expression over the choice of the

γm ≥ 0,m = 1, ...,M where∑

m γm = 1. Since this function is convex

in the γm ≥ 0,m = 1, ...,M the first-order conditions are necessary and

sufficient for a global minimum. There must then exist a λ such that

α2m/γ

3m = λ3 so γm = α

2/3m /λ. It follows that λ =

∑m α

2/3m . Hence

N2L∗ → (∑

α2/3m )3/(6M3).

This completes the proof of Theorem 3.4.

References

[1] Stephen A. Baccus and Markus Meister. Fast and slow contrast adaptation in retinal circuitry.

Neuron, 36:909–919, 2002.

[2] Gary S. Becker. Human Capital. Columbia University Press, New York, 1964.

[3] Patrick Billingsley. Convergence of Probability Measures. John Wiley and Sons, New York,

1968.

[4] Tommy C. Blanchard, Lauren S. Wolfe, Ivo Vlaev, Joel S. Winston, and Benjamin Y. Hayden.

Biases in preferences for sequences of outcomes in monkeys. Cognition, 130:289–299, 2014.

[5] Christopher J. Burke, Michelle Baddeley, Philippe Tobler, and Wolfram Schultz. Partial

adaptation of obtained and observed value signals preserves information about gains and

losses. Journal of Neuroscience, 36(39):10016–25, 2016.

[6] Andrew Caplin and Mark Dean. Dopamine, reward prediction, and economics. Quarterly

Journal of Economics, 123(2):663–701, 2008.

[7] Andrew E. Clark, Sarah Fleche, Richard Layard, Nattavudh Powdthavee, and George Ward.

The Origins of Happiness: The Science of Well-Being Over the Life Course. Princeton

University Press, Princeton and Oxford, 2018.

[8] Leo P. Crespi. Quantitative variation of incentive and performance in the white rat. The

American Journal of Psychology, 55(4):467–517, 1942.

[9] Christian Dustmann and Costas Meghir. Wages, experience and seniority. Review of Eco-

nomic Studies, 72:77–108, 2005.

[10] Robert H. Frank and Robert M. Hutchens. Wages, seniority, and the demand for rising

consumption profiles. Journal of Economic Behavior and Organization, 21:251–276, 1993.

41

[11] Shane Frederick and George Loewenstein. Hedonic adaptation. In Edward Diener

Daniel Kahneman and Norbert Schwartz, editors, Well-Being: The Foundations of Hedonic

Psychology, pages 302–329. Russell Sage Foundation Press, New York, 1999.

[12] Todd A. Hare, Wolfram Schultz, Colin F. Camerer, John P. O’Doherty, and Antonio Rangel.

Transformation of stimulus values into motor commands during simple choice. Proceedings

of the National Academy of Sciences of the USA, 108:18120–25, 2011.

[13] Gerhard Jocham, Tilmann A. Klein, and Markus Ullsperger. Dopamine-mediated reinforce-

ment learning signals in the striatum and ventromedial prefrontal cortex underlie value-based

choices. Journal of Neuroscience, 2:1606–13, 2011.

[14] Daniel Kahneman, Barbara L. Fredrickson, Charles A. Schreiber, and Donald Redelmeier.

When more pain is preferred to less: Adding a better end. Psychological Science, 4(6):401–

405, 1993.

[15] Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk.

Econometrica, 47(2):263–291, 1979.

[16] Daniel Kahneman, Peter P. Wakker, and Rakesh Sarin. Back to Bentham? Explorations of

experienced utility. Quarterly Journal of Economics, 112(2):375–406, 1997.

[17] Mel W. Khaw, Paul W. Glimcher, and Kenway Louie. History-dependent adaptation in sub-

jective value: A waterfall illusion for choice. Center for Neural Science, New York University,

2017.

[18] Simon Laughlin. A simple coding procedure enhances a neuron’s information capacity. Zeits-

chrift fur Naturforschung C, 36:910–912, 1981.

[19] Edward Lazear. Agency, earnings profiles, productivity, and hours restrictions. American

Economic Review, 71:606—620, 1981.

[20] Kenway Louie and Paul W. Glimcher. Efficient coding and the neural represntation of value.

Annals of the New York Academy of Sciences, 1251:13–32, 2012.

[21] Mathematical Society of Japan, Kiyoshi Ito, ed. Encyclopedic Dictionary of Mathematics,

Third Ed. MIT Press, Cambridge, MA, 1987.

[22] Frederick Mosteller and Philip Nogee. An experimental measurement of utility. Journal of

Political Economy, 59(5):371–404, 1951.

[23] Nick Netzer. Evolution of time preferences and attitudes towards risk. American Economic

Review, 99(3):937–55, 2009.

[24] Matthew Rabin. Risk aversion and expected-utility theory: A calibration theorem. Econo-

metrica, 68:1281–1292, 2000.

[25] Luis Rayo and Gary Becker. Evolutionary efficiency and happiness. Journal of Political Eco-

nomy, 115(2):302–337, 2007.

[26] Alfonso Renart and Christian K. Machens. Variability in neural activity and behavior. Cur-

rent Opinion in Neurobiology, 25:211–220, 2014.

[27] Arthur J. Robson. A biological basis for expected and non-expected utility. Journal of Eco-

nomic Theory, 68(2):397–424, 1996.

[28] Arthur J. Robson. The biological basis of economic behavior. Journal of Economic Literature,

39(1):11–33, 2001.

[29] Arthur J. Robson. Adaptive hedonic utility and risk-taking. Working paper, Simon Fraser

University, 2018.

[30] Arthur J. Robson and Larry Samuelson. The evolutionary optimality of decision and experi-

enced utility. Theoretical Economics, 6:311, 2011.

42

[31] Robert D. Rogers. The roles of dopamine and serotonin in decision making: Evidence from

pharmacological experiments in humans. Neuropsychopharmacology, 36:114–132, 2011.

[32] H. L. Royden. Real Analysis. Prentice Hall, Englewood Cliffs, NJ, 1988.

[33] David A. Schkade and Daniel Kahneman. Does living in California make people happy? A

focusing illusion in judgments of life satisfaction. Psychological Science, 9(5):340–346, 1998.

[34] Wolfram Schultz. Neuroan reward and decision signals: From theories to data. Physiological

Reviews, 95:853–951, 2015.

[35] Wolfram Schultz. Dopamine reward prediction error coding. Dialogues in Clinical Neuros-

cience, 18(1), 2016.

[36] Michael N. Shadlen and Daphna Shohamy. Decision making and sequential sampling from

memory. Neuron, 90:927–39, 2016.

[37] William R. Stauffer, Armin Lak, and Wolfram Schultz. Dopamine reward prediction error

responses reflect marginal utility. Current Biology, 90:927–39, 2014.

[38] Philippe N. Tobler, Christopher D. Fiorillo, and Wolfram Schultz. Adaptive coding of reward

value by dopamine neurons. Science, 307:1642–1645, 2005.

[39] D.J. Tolhust, J. A. Movshon, and A. F. Dean. The statistical reliability of signals in single

neurons in cat and monkey visual cortex. Vision Research, 23(8):775–785, 1983.

[40] Martin Vestergaard and Wolfram Schultz. Choice mechanisms for past, temporally extended

outcomes. Proceedings of the Royal Society: Series B, 282:20141766, 2015.

[41] Joshua A. Weller, Irwin P. Levin, Baba Shiv, and Antonio Bechara. Neural correlates of

adaptive decision making for risky gains and losses. Psychological Science, 18(11):958–64,

2007.

[42] Marco K. Wittman, Nils Kolling, Rei Akaishu, Bolton K. H. Chau, Joshua W. Brown, Natalie

Nelissen, and Matthew F. S. Rushworth. Predictive decision making driven by multiple time-

linked reward representations in the anterior cingulate cortex DOI 10.1038/ncomms12327.

Nature Communications, 2016.

[43] Michael Woodford. Inattentive valuation and reference-dependent choice. Department of Eco-

nomics, Columbia University, 2012.

Documents

ADAPTIVE HEDONIC UTILITY1 - as.nyu.eduas.nyu.edu/content/dam/nyu-as/econ/documents/2019-spring/Adaptive... · tion to the so called hedonic treadmill.9 The adaptation of utility here