1 Learning and the Economics of Small Decisions Ido Erev, Technion Based on a chapter written with Ernan Haruvy for the 2 nd Vol. of the Handbook of Experimental

1

Learning and the Economics of Small Decisions

Ido Erev, Technion

Based on a chapter written with Ernan Haruvy for the 2nd Vol. of the Handbook of Experimental Economics edited by Kagel and Roth.

The chapter focuses on the relationship between basic learning phenomena and mainstream behavioral economic research.

To clarify the analysis we start with replications of the basic learning phenomena in a simplified “standard paradigm” (see Hertwig and Ortmann, 2002)

http://lakers.topbuzz.com/gallery/v/kobe-bryant/Kobe+Bryant+looks+to+pass+out+of+the+Timberwolve+double+team.JPG.html

http://www.google.it/imgres?imgurl=http://nopests.com/blog/wp-content/uploads/2010/07/bee-and-flower.jpg&imgrefurl=http://nopests.com/blog/tag/bees&h=351&w=400&sz=16&tbnid=UuuKZkQlZ_n6UM:&tbnh=109&tbnw=124&prev=/images%3Fq%3Dbees%2Bpictures&zoom=1&q=bees+pictures&usg=__2IvLQnoz0EaBY-b8GxR6p-DXruA=&sa=X&ei=qtrcTNnCKM7ssgbD3cn2Cw&ved=0CBgQ9QEwAA

2

The clicking paradigm

The current experiment includes many trials. Your task, in each trial, is to click on one of the two keys presented on the screen. Each click will be followed by the presentation of the keys’ payoffs. Your payoff for the trial is the payoff of the selected key.

You selected Right. Your payoff in this trial is 1Had you selected Left, your payoff would be 0

10

Not a test of rational decision theory; the rationality assumption is not even wrong.

3

S R P(R)

5 0 (10, .1; -1) 27

6 0 (-10, .1; +1) 56

400 trials, ¼ cent per point

1. Underweighting of rare events (Barron & Erev, 2003)

Risk Seeking

Experience-Description gap (Hertwig et al, 2004)

Underestimation? (Barron & Yechiam, 2009)

Robust to prior information (Jessup et al., 2008)

Reversed certainty effect

Similar pattern in Honey Bee (Shafir et al., 2008)

Taleb’s Black Swan effect

Sensitivity to magnitude: -20 vs. -10 (Ert & Erev, 2010)

Risk Aversion

4

H L P(H)

1 1 0 96

2 (11, .5; -9) 0 58

3 0 (9, .5; -11) 53

2. The payoff variability effect (Myers & Sadler, 1960; Busemeyer &Townsend, 1993 ).

Neither!!

Risk aversionOr Loss aversion?

0

0.20.4

0.60.8

11.2

1 2 3 4 5 6 7 8 9 1

Blocks of 20 trials

Pro

po

rtio

n o

f H

C

ho

ices Problem 1

Problem 2

Problem 3

3. The Big Eye effect (Ben Zion et al., 2010, Grosskopf et al., 2006)

x ~ N(0,300), y ~ N(0, 300)

R1: xR2: y M: Mean(R1,R2) + 5

0

0.2

0.4

0.6

0.8

1

1 10 30 50 70 90

Trial

Ass

et M

Pro

p

Deviation from: maximization, risk aversion, loss aversion.Implies under-diversificationRobust to prior information

4. The hot stove effect (Hogarth & Einhorn, 1992; March and Denrell, 2002).

6

5. Surprise-triggers-change (Nevo & Erev, 2010)

Evaluation of the sequential dependency in 2-alternative studies reveals a 4-fold recency pattern:

This pattern violates reinforcement learning and similar “positive recency models”. It is consistent with the observation of high correlation between price change and volume of trade in the stock market (Karpoff, 1987), and with decrease of compliance found after an audit (Kastlunger, Kirchler, Mittone & Pitters, 2010).

Can be captured with the assumption that surprise triggers change

Problem Proportion. Of repeated R choices

Proportion. Of Switches to R

0 or (+1, .9; -10) After +1After -10

8469

After +1After -10

2131

0 or (+10, .1; -1) After +10After -1

6079

After +10After -1

23 6

7

6. High sensitivity to sequential dependencies (Biele, Erev & Ert, 2008)

R after +1

R after -1

R after Safe (0)

0.98 0.23 0.15

High Low

High .95 .05

Low .05 .95State at t

State at t +1S: 0 with certaintyR:1 if the State is High; -1 if the State is Low

8

Implications to descriptive models:

The basic learning phenomena are extremely robust: They appear to be common to human and other animals, are consistent with stock market phenomena, and can be easily replicated.

The current replications kept the environment fixed and focused on a single variable: the incentive structure. Thus, it should be possible to capture these regularities with a general model without “situation specific parameters.”

In addition, the results show important limitations of traditional reinforcement learning models.

9

I-SAW (Inertia, Sampling and Weighting; Nevo & Erev, 2010)

Three response modes: Exploration, exploitation and inertia.

At each exploitation trial player i computes the estimated value of alternative j as:ESV(j) = (1-wi)(Mean of sample of mi from j) + wi (Grand Mean j)

Sampling by similarity, and the very last outcome is more likely to be in the sample. The alternative with the highest ESV is selected.

Exploration implies random choice.

Inertia implies repetition of the last choice. The probability of inertia decreases when the outcomes are surprising. Surprise is computed by the gap between the payoff at t, and the payoffs in the previous trials

An example of a case based decision model (Gilboa & Schmeidler, 1995 and see related ideas in Kareev, 2000; Osborne and Rubinstein, 1998; Gonzalez et al., 2003)

10

Choice prediction competitions (Erev, Ert & Roth, 2008, 2010)

1. Individual choice tasks http://tx.technion.ac.il/~eyalert/Comp.html

The task: Predicting the proportion of risky choices in binary choice task in the clicking paradigm without information concerning forgone payoffs.

Two studies (estimation and competition) each with 60 conditions We published the estimation, and challenge other researchers to predict the result of the second.

The models were rank based on their squared error.

The best baseline is a predecessor of I-SAW. The winning submission, submitted by Stewart, West & Lebiere is based on a similar instance based (“episodic”) logic (with a quantification in ACT-R).

Reinforcement learning and similar “semantic” models did not do well.

11

2. Market Entry game http://tx.technion.ac.il/~eyalert/Comp.html

The task: Predicting behavior in four person market game.

The best baseline is a predecessor of I-SAW. The winning submission, submitted by Chen et al. is a version of I-SAW.

Reinforcement learning and similar “semantic” models did not do well.

12

A second look at the experience-description gap (Marchiori, Di Guida & Erev)

The tendency to overweight rare events in decisions from description (the pattern captured by PT) may not be a reflection of a distinct decision process.

It can be a product of the nature of past experiences in similar situations.

For example, when the agents is asked to choose between:

S: -5 with certaintyR: -5000 with probability 1/1000

She recalls past experiences with events estimated with similarly low probabilities.

Previous research (e.g., Erev et al., 1994) suggests that events estimated with 1/1000 occurs with much higher probability (around 1/10). Thus, the reliance of these experiences can lead to the pattern predicted by prospect theory.

13

2. The effect of social interaction and prior information

1. In certain situations the additional complexity have limited effect:Constant sum game (Erev & Roth, 1998)The Market Entry Game competition.

2. In some cases the prior information affects the learning process.Reciprocation in repeated prisoner dilemma game

3. The experience-description gap in games.Other regarding preferences and the mythical fixed pie

14

S B1 B2 B3 E

S 10, 5 9, 0 9, 0 9, 0 9, 0

B1 0, 4 0, 0 0, 0 0, 0 0, 0

B2 0, 4 0, 0 0, 0 0, 0 0, 0

B3 0, 4 0, 0 0, 0 0, 0 0, 0

E 0, 4 0, 0 0, 0 0, 0 12, 12

When the game is played with fixed matching, known payoff matrix, and noiseless feedback, players select the fair and efficient outcome (E, E).

Violation of one of these condition leads to the (S, S) as predicted by I-SAW and similar models.

Optimistic implications from a different story

Interpersonal conflicts (Erev & Greiner, 2010)

15

3. The economics of small decisions

The experimental paradigms considered here focus on small decisions from experience: The expected stakes were very low (a few cents or less per choice), and the decision makers did not spend more than a few seconds on each choice. We believe that this set of paradigms is not just a good test bed for basic learning phenomena, it is also a good simulation of natural environments in which experience is likely to shape economic behavior. In many of these environments small decisions from experience can lead to consequential outcomes.

16

1. Gentle COP: Enforcement of safety rules (Erev & Rodansky, 2004)

Enforcement is necessaryWorkers like enforcement programsProbability is more important than magnitudeLarge punishments are too costly, therefore, gentle enforcement can be optimalSafety Climate (Zohar, 1980; Zohar and Luria, 1994)

01020304050

60708090

100

Ear plugs

Eye protection

Gloves

17

3. The decision to explore and the NCAA (Gopher et al., 1989)

Two teams in 2005/6 and 2006/7Memphis and U of Florida

Ten additional teams in 2007/8 including Kansas

The world's first brain training toolfor basketball players

2. Car recall (Barron, Leider & Stack, 2008)

18

4. Stock market patterns

Black Swan

Insufficient investment in the stock market, and insufficient diversification.

High correlation between Price change and volume of trade in the following day

19

4. Summary

Many of the classical properties of human and animal learning can be reliably reproduced in the easy to run (and to model) clicking paradigm.

The main results can be captured with models that assume best reply to small samples of experiences in similar cases. The implied behavioral processes are evolutionary reasonable, but can lead to robust deviations from maximization in relatively static environments.

These simple models fail when the description of the game suggests easy and efficient super-game strategies. The clearest example is reciprocation in the prisoner dilemma game with full information, fixed matching, low noise. We believe that this set of situations is interesting but small and overrated.

The current understanding of decisions from experience is sufficient to shed light on many natural problems.

Documents

1 Learning and the Economics of Small Decisions Ido Erev, Technion Based on a chapter written with Ernan Haruvy for the 2 nd Vol. of the Handbook of Experimental