Graduate Microeconomics II Lecture 7: Moral Hazardhomepages.ulb.ac.be/~plegros/documents/classes/micro2/L7-Moral hazard.pdf · Graduate Microeconomics II Lecture 7: Moral Hazard Patrick

Graduate Microeconomics IILecture 7: Moral Hazard

Patrick Legros

1 / 25

Outline

Introduction

A principal-agent modelThe value of information

Rent extractionLimited liability of the agent

The “exponential linear normal” model

Moral hazard in teams

2 / 25

Outline

Introduction





3 / 25

Outline

Introduction





4 / 25

Outline

Introduction





5 / 25

Outline

Introduction





6 / 25

Introduction

I Prevent workers from shirking

I Insure that firms produce the right quality

I How much information about effort is embodied in the outputthat is observed?

I Role of monitoring? Value of additional information?

I Does joint production help or does it make things worse froman incentive point of view?

As for adverse selection (screening), there are two basic constraintson the design of incentive schemes

Incentive Compability agents must prefer to take the equilibriumaction than another action.

Voluntary Participation agents prefer to participate than refusingthe contract.

7 / 25

Introduction

I Prevent workers from shirking

I Insure that firms produce the right quality

I How much information about effort is embodied in the outputthat is observed?

I Role of monitoring? Value of additional information?

I Does joint production help or does it make things worse froman incentive point of view?

As for adverse selection (screening), there are two basic constraintson the design of incentive schemes

Incentive Compability agents must prefer to take the equilibriumaction than another action.

Voluntary Participation agents prefer to participate than refusingthe contract.

8 / 25

A principal-agent model

I An agent (worker in a firm) thas an outside opportunity of u.If hired in a firm, she has to take an action a ∈ [0,∞). Theverifiable output is a random variable y with distributionF (y , a).

I Higher actions are assumed to lead to higher expected output.More generaly, assume FOSD if a > a′ the F (·, a) FOSDF (·, a′).

I The agent is risk averse and her utility of a monetary payoff ofx when she takes action a is

u(x)− c(a)

where u is increasing concave and c is increasing convex.

I The principal (owner of the firm) is a residual claimant: if theoutput is y and the worker is paid x , he has utility

v(y − x)

9 / 25

A principal-agent modelFull contractibility

Suppose that the effort of the agent is verifiable. The contractbetween the principal and the agent can then specify an effort thatthe agent must exert and the output contingent wage w(y) thatthe agent will receive.

maxa,{w(y)}

∫v(y − w(y))f (y , a)dy

s.t.

∫u(w(y))f (y , a)dy − c(a) ≥ u

The constraint is the participation constraint of the agent. If λ isthe coefficient of the constraint, the problem can be written

maxa,{w(y)},λ≥0

∫[v(y − w(y)) + λu(w(y))] f (y , a)dy − λ[c(a) + u]

10 / 25

Assuming that a > 0, pointwise maximization then yields

∀y ,−v ′(y − w(y)) + λu′(w(y)) = 0 (1)∫[v(y − w(y)) + λu(w(y))] fa(y , a)dy − λc ′(a) = 0 (2)∫

u(w(y))f (y , a)dy − c(a)− u = 0 (3)

(1) implies that for each y ,

v ′(y − w(y))

u′(w(y))= λ

hence for each output level the ratio of marginal utility levels isconstant. This is Borch rule.

I Note that by concavity of u and v , the optimal w(y) isincreasing in y .

I Question: why is (3) binding? That is why is λ > 0?

11 / 25

A principal-agent modelThe risk-incentive tradeoff

Assume now that a is not verifiable. The principal can no longerimpose a value of a but can still design a contract {w(y)} in sucha way that the agent will be induced to take the “right” value of a.

The design must take into account the incentive problem. Facedwith an output contingent compensation the agent will choose a tomaximize

∫u(w(y))f (y , a)dy − c(a). Hence by choosing w(·), the

principal effectively induces the agent to choose a that maximizesher expected utility. Hence the problem is now,

max{w(y)},a

∫v(y − w(y))f (y , a)dy (4)

s.t.

∫u(w(y))f (y , a)dy − c(a) ≥ u (5)

a ∈ arg max

∫u(w(y))f (y , a)dy − c(a) (6)

12 / 25

A principal-agent modelTwo difficulties

1. (6) may be difficult to deal with: in particular, there may bedifferent optimal values of effort for a given incentive schemew .

If the objective function of the agent is concave in a, then it ispossible to replace the incentive compatiblity constraint (6) bythe corresponding first order condition. Conditions underwhich this is appropriate have been derived by Mirrlees,Holmstrom, Rogerson, Jewitt. One such (pretty strong)condition is that F (y , a) is concave in a. Grossman and Hart(1982) analyze the problem by assuming finitely many statesand action levels and “bypass” the first order problemaltogether.

2. If the agent and the principal have large levels of wealth thatcan be used in contracting, then it may be possible to getvery close to the first best situation by imposing very largepenalties for low levels of output (Mirrlees).

13 / 25

A principal-agent modelThe first order approach

It is convenient to replace the FOC by a weak inequality (Jewitt).The problem of the principal is,

max{w(y)},a

∫v(y − w(y))f (y , a)dy (7)

s.t.

∫u(w(y))f (y , a)dy − c(a) ≥ u (8)∫

u(w(y))fa(y , a)dy − c ′(a) ≥ 0 (9)

Let λ the coefficient for the IR constraint and let µ be thecoefficient for the IC constraint (9).

14 / 25

Pointwise optimization with respect to w(y) leads to

v ′(y − w(y))

u′(w(y))= λ+ µ

fa(y , a)

f (y , a)

where fa(y , a)/f (y , a) is the likelihood ratio.

Note that w(y) is increasing in y only if fa/f is increasing (MLRPcondition). This condition holds for many usual distributions(uniform, normal in particular).

15 / 25

A principal-agent modelThe cost of IC: µ > 0

If µ = 0, we are back to Borch rule and the first best can beimplemented. However this is impossible if F (y , a) is a nontrivialfunction of a and if the first best effort is positive.Going back to the first best problem, if the optimal a is positive, itsolves ∫

[v(y − w(y)) + λu(w(y))] fa(y , a)dy − λc ′(a) = 0

Since the IC condition (9) holds we must have∫v(y − w(y))fa(y , a)dy = 0,

that is the principal does not benefit from an increase in the effortof the agent.

16 / 25

A principal-agent modelGood news-bad news

As long as MLRP holds, that is when fa(y , a)/f (y , a) is increasingin a, the second best compensation w(y) is increasing in y . Since∫

yfa(y , a)dy = 0,

there exists y such that the likelihood ratio is equal to zero at y .

Hence we havew(y) > w(y) if y > y [Good news]

w(y) < w(y) if y < y [Bad news]

In other words, with respect to the first best compensation, thecompensation involves an increasing “bonus” as the output ishigher and an increasing “malus” as the output is lower than y .

17 / 25

A principal-agent modelThe value of additional information

Suppose that in addition to output y there is an additional signal zand that there is a joint distribution F (y , z , a).

Starting from a world where only y is contractible (henceF (y , a) =

∫z F (y , z , a)dz), the principal and agent have second

best payoffs of V ∗,U∗.

A question raised by Holmstrom (1979) is the following: if z iscontractible in addition to y is it the case that the resulting secondbest optimum will lead to higher payoffs for the principal and theagent?

18 / 25

Answer can be glanced via the FOCs with and without zWith y only as a basis for compensation schemes:

v ′(y − w(y))

u′(w(y))= λ+ µ

fa(y , a)

f (y , a)

With y and z as a basis for compensation schemes:

v ′(y − w(y , z))

u′(w(y , z))= λ+ µ

fa(y , z , a)

f (y , , z , a)

Hence, w(y , z) is a non-trivial function of z only if the likelihood

ratio fa(y ,z,a)f (y ,,z,a) is a non-trivial function of z .

19 / 25

Proposition: The additional signal z leads to a Paretoimprovement if and only if z is informative for a. That is if andonly if we cannot write

f (y , z , a) = g(y , a)h(y , z)

20 / 25

Principal-agent modelsThe rent extraction motive

In the previous problem the principal distorts the compensationscheme with respect to the first best (Borch) in order to createincentives for the agent: since the agent is risk-averse, underMLRP, more “variance” in compensation induces the agent toexert high effort in order to avoid the low compensationcorresponding to low output levels.

Obviously this incentive creation has force only if the agent is riskaverse. With there is in fact no incentive problem if the agent isrisk neutral! Why?

Because the agent is risk neutral, Borch rule requires that theprincipal gets full insurance (that is the agent bears all the risk).That is the principal gets a fixed payoff P and the agent gets theresidual y − P. Letting a∗ be the first best effort, it is clear thatwith the first best compensation the agent will choose a∗, that isthe first best is implemented in the second best!

21 / 25

Principal-agent modelsThe rent extraction motive

In the previous problem the principal distorts the compensationscheme with respect to the first best (Borch) in order to createincentives for the agent: since the agent is risk-averse, underMLRP, more “variance” in compensation induces the agent toexert high effort in order to avoid the low compensationcorresponding to low output levels.

Obviously this incentive creation has force only if the agent is riskaverse. With there is in fact no incentive problem if the agent isrisk neutral! Why?

Because the agent is risk neutral, Borch rule requires that theprincipal gets full insurance (that is the agent bears all the risk).That is the principal gets a fixed payoff P and the agent gets theresidual y − P. Letting a∗ be the first best effort, it is clear thatwith the first best compensation the agent will choose a∗, that isthe first best is implemented in the second best!

22 / 25

However this construction supposes that the agent is able to pay Pto the principal for all possible output realizations. This may notbe possible for low output levels if the agent does not haveadditional wealth or cannot easily borrow).

If a fixed compensation P is not feasible, then the principal will not“sell the firm” to the agent and the resulting second best solutionwill not be first best.

The source of inefficiency here is akin to a rent extraction motive.

23 / 25

Rent extraction motiveA simple example

Suppose two output levels (R and 0), risk neutral principal andagent, probability of high output is π(a) where a ∈ {0, 1} is theaction of the agent. The cost of action is a. Let sR , s0 be thecontract. Assuming that the agent has limited liability, we needsy ≥ 0,∀y .

I For a = 0, the easiest way is to pay the agent a fixed wagesR = s0 = w . For participation, we need w = u and theprincipal gets

V a=0 = π(0)R − u

.I For a = 1, the IC constraint

π(1)sR + (1− π(1))s0 − 1 ≥ π(0)sR + (1− π(0))s0

⇔(π(1)− π(0))(sR − s0) ≥ 1

24 / 25

Incentives depend on the gap sR − s0. Since the expected utility ofthe agent is

U = π(1)(sR − s0) + s0 − 1

and since s0 ≥ 0, the lowest second best payoff to the agentconditional on her taking action a = 1 is

Um =π(1)

π(1)− π(0)− 1

I If Um < u, the principal needs to increase the compensationto the agent (keeping the “gap” large enough), e.g., byincreasing s0.

I If Um ≥ u, the principal must give a rent (equal to Um − u ifhe wants action 1 to be taken

25 / 25

The payoff to the principal when a = 1 is then

V a=1 = π(1)R −max{Um, u} − 1

This has to be contrasted with V a=0 = π(0)R − u. Clearly, theprincipal prefers to implement a = 1 if and only if

(π(1)− π(0))R ≥ max{Um − u, 0}+ 1

Efficiency would require that a = 1 when (π(1)− π(0))R ≥ 1.

However it is clear that if Um − u is large enough that the principalwill choose to implement a = 0 because providing rents to theagent becomes too costly from the point of view of the principal.

26 / 25

Principal-agentThe “exponential linear normal” model

Output is y = a + ε where ε is normally distributed, ε ∼ N(0, σ2)and where a is effort. The principal is risk neutral and the agenthas utility − exp r(x − c(a)) where r is the degree of absolute riskaversion of the agent and c is the cost of effort.

Limit attention to linear sharing rules w(y) = s + by .

The optimal sharing rate is then

b =1

1 + rσ2c ′′

decreasing in r , in σ2 and c”.

27 / 25


I Two agents and joint production. Actions are ai , i = 1, 2 andcost for agent i is c(ai ).

I There is joint production and only the final output y(a1 + a2)is contractible.

I A sharing rule defines shares for each agent s1(y), s2(y).

I Budget balancing requires s2(y) = y − s1(y).

The first best would require to choose a∗ to maximize y(a)− c(a),hence that

y ′(a∗) = c ′(a∗i ), i = 1, 2

28 / 25

For any differentiable sharing rule, a∗ is a Nash equilibrium if eachagent i maximises si (y(ai , a

∗j )− c(ai ), hence

y ′(a∗)s ′1(y(a∗)) = c ′(a∗) (10)

y ′(a∗)s ′2(y(a∗)) = c ′(a∗) (11)

By budget balancing, s ′1(y) + s ′2(y) = 1 and therefore (11) is

y ′(a∗)− y ′(a∗)s ′1(y(a∗)) = c ′(a∗),

and adding to (10) we have

y ′(a∗) = 2c ′(a∗)

which contradicts the definition of a∗.

29 / 25

Documents

Graduate Microeconomics II Lecture 7: Moral Hazardhomepages.ulb.ac.be/~plegros/documents/classes/micro2/L7-Moral hazard.pdf · Graduate Microeconomics II Lecture 7: Moral Hazard Patrick