Contract Theory, by Alex Young

8/10/2019 Contract Theory, by Alex Young

1/86

Contract Theory

Alex Young

December 19, 2012

Contents

1 Preface 4

2 Hidden Information, Screening 4

2.1 The Simple Economics of Adverse Selection . . . . . . . . . . . . . . . . . . 4

2.1.1 First-Best Outcome: Perfect Price Discrimination . . . . . . . . . . . 6

2.1.1.1 Comparison of type-specific contracts with each other. . . . 6

2.1.1.2 In case its been too long since youve last done a maximiza-

tion problem . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1.3 Mas-Colell et al.[1995]. . . . . . . . . . . . . . . . . . . . . 8

2.1.1.4 Just what is the single crossing condition, exactly? . . . . . 10

2.1.2 Adverse Selection, Linear Pricing, and Simple Two-Part Tariffs . . . 10

2.1.2.1 Linear Pricing . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.2.2 Single Two-Part Tariff . . . . . . . . . . . . . . . . . . . . . 122.1.2.3 Did you forget the envelope theorem?. . . . . . . . . . . . . 12

2.1.3 Second-Best Outcome: Optimal Nonlinear Pricing Algorithm. . . . . 14

2.1.3.1 Comparison with the First Best . . . . . . . . . . . . . . . . 17

2.2 Application: Credit Rationing . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1


2/86

2.2.1 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 A Continuum of Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.1 The Implementation Problem . . . . . . . . . . . . . . . . . . . . . . 23

2.3.2 The Optimization Problems Algorithm. . . . . . . . . . . . . . . . . 25

2.3.2.1 How do you get the local incentive compability constraint? . 25

2.3.2.2 Doesnt that look familiar? . . . . . . . . . . . . . . . . . . 30

2.4 Application: Costly State Verification. . . . . . . . . . . . . . . . . . . . . . 31

3 Hidden Information, Signaling 34

3.1 Spences Model of Education as a Signal . . . . . . . . . . . . . . . . . . . . 34

3.1.1 Did you forget the beer quiche game? . . . . . . . . . . . . . . . . . . 36

3.1.2 Refinements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.1.2.1 Cho and Kreps Intuitive Criterion . . . . . . . . . . . . . . 42

3.1.2.2 Maskin and Tiroles Informed Principal Problem . . . . . . 45

4 The Principal-Agent ProblemMas-Colell et al. [1995] 46

4.1 Comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2 Hidden Actions (Moral Hazard) . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.2.1 The Optimal Contract when Effort is Observable . . . . . . . . . . . 48

4.2.1.1 A risk-averse owner. . . . . . . . . . . . . . . . . . . . . . . 50

4.2.2 The Optimal Contract when Effort is Not Observable . . . . . . . . . 51

4.2.2.1 A risk-neutral manager. . . . . . . . . . . . . . . . . . . . . 52

4.2.2.2 A risk-averse manager . . . . . . . . . . . . . . . . . . . . . 53

4.2.3 Comparing the first and second-best wage contracts . . . . . . . . . . 554.2.3.1 An additional demonstration of the implications of risk aversion 55

4.2.4 The Value of Information . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3 Linear Contracts, Exponential Utility, and Normally Distributed Performance

(LEN)[Bolton and Dewatripont,2004] . . . . . . . . . . . . . . . . . . . . . 58

2


3/86

4.3.1 Did you forget the properties of the log-normal distribution? . . . . . 60

4.4 Moral Hazard in Teams[Bolton and Dewatripont,2004]. . . . . . . . . . . . 60

4.4.1 Unobservable Individual Outputs . . . . . . . . . . . . . . . . . . . . 61

4.4.2 Observable Individual Outputs . . . . . . . . . . . . . . . . . . . . . 64

4.5 Combining Moral Hazard and Adverse Selection . . . . . . . . . . . . . . . . 68

4.5.1 Optimal Contract, Moral Hazard Only . . . . . . . . . . . . . . . . . 70

4.5.2 Optimal Contract, Adverse Selection Only . . . . . . . . . . . . . . . 71

4.5.3 Optimal Contract, Moral Hazard and Adverse Selection. . . . . . . . 74

5 Applications 77

5.1 Application:Antle and Fellingham[1995]. . . . . . . . . . . . . . . . . . . . 77

5.1.1 The Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.1.2 Solution of Benchmark Case . . . . . . . . . . . . . . . . . . . . . . . 79

5.1.2.1 How? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 Prendergast [1993] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.2.1 Simplified Representation . . . . . . . . . . . . . . . . . . . . . . . . 80

3


4/86

1 Preface

These notes are mostly follow Bolton and Dewatripont [2004] andMas-Colell et al. [1995].

Textbooks always omit steps in derivations due to space constraints. Moreover, its a good

exercise for the reader to fill in the blanks. Since I habitually forget what I have done,

why not keep a record of all those aha! moments I had while trodding through the tedious

algebra? Why not comment on what the algebra means in English?

I thank Qi Chen for his teaching, Zeqiong Huang and Thomas Steffen for studying together,

and of course my father for many helpful Skype conversations.

2 Hidden Information, Screening

2.1 The Simple Economics of Adverse Selection

Consider a transaction between a buyer and a seller, where the seller does not know the

buyers exact valuation but sets the terms of the contract. The buyers preferences are

represented by the utility function

u(q , T , ) =

q0

P(x, ) dx T

whereqis the number of units purchased,Tis the total amount paid to the seller, and P(x, )

is the inverse demand curve of a buyer with preference characteristics . For tractability,

consider the following functional form for the buyers preferences:

u(q , T , ) =v(q) T

4


5/86

wherev(0) = 0, v(q)>0, and v (q)< 0 q. That is,

u(0, 0, ) =v(0) 0 = 0

u(q , T , )

q =v (q)>0

2u(q , T , )

q2 =v (q)0. His profit from sellingq

units against a sum of moneyT is given by

= T cq

The question is, What is the best (i.e. the profit maximizing) pair (T, q) that the seller will

be able to induce the buyer to choose? The answer depends on the information the seller

has on the buyers preferences. For simplification, consider the case where there are only two

types of buyers: {L, H}, with H > L. The consumer is of typeL with probability

[0, 1] and of type Hwith probability (1 ). The probabilitycan also be interpreted

as the proportion of consumers of type L.

5


6/86

2.1.1 First-Best Outcome: Perfect Price Discrimination

Suppose instead that the seller knows the buyers exact valuation. The seller can then

treat each type of buyer separately and offer type-specific contracts, that is, (Ti, qi) for type

i (i = H, L). Assume the buyer obtains a payoff of u if she does not take the offer. The

seller will try to maximize his profits subject to inducing the buyer to accept the proposed

contract:

maxTi,qi

Ti cqi

subject to

iv(qi) Ti u (IR)

The constraint means that the buyer will participate only if doing so does not make him

worse off than not participating (i.e. individual rationality). The solution to the problem is

the contract (qi, Ti) such that

iv(qi) =c

iv(qi) = Ti+ u

2.1.1.1 Comparison of type-specific contracts with each other We can relate the

quantities and prices of one type to the other.

Hv(qH) =Lv(qL)

SinceH> L, it must be that v(qH)< v

(qL). Since v() 0, qH> qL. It makes intuitive

6


7/86

sense that the high type will consume more than the low type. What about price?

Hv(qH) TH=Lv(qL) TL

Hv(qH) Lv(qL) = TH TL

Since H > L and v(qH) > v(qL) because qH > qL and v() > 0, the left hand side is

positive, and so is the right hand side. Therefore, TH> TL. In the first best, the seller leaves

no rents to either buyer type. Hence, along with the higher consumption for the high type

comes a higher price. This will play a role in the next section when the seller cannot observe

the buyers type. We will see that the seller cannot use these bundles in such a situation to

maximize profit.

2.1.1.2 In case its been too long since youve last done a maximization problem

To see this, first determine whether the IR constraint will hold as an equality in the optimal

solution. The answer is yes if either the partial derivative of the profit function with

respect to Ti or the partial derivative with respect to qi is positive, for then the optimal

pair could not lie below the IR constraint since profit could be made greater by increasing

whichever argument had positive marginal profit [Graham, 2005].

Ti= 1> 0

qi= c


8/86

press the constraint in terms ofTi, substitute into the profit function, and take the FOC:

{(iv(qi) u) cqi}

qi=iv

(qi) c

= 0

iv(qi) =c

Thus, the seller finds it optimal to maximize his total surplus by setting a quantity such

that marginal utility equals marginal cost, and then setting payment to appropriate the full

surplus and leave no rent to the buyer above u. Without adverse selection, the total profit

of the seller is thus

(TL cqL) + (1 )(TH cqH)

2.1.1.3 Mas-Colell et al. [1995] Mas-Colell et al. [1995] discuss hidden information

in the context of the principal-agent problem. That is, postcontractual uncertainty takes

the form of hidden information, not hidden action, and the problem is one of inducing

truthtelling, not effort. The unobservable in the model is the random realization of the

managers disutility from effort.

The managers Bernoulli utility function over wages and effort depends on a state of nature

that is realized after the contract is signed and that only the manager observes. We assume

that Rand that

u(w,e,) =v(w g(e, ))

8


9/86

g(e, ) measures disutility of effort. It has the following properties:

ge(e, )>0 (Disutility of effort is increasing in effort)

gee(e, )>0 (Marginal disutility of effort is increasing in effort)

g(e, )


10/86

Now then,

(eL) ge(eL, H)>

(eL) ge(eL, L) (Single crossing condition)

= 0 (FOC)

= (eH) ge(eH, H) (Equal across states)

(eL) ge(eL, H)>

(eH) ge(eH, H)

eL< eH (fe(ei, i)


11/86

The buyers net surplus can be written as follows:

Si(P) iv(qi) P qi

=iv[Di(P)] P[Di(P)]

D(P) DL(P) + (1 )DH(P)

S(P) SL(P) + (1 )SH(P)

With linear pricing, the sellers problem is the classic monopoly pricing problem; he solves

maxP

= maxP

T cq

= maxP

P D(P) cD(P)

= maxP

(P c)D(P)

with the monopoly price given by

d{(P c)D(P)}dP

=P D(P) +D(P) cD(P)

= 0

P =c D(P)

D(P)

In this solution, we have both positive rents for the buyers (S(P)> 0) and inefficiently low

consumption (i.e. iv(q) = P > c). The seller can do better by moving away from linear

pricing only if buyers cannot make arbitrage profits by trading in a secondary market.

Otherwise, buyers would buy at the minimum average price and then resell in the secondary

market if they do not want to consume everything they have purchased.

11


12/86

2.1.2.2 Single Two-Part Tariff We assume that there is only one buyer and that is

a probability measure; thus, there are no arbitrage opportunities to the buyer. Therefore,

a single two-part tariff (Z, P), where P is the unit price and Z the fixed fee, will improve

upon linear pricing for the seller. P, the minimum fixed fee the seller will set is given by

Z=SL(P). A type-Hbuyer will always purchaseq >0 when charged T(q) =SL(P) + P q,

sinceH> L. If the seller sets Z=SL(P), he also solves

maxP

T(q) = maxP

SL(P) + (P c)q

= maxP

SL(P) + (P c)D(P)

d{SL(P) + (P c)D(P)}

dP =SL(P) + (P c)D

(P) +D(P)

= 0

P =c D(P) +SL(P)

D(P)

By the envelope theorem,

Lv(qL) PqL= Lv(DL(P

)) PDL(P)

= maxP

SL(P)

Lv(DL(P))D

L(P) (P D

L(P) +DL(P)) =S

L(P)

(Lv(DL(P)) P)

=0 by FOC

DL(P) DL(P) =SL(P)

DL(P) =SL(P)

so that D(P) +SL(P)> 0; in addition,D(P)< 0 P > c.

2.1.2.3 Did you forget the envelope theorem? Recall that a property of the indirect

utility function is that it is strictly increasing in w, income[Jehle and Reny,2001]. We prove

12


13/86

this using the Envelope Theorem.

Proof. To solve

v(p, w) = maxxRn

+

u(x)

subject to

p x= w

form the Lagrangian:

L(x, ) =u(x) +(w p x)

L(x, )

xi =

u(x)

xi > 0

pi = 0

>0

By the Envelope Theorem,

v(p, w)

w =

L(x, w)

w = >0

Alternatively, instead of using the envelope theorem, consider a graphical argument:

Use the Fundamental Theorem of Calculus:

SL(P) = d

dPSL(P)

=

d

dP aP DL(t) dt=

d

dP

Pa

DL(t) dt

= DL(P)

where a is where the demand curve intersects the (horizontal) P axis. Thus, if the seller

13


14/86

Figure 1: An inverse demand curve.

decides to serve both types of customers, the first-best outcome cannot be ahcieved and

underconsumption remains relative to the first-best outcome. Alternatively, an optimal

single two-part tariff contract is always preferred by the seller to an optimal linear pricing

contract because the seller can always raise his profits by setting Z=SL(Pm).

2.1.3 Second-Best Outcome: Optimal Nonlinear Pricing Algorithm

Since the seller does not observe the buyers type, he is forced to offer a set of choices

independent of her type. The seller solves

maxT(q)

[T(qL) cqL] + (1 )[T(qH) cqH]

subject to

qi = arg maxq

iv(q) T(q) for i = L, H (IC)

iv(qi) T(qi) 0 for i = L, H (IR)

The problem as-is involves optimization over a schedule T(q) under constraints that them-

selves involve optimization problems. To simplify the problem, follow these steps:

14


15/86

1. Use the revelation principle.

2. Observe that the participation constraint of the high type will not bind at the opti-

mum.

Hv(qH) T(qH) Hv(qL) T(qL) (ICH)

> Lv(qL) T(qL) (H> L)

0 (IRL)

That is, the high type will earn an informational rent.

3. Solve the relaxed problem without the incentive constraint that is satisfied at the first-

best optimum.

The first-best problems outcome is not incentive compatible because the Hbuyer will

prefer to choose (qL, TL) rather than his own first-best allocation because it gives him

a strictly positive surplus:

Hv(qL) TL> Lv(qL) TL (H> L)

= 0 (No rent in first-best)

(H L)v(qL)>0

Instead, type L will not find it attractive to raise his consumption to the level qH:

Hv(qH) T(qH) = 0 (No rent in first-best)

> Lv(qH) T(qH) (H> L)

15


16/86

Therefore, we omit the (ICL) constraint. That is, we omit the constraint for the type

that does not want to deviate; the constraint for the type that does will bind.

4. Observe that the two remaining constraints of the relaxed problem will bind at the

optimum.

5. Eliminate TL and THfrom the maximand using the two binding constraints, perform

the unconstrained optimization, and check that (ICL) is indeed satisfied.

How do you get the second FOC (w.r.t qL) in Step 5?

[Lv(qL) c] (1 )(H L)v

(qL) =v(qL)(L (1 )(H L)) c

= 0

v(qL) = c

L (1 )(H L)

=

c

L

1 (1 )(H L)

L

Lv(qL) = c

1 (1 )(H L)

L

16


17/86

2.1.3.1 Comparison with the First Best From the previous analysis, we see that

Lv(qSL) =

c

1 1

H L

L > c

=Lv(qFL ) (No rent in first best)

Lv(qSL)> Lv

(qFL )

v(qSL)> v(qFL ) (L> 0)

If we assume v () qSL andv

()>0,

L[v(qFL ) v(q

SL)]> 0

ThereforeTFL > TSL . So in the second best, the low type has a lower quantity and a lower

price compared to the first best.

17


18/86

What about the high type? We saw previously that

Hv(qSH) =c

Marginal benefit also equaled marginal cost in the first best:

Hv(qFH) =c

Sincev ()< 0, qFH=qSH; that is, the optimal quantity for the high type is unchanged in the

second best relative to the first best. What about the price?

Hv(qSH) T

SH=Hv(q

SL) T

SL (ICH)

> Lv(qSL) T

SL (H> L)

= 0 (IRL)

=Hv(qFH) T

FH (No rent in first best)

Hv(qSH) T

SH> Hv(q

FH) T

FH

TFH TSH>0 (q

SH=q

FH)

TFH > TSH

So in the second best, the high type gets the same quantity as he gets in the first best at a

lower price. Let us link this result with the result for the low type, who gets a lower quantity

at a lower price in the second best compared to the first best. The seller designs the bundle

such that it is not worth it for the high type to pick the low types bundle. He does so in

two ways:

1. He gives the high type a good deal by offering him his optimal quantity at a lower

price (relative to the first best).

18


19/86

2. He makes the low types bundle less attractive by reducing the quantity.

2.2 Application: Credit Rationing

Consider a population of risk-neutral borrowers who each own a project that requires an

initial outlay of I = 1 and yields a random return X, where X {R, 0}. Let p [0, 1]

denote the probability that X=R. A borrower can of two types i= s, r, where sis safe

and r is risky. The borrower of type i has a project with return characteristics (pi, Ri).

We assume

piRi= m >1 (A1)

ps> pr and Rs< Rr (A2)

A bank can offer to finance the initial outlay in exchange for a future repayment. Assume

that there is a single bank and excess demand for funds: the bank has a total amount max{, 1 }, so that available funds are sufficient to avoid crowding out either type

of borrower completely.

Call (xi, Di) a contract that offers financing with probability xi and repayment Di. The

bank solves

maxxi,Di

[xs(psDs 1) + (1 )xr(prDr 1)]

19


20/86

subject to

0 xi 1 i = s, r (Regularity)

Di Ri i = s, r (IR)

xipi(Ri Di) xjpi(Ri Dj)i, j=s, r (IC)

xs+ (1 )xr (Supply)

2.2.1 Solution

1. We have already applied the revelation principle.

2. The risky type has an incentive to mimic the safe type. Therefore, his (IR) con-

straint does not bind:

RrxrDr xsDs

xr xs(ICR)

>xrDr xsDr

xr xs(Dr > Ds)

=Dr

and the (IR) of the safe type does.

3. The first-best problems outcome is not incentive compatible because the risky type

will prefer to choose (xs, Ds) rather than his own first-best allocation because it gives

him a strictly positive surplus (i.e. an informational rent):

xspr(Rr Ds) xsps(Rs Ds) (pr ps)

= 0 (No rent in first-best)

Rr > Ds

20


21/86

Instead, the safe type will not find it attractive to choose (xr, Dr):

xrpr(Rr Dr) = 0 (No rent in first-best)

>xrps(Rs Dr) (pr < ps)

Therefore, we omit the incentive constraint of the safe type.

4. The banks problem is now to solve

maxxi,Di

[xs(ps(Ds 1) + (1 )xr(prDr 1)]

subject to

Ds= Rs (IRS)

xr(Rr Dr) =xs(Rr Rs) (ICR)

5. As both of these constraints bind, the problem becomes

maxxs,xr

{xs(psRs 1) + (1 )[xr(prRr 1) xspr(Rr Rs)]}

= maxxs,xr

{xs(m 1) + (1 )[xr(m 1) xspr(Rr Rs)]}

subject to

0 xs xr 1

xs+ (1 )xr

The FOC with respect to xr is (1 )(m 1) 0, therefore xr = 1; there is no

rationing of risky borrowers.

21


22/86

The FOC with respect to xs is(m 1) (1 )pr(Rr Rs). Thus,

xs =0, if(m 1) (1 )p

r(Rr Rs)< 0 (1 )

, otherwise

2.3 A Continuum of Types

Suppose that is distributed according to the p.d.ff() with c.d.fF() on an interval [,].

With the revelation principle, the seller solves

maxq(),T()

E[()] = maxq(),T()

E[T() cq()]

= maxq(),T()

[T() cq()]f() d

subject to

v[q()] T() 0 [,] (IR)

v[q()] T() v[q()] T() , [,] (IC)

If all (IC)s hold, the (IR)s can be replaced by

v[q()] T() 0 (IR)

Consider decomposing the sellers problem an implementation problem (whichq() are IC?)

and anoptimization problem(among all feasibleq(), which one is the arg maxq() E[()]?)

22


23/86

2.3.1 The Implementation Problem

We want to show that if the buyers utility function satisfies the single-crossing condition

[ u/qu/T]> 0

then the set of (IC)s in the sellers problem is equivalent to the following:

1. Monotonicity: dq()

d 0.

2. Local incentive compatibility: v[q()]dq()

d =T() [,].

Proof. We first show that (IC) satisfaction implies monotonicity and local incentive compat-

ibility. If consumptionq() and paymentT() are differentiable, the following conditions for

the buyers problem are satisfied at = :

u(q(), T(), )

=

[v(q()) T()]

=v [q()]dq()

d

= 0 (FOC)

v[q()](dq()

d)2 +v[q()]

d2q()

d2 T() 0 (SOC)

The FOC is the local incentive compatibility constraint. If we differentiate it with respect

to, we obtain

v[q()]dq()

d

+v[q()](dq()

d

)2 +v[q()]d2q()

d2

T() 0= 0

v[q()]dq()

d 0

dq()

d 0

We next show that monotonicity and local incentive compatibility imply that all the buyers

23


24/86

(IC)s hold. Suppose not. Then, there exists at least one type = such that

v[q()] T()< v[q()] T()

0 , thenx [,],

v

[q(x)]< xv

[q(x)]

v[q(x)] T(x)< xv[q(x)] T(x)

[v[q(x)]dq(x)

dx T(x)] dx xv[q(x)]

v[q(x)] T(x)> xv[q(x)] T(x)

[v[q(x)]dq(x)

dx T(x)] dx >

[xv[q(x)]dq(x)

dx T(x)] dx

= 0

But < [v[q(x)]

dq(x)

dx T(x)] dx


25/86


26/86

Proof.

v[q()] T() v[q()] T() (Type )

v[q()] T() v[q()] T() (Type)

(v[q()] v[q()]) T() T() (From first ineq.)

(v[q()] v[q()]) T() T() (From second ineq.)

(v[q()] v[q()]) T() T() (v[q()] v[q()])

lim

(v[q()] v[q()])

lim

T() T()

lim

(v[q()] v[q()])

v[q()]dq()

d =T()

by the Sandwich Theorem.

The standard procedure for solving this is to first ignore the monotonicity constraint and

solve the problem with only the (IR) and LIC constraints. Define

W() v[q()] T() = max

{v[q()] T()}

By the envelope theorem,

dW()

d

=

=

W()

=

=v[q()]

26


27/86

or, integrating,

dW(x)

dx dx=

v[q(x)] dx

W() W() =

v[q(x)] dx

W() =

v[q(x)] dx+W()

At the optimum, the participation constraint of the lowest type is binding. SinceT() =

v[q()] W(), we can rewrite the sellers profits as

=

[T() cq()] f() d

=

[(v[q()] W()) cq()] f() d

=

v[q()]

v[q(x)] dx cq()

f() d

=

{v[q()] cq()} f() d

v[q(x)] dxf() d

Solve the second integral through integration by parts:

u=

v[q(x)] dx, dv= f() d

du= v[q()] d, v= F()

27


28/86


29/86

From this equation, we make two observations:

1. There is underconsumption < since first-best efficiency requires v[q()] =c.

2. We can obtain an expression for the price-cost margin:

P[q()] T()

=v [q()]

=c+1 F()

f() v[q()]

=c+1 F()

f()

P[q()]

P cP

=1 F()f()

Recall that we have ignored the monotonicity constraint in the second equality. But we have

to check that the optimal solution defined by (2.31) and the local incentive compatibility

constraint satisfies the monotonicity constraint. In general, that depends on the form of

the buyers utility function and/or on the form of the density function f(). A sufficient

condition for the monotonicity constraint to be satisfied is that the hazard rate is increasing

in :

d

dh()

d

d

f()

1 F()

>0

Letting

g() 1

h()

= 1 F()

f()

29


30/86

the FOCs can be rewritten as

{g()}v[q()] =c

Implicitly differentiating this equation with respect to then yields

g()v[q()]dq

d+v[q()]g() = 0

dq

d=

g()v[q()]

v[q()]g()

Since v() is concave and g()> 0 , dq/d 0 ifg()> 0. Hence, a sufficient condition

for g ()> 0 is that 1/h() is decreasing in .

2.3.2.2 Doesnt that look familiar? Well, it should. Recall the optimal auction design

problem[Krishna, 2010]. The seller maximizes

iN

mi(0) +iN

X

xi

1 Fi(xi)

fi(xi)

Qi(x)f(x) dx

subject to

mi(0) 0 (IR)

dqi(xi)

dxi 0 (IC-1)

Ui(xi) =Ui(0) +

xi0

qi(ti) dti (IC-2)

But (IC-1) is just a monotonicity condition. And (IC-2) is a local incentive compatibility

constraint. Moreover, the design problem is regularif the virtual valuation is an increasing

function ofxi, and a sufficient condition for regularity is that the hazard rate function, i(xi),

30


31/86

is increasing:

i(xi) xi 1

i(xi)

=xi 1 Fi(xi)fi(xi)

2.4 Application: Costly State Verification

Consider a financial contracting problem involving two risk-neutral agents:

1. an entrepreneur with an investment project but no investment funds, and

2. a financier with unlimited funds.

The fixed investment requires a setup cost of I > 0 at t = 0 and generates random cash

flows at t = 1 of [0, +), with p.d.f f(). The entrepreneur observes realized cash

flows and can credibly disclose to the investor only by incurring a certification cost K >0.

The contract-design problem is to specify in advance which cash-flow realizations should be

certified.

The set of contracts from which the contracting parties can choose follows: the contract

specifies whether an audit should take place following the realization of cash flows and what

fraction of realized cash flows the entrepreneur should pay back to the investor. Therevela-

tion principlereduces the set of relevant contracts.

Thus, following the realization of , the entrepreneur truthfully reveals , and the contract

specifies the probability p() [0, 1] of certifying (auditing) cash flows. When there is no

certification, the contract can only specify a repaymentr(). With certification, the contract

specifies repayment contingent on both the announced cash flow and the true, certified

cash flow : r(, ). ra() denotes the repayment for audited cash flows.

31


32/86

A central result of the CSV approach to financial contracting is that standard debt contracts

may be optimal financial contracts. But it is driven by fairly strong assumptions. For

example, financial contracts are assumed to specify only deterministic certification policies:

, p() {0, 1}

That is, commitment to random audits is not feasible. How do we then show that a standard

debt contract is an optimal financial contract?

Assuming that the project has positive NPV, the optimal contracting problem reduces to

minimizing expected audit costs, subject to meeting the entrepreneurs incentive constraints

for truthful revelation, and the financiers participation:

minp(),r(),ra()

K

+0

p()f() d

subject to

+0

p()[ra() K]f() d+

+0

[1 p()]r()f() d I (IR)

ra(1) r(2)1=2 : p(1) = 1 and p(2) = 0 (IC-1)

r(1) =r(2) =r 1=2: p(1) = 0 =p(2) (IC-2)

ra() : p() = 1 (Limited wealth (A))

r() : p() = 0 (Limited wealth (B))

The incentive constraints imply that for any two cash-flow realizations that do not require

certification, the repayment to the financer has to be the same: r(1) = r(2) = r. Oth-

erwise, the entrepreneur could lie about realized cash flows and announce the one with the

32


33/86

lower repayment. Similarly, a cash flow realization that requires certification should not en-

tail a higher repayment than the repayment for a cash-flow realization that does not involve

certification: ra(1) r(2). Otherwise, the entrepreneur could lie about the realization of

1 and thus make a smaller repayment.

The only dimensions along which the financial contract can be varied are the certification

subset and the size of the repayment in the certification subset. Characterizing the optimal

contract follows:

1. Any feasible contract must include the cash-flow realization = 0 in the audit subset.

Otherwise, the entrepreneur could always claim to have a cash flow of zero, thereby

avoiding an audit as well as any repayments to the financier.

2. Any contract that minimizes expected audit costs must be such that for any cash-flow

realization in the audit subset,ra() = min{, r}.

3. Any contract with a disconnected audit subset [0, ] [, ] (with < ) would be

inefficient, since then an obvious improvement is available by shifting to a connected

subset with the same probability mass.

We conclude that the uniquely optimal financial contract, which minimizes expected audit

costs, is such that

1. there is a single connected audit region a = [0, ], with < ;

2. over this region, audit repayments arera() =; and

3. for cash flows >, which are not audited, the repayment isr = . The unique cutoff

is given by the solution to the (IR) constraint

0

( K)f() d+ [1 F()]r= I

33


34/86

and expected audit costs are F()K.

3 Hidden Information, Signaling

3.1 Spences Model of Education as a Signal

A workers productivity can be eitherrHorrL, withrH> rL> 0. Leti Pr(r= ri) [0, 1]

be the firms prior belief that r = ri. Workers are willing to work at any wagew >0, and

firms are willing to hire any worker at a wage less than the workers expected productivity.

A worker of type i = L, Hcan gete years of education at costc(e) =iebefore entering the

labor market. The key assumption is that H< L; the marginal cost of education is lower

for high-productivity workers.

In a first stage, the worker chooses education, and in a second stage, the wage is determined

through bargaining. We assume that in the bargaining phase, the worker has all the bar-

gaining power.

Suppose that the workers productivity is perfectly observable. Then each worker type

chooses a level of education ei = 0. Since the workers productivity is known, the highest

offer the firm is willing to accept is wi = ri regardless of the level of education of the worker,

and we have the first-best solution.

Now suppose that productivity is not observable. Then the first-best solution can no longer

be an equilibrium outcome. We specify precisely the game played by the worker and firm,

as well as the notion of equilibrium:

1. The worker chooses a level of education, possibly randomly, to maximize his expected

return. pi(e) Pr(e|i) denotes the probability that the worker/principal of type i

34


35/86

chooses education level e.

2. The outcome is entirely driven by how the agents beliefs have been affected by the

observation of the principals education level. (i|e) Pr(i|e) denotes the agents

revised beliefs about productivity upon observing e. Then the equilibrium wage in the

second stage is given by

w(e) =(H|e)rH+ (L|e)rL

where (L|e) = 1 (H|e). This is the maximum wage the firm is willing to pay

given its updated beliefs.

Imposing the minimum consistency requirements on the agents conditional beliefs leads to

the definition of a perfect Bayesian equilibrium: a set of (possibly mixed) strategies

{pi(e)} for the principals types and conditional beliefs (i|e) for the agent such that

1. All education levels observed with positive probability in equilibrium must maximize

workers expected payoff: e :pi(e)> 0,

e arg maxe

{(H|e)rH+ (L|e)rL ie}

2. Firms posterior beliefs conditional on equilibrium education levels must satisfy Bayes

rule:

(i|e) = Pr(i|e)

= Pr(e|i)Pr(i)

Pr(e|H) + Pr(e|L) (Familiar now?)

= pi(e)i2

i=1 ipi(e)

whenever pi(e)> 0 for at least one type.

35


36/86

3. Posterior beliefs are otherwise not restricted: if pi(e) = 0 f or i = L, H (so that2i=1 ipie) = 0, and Bayes rule gives no prediction for posterior beliefs), then (i|e)

can take any value in [0, 1].

4. Firms pay workers their expected productivity:

w(e) =(H|e)rH+ (L|e)rL

Thus, an equilibrium outcome is a situation where the agents beliefs about what action

each type of principal takes are correct (i.e. the action believed to be chosen by a given

type is indeed the action chosen by that type) and where, given the agents updated beliefs(following the action), each type of principal is acting optimally.

When a PBE is taken to be the outcome, essentially the only restriction imposed is that in

equilibrium, the agents beliefs are consistent with the agents knowledge of the optimizing

behavior of the principal. To solve for a PBE, one typically proceeds as follows:

1. Using ones basic understanding of how the game works, one guesses (i.e. conjectures)

conditional beliefs(i|e) for the agent.

2. Then one determines the principals best response pi(e) given these beliefs.

3. Finally one checks whether the beliefs (i|e) are consistent with the principals opti-

mizing behavior.

In signaling games, the difficulty is usually not to find aPBE. Rather, the problem is

that there exist too many PBEs.

3.1.1 Did you forget the beer quiche game?

Lets use the beer quiche game as an example of how to solve for a PBE. As a student,

I found my professors instruction to be frustrating at first. Conjecture an equilibrium and

36


37/86

Figure 2: Image from Professor Meghan Busse.

then verify it? What the hell is that?! Im supposed to know the answer before I solve for

it?

No, you are not supposed to know the answer from the onset. But that leads to another

frustration during an exam: you have limited time, but several of your conjectures may

turn out not to be equilibria. Well, hopefully when you write a paper, youre not under a

ninety-minute time constraint, so lets proceed with the steps.

There are two players in the game: the sender and the receiver. The sender is one individual

who can be of two types: brave or coward. If the sender is the brave type, he prefers beer

to quiche; if he is the coward type, he prefers quiche to beer. The receiver prefers to fight

the coward type but not the brave type.

1. Let us first conjecture a separating equilibrium: if brave, then beer; if coward, then

quiche. This is the senders strategy.

2. We can show by Bayes Rule that such a conjecture allows the receiver to perfectly

37


38/86

distinguish between the types:

Pr(tb | b) = Pr(b | tb)Pr(tb)

Pr(b | tb)Pr(tb) + Pr(b | tc)Pr(tc)

= 1 0.81 0.8 + 0 0.2

= 1

By the same procedure, we see that Pr(tb | q) = 0, Pr(tc | q) = 1, and Pr(tc | b) = 0.

Hence, the receiver will not fight if he sees beer but will fight if he sees quiche. This is

the receiversresponse.

3. Knowing that the receiver will respond that way, will the sender still stick to the original

strategy? No. The brave type gets his preferred breakfast without a fight, but the

coward type will have to fight if he chooses quiche. If instead he lied and pretended

to be the brave type, then he would receive a higher payoff due to not fighting, even

though he prefers quiche to beer. So our conjectured separating equilibrium is not an

equilibrium.

It is easy to see that the other candidate separating equilibrium, if brave, then quiche; if

coward, then beer, makes no sense. The coward type will again deviate, as doing so would

yield the best payoff for him: he gets his preferred breakfast without a fight. What about

pooling equilibria? Lets follow the steps again:

1. Conjecture always beer.

2. We can show by Bayes Rule that such a conjecture implies that the receiver cannot

38


39/86


40/86

To prevent the coward type from deviating, the response must be fight if quiche.

Accordingly, >1 >1/2.

3. Knowing that the receiver will respond that way, will the sender still stick to the orig-

inal strategy? Yes. The brave type will not deviate as he receives his best payoff in

equilibrium. The coward type might have deviated if he could have gotten the payoff

of 3, but he cannot as the receiver will fight if he sees quiche. Thus, the coward type

will not deviate either.

There is another candidate pooling equilibrium, but we will save that for later.

We conjecture the following PBE. If the observed education level is high, it is likely that

the worker is type H. Even if a type L worker could obtain a wage w = rH by acquiring

education, he would not be willing to choose a level of education above e, where e is given

by1

w c= rH Le

=rL

Thus a candidate is to set (H|e) = 1 e e and (H|e) = 0 e < e. If the principal

optimizes against these beliefs, then he chooses e= 0 when he is of type L and e when he

is of type H.

There are many other PBEs, which can be classified into three categories:

1. Separating PBEsThe signal chosen by the principal identifies the principals type

exactly.

1That is, e is a hurdle. Any level of education higher than e means that the worker is better off notpursuing any education at all: e >e, rH Le < rL.

40


41/86

2. Pooling PBEsThe observed signal reveals no additional information about the prin-

cipals type.

3. Semiseparating PBEsSome, but not all, information about the principals tpye is

obtained from the observation of the signal.

Aseparating equilibriumis a PBE where each type of principal chooses a different signal

in equilibrium: eH = eL : (H|eH) = 1, (L|eL) = 1, and wi = ri. The set of separating

equilibrium levels of education is given by

Ss=

(eH, eL) :eL= 0 & eH

rH rL

L,rH rL

H

because incentive compatibility requires

rH LeH rL

rH HeH rL

Therefore

rH rLL

eHrH rL

H

eH

rH rL

L,rH rL

H

A pooling equilibrium is a PBE where each type of principal chooses the same signal in

equilibrium: eH=eL: (H|eH) =H, (L|eL) =L, andw(eH) =w(eL) =LrL+HrH

r. The set of pooling equilibrium levels of education is

Sp=

(eH, eL) :eL= eH=ep & ep

0,

LrL+HrH rLL

41


42/86

because incentive compatibility requires

r Lep rL

Therefore

r rLL

=LrL+HrH rL

L

ep

0

ep

0,LrL+HrH rL

L

Finally, there also exists a set of semiseparating equilibria, where at least one type of principal

is mixing between two signals, one of which is also chosen with positive probability by the

other type of principal. The key requirement here is that, for mixing to be optimal, the type

of principal who is mixing has to be indifferent between the two signals that are played with

positive probability.

3.1.2 Refinements

A large body of research in game theory has attempted to develop criteria based on theo-

retical considerations alone for selecting a particular subset of equilibria. The basic idea is

to enrich the specification of the game by introducing restrictions on the set of allowable

off-equilibrium beliefs using the observation that some types of players are more likely to

choose some deviations than others (i.e. who wants to deviate?)

3.1.2.1 Cho and Kreps Intuitive Criterion The basic observation this selection cri-

terion builds on is that most deviations from equilibrium play can never be in the interest

of some principal types. Beliefs conditional on off-equilibrium actions, therefore, ought to

42


43/86

reflect that these actions are more likely to be chosen by some types than others. That is,

beliefs conditional on off-equilibrium actions must be restricted to reflect that only some

types are ever likely to choose these actions.

Formally (i.e. use this in papers, not workshops), the Cho-Kreps intuitive criterion

follows: Letui = wi (ei) iei denote the equilibrium payoff of type i. Then, (j|e) = 0,

for e = (ei; ej), whenever rH je < uj and rH ie ui (i= L, H; i =j).

In a language comprehensible to mortals, the intuitive criterion states that when a deviation

isdominatedfor one type of player but not the other, the deviation should not be attributed

to the player for which it is dominated. By dominated, one means that the player is getting

a worsepayoff than his equilibrium payoff for anybelief of the uninformed party following

the deviation.

Perhaps an example will help illuminate or remind you about the intuitive criterion. Recall

the beer quiche game:

1. Lets finish with the final conjecture: always quiche.

2. By the same procedure, the receiver does not learn the senders type since both types

43


44/86

choose the same message. Fighting yields

0 0.8 + 1 0.2 = 0.2

Not fighting yields

1 0.8 + 0 0.2 = 0.8

Therefore, the receiver will choose to not fight if quiche. We again must specify

off-equilibrium probabilities. Let denote Pr(tb | b), with Pr(tc | b) = 1 . Fighting

yields

0 + 1 (1 ) = 1

Not fighting yields

1 + 0 (1 ) =

To prevent the brave type from deviating, the response must be fight if beer. Ac-

cordingly, 1 >


45/86

to beer from quiche came from the coward type.

That makes no sense. The coward type would NEVER deviate to beer; he already

received his best payoff in equilibrium. The probability that the sender is the coward

type given that the receiver sees beer should not be greater than 1/2; it should be

zero.

Therefore, while always quiche is a perfect Bayesian equilibrium, it does not survive

refinement via the intuitive criterion. Note that in the always beer case, the brave

type had no reason to deviate, and accordingly, the probability that that the sender

was the brave type given that the receiver saw quiche was less than 1/2. Hence,

always beer survives the intuitive criterion.

The intuitive criterion predicts implausible equilibrium outcomes in some situations. The

least-cost separating equilibrium in which eL = 0 and eH = (rH rL)/L is the same for

all i >0; i= L, H. Suppose now that L is arbitrarily small (L= 0). Then it seems

excessive to payc(eH; H) =H(rHrL)/Ljust to raise the waige by a commensurately small

amount [w= rH (1 )rH rL= (rH rL) 0]. Then the pooling equilibrium where

no education costs are incurred seems more plausible as it is Pareto-dominant. Moreover,

note that with no adverse selection (i.e. L = 0), eL = eH = 0 in equilibrium, so the

pooling equilibrium without education is the limit of this complete-information case, not the

Cho-Kreps equilibrium.

3.1.2.2 Maskin and Tiroles Informed Principal Problem The problem of multi-

plicity of PBEs is reduced when the timing of the game is changed so as to let the principal

offer the agent a contingent contract before the choice of the signal. Consider the linear

model of education specified previously, but invert the stages of contracting and education

choice. That is,

45


46/86

1. now the worker offers his employer a contract before undertaking education.

2. This contract then specifies a wage schedule contingent on the level of education chosen

by the worker after signing the contract.

Let {we} denote the contingent wage schedule specified by the contract. There are two

different cases to consider:

1. r = HrH+ LrL rH H(rH rL)/L. In this case a type Hworker is better off

in the least-cost separating equilibrium than in the efficient pooling equilibrium.

2. r > rH H(rH rL)/L. Here the typeHworker is better off in the efficient pooling

equilibrium.

4 The Principal-Agent ProblemMas-Colell et al.[1995]

4.1 Comment

One thing about contract theory (and perhaps teaching game theory more generally) is that

some people do certain topics way, way, way better than others. We can see this in textbooks.

When I took Micro 1, I found the consumer choice, classical demand theory, and production

chapters to be unreadable; I considered it a testament to the readability ofJehle and Reny

[2001] that I could read the sections in that book and still do the assigned problems in

Mas-Colell et al. [1995]2. But Chapter 14 in Mas-Colell et al. is amazingly readable. The

notes accordingly follow Mas-Colell et al. for the basic treatment instead of Bolton and

Dewatripont[2004].2Of course by do I meant that I stared at the page for half an hour, took a stab at how to begin, and

then consulted the solutions manual.

46


47/86

4.2 Hidden Actions (Moral Hazard)

The owner of a firm (the principal) wishes to hire a manager (the agent) for a project. The

projects profits are affected in part by the managers actions. How should the principal

design the managers compensation in a way that indirectly gives him (the manager) the

incentive to take the desired action when the managers actions are unobservable?

Let denote the projects observable profits, and let e E R denote the managers

action choice. For the nonobservability of managerial effort to matter, e must not be per-

fectly deducible from observing . We thus assume that although the projects profits are

affected by e, they are not fully determined by it. Profit can take values in [, ] and

it is stochastically related to e in a manner described by f( | e) with f( | e) > 0 for all

e Eand all [, ]. Put another way, any realization ofcan arise following any givene.

We focus on the case in which the manager has only two possible effort choices, eH and

eL. eH leads to a higher profit level for the firm than eL but entails greater difficulty for

the manager. That is, we assume that the distribution of conditional on eH first-order

stochastically dominates the distribution conditional on eL:

F( | eH) F( | eL) f( | eH) d >

f( | eL) d

The manager is an expected utility maximizer with a Bernoulli utility funciton u(w, e) over

47


48/86

his wage and effort level with the following properties:

u(w, e) =v(w) g(e)

v

(w)> 0 (Prefers more income to less)

v(w) 0 (Weakly risk averse)

g(eH)> g(eL) (Dislikes a high level of effort)

The owner receives the projects profits less any wage payments made to the manager. We

assume that the owner is risk neutral; his objective is to maximize his expected return.

4.2.1 The Optimal Contract when Effort is Observable

The owner chooses a take it or leave it contract to offer the manager. The optimal contract

for the owner then solves the following problem:

maxe{eL,eH},w()

( w())f( | e) d

s.t.

v(w())f( | e) d g(e) u

We write the Lagrangian corresponding to a discrete version of the model and take the FOC:

L =i

(i wi)pi,e+

i

v(wi)pi,e g(e) u

L

wi= pi,e+(v

(wi)pi,e)

= 0

= 1

v(wi)

= 1

v(w()) (Borch rule)

48


49/86

Exercise 14.B.1 asks us whether the reservation utility constraint must be binding in an

optimal contract. We can see this a few ways. First, from the previous derivation, since

v() > 0, >0. This implies that the constraint must bind. Second, if the constraint did

not bind, then the owner could reduce the wage payment while still satisfying the constraint

andmaking himself strictly better off. But that would mean that such a wage payment was

not optimal.

If the manager is strictly risk averse (i.e. v (w)


50/86

utility level:

v(we) g(e) = u

v(we) = u+g(e)

we =v1(u+g(e))

g(eH)> g(eL)

we,H> we,L

Thus, in the principal-agent model with observable managerial effort, an optimal contractspecifies that the manager choose e that maximizes

f( | e) d v1(u+g(e))

and pays the manager a fixed wage

w

=v1

(u+g(e

))

4.2.1.1 A risk-averse owner Exercise 14.B.2 asks us to derive the FOC characterizing

the optimal compensation scheme for a two-effort-level hidden action (i.e. effort unobserv-

able) model with a strictly risk averse principal. Let V() denote the owners utility function

over wealth. As before, the owner chooses a take it or leave it contract to offer the manager.

50


51/86

The optimal contract for the owner then solves the following problem:

maxe{eL,eH},w()

V( w())f( | e) d

s.t.

v(w())f( | e) d g(e) u (Individual Rationality)

e arg maxe

v(w())f( | e) d g(e) (Incentive Compatibility)

Note that the IC constraint can be written as

v(w())f( | eH) d g(eH)

v(w())f( | eL) d g(eL)


L =i

V(i wi)pi,eH+

i

v(wi)pi,eH g(e) u

+

i

v(wi)pi,eH g(eH) i

v(wi)pi,eL+ g(eL)

L

wi= pi,eHV

(i wi) +(v(wi)pi,eH) +(v

(wi)pi,eH v(wi)pi,eL)

= 0

+

1

pi,eLpi,eH

=

V(i wi)

v(wi)

+

1

f( | eL)

f( | eH)

=

V( w())

v(w()) (Borch rule)

4.2.2 The Optimal Contract when Effort is Not Observable

When effort is not observable, the only way to get the manager to take the desired action is to

relate his pay to the realization of profits, which we assumed was random. Nonobservability

of effort can thus lead to inefficiencies.

51


52/86

4.2.2.1 A risk-neutral manager When the manager has no concern about risk bearing,

the owner can still achieve the same outcome as when effort is observable. Suppose that the

owner offers a compensation schedule of the form

w() = (Selling the project to the manager)

If the manager accepts this contract, he chooses e to maximize his expected utility

w()f( | e) d g(e) =

f( | e) d g(e)

But whateverethat maximizes the above expression is also theethat maximizes the owners

profit given observable effort and a risk-neutral manager3:

f( | e) d g(e) u

Thus, w() = induces the first-best effort level e. The manager is willing to accept

it as long as it gives him an expected utility of at least u; that is, if

f( | e) d g(e) u

Let be the level at which the above inequality holds with equality. Note that the owners

payoff givenw() = is . We rearrange

f( | e) d g(e) = u

f( | e) d g(e) u=

But that is the maximized value of the owners expected profit given a risk-neutral manager

and w() = . Both the owner and the manager get the same payoff as when effort is

3The and the u are both constants that drop out in the FOC with respect to e.

52


53/86

observable.

4.2.2.2 A risk-averse manager Now, incentives for high effort can be provided only

at the cost of having the manager face risk. The principal solves

maxe{eL,eH},w()

( w())f( | e) d

s.t.

v(w())f( | e) d g(e) u (Individual Rationality)

e arg maxe

v(w())f( | e) d g(e) (Incentive Compatibility)

As before, note that the IC constraint can be written as

v(w())f( | eH) d g(eH)

v(w())f( | eL) d g(eL)


L =

i(i wi)pi,eH+

i

v(wi)pi,eH g(e) u

+

i

v(wi)pi,eH g(eH) i

v(wi)pi,eL+g(eL)

L

wi= pi,eH+(v

(wi)pi,eH) +(v(wi)pi,eH v

(wi)pi,eL)

= 0

+

1

pi,eLpi,eH

=

1

v(wi)

+

1 f( | eL)

f( | eH)

=

1

v(w()) (Borch rule)

Note that we can draw comparisons between the first-best and the second-best. We letw

53


54/86

be the fixed wage payment such that

1

v(

w)

=

Iff( | eL)/f( | eH)< 1, then

1

v(w())=+

1

f( | eL)

f( | eH)

=

1

v(w)+

1 f( | eL)

f( | eH)

=

1

v(

w)

+

1

f( | eL)

f( | eH)

>0

1

v(w())>

1

v(w)

v(w())< v(w)

w()>w (v() 0)By the same procedure, we conclude that iff( | eL)/f( | eH)>1, then

w()


55/86

4.2.3 Comparing the first and second-best wage contracts

Given the variability introduced into the managers compensation in the second-best scenario,

the expected value of the managers wage payment must be strictly greater than the fixed

wage payment in the first-best. Why? The manager must be assured an expected utility

level of uor else he will not participate. The owner subsequently compensates him through

a higher average wage payment for any risk he bears.

v(E[w() | eH])> E[v(w()) | eH] (v() 0)

=

v(w())f( | eH) d

= u+g(eH) (IR constraint)

=v(weH)

E[w() | eH]> weH

(v()>0)

To reiterate, from the owners perspective, if he wishes to induce the manager to perform

eHwhen effort is unobservable, on average, he must pay more than in the case where effort

is observable. Nonobservability of effort can thus lead to an inefficiently low level of effort

being implemented.

4.2.3.1 An additional demonstration of the implications of risk aversion This

comes fromBolton and Dewatripont [2004]. Take a standard moral hazard problem where

the principal considers offering the agent a lottery rather than a fixed payment wi if output

qi is observed. Specifically, the agent would receive wi,j with probability pi,j 0, with

j = 1, 2, . . . , mandm

j=1

pi,j = 1

Assume the agents utility function isu(w) (a). Show that such a randomizing incentive

55


56/86

scheme cannot be optimal if the principal is risk neutral and the agent is strictly risk averse.

Proof. Fix an output qi, i {L, H}.

v(Qwi,1+ (1 Q)wi,2)> Qv(wi,1) + (1 Q)v(wi,2)

=v(wi)

Qwi,1+ (1 Q)wi,2> wi

Therefore, since there is another wage payment for a given output that makes the agent

indifferent but at less cost to the principal, the lottery is not optimal. This leads us to the

next section, which is in fact where I found out how to solve this problem.

4.2.4 The Value of Information

Suppose that another statistical signal of effort, y, is available to the owner in addition to

the realization of profits (i.e. the outcome), and that the joint density of and y given e

(action / effort) is f(, y | e). In principle, the agents wage could depend on both output

and this additional signal. When should that be the case? That is, when should the optimal

compensation function depend ony?

We can derive the Borch rule given that w is a function of and y :

+

1

f(, y | eL)

f(, y | eH)

=

1

v(w(, y))

Suppose that y is simply a noisy random variable that is unrelated to e (cf. the lottery

56


57/86

problem inBolton and Dewatripont[2004]). In that case,

f(, y | e) =f1( | e)f2(y)

But then

+

1

f(, y | eL)

f(, y | eH)

=

1

v(w(, y))

+

1

f1( | eL)f2(y)

f1( | eH)f2(y)

=

1

v(w(, y))

+

1

f1( | eL)

f1( | eH)

=

1

v(w(, y))

Thus, the optimal compensation package is independent of y. Why? Suppose that the

principal does offer a contract that has wage payments depend ony. But ally does is induce

additional uncertainty in the agents wage that is unrelated to effort; that is, the agent bears

additional risk for no reason. If the principal instead offers, for each outcome, the certain

payment w() such that

v(w()) = E[v(w(, y)) | ]

then the agent gets exactly the same expected utility under w() as under w(, y) for any

level of effort. But the agent faces less risk under the certain payment w() per outcome

contract. Thus, the expected wage payments ar elower and the principal is better off under

that contract than the one that depends on y :

v(E[w(, y) | ])> E[v(w(, y)) | ]

=v(w())

E[w(, y) | ]> w() (v()>0)

57


58/86

4.3 Linear Contracts, Exponential Utility, and Normally Distributed

Performance (LEN) [Bolton and Dewatripont,2004]

Performance is assumed to be equal to effort plus noise:

q= a+

where is normally distributed with zero mean and variance 2. The principal is assumed

to be risk neuteral. The agent has CARA risk preferences represented by

u(w, a) = exp([w (a)])

where w is the amount of monetary compensation and > 0 is the agents coefficient of

absolute risk aversion ( = u/u from Arrow and Pratt). Effort cost is measured in

monetary units. For simplicity, the cost-of-effort function is assumed to be quadratic:

(a) =1

2ca2

Suppose that the principal and agent can write only linear contracts of the form

w= t+sq

wheret is the fixed compensation level and s is the variable, performance-related component

of compensation. The principals problem is then to solve

maxa,t,s

E(q w)

58


59/86


60/86

subject to

t+sa 1

2ca2

1

2s22 =w (IR)

Plugging int and taking the FOC with respect to s, we have

s= 1

1 +c2

Effort and the variable compensation component goes down as c, , and 2 increases.

4.3.1 Did you forget the properties of the log-normal distribution?

All you really need to remember, thanks to Wikipedia, is that ifx N(, 2), then exp(x)

is distributed lognormally with mean

exp

E[x] +

1

2Var[x]

= exp

+

1

22

4.4 Moral Hazard in Teams [Bolton and Dewatripont,2004]

A risk-neutral principal has a contract with n 2 agents, each of whom has the following

utility function separable in income and effort:

ui(wi) i(ai)

ui()>0

ui () 0

i()>0

i() 0

60


61/86

The agents actions produce outputs:

qn1= (q1, . . . , q n)

F(q| a) (Joint CDF)

an1= (a1, . . . , an) (Vector of actions taken by the agents)

We consider two cases:

1. Moral hazard in a team- the output produced by the agents is a single aggregate, Q,

with conditional distribution F(Q | a).

2. Each agent produces an individual random output qi, which may be imperfectly cor-

related with the other agents outputs.

4.4.1 Unobservable Individual Outputs

n 2 agents form a partnership to produce some deterministic (scalar) aggregate output

Q= Q(a)

by supplying a vector of individual hidden actions a. We assume that the output function

has the following properties:

Q

ai>0 (Strictly increasing)

2Q

a2i


62/86

We suppose further that all agents are risk neutral:

ui(w) =w

A partnership is a vector of output-contingent compensations for each agent, w(Q) =

[w1(Q), . . . , wn(Q)], such thatn

i=1

wi(Q) =Q (Budget balance)

When effort levels are unobservable and no measure of individual performance is available,

rewarding an agent for raising aggregate output means rewarding her higher effort and also

the higher effort of other agents. Thus, there is a free-riding problem.

We know that a risk-neutral individual agent would supply the first-best level of effort, given

that all the other agents supply the first-best, if he is compensated with the full marginal

return of his effort. The level of effort that maximizes social welfare in the first best is

defined by

a = arg maxa

Q(a)

ni=1

i(ai)

From the FOC, at the optimum, we have

Q(a)

ai i(a

i ) = 0 i

Fix an arbitrary partnership contract w(Q). Each agent i independently chooses ai to

maximize his own utility given the other agents actions, ai. Thus, agent i solves

maxai

wi(Q(ai, ai)) i(ai)

62


63/86

From the FOC, we have

dwi(Q(ai, ai))

dQ

Q(ai, ai)

ai i(ai) = 0

It seems that we may achieve a Nash equilibrium where all agents supply first-best effort

levels ifeach agent gets a compensation contract wi[Q(ai, ai)] providing him with the full

marginal return from his effort, when all other agents also supply the first-best effort level:

dwi(Q(ai, ai))

dQ = 1

But that implies

dwi(Q(ai, ai)) = dQ

wi(Q(ai, ai)) =Q+k

Sincen

i=1(Q+k) =nQ+nk > Q

we cannot implement such a contract while satisfying budget balance. What can we do?

Introduce a budget breaker into the organization. A third party agrees to sign a contract

with all n agents offering to pay each of them wi(Q) = Q. There then exists a Nash

equilibrium where all agents supply their first-best action and where the budget breaker

pays nQ(a) in equilibrium.

Proof. The proof is due to S. Viswanathan. If all other agents pickai

and agent i picks

ai > ai , he gets nothing more even though he does produce more. Due to costly action,

i() >0, this is suboptimal. If instead agent i picks ai < ai , then he gets nothing. (And

neither does anyone else?) Therefore, a is Nash.

63


64/86

4.4.2 Observable Individual Outputs

Previously, the problem was one of eliciting cooperation / avoiding free riding. We now turn

to the problem of harnessing competition between agents. When does an optimal linear

incnetive scheme for agent i also take int oaccount agent j s performance? We consider the

setting in which both agents have CARA risk preferences, normally distributed individual

outputs, and incentive contracts linear in output.

We have two identical agents, each producing an individual output qi by supplying effortai:

q1= a1+1+2

q2= a2+2+1

1,2 N(0, 2) (Independently and normally distributed)

When = 0, the outputs are correlated. Hence, it should be optimal to base an individual

agents compensation on both output realizations, as both provide information about an

individual agents action choice.

The principal is risk neutral, and both agents have CARA risk preferences as follows:

u(w, a) = exp( [w (a)])

ais the agents effort

is the coefficient of absolute risk aversion

(a) =12

ca2

64


65/86

We restrict attention to linear incentive contracts of the form

w1= z1+v1q1+u1q2

w2= z2+v2q2+u2q1

The principals problem is symmetric. Therefore, we illustrate his maximization problem in

relation with agent 1:

maxa1,z1,v1,u1

E(q1 w1)

s.t. E [ exp { [w1 (a)]}] u(w) (IR)

a1 arg maxa

E [ exp { [w1 (a)]}] (IC)

We can rewrite the agents maximization problem:

E [ exp { [w1 (a)]}] = E [ exp { [z1+v1q1+u1q2 (a)]}]

= E [ exp { [z1+v1(a1+1+2) +u1(a2+2+1) (a)]}]

E [ exp {w1(a)}]= exp

E (w1(a)) +1

2Var (w1(a))

= exp

z1+v1a1+u1a2

1

2ca21

+

1

22Var [v11+v12+u12+u11]

= exp

z1+v1a1+u1a2

1

2ca21

+1

22Var [(v1+u1)1+ (u1+v1)2]

= exp

z1+v1a1+u1a2 1

2ca21

+1

222

(v1+u1)

2 + (u1+v1)2

= exp

z1+v1a1+u1a2

1

2ca21

1

22

(v1+u1)

2 + (u1+v1)2

65


66/86

Solving the above is equivalent to solving5

maxa1

z1+v1a1+u1a2

1

2ca21

1

22

(v1+u1)

2 + (u1+v1)2

From the FOC, we have

v1 ca1 = 0

a1 =v1

c

We plug this back in:

z1+v1

v1c

+u1v2c

1

2cv1

c

2

1

22

(v1+u1)

2 + (u1+v1)2

=

z1+

v 21c

+u1v2

c

1

2

v21c

1

22

(v1+u1)

2 + (u1+v1)2

=

z1+

1

2

v21c

+u1v2

c

1

22

(v1+u1)

2 + (u1+v1)2

=w (Certainty equivalent, reservation utility)

The principals problem can now be written as

maxa1,z1,v1,u1

E(q1 w1) = maxz1,v1,u1

E (a1+1+2 [z1+v1q1+u1q2])

= maxz1,v1,u1

E

v1c

z1+

v21c

+u1v2

c

s.t.

z1+

1

2

v21c

+u1v2

c

1

22

(v1+u1)

2 + (u1+v1)2

= w

5

Draw the graph of exp(x) if you ever forget / are too drunk to remember.

66


67/86

Since the constraint binds, we can rewrite it as an unconstrained problem:

maxz1,v1,u1

E

v1c

1

2

v21c

u1v2

c +

1

22

(v1+u1)

2 + (u1+v1)2

+

v 21c

+u1v2

c

=max

v1,u1Ev1

c 1

2v

2

1

c 1

22 (v1+u1)2 + (u1+v1)2

We solve this sequentially:

1. For a given v1,u1 is determined to minimize risk.

2. v1 is then set optimally to trade off risk sharing and incentives.

We minimize the variance with respect to u1:

u1

1

22

(v1+u1)

2 + (u1+v1)2

=1

22 [2(v1+u1)+ 2(u1+v1)]

= 0

u1 = 22 + 1

v1We postpone the interpretation ofu1. For now, we substitute u1 in and take the FOC with

respect to v1:

maxv1,u1

E

v1c

1

2

v21c

1

22

v1+

2

2 + 1

v1

2+

2

2 + 1

v1

+v1

2

=maxv1,u1

Ev1c 12 v21c 122v21 1 222 + 12

+1 22 + 12

2=max

v1,u1E

v1c

1

2

v21c

1

22v21

1 2

2 + 1

2(1 +2)

=maxv1,u1

E

v1c

1

2

v21c

1

22v21

(1 2)2

1 +2

67


68/86

We get

1

c

1

cv1

2 (1 2)2

1 +2 v1 = 0

1 v1 c2 (1

2

)

2

1 +2 v1 = 0

v1 = 1

1 +c2(1 2)2

1 +2

= 1 +2

1 +2 +c2(1 2)2

>0 (1 1)

When= 0, v1 reduces to1

1 +c2

the expression for the variable wage component in the LEN case. When = 0, the correlation

in the two agents outputs can reduce the overall risk exposure of any individual agent and

thus enable the provision of stronger incentives to both agents. To see this, consider the

extremes where = 1: v1 = 1. That is, by filtering out a common shock, the optimal

incentive scheme can almost eliminate each agents exposure to risk and approximate first-

best incentives by letting v1 tend to 1 (i.e. full marginal return). Equilibrium effort is

higher because of lower risk exposure.

4.5 Combining Moral Hazard and Adverse Selection

Consider the problem of a risk-neutral seller of a firm interacting with a risk-neutral buyer.

The buyer can generate an uncertain revenue stream by running the firm: X {0, R}. The

buyer may be more or less able at running the firm, and ability translates into a higher

or lower probability of getting the outcome R > 0. In addition, the buyer can raise the

probability of gettingRby working harder.

68


69/86

Let denote ability, and suppose that {L, H} with L < H. The seller does not

know the buyers type, and his prior is that the buyer has ability H (L) with probability

(1 ). e denotes effort, and a buyer with ability who supplies effort e generates R with

a probabilitye at a private cost of

(e) 1

2ce2

To summarize, the setup is a model with

1. two states (or outcomes),

2. two types, and

3. a continuum of actions.

The sellers problem is to offer a menu of contracts (ti, ri) to maximize his expected re-

turn from the sale, where ti denotes an up-front cash payment and ri a repayment to be

paid from future revenues generated by the buyer. We assume limited liability: X ri 0.6

The buyers payoff under such a contract takes the form:

iei(R ri) ti 1

2ce2i

The FOC gives the choice of effort that maximizes his payoff:

i(R ri) cei= 0

ei=i(R ri)

c

Bolton and Dewatripont [2004]note that the optimal ei is independent ofti and decreasing

6That is, the repayment cannot exceed the realization of revenue.

69


70/86

in ri7. Plugging that in to the payoff produces the maximized payoff:

ii(R ri)

c (R ri) ti

1

2c

i(R ri)

c

2=1

22i (R ri)2

c ti

The seller solves

max(ti,ri)

{[tH+ HeHrH] + (1 ) [tL+Le

LrL]}

=max(ti,ri)

tH+ H

H(R rH)

c rH

+ (1 )

tL+L

L(R rL)

c rL

=max

(ti,ri)tH+ 2H(R rH)

c rH + (1 ) tL+ 2L(R rL)

c rL

subject to

1

2

2i (R ri)

2

c

ti

1

2

2i (R rj)

2

c

tj (IC)

1

2

2i (R ri)

2

c

ti 0 (IR)

4.5.1 Optimal Contract, Moral Hazard Only

When there is no adverse selection problem that is, the seller can observe the buyers

ability / type he can offer ability-specific contracts:

max{ti,ri}

ti+

2i (R ri)ric

subject to

1

2c [i(R ri)]

2

ti 0 (IR)7Its obvious, but think of an extreme case when ri = R. In that case, the seller demands everything.

Would the buyer work in such a case?

70


71/86

Since the participation constraint binds, the seller solves

max{ri}

1

2c[i(R ri)]

2 +2i (R ri)ri

c

We take the FOC:

1

c[i(R ri)](i) +

1

c2i [(R ri) ri] = 0

R+ri+R 2ri = 0

ri = 0

ri = 0

This implies that

ti = 1

2c[i(R ri)]

2

= 1

2c[iR

2]

That is, the seller sells the firm for an upfront cash payment and does not maintain anyadditional financial participation / future ownership.

4.5.2 Optimal Contract, Adverse Selection Only

We now suppose that the buyers effort level is fixed at some levelebut that the seller cannotobserve the buyers ability.

max{ti,ri}

{(tH+HerH) + (1 )(tL+LerL)}subject to

ie(R ri) ti ie(R rj) tj (IC)ie(R ri) ti 1

2ce2 0 (IR)

71


72/86

Bolton and Dewatripont[2004] claim that the problem has a simple solution:

ri= R

ti=

1

2ce2And indeed it does8. How? Lets follow the algorithm from before.

1. A less-productive agent always has to be paid more than a more-productive agent.

Hence, the more-productive agent has the incentive to deviate / lie.

2. Thus, IR-L and IC-H bind.

3. We solve for tL and tH:

Le(R rL) tL 12

ce2 = 0 (IR-L)tL= Le(R rL) 1

2ce2

H

e(R rH) tH=H

e(R rL) tL (IC-H)

tH=He(R rH) He(R rL) +Le(R rL) 1

2

ce24. We plug these in to the maximization problem to simplify it:

max{ri}

He(R rH) He(R rL) +Le(R rL) 1

2ce2 +HerH

+(1)

L

e(R rL)

1

2c

e2 +L

erL

8I thank Zeqiong Huang for alerting me to the corner solution.

72


73/86

5. We take the FOC with respect to rH:

H

e+H

e= H

e(1 )

>0

6. We do the same with respect to rL:

(He Le) =(H L)e>0

7. What does it mean when both FOCs are strictly positive? It means that we have a

corner solution: keep increasing bothri until they cant be increased any more. This

happens whenri = R.

Intuitively, the solutions to the two subcases have different interpretations. When there is

moral hazard but no adverse selection, the seller literally sells the firm to the buyer. He

accepts an upfront payment and maintains no additional ownership. This is reminiscent of

how moral hazard can be avoided even when effort is unobservable and can only be imper-

fectly deduced from output so long as the agent is risk-neutral, for then the principal can

simply sell the project to the agent.

When there is adverse selection but no moral hazard, the seller does not actually sell anything

to the buyer. Rather, he employs the buyer. In exchange for a payment note thatti


74/86

4.5.3 Optimal Contract, Moral Hazard and Adverse Selection

The problem becomes, unsurprisingly, a combination of the two subcases we have considered,

and it is none other than the problem we set up at the beginning:

max(ti,ri)

tH+

2H(R rH)

c rH

+ (1 )

tL+

2L(R rL)

c rL

subject to

1

2

2i (R ri)

2

c

ti

1

2

2i (R rj)

2

c

tj (IC)

1

2

2i (R ri)

2

c

ti 0 (IR)

We first comment on why the previous solutions are not optimal when both problems exist.

Recall that when there is no adverse selection but there is moral hazard, there are no future

repayments, but there is a fixed upfront cost that is type-specific:

tH=2HR

2

2c

>2LR

2

2c

=tL

Given that rH=rL= 0, if the seller offered the moral hazard only menu in a dual-problem

setup, the high type would never choose (rH, tH). Why should he? If he chooses (rL, tL), he

can pay less for the same thing.

Now, recall that when there is no moral hazard but there is adverse selection, the seller paysthe buyer upfront, who in turn pays the seller everything in the future. But the ti upfront

payment to the buyer we derived is the same for both types:

tH=tL= ce22

74


75/86


76/86

We take the FOC with respect to rH:

1

2

1

c2H2(R rH) +

1

c2H(R 2rH) = 0

(R rH) +R 2rH= 0

rH= 0

rH= 0

Just as in the second-best adverse selection case, we have the so-called efficiency at the

top. We take the FOC with respect to rL:

1

2

1

c2H2(R rL)(1) +

1

2

1

c2L2(R rL)(1)

+(1)

1

2

1

c2L2(R rL)(1) +

1

c2L(R 2rL)

=(2H(R rL)

2L(R rL)) + (1 )(

2L(R 2rL)

2L(R rL))

=(2H 2L)(R rL) + (1 )(R 2rL R+rL)

2L

=(2H 2L)R (2H 2L)rL+ (1 )(rL)2L

(2H 2L)R=rL[(

2H

2L) + (1 )

2L]

rL=

(2H

2L)

(2H 2L) + (1 )

2L

R

>0

From that we can derive tSL, where the Sdenotes second-best:

tSL= [L(R rSL)]

2

2c

< [L(R)]

2

2c

=tFL

76


77/86


78/86

transferred from the owner to the manager, who consumes cash above what he puts into

production. The managers utility for cash transferred,y; and production requirement, x,

with cost per dollar c is therefore

U(y, x; c) =y cx

The owner designs a menu of contracts, {x(c), y(c)} c C, such that the manager selects

(x(c), y(c)) when he knows the per dollar cost is c. The owners objective in choosing among

alternative budgets is to maximize expected profits:

10

[x(c) y(c)] f(c) dc

There are five constraints on the owners choice of schemes.

1. The manager must receive an expected utility at least as high as his next-best alterna-

tive, with denoting the managers opportunity cost of participation and normalized

to zero.

2. The contract must respect the managers lack of cash.

3. The contract must induce a manager who knows the cost is ci to select (x(c), y(c)).

4. To ensure a solution exists, we must assume the production of cash flows is bounded.

5. Cash produced must not be negative.

Therefore, the owners problem is

max(x(c),y(c))

10

[x(c) y(c)] f(c) dc

78


79/86

subject to

10

[y(c) cx(c)] f(c) dc = 0 (IR-1)

y(c) cx(c) 0c C (IR-2)

y(c) cx(c) y(c) cx(c) c, c C (IC)

x(c) xmax c C (Existence)

x(c) 0c C (Nonnegativity)

5.1.2 Solution of Benchmark Case

Antle and Eppen [1985]showed that the optimal amounts of production and resources trans-

ferred are given by a simple hurdle strategy: if the reported cost is above c, nothing is

produced and no resources are given; if the reported cost is at or below c,xmax is produced

and cxmax is given.

5.1.2.1 How? If the problem is simplified from a continuum of types to two, following

the algorithm outlined above, the owners problem is equivalent to

maxx

(1 c F(c)

f(c))x

Hence, by inspection[Antle and Eppen,1985],

x = xmax, if (1 c

F(c)

f(c))> 0

0, otherwise

The solution involves a trade-off between productive efficiency and distributional conse-

quences. Lowering (raising) the hurdle cost gives up (increases) valuable production. But it

79


80/86

allows the owner to capture (give) more of the surplus to the manager in the form of reduced

(excess) resources.

The optimal policy balances these factors. Consider a candidate for the critical hurdle cost,

ci. If the hurdle is lowered tock1, production with an expected gross revenue to the owner

ofpkxmax is lost. But the expected resources allocated decline:

(k

i=1

pickxmax =

k1i=1

pickxmax +pkckx

max) (Original)

k1

i=1pick1x

max (New)

=k1i=1

pi(ck ck1)xmax +pkckx

max (Difference)

Likewise, if the hurdle is raised to ck+1, production with an expected gross revenue to the

owner ofpk+1xmax is gained. But the expected resources allocated increase:

(k+1

i=1pick+1x

max =k

i=1pick+1x

max +pk+1ck+1xmax) (New)

k

i=1

pickxmax (Original)

=k

i=1

pi(ck+1 ck)xmax +pk+1ck+1x

max (Difference)

5.2 Prendergast[1993]

5.2.1 Simplified Representation

The following is a simplification of Prendergast [1993] due to Qi Chen. A risk-neutral

manager-owner of a firm chooses a binary decisionx {0, 1} to maximize profits. Choosing

the correct decision results in an additional profit of to the firm. We let {0, 1} denote

the optimal decision, and the ex ante prior places equal probabilities on each decision being

80


81/86

optimal.

The manager privately observes a signalm regarding the true state where

Pr[m= ] =m>1

2

The manager also employs a risk-neutral worker who privately observes another independent

signal w regarding the true state where

Pr[w =] =w >1

2

In addition, the worker privately observes a signal as to what the manager has observed

where

Pr[= m] = >1

2

Thus, the worker has information about the managers observation, which is indirectly in-

formative about the true state, and direct information about the true state. Note that

information about the managers observation can easily be motivated as some guess about

what the managers taste / preferences are.

The manager offers wage contracts of the following form: the worker is paid a fixed wage

(normalized to zero) plus a non-negative bonusif his reported optimal decisionw matchesthe managers signalm and zero otherwise.

If the worker exerts effort at cost c, then w =w; and if he does not, then w =w < w.

In addition, we assume that w > w > m, such that in a first-best world, the manager

would always ignore his own information and choose x = w regardless of the workers effort.

81


82/86

We assume that the workers effort is socially efficient:

(w) (w)> c

Lastly, we assume that the probability of the agent getting paid conditional on working ex-

ceeds the probability of getting paid if the agent reports based on what he thinks the manager

wants to hear, which is in turn greater than the probability of getting paid conditional on

Documents

Contract Theory, by Alex Young