16
Implementing second-best environmental policy under adverse selection Glenn Sheriff School of International and Public Affairs, Columbia University, 420 W.118th Street, Room 1405, New York, NY 10027, USA article info Article history: Received 17 November 2006 Available online 28 November 2008 Keywords: Empirical contract theory Environmental policy Stochastic frontier analysis Adverse selection abstract A key obstacle to practical application of mechanism design theory to regulation is the difficulty of obtaining consistent beliefs regarding information that theoretical models assume to be commonly held. This article presents a solution to this problem by developing an easily implemented empirical methodology with which the government can use available data to develop beliefs regarding the technology and distribution of types in a regulated sector characterized by hidden information. Results are used to calibrate a second-best land conservation mechanism and evaluate its cost relative to simpler alternatives. & 2008 Elsevier Inc. All rights reserved. 1. Introduction The theoretical literature on optimal regulation under adverse selection has grown tremendously in the past three decades. In spite of this progress, actual policies implementing even the most basic optimal mechanisms remain scarce. Here, ‘‘optimal’’ refers to a mechanism that achieves the best outcome for the regulator subject to information and other constraints. In an adverse selection context, such mechanisms can be typically characterized as a set of contracts offered to regulated firms in which payments vary with an observable action in a non-linear way [3]. For example, a system of French regulatory contracts paying firms for installation of industrial wastewater treatment equipment may be optimal if there is private information regarding abatement cost [31]. Auctions are gaining popularity as a means of mitigating information problems. Auctions, however, are not necessarily optimal allocation mechanisms since they do not take advantage of all the information at the government’s disposal [19]. One obstacle to the transition from mechanism design theory to practice is the difficulty of obtaining information that theoretical models assume to be commonly held. Models typically characterize a second-best (as opposed to the full information first-best) menu of contracts stipulating payments and allocations among which the regulated firms choose. The precise terms of each contract are defined up to commonly held beliefs regarding: (a) the production technology and (b) the probability distribution of firm types. 1 The optimal values of the contract terms can vary greatly depending on these two sets of beliefs. From the standpoint of applied theory, the development of consistent beliefs regarding these items is therefore of paramount importance. Unfortunately, the extant literature provides little guidance in this regard. In this article, I show how to calculate the terms of a second-best mechanism by modeling type as a source of heterogeneity that is unobserved both to the regulator and the econometrician. Such a methodology has a wide range of potential applications for general principal-agent problems characterized by adverse selection. It is particularly relevant in Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jeem Journal of Environmental Economics and Management ARTICLE IN PRESS 0095-0696/$ - see front matter & 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jeem.2008.10.001 Fax: +1 212 864 4847. E-mail address: [email protected] 1 Type refers to a productivity parameter that is private information to each firm. Journal of Environmental Economics and Management 57 (2009) 253–268

Implementing second-best environmental policy under adverse selection

Embed Size (px)

Citation preview

ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal ofEnvironmental Economics and Management

Journal of Environmental Economics and Management 57 (2009) 253–268

0095-06

doi:10.1

� Fax:

E-m1 Ty

journal homepage: www.elsevier.com/locate/jeem

Implementing second-best environmental policy underadverse selection

Glenn Sheriff �

School of International and Public Affairs, Columbia University, 420 W. 118th Street, Room 1405, New York, NY 10027, USA

a r t i c l e i n f o

Article history:

Received 17 November 2006Available online 28 November 2008

Keywords:

Empirical contract theory

Environmental policy

Stochastic frontier analysis

Adverse selection

96/$ - see front matter & 2008 Elsevier Inc. A

016/j.jeem.2008.10.001

+1 212 864 4847.

ail address: [email protected]

pe refers to a productivity parameter that is

a b s t r a c t

A key obstacle to practical application of mechanism design theory to regulation is the

difficulty of obtaining consistent beliefs regarding information that theoretical models

assume to be commonly held. This article presents a solution to this problem by

developing an easily implemented empirical methodology with which the government

can use available data to develop beliefs regarding the technology and distribution of

types in a regulated sector characterized by hidden information. Results are used to

calibrate a second-best land conservation mechanism and evaluate its cost relative to

simpler alternatives.

& 2008 Elsevier Inc. All rights reserved.

1. Introduction

The theoretical literature on optimal regulation under adverse selection has grown tremendously in the past threedecades. In spite of this progress, actual policies implementing even the most basic optimal mechanisms remain scarce.Here, ‘‘optimal’’ refers to a mechanism that achieves the best outcome for the regulator subject to information and otherconstraints. In an adverse selection context, such mechanisms can be typically characterized as a set of contracts offered toregulated firms in which payments vary with an observable action in a non-linear way [3]. For example, a system of Frenchregulatory contracts paying firms for installation of industrial wastewater treatment equipment may be optimal if there isprivate information regarding abatement cost [31]. Auctions are gaining popularity as a means of mitigating informationproblems. Auctions, however, are not necessarily optimal allocation mechanisms since they do not take advantage of all theinformation at the government’s disposal [19].

One obstacle to the transition from mechanism design theory to practice is the difficulty of obtaining information thattheoretical models assume to be commonly held. Models typically characterize a second-best (as opposed to the fullinformation first-best) menu of contracts stipulating payments and allocations among which the regulated firms choose.The precise terms of each contract are defined up to commonly held beliefs regarding: (a) the production technology and(b) the probability distribution of firm types.1 The optimal values of the contract terms can vary greatly depending on thesetwo sets of beliefs. From the standpoint of applied theory, the development of consistent beliefs regarding these items istherefore of paramount importance. Unfortunately, the extant literature provides little guidance in this regard.

In this article, I show how to calculate the terms of a second-best mechanism by modeling type as a source ofheterogeneity that is unobserved both to the regulator and the econometrician. Such a methodology has a wide range ofpotential applications for general principal-agent problems characterized by adverse selection. It is particularly relevant in

ll rights reserved.

private information to each firm.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268254

regulatory applications that fit into the general framework of [3]. With respect to environmental policy, this frameworkreadily applies to cases in which environmental benefits are privately provided or in which firms have an effective right topollute.

As an application, I consider the problem of designing a hypothetical program to encourage owners to set aside atargeted amount of land from agricultural production at least cost to the government. Land conservation payments arebecoming popular in both the developed and developing world. Prominent examples include the Wetlands ReserveProgram and Conservation Reserve Program (CRP) in the U.S., Australia’s Bush Tender program, and Costa Rica’s Pago deServicios Ambientales (payment for environmental services) program. Such programs share the characteristics thatparticipation is voluntary, and that opportunity costs of participation are both heterogeneous across landowners and notdirectly observable to the government.2

There are several policy instruments available to the government for meeting a given land set aside target. The choice ofinstrument involves a trade-off between simplicity and cost-effectiveness. At one extreme is a uniform Pigouvian-stylesubsidy. This instrument is relatively simple to administer, but involves potentially large excess payments to thoselandowners with low participation costs. At the other extreme is first-degree price discrimination. This instrument isdifficult to implement. It requires the government to obtain perfect information regarding costs and to design a contracttailored to each landowner. It is cost effective, however, since it results in an efficient allocation with each owner receivinga payment exactly equaling his opportunity cost. Between these extremes lie third-degree price discrimination(geographically differentiated Pigouvian subsidies) and second-degree price discrimination (the Baron and Myersonmechanism).

Theory can rank these instruments by expected cost. Textbook environmental economics policy prescriptions such asPigouvian taxes and subsidies or emissions markets are sub-optimal in a world characterized by asymmetric informationand a social cost of raising public funds [22]. With the empirical tools provided in this paper, however, one can go a stepfurther towards making an informed choice between instruments based on the magnitude of cost differences.

Such analysis also contributes to the discussion in the mechanism design literature surrounding the ‘‘Wilson doctrine.’’In an influential piece, [34] notes the prevalence of simple rather than theoretically optimal trading rules in markets withasymmetric information. As noted above, the terms of a second-best contract mechanism are highly dependent upon theregulator’s beliefs regarding the production technology and distribution of types. In contrast, some mechanisms (likePigouvian instruments and some types of auctions) are always allocatively efficient, even if they do not optimize theregulator’s objective. Proponents of the Wilson doctrine argue that such mechanisms (which also have the virtue ofsimplicity) are therefore preferable to a complicated mechanism that is optimal only if the regulator’s beliefs are correct.Here I provide policy-makers an easily implementable means of evaluating the magnitude of potential gains offered bytheoretically optimal mechanisms.

To identify both the production technology and the distribution of types, the empirical methodology uses a two-partadditive error structure in the spirit of the stochastic frontier models [1,23]. One part of the error is stochastic noise (e.g.,measurement error or random shocks), while the other represents firm types.3 This econometric model extends thestochastic frontier literature by adapting robust generalized method of moments (GMM) estimation techniques recentlydeveloped in other contexts.

Previous research attempting to identify production technology and distribution of types under adverse selection can bedivided into two branches. One branch of the empirical contract theory literature assumes that existing regulations arealready optimal in a second-best sense [21,31,35]. The optimality conditions provide equations that can be econometricallyestimated with observable data. In an analysis of regulated California water utilities, [35] derives the regulator’s optimalchoice for a menu of contracts stipulating firms’ transfers and capital stock. He then uses these equations to estimate firmproduction technology and distribution of types. Similarly, [31] uses the equilibrium regulatory contract terms (emissionsand transfers) to estimate the parameters of a distribution of types and production technology for French industrial firmsunder wastewater emission contracts. Using the equation characterizing the optimal contract terms to conduct a semi-parametric estimation of the abatement technology, [21] also examines wastewater abatement.

This line of research is descriptive rather than prescriptive. It may describe regulator beliefs regarding commonly heldinformation. However, parameter estimates are consistent only under the assumption that the regulator is in fact alreadybehaving optimally conditional on her pre-existing beliefs. It provides no guidance to a regulator as to how she mightdevelop these beliefs in the first place.

In contrast, the methodology developed here is designed to inform a regulator who is designing a new policy orreforming an existing one, and thus cannot use the optimality of pre-existing contract terms as a basis on which to conduct

2 Although the data used in the application are for U.S. agricultural producers, the program modeled here is not intended to mimic the rich detail of

the actual CRP. For example, the CRP is a dynamic game, with auctions repeated almost every year, whereas the program modeled here is static. Moreover,

existing programs typically do not have a simple explicit goal such as cost minimization. Rather than provide a blueprint for the CRP (or any other existing

program), the primary goal of this paper is to develop a methodology for how to incorporate regulated firms’ private information into environmental

policy design.3 The stochastic frontier literature typically refers to the second component of the error term as a firm’s ‘‘technical efficiency.’’ The contract theory

literature typically attributes private information to unobserved (to the regulator) differences in efficiency. Different efficiency levels correspond to

different ‘‘types’’ of firms. The interpretation of type used in this article is with that of technical efficiency in the stochastic frontier literature.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 255

econometric inference. It is more closely related to a second branch of the literature that estimates the actual distributionof types rather than the regulator’s beliefs. In initial work along this line, [10] interpret regression residuals as firm types fora model of urban transport. The model is restrictive since it does not allow the researcher to disentangle firm types fromrandom shocks. A more flexible approach allows for both random types and stochastic noise. Studies of Frenchtelecommunications [5] and urban transport [15] assume parametric distributions for both unobserved variables.Maximum likelihood techniques allow estimation of the parameters of the relevant value functions and probabilitydistributions.

By allowing for both random error and unobserved types [5,15] are the works most closely related to this article. Themethodology presented here has three principal advantages over earlier approaches. First, a disadvantage to using themaximum likelihood approach often employed in stochastic frontier analysis is that parameter estimates are inconsistentin the presence of errors that are not i.i.d. [18]. The approach developed here is robust to arbitrary heteroskedasticity andgeographic clustering. Second, previous papers estimated a single equation. In a framework with two unobserved sourcesof variation, deriving and estimating the likelihood function for a joint system of several equations with potentiallycorrelated errors becomes quite cumbersome, and to my knowledge has not been done. In contrast, the frameworkemployed here easily allows one to increase estimation efficiency by using a system of restricted cost function andexpenditure share equations. Finally, this approach is easily implementable with cross-sectional data and computationallyundemanding.

2. Alternative contract schemes

I consider four versions of a program designed to induce landowners to remove environmentally sensitive land fromproduction. All versions share two salient features. First, the government procures a targeted quantity of land, weighted byan environmental quality index. Second, the program is voluntary. The regulator’s objective is to allocate set asides andtransfer payments to minimize the expected cost of satisfying the constraints.

The theoretical model builds upon earlier work by [9,30]. There is a fixed population of landowners who can bedifferentiated by observable characteristics that depend on a general geographical location (their county), indexed byi 2 f1; . . . ; Ig. In a given county, landowners are identical up to an unobserved (to the government) productivity parametery 2 ð0;1�, referred to as their type. Let f ðyÞ � dFðyÞ=dy denote the government’s beliefs regarding the probability densityfunction of types, assumed to be the same for all counties. This function satisfies the monotone hazard rate condition:d½FðyÞ=f ðyÞ�=dy40.4

Let a denote the amount of land idled by an individual. The restricted profit function riða; yÞ, indicates the market incomeobtained from the unenrolled land.5 It completely characterizes the production technology. This function satisfies ri

y40,ri

ao0, riaao0, ri

ayo0, and riaayo0.6 The first property indicates that y increases productivity. The next indicates that enrolling

land reduces profit. The third property shows that this marginal opportunity cost becomes greater in absolute value asenrollment increases (least productive land is enrolled first).7 The fourth property, commonly referred to as the single-crossing condition, indicates that the marginal opportunity cost is increasing in type (all else equal, more productivelandowners forego more profit from idling an additional acre). The final property is a regularity condition that helps ensurethat the problem is well behaved.8

The pair haiðyÞ; tiðyÞi denotes contract terms in county i for type y, where aiðyÞ is the amount of land enrolled in theprogram and tiðyÞ is the transfer payment. The environmental constraint is a requirement that the contracts meet theenrollment target:

XI

i¼1

nioi

Z 1

0aiðyÞdFðyÞX

XI

i¼1

nioiA. (1)

Here, ni is the number of landowners in county i, oi is the environmental weight for county i, and A is the target averageacreage idled per landowner. Maximum enrollment for any individual is constrained by his total available acreage

aiðyÞpai for all ðy; iÞ, (2)

where ai is the size of a holding in county i. Let lX0 and gðyÞX0, respectively, denote the Lagrange multipliers for (1)and (2).9

4 This assumption is standard in mechanism design theory. It is satisfied by a wide range of common distributions [2]. It helps ensure that equilibria

are separating, i.e., that it is not optimal for an interval of types to receive the same contract. For a discussion of how to solve this type of problem when

this assumption is violated see [16].5 Output price is assumed to be perfectly elastic so that a landowner’s profit is not affected by the amount of land idled by others.6 As a notational convention, subscripts on functions denote partial derivatives and superscripts denote county indexes.7 If ri

aa ¼ 0 a bang-bang solution results in which farmers either enroll all or none of their land [30]. In such a case, the second-best and Pigouvian

schemes are equivalent.8 The estimation results from Section 3 reject the hypotheses that these assumptions are violated.9 I assume both ni and ai to be exogenously fixed. Thus there is no entry or exit from the sector, nor can farmers react to the program by buying or

selling land.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268256

The second policy constraint is that program participation be voluntary. The regulator must compensate landowners forat least their opportunity cost:

tiðyÞXrið0;yÞ � riðaiðyÞ; yÞ for all ðy; iÞ. (3)

Define surplus payments received in excess of the minimum necessary to satisfy (3) by

siðyÞ � tiðyÞ � ½rið0; yÞ � riðaiðyÞ; yÞ�. (4)

The voluntary participation constraint (3) can then be expressed succinctly as

siðyÞX0 for all ðy; iÞ. (5)

It is convenient to use Eq. (4) to express expected program cost in terms of surplus:

XI

i¼1

ni

Z 1

0½siðyÞ þ rið0; yÞ � riðaiðyÞ; yÞ�dFðyÞ. (6)

The policy alternatives in the following subsections progress from least to most constrained. The first-best allows thegovernment to allocate transfers and payments across types as it sees fit, subject only to constraints (1), (2), and (5). Thesecond-best adds the constraint that the allocation must be incentive compatible. The county-level Pigouvian subsidyrequires that the same environmentally weighted per-acre payment be offered to all landowners in a given county, withparticipation levels determined by the individual. Since this system is necessarily incentive compatible, it is moreconstrained than the second-best. The last alternative adds the constraint that a single environmentally weighted per-acrepayment be offered to all landowners, regardless of location. Therefore, the cost of attaining the land set aside target cannotbe decreasing as we progress from one alternative to the next. The magnitude of any possible increase in cost is theempirical question addressed in Sections 3 and 4.

2.1. First-best (first-degree price discrimination)

The Lagrangian for the first-best minimization problem is

LFB¼XI

i¼1

ni

Z 1

0siðyÞ þ rið0; yÞ � riðaiðyÞ;yÞ þ loi½A� aiðyÞ� �

giðyÞf ðyÞ½ai� aiðyÞ�

� �dFðyÞ (7)

subject to surplus constraint (5). Since surplus payments only increase cost, this constraint binds for all types. The first-bestland allocation also satisfies the following Kuhn–Tucker conditions for all i and y:

riaða

iðyÞ; yÞ þ loi �giðyÞf ðyÞ

p0, (8)

aiðyÞ riaða

iðyÞ;yÞ þ loi �giðyÞf ðyÞ

� �¼ 0, (9)

loi

Z 1

0½A� aiðyÞ�dFðyÞ ¼ giðyÞ½ai

� aiðyÞ� ¼ 0. (10)

Consequently, for an interior solution the optimal first-best program satisfies the equimarginal principle. It equates themarginal profit from cultivating an additional acre of land for each landowner to the environmental benefit-weightedshadow cost of tightening the environmental constraint for the entire sector. The first-best program could be implementedwith a two-part tariff. The regulator would offer a linear payment per environmental benefit equal to the shadow value ofthe enrollment target. Landowners would respond by idling the efficient quantity of land. A type-dependent lump-sum taxwould recover all surplus payments arising from the subsidy.

2.2. Second-best (second-degree price discrimination)

The optimal second-best policy is more complicated. The revelation principle indicates that there is no loss in generalityby restricting attention to direct revelation mechanisms satisfying incentive compatibility [24]. Incentive compatibilityrequires that for all i:

y 2 arg maxyfriðaiðyÞ; yÞ þ tiðyÞg for all ðy; yÞ 2 ð0;1�2. (11)

This requirement, combined with the participation constraint (5), imposes two restrictions on the set of feasible contractallocations [3]. First, enrollment is monotonically non-increasing in type:

daiðyÞdy

p0. (12)

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 257

Second, expected surplus weakly decreases over type at the rate

dsiðyÞdy¼ ri

yðaiðyÞ; yÞ � ri

yð0; yÞ. (13)

Intuitively, if the government offered payments exactly to offset costs, low types would choose contracts designed forhigher types (but not vice versa). The lower the type, the higher the surplus payment necessary to induce a landowner tochoose the contract intended for him.

Since surplus is non-increasing, the best the principal can do while satisfying (5) and (13) is to set sið1Þ ¼ 0. UsingEq. (13), surplus is then

siðyÞ ¼Z 1

y½riyð0; zÞ � ri

yðaiðzÞ; zÞ�dz. (14)

Temporarily ignoring (12), substitution of Eq. (14) into Eq. (6) and integration by parts yields the following Lagrangian forthe government’s second-best problem:

LSB¼XI

i¼1

ni

Z 1

0

FðyÞf ðyÞ½riyð0;yÞ � ri

yðaiðyÞ; yÞ� þ rið0; yÞ � riðaiðyÞ; yÞ

þloi½A� aiðyÞ� �giðyÞf ðyÞ½ai� aiðyÞ�

�dFðyÞ. (15)

The second-best land allocation satisfies the following Kuhn–Tucker conditions:

riaða

iðyÞ;yÞ þFðyÞf ðyÞ

riayða

iðyÞ; yÞ þ loi �giðyÞf ðyÞ

p0, (16)

aiðyÞ riaða

iðyÞ; yÞ þFðyÞf ðyÞ

riayða

iðyÞ; yÞ þ loi �giðyÞf ðyÞ

� �¼ 0, (17)

loi

Z 1

0½A� aiðyÞ�dFðyÞ ¼ giðyÞ½ai

� aiðyÞ� ¼ 0. (18)

The restrictions on FðyÞ and rið�Þ ensure satisfaction of (12).The impact of asymmetric information can be most easily seen for interior solutions. Unlike the first-best case, rather

than equating marginal profit of land to its weighted shadow value for all farms, there is a distortion createdby the term FðyÞri

ay=f ðyÞ in Eq. (16). As a result, the equimarginal principal is never satisfied. This program could beimplemented by the government requesting landowners to choose a contract from a menu of possible choices.Alternatively, the second-best welfare level can be thought of as an upper bound of the welfare obtainable by anoptimally designed procurement auction. For an interior solution, the second-best is not obtainable by a Pigouviansubsidy.

2.3. Differentiated Pigouvian subsidy (third-degree price discrimination)

This program is geographically differentiated inasmuch as a distinct payment scheme applies to each county. Unlike theprevious case, transfers are restricted to be a linear payment per environmental benefit, ti=oi, where ti is a per-acrepayment and oi is the benefit per acre. Thus, for this case, tiðyÞ ¼ tiaiðyÞ. The first-order condition of incentive compatibilitycondition (11) requires that an interior solution satisfy

�riaða

iðyÞ;yÞ ¼ ti. (19)

Presented with this subsidy, the solution the landowners’ optimal enrollment problem is

aiðt; yÞ � arg maxafriða; yÞ þ tiag.

Knowing this, the government’s optimization problem is to choose the vector s � ðt1; . . . ; tIÞ 2 RIþ that minimizes expected

expenditures. The corresponding Lagrangian is

LCP¼XI

i¼1

ni

Z 1

0tiaiðti; yÞ þ loi½A� aiðti; yÞ� �

giðyÞf ðyÞ½ai� aiðti; yÞ�

� �dFðyÞ. (20)

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268258

The optimal solution satisfies the following conditions10:

Z 1

0aiðti; yÞdFðyÞ þ ti q

R 10 aiðti; yÞdFðyÞ

qtiXloi q

R 10 aiðti; yÞdFðyÞ

qti�

qR 1

0

giðyÞf ðyÞ

aiðti; yÞdFðyÞ

qti, (21)

loi

Z 1

0½A� aiðti; yÞ�dFðyÞ ¼ giðyÞ½ai

� aiðti; yÞ� ¼ 0. (22)

The intuition behind this result is most easily understood for interior solutions in which the optimal ti is strictly positiveand constraint (2) does not bind. Then, Eq. (21) can be re-arranged to yield a variant of the inverse-elasticity pricing rule fora monopsonist that discriminates between distinct markets:

1þ1

�i¼loi

ti, (23)

where

�i �qR 1

0 aiðti;yÞdFðyÞqti

�tiR 1

0 aiðti; yÞdFðyÞ(24)

is the price elasticity of expected supply of enrolled acreage in county i. The government minimizes cost by providing ahigher subsidy in counties with more elastic supply.

2.4. Uniform Pigouvian subsidy (no price discrimination)

The uniform Pigouvian subsidy is similar to the case analyzed in the previous section, with the exception that thepayment per environmental benefit may not vary by county, i.e., ti=oi ¼ l for all i. When landowners respond to thesubsidy, the land allocation is the same as for the first-best program. Unlike the first-best, however, the informationasymmetry prevents the use of a two-part tariff to recover surplus payments.

3. Empirical model

In this section, I show how to estimate the parameters of the production technology and the distribution of agent typesusing commonly available data (e.g., from cross-sectional industrial surveys). These results provide the necessaryinformation to calibrate the allocations for the policy alternatives described above. I begin by specifying a parametrictechnology. This technology implies a restricted cost function, expenditure share equations (by Shephard’s Lemma), and arevenue to cost ratio (by profit maximization). I then show how robust GMM techniques can be applied to standardstochastic frontier analysis for efficient estimation of this system of equations.

3.1. Specification

I employ a log-linear stochastic frontier model [1,23]. Although it would appear natural to estimate a restricted profitfrontier, the data contain observations with negative profit, making this variable unsuitable for a logarithmictransformation. Instead it is more convenient to work with the restricted cost function. Once the parameters of therestricted cost function are estimated, they are used to recover the profit function.

Let Cðw; q; ‘; yÞ denote the minimum cost for a type y to produce q 2 Rþ units of an aggregate agricultural commodity,given a vector of variable input prices w � ðw1; . . . ;wNÞ 2 RN

þþ and a land endowment ‘ 2 Rþ.11 I assume this restricted costfunction has a Cobb–Douglas form in which a reduction in type indicates a proportional increase in cost12:

Cð�Þ ¼ y�1 expXMm¼1

amsm

!qbq ‘b‘

YNn¼1

wbnn . (25)

Here, s � ðs1; . . . ; sMÞ0 is a vector of observable factors affecting cost. These variables include year and state fixed effects to

control for region-wide annual shocks and state-level factors unlikely to have changed significantly in the four-year period.

10 As in the standard monopsonist problem, it is necessary to verify satisfaction of the second-order condition since the restrictions imposed on rið�Þ

are not sufficient to guarantee a minimum [32].11 For regression variables and parameters, subscripts represent indexes.12 The Cobb–Douglas form was chosen for ease of algebraic manipulation. For estimation purposes a more flexible log-linear specification (e.g.,

translog) with a log-linear Hicks-neutral productivity parameter could also be used.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 259

In addition, s includes county-level mean yields as a proxy for local factors that affect production such as average soil andclimate characteristics. The variables denoted a and b are structural parameters to be estimated econometrically.

Let q� � arg maxqfpq� Cðw; q; ‘;yÞg be profit-maximizing output for a given output price p 2 Rþþ. Algebraicmanipulation of the first-order condition for an interior solution to this profit maximization problem yields theequilibrium ratio of revenue to cost:

pq�

Cðw; q�; ‘;yÞ¼q ln Cðw; q�; ‘;1Þ

q ln q. (26)

Note that ratio (26) is independent of y.Shephard’s Lemma and Eq. (26) provide a system of equations for estimating Cð�Þ [11]. The system of estimating

equations for a typical observation is

lnC

wN¼XMm¼1

amsm þXN�1

n¼1

bn lnwn

wNþ bq ln q� þ b‘ ln ‘ þ v0 � ln y, (27)

wnx�nC¼ bn þ vn; n ¼ 1; . . . ;N � 1, (28)

pq�

C¼ bq þ vN , (29)

where C is observed cost. The vector of random noise for an individual landowner is v � ðv0;v1; . . . ;vNÞ0. Normalization of

the restricted cost function by wN imposes positive linear homogeneity in input prices.Since output is endogenous under the assumption of profit maximization, the state-level annual output price index p

acts as an instrument for q�. In addition, during the period in which the data were collected, 1997–2000, the U.S.government was implementing a land set aside program (the CRP). Since farmers were paid to remove land fromproduction, the amount of land reported as being cultivated is likely to be correlated with individuals’ unobservedproductivity characteristics. I therefore employ county population density as an instrument for ‘. Finally, individual inputprice indices are likely to be endogenous since the component weights (e.g., relative use of diesel versus gasoline indetermining an aggregate energy price) may also be correlated with unobserved landowner characteristics [14]. To correctfor this, I use state-level price indices as instruments for individual-level prices. Vector z denotes the exogenous variablesfor an individual landowner.

Since y is unobservable to the government and the econometrician, ln y is modeled as a component of the error term inEq. (27).13 As is common in the stochastic frontier literature, I assume that v and y satisfy the following moment conditions:

M1.

13

14

15

The e16

E½vjz� ¼ 0;

M2. E½v3

0� ¼ 0; ffiffiffiffiffiffiffiffiffip

M3. E½ln yjz� ¼ �s 2=p; ffiffiffiffiffiffiffiffiffip M4. E½ðlny� E½ln y�Þ3� ¼ s3ð1� 4=pÞ 2=p.

Under M1, v is a mean-zero disturbance vector uncorrelated with the instruments. In addition, M2 states that v0 issymmetrically distributed.14 Assumptions M3 and M4 require that ln y be uncorrelated with the instruments, and indicatethat it belongs to a ‘‘half-normal’’ distribution (lny has the same distribution as �jhj, where h is a random variabledistributed Nð0;s2Þ). Thus, the distribution of ln y has a single parameter, s.

These distributional assumptions have two practical implications [1]. First, consider a least squares estimator thatignores the type-dependent component of the error structure, using the incorrect moment conditions E½v0 � ln yjz� ¼ 0. Bytreating the expected compound error as mean zero, rather than mean s

ffiffiffiffiffiffiffiffiffi2=p

p, this regression biases estimates of

coefficients corresponding to the intercept.15All other parameter estimates remain consistent. The second practicalimplication is that the compound error term ðv0 � ln yÞ is skewed with third central moment equal to �s3ð1� 4=pÞ

ffiffiffiffiffiffiffiffiffi2=p

p.

The assumption of a half-normal distribution for ln y is common in the stochastic frontier literature [20]. Intuitively, itcan be justified by the notion that in a competitive economy the mode of the distribution of firms should be near thefrontier. Since type is unobservable, however, it is impossible to formally test this structural hypothesis with cross-sectional data.16 Nonetheless, as depicted in Fig. 1, the distribution of county-level average corn yields in the region understudy appears to support its plausibility.

Type cancels out of Eqs. (28) and (29).

Violation of M2 could lead to random noise influencing the estimated distribution of types.

For the special case in which s contains only a constant term, least-squares methods estimates the following regression:

ln C � ln wN ¼ sffiffiffiffiffiffiffiffiffi2=p

pþ aþ

XN�1

n¼1

bn ln wn þ bq ln q� þ b‘ ln ‘ þ ½v0 � ln y� sffiffiffiffiffiffiffiffiffi2=p

p�.

xpected value of the error term in brackets is zero, but the presence of the (unobserved) first term on the right-hand side will bias the estimate of a.

Distributional assumptions on the type distribution are not necessary for identification if the researcher has access to panel data.

ARTICLE IN PRESS

120

100

80

60

40

20

075 85 95 105 115 125 135 145 155 165 More

Num

ber

of C

ount

ies

Bushels per acre

Fig. 1. County distribution of corn yields in Heartland Farm Resource Region.

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268260

Let e � ðe1; . . . ; eJÞ0 denote regression residuals for Eq. (27), where j ¼ 1; . . . ; J indexes each observation. The third

moment of the residuals is a consistent estimator for the third moment of the combined error term v0 � lny. This suggestsan additional equation to estimate sequentially after Eqs. (27)–(29) to identify the parameter of the type distribution:

e3j ¼ �s

3ð1� 4=pÞffiffiffiffiffiffiffiffiffi2=p

pþ vjðNþ1Þ, (30)

where vjðNþ1Þ is random noise for observation j. The cube root of the estimate s3 permits correction of the initial bias in theintercept.17

Estimation of the system proceeds in three steps. The first ignores lny, and estimates Eqs. (27)–(29) by system two-

stage least squares (2SLS). Let b denote the vector of parameters for Eqs. (27)–(29). The system 2SLS estimator b2SLS

is

b2SLS¼ ðZ0XÞ�1

ðZ0YÞ, (31)

where X, Y, and Z are, respectively, the equation-by-equation stacked right-hand side, left-hand side, and exogenousvariables for all observations for Eqs. (27)–(29). This estimator is consistent for all parameters, except the intercept. Theestimator b

2SLSis likely to be inefficient, however, and generate inconsistent estimates of the covariance matrix. In addition

to correlation of errors for the same observation across equations, the noise component may be heteroskedastic orinfluenced by unobserved shocks commonly affecting all landowners in the same geographic area. Such shocks may beshort-lived or persist across time.

The next step addresses these potential problems. I use the 2SLS residuals to construct a robust GMM estimator[28,36,37]. The GMM estimator of the parameter vector b from Eqs. (27)–(29) is

bGMM¼ ðX0ZW�1Z0XÞ�1

ðX0ZW�1Z0YÞ. (32)

Here, the weighting matrix W is a function of the system 2SLS residuals:

W ¼1

J

XI

i¼1

Zi0ðYi� Xib

2SLSÞðYi� Xib

2SLSÞ0Zi, (33)

where superscripts indicate that the respective matrices contain only the information for county i. Effectively, W is theaverage of county-level weighting matrices calculated in turn from farm-level 2SLS residuals (for more details see [36, pp.328–330]). For this estimator to be appropriate, I assume that errors from different counties are independent, andunobserved county effects are uncorrelated with the instruments. This estimator is asymptotically efficient in the presenceof arbitrary heteroskedasticity and arbitrary county-level correlations both within and across time periods.

Finally, the third empirical moment of the GMM residuals from Eq. (27) is used as the left-hand side variable in Eq. (30).This last equation is consistently estimated by ordinary least squares. The estimate s3 is used to correct the bias in theGMM estimate for the intercept. I use the residuals from this sequence of regressions to calculate the asymptotic covariancematrix for the entire system [27].

The consistency of estimates of the distribution of agent types and the intercept of the restricted cost function (but notother parameters) depends upon both the half-normal distribution of ln y and the symmetry of v about its mean. Althoughthese are arguably strong assumptions, they are weaker than those typically employed in stochastic frontier analysis.

17 The half-normal specification is inappropriate if least-squares residuals are negatively skewed. The estimated s would then be negative, implying

the existence of firms with lower cost than the most efficient. See [6,33] for a discussion of suitable distributional assumptions for such cases.

ARTICLE IN PRESS

Table 1Summary statistics.

Mean Standard deviation

Cost in current dollars ð�1000Þ 440.27 2859.5

Input expenditure shares

Capital 0.3760 0.1477

Labor 0.1776 0.1114

Energy 0.0423 0.0274

Materials 0.4040 0.1604

Revenue/cost 1.1615 0.8027

Output index ð�1000Þ 517.51 2595.7

Acres 1039.2 1034.0

Price indices (farm level)

Capital 0.9857 0.0335

Labor 0.9880 0.0843

Energy 0.9721 0.1216

Materials 0.9859 0.0357

Price indices (state level)

Output 1.0102 0.1014

Capital 0.9856 0.0327

Labor 0.9880 0.0821

Energy 0.9721 0.1202

Materials 0.9857 0.0306

Cty. avg. yield 135.91 15.270

Cty. avg. population density 78.75 146.80

Number of observations 5547

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 261

The commonly used maximum likelihood stochastic frontier approach adds additional assumptions regarding the i.i.d.normality of v. Estimates of all parameters in such models are inconsistent if errors are not i.i.d. [18]. In addition, estimationof a complete stochastic frontier system using maximum likelihood methods for non-i.i.d. errors is likely to becomputationally unwieldy. Moreover, standard stochastic frontier estimation packages do not allow use of instrumentalvariable techniques. Thus, estimates are likely to be inconsistent if explanatory variables such as output are endogenous.18

The resulting estimator is not only robust, but computationally undemanding, even with a system of equations.19

3.2. Data and estimation results

The econometric analysis is conducted on a sample of midwestern U.S. agricultural producers. Producer cost and returnsdata come from 1997–2000 Agricultural Resource Management Study (ARMS) surveys conducted by the National AgriculturalStatistics Service (NASS). The surveys are independent annual cross-sections in which it is not possible to track individualproducers across time. ARMS contains data on input expenditures, output quantities, and land. Input and output price datacome from the Bureau of Labor Statistics for capital and labor, the Federal Reserve for interest rates, and NASS for otherinputs as well as output commodities. County-level population densities come from the U.S. Census Bureau website, andcounty-level corn yields come from the NASS website.

During the period in which the data were collected, the U.S. government implemented the loan deficiency payment(LDP) program. LDPs are loans made to farmers at harvest at a pre-determined rate per volume of output. In subsequentmonths, when farmers repaid the loan, they could do so either in full or at the prevailing market rate. LDPs thus acted as aprice floor for program commodities. To account for this policy, I specify output price as the maximum of the loan rate orthe market-year average price. The U.S. government also provided income support to farmers in the form of productionflexibility contracts and marketing loss assistance payments. These payments were decoupled from output in the sensethat they were made on the basis of area under cultivation, not production of a specific commodity. In the econometricmodel, I assume that these decoupled payments had no distortionary effect on production choice.

I aggregate outputs into a single category and variable inputs into capital services, labor, energy, and materials using amultilateral Tornqvist index [7]. Since ARMS surveys record capital assets as estimated market value at year end, I calculatecapital services adapting the methodology of [17].

The estimation procedure implicitly assumes all producers have the same general production technology (up to the typeparameter). To limit possible specification bias, I focus attention on one relatively homogenous area, the ‘‘Heartland’’ FarmResource Region.20 This region comprises the corn belt. It includes the entire states of Illinois, Indiana, and Iowa, as well as

18 For example, using Stata’s frontier command to estimate Eq. (29) alone by maximum likelihood without instruments does not yield a

well-behaved cost function.19 It is straightforward to program the estimation routine in a matrix language such as Gauss. Computations take seconds to complete on a

standard PC.20 The U.S. is divided into Farm Resource Regions with similar physiographic, soil, and climatic characteristics [12].

ARTICLE IN PRESS

Table 2Cost frontier parameter estimates.

Parameter Variable Value Standard error

ac ln county average yield �0.4686 0.0504

b1 ln capital price 0.3622 0.0029

b2 ln energy price 0.0429 0.0026

b3 ln materials price 0.4138 0.0027

b4 ln labor price 0.1811 0.0043

bq ln output 1.1264 0.0029

b‘ ln acres �0.0807 0.0379

s Scale of type distribution 0.8612 0.1112

R2 0.8393

Notes: Robust to arbitrary heteroskedasticity and county clustering. R2 for cost equation only.

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268262

portions of Kentucky, Minnesota, Missouri, Nebraska, Ohio, and South Dakota. It is the region with most farms, mostcropland, and greatest value of production [12].

Table 1 presents summary statistics for the data. There are 5547 observations, where each observation corresponds toan individual farm in a single year. Without the inclusion of decoupled government payments, the average return to non-land inputs is about 16 percent. There is considerable variation in net revenues. Although median returns are about $33 peracre, just over one-third of the sample had negative market returns.

Table 2 reports parameter estimates. For each endogenous variable (input price indices, land cultivated, and output), F

tests reject at the 99 percent level the null hypothesis that excluded instruments do not have explanatory power.21 I obtainthe restricted profit function used in the theoretical model by algebraic manipulation of the estimated restricted costfunction and the first-order condition for q�. The functional form of the corresponding restricted profit function is

riða; yÞ ¼ ½bq � 1� � ½ai� a�b‘y�1 bq

p

� �bq

expXMm¼1

amsm

!YNn¼1

wbnn

" #1=1�bq

. (34)

For the parameter estimates in Table 2, the restricted profit function satisfies the theoretical restrictions imposed in Section 2.In addition, the estimated function is well behaved, satisfying theoretical monotonicity and curvature conditions withrespect to prices.

Although model specifications and data sets differ, input own-price elasticities are comparable to results from earlierstudies of U.S. agriculture [29]. Evaluated at the sample mean, the estimated average annual return to land is approximately$57 per acre, without including decoupled payments. For the three states entirely included in the sample (Illinois, Indiana,and Iowa), these payments averaged about $45 per acre [13]. Including decoupled payments brings average returns toabout $102 per acre. This figure is reasonably close to the average commercial rate of $110 per acre paid by farmers whorented land for crop production in these three states [25].

As shown in the previous section, empirical calibration of the optimal contract schedule requires two components: arestricted profit function for each type of producer and a probability distribution for type. The procedures described in thissection provide this information. The estimated restricted cost function is used to calculate riða; yÞ. The estimate s3 is usedto parameterize f ðyÞ.

4. Policy simulations

As an illustration of the methodology, I combine the parameter estimates with the theoretical results from Section 2.I use this information to calibrate the four hypothetical policy options to retire environmentally sensitive land in themidwestern U.S.

The simulations evaluate three policy decisions. The first two involve the value of policy reform. I compare the benefitsof changing from a uniform Pigouvian subsidy to a system in which the government acts as a monopsonist employingthird-degree price discrimination, i.e., using a system of linear subsidies that vary by the elasticity of the supply of land atthe county level. Next, I examine the benefits obtainable by shifting from third-degree price discrimination to the second-best program. This comparison indicates the maximum expense the government should be willing to incur to develop anoptimal policy without collecting additional information. The final comparison calculates the value of removing theinformation asymmetry. For example, suppose type were completely embodied in a measurable soil quality index. By

21 The fact that the instruments are strongly correlated with the endogenous variables reduces the impact of inconsistency arising from a possible

weak correlation of instruments with errors [4].

ARTICLE IN PRESS

Table 3Simulation results.

Price discrimination

First Second Third None

Total cost 233 679 708 712

Surplus payments 0 445 473 479

Note: Thousands of dollars.

200

150

100

50

035 40 45 50 55 60 More

Cents per benefit-weighted acre

Num

ber

of C

ount

ies

Fig. 2. Distribution of county-level Pigouvian subsidies.

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 263

comparing the cost of the first and second-best mechanisms, we could obtain the maximum amount the governmentshould be willing to pay to collect the soil quality information, i.e., to change from second to first-degree pricediscrimination.

To keep the simulation tractable, all farms in a given county are assumed to have the same observable characteristics,but differ by their unobserved profitability parameter y. Environmental weights are based on the U.S. Department ofAgriculture’s National Resource Inventory (NRI) erodibility index.22 Available land ranges from 74 to 1150 acres per farm.Using data from [26], I calculate the number of farms in each county by dividing total county cropland by average countyfarm size.23 Target enrollment is set at about 1.72 million weighted acres. At the average environmental weight, thiscorresponds to approximately 7 million acres (5 percent of agricultural land in the Heartland resource region). For eachpolicy, the numerical solution of the relevant necessary conditions characterizes the contract terms.24

Table 3 summarizes the simulation results. Since they become progressively less constrained the programs must benon-increasing in cost as they go from the uniform Pigouvian subsidy to first-degree price discrimination. Interestingly, theonly sizable difference among the programs is between second- and first-degree price discriminations. The non-discriminatory subsidy required to attain the target is about 41.5 cents per environmentally weighted acre, resulting in atotal cost of $712 thousand.25 Fig. 2 illustrates the optimal distribution of Pigouvian subsidies when the government cancondition on observable county characteristics. These subsidies range from 29.2 to 61.8 cents per acre, with the mode

22 The NRI assigns a wind and water erodibility index to each sampled farm. For each county, I calculate a weighted (by farm size) average erodibility

index based on the larger of these two values, and normalize the weights such that the maximum county weight is equal to unity.23 Since the restricted profit function is concave in land, the simplification of using county average farm size rather than a distribution of farm sizes

with the same mean may overstate cost estimates of the various programs.24 The half-normal distribution is unbounded from below. To make the simulation tractable, the minimum type is set to expð�2Þ, excluding about the

lowest one percent of types. All computations are performed in GAUSS.25 These subsidies do not include compensation for lost decoupled government payments. Such payments are not private information to landowners.

In principle, the government could compensate landowners for them exactly. This compensation would increase the cost of the land set aside program.

However, it would be completely offset by corresponding reductions in decoupled payments, resulting in no net increase in overall government

expenditures.

ARTICLE IN PRESS

9

6

3

00 100 200 300 400

acres

1st Best 2nd Best Pigouvian

-2.00 -1.75 -1.50

In (�)

400

300

200

100

0

Fig. 3. Regional first-best, second-best, and Pigouvian allocations. (a) Contract terms (cents per acre) and (b) enrollment (acres per type).

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268264

occurring at about 45 cents. Relative to a uniform Pigouvian subsidy, this third-degree price discrimination allows onlymodest cost savings of 0.8 percent, about $3800 per year.

Using second-degree price discrimination, the lowest the government can spend to attain the target by using theinformation currently at its disposal is about $679 thousand. These savings represent about $33 thousand per year relativeto the uniform subsidy, about 4.6 percent of the total cost.

The relatively small reductions in cost obtainable by engaging in second- or third-degree price discrimination contrastsharply with the first best. If the regulator could use information on individual types to engage in first-degree pricediscrimination, the cost of the program drops to about $233 thousand, or 3.4 cents per benefit-weighted acre. The second-best mechanism is only able to recover about 6.9 percent of the difference in cost between the uniform Pigouvian subsidyand the first-best mechanism. The maximum the regulator should be willing to pay to obtain the information necessary forfirst-degree price discrimination is $446 thousand per year.

Rather than present the contract terms for each county, as an illustration Fig. 3 depicts allocations for a second-bestcontract for a hypothetical county with mean characteristics and a environmental weight of unity. For comparison, it alsoillustrates contract allocations for first-degree price discrimination and a non-discriminatory Pigouvian subsidy, assumingall counties have identical characteristics.

The dot-dash line indicates contract terms for the Pigouvian subsidy. As indicated in panel (a), all contracts receive thesame payment per acre. In panel (b), the enrollment rates by type for the Pigouvian subsidy are not visible since theycoincide with those for the first-best contract. Enrollment decreases as type increases. This must be the case since eachproducer chooses to enroll a quantity of land such that his marginal opportunity cost is equal to the Pigouvian subsidy, andthis marginal cost is increasing in type.

The solid line in both panels indicates contract terms for the first-best policy. The fundamental difference with thePigouvian subsidy is that the government can fully eliminate any surplus payments. Farms with higher opportunity costsidle fewer acres. Thus, as shown in panel (a), smaller land set asides are matched with larger payments per acre. Such anallocation is not feasible if opportunity costs are private information since farms with lower costs could profitably mimichigher types by choosing contracts with low enrollment and high payments per acre.

The dashed lines depict the contract terms for the second-best policy. Panel (a) shows that the second-best paymentschedule is non-linear, offering greater payments per acre as enrollment increases. The distortionary impact of theinformation asymmetry on the distribution of idled land across types can be seen in panel (b). Relative to the first best andPigouvian subsidy, the allocation is shifted so that lower types enroll more land, and higher types enroll less land. Thisdistortion, combined with the fact that higher types receive lower payments per acre helps reduce the incentive for lowertypes to falsely claim to be high types. It is these distortions that reduce surplus payments relative to the Pigouvian subsidy.The optimal distortion is small, however, and cost savings are low.

Under what conditions then would one expect more significant gains? As reported in Table 4, these findings are robustto key cost function parameter values evaluated within two standard deviations of their point estimates. Recent researchon a related class of asymmetric information models shows that the distribution of types can affect the efficiency gains ofoptimal mechanisms [8].

ARTICLE IN PRESS

Table 4Sensitivity of simulation results to cost parameter estimates.

Parameter adjustmenta Total cost by degree of price discriminationb Relative savings of second degreec

First Second None

Baseline 0.233 0.679 0.712 0.069

bGMM

q þ 2�s.e.0.189 0.534 0.559 0.068

bGMM

q þ s.e.0.209 0.601 0.629 0.069

bGMM

q � s.e.0.261 0.773 0.811 0.069

bGMM

q � 2�s.e.0.293 0.884 0.929 0.070

bGMM

‘ þ 2�s.e.0.0007 0.0021 0.0022 0.075

bGMM

‘ þ s.e.0.028 0.082 0.087 0.072

bGMM

‘ �s.e.1.46 4.26 4.46 0.068

aGMM1 þ 2�s.e. 0.0050 0.0145 0.0152 0.067

aGMM1 þs.e. 0.0341 0.0993 0.104 0.068

aGMM1 �s.e. 1.59 4.64 4.87 0.070

aGMM1 � 2�s.e. 10.8 31.7 33.2 0.071

a Baseline refers to GMM point estimates. Parameters adjusted by one or two standard errors (s.e.), holding all others at baseline.b $millions.c (None-second)/(none-first).

Table 5Sensitivity of simulation results to scale parameter estimate.

s Total cost by degree of price discriminationa Relative savings of second degreeb

First Second None

1.0 0.102 0.267 0.283 0.089

0.9 0.177 0.502 0.528 0.073

0.8 0.399 1.20 1.26 0.064

0.7 1.23 3.74 3.89 0.060

0.6 4.97 14.25 14.84 0.060

0.5 23.1 59.6 62.1 0.064

0.4 112 251 262 0.071

0.3 530 1030 1080 0.082

0.2 2440 4000 4180 0.104

0.1 9920 13,370 14,080 0.169

a $millions.b (None-second)/(none-first).

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 265

To evaluate the potential importance of the distribution, I repeat the simulation using alternative specifications of F.First, I change the value of the scale parameter s of the half-normal distribution, holding all else fixed.26 The half-normaldistribution is negatively skewed. Such a distributional assumption is appropriate in this application, since the empiricaldistribution of residuals is positively skewed.27 This characteristic limits, however, a general inquiry into the effects of thedistribution on relative performance of the second-best mechanism. For ease of comparison with earlier work, I alsosimulate the mechanism using the power function distribution employed in [8]. For this specification, the distributionfunction of lny is ½ðlny� lnyÞ=ðln y� ln yÞ�d, where d 2 Rþ.28 I repeat the simulation, varying the parameter d (resultsreported inTables 5 and 6).

In the immediate neighborhood of the estimated s, increasing this parameter tends enhances the relative performanceof the second-best mechanism. As s varies from about two standard errors below to two standard errors above its pointestimate, the relative performance of the second-best mechanism increases from 6 to 9 percent. This effect is notmonotonic, however. If probability mass is very concentrated on the highest types (s very low), the relative performance ofthe second-best program also improves. For example, if s ¼ :01 the relative gains increase to about 17 percent.

26 I also ran simulations varying the parameter of an exponential distribution of types. Since the results are similar to those of the half normal, they

are not reported here.27 See [6,33] for cost frontier estimation with negatively skewed residuals.28 As with the half-normal distribution, I set y ¼ expð�2Þ and y ¼ 1. Note that d ¼ 1 corresponds to the uniform distribution.

ARTICLE IN PRESS

Table 6Simulation results for power function distribution of types.

d Total cost by degree of price discriminationa Relative savings of

second degreeb

First Second None

100 21,800 24,300 27,800 0.584

10 848 1850 1920 0.073

4 12.5 34.2 35.6 0.065

2 0.262 0.667 0.705 0.085

1 0.020 0.033 0.037 0.228

0.5 0.0069 0.0086 0.010 0.482

0.25 0.0047 0.0052 0.0063 0.696

0.1 0.0038 0.0040 0.0049 0.879

a $millions.b (None-second)/(none-first).

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268266

The power function distribution shows a similar result. As d falls, probability mass shifts to lower types. From 4 to 2, therelative performance of the optimal scheme increases from 6.5 to 8.5 percent, levels similar to those for the half-normaldistribution. For a uniform distribution of types, the relative savings increase to almost 23 percent. As the distributionbecomes positively skewed, relative savings continue to increase, to almost 70 percent for d equal to 0.2. On the oppositeextreme, as d increases to high levels the relative performance of the optimal mechanism improves as well. For d ¼ 100,relative savings exceed 58 percent.

Concentrating probability mass on lower types improves the relative performance of the second-best mechanism in twoways: it reduces surplus payments and increases efficiency. To understand the intuition, recall that the second-bestmechanism creates a distortion from the first-best land allocation. By comparing Eqs. (8) and (16) we can see that it distortsupward the land retired by lower types and downward the land retired by higher types. The reason for this distortion is thatevery unit of land retired by a farmer of a given type results in surplus payments being paid to all lower types. Thedistortion results in a net reduction in total payments: i.e., the reduced surplus payments created by the distortion morethan offset the increased cost created by the corresponding loss in efficiency.

Note from Eq. (14) that the surplus allocated to each farm is independent of the probability distribution F. Thus,distorting the land retirement of the highest participating types downward by one unit has more leverage in terms of totalsurplus reduction the larger the probability mass of lower types. Moreover, the total efficiency loss corresponding to thisunit reduction from the first-best allocation grows smaller as the probability mass of high types experiencing it shrinks. Toensure that the target is met, it is necessary to distort the land retirement of lower types upward relative to their first-bestallocation. Recall that the restricted profit function is concave in land. As a result, it is less costly to spread a given quantityof land retirement among many farms of a given type than among few farms. Consequently, the efficiency loss arising fromthis upward distortion grows smaller as the probability mass of lower types increases.

Although the second-best mechanism performs well in relative terms if the distribution of types is negatively skewed,its absolute savings are at their lowest. For example, even when d equals 0.1 and the second-best mechanism is able tocapture over 88 percent of total possible cost savings, the absolute savings are $900. The larger the proportion of farmsearning low profit, the less costly the program, and the smaller the potential absolute savings.

As probability mass concentrates on the highest types, the relative performance of the second-best mechanism alsoimproves. Low types continue to receive large surplus payments. Since their probability density is extremely small,however, their overall impact is negligible. Total expected surplus payments are consequently low. Further, the distortion inthe land allocation, FðyÞri

ay=f ðyÞ, is lower for high types if their probability density is high. Due to these effects, the second-best mechanism approaches the first best as the probability distribution approaches a Dirac delta distribution.29

5. Conclusion

Recent articles have shown the usefulness of modeling type as an unobserved random variable for analysis of regulationunder asymmetric information. Here, I extend earlier results in two directions. First, I develop a GMM-based methodologyfor estimating a stochastic cost frontier for a profit-maximizing producer. This approach differs from earlier techniques inthat it easily accommodates a system of equations (in this case a cost equation, expenditure share equations, and the ratioof revenue to cost) with instrumental variables and is robust to arbitrary cross-equation correlation, heteroskedasticity, andgeographic clustering. Further, it is computationally simple as it does not require non-linear optimization.

Second, I extend the empirical contract theory literature by using the empirical results to calibrate the theoreticallyoptimal contract. Although the econometrician cannot directly observe producer type, the stochastic frontier approach

29 That is, where the probability mass is concentrated on one point. Simulations with other type distributions such as exponential and normal

confirm these qualitative results.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268 267

permits consistent estimation of the technology and probability distribution of types in the population. This techniqueprovides the necessary ingredients for specifying an optimal contract mechanism.

I apply this methodology a hypothetical program to retire environmentally sensitive agricultural land. The simulationpermits cost comparisons of four programs varying by degree of price discrimination. This type of analysis can provideguidance to policy makers interested in reducing the cost of voluntary environmental policies. For example, if the second-best program closely approximates the full-information program, reform efforts may be well spent in designing a complexsystem of non-linear contracts. For the sample of agricultural producers considered here, however, such does not appear tobe the case. Simulated cost advantages from using second or third-degree price discrimination are small. To the extent thatthere are differences in policy implementation costs (not explicitly modeled here), there may be reason to use a simplenon-discriminatory policy like a uniform Pigouvian subsidy. Otherwise, the only manner of achieving significant costsavings is to try to reduce the information problem (perhaps by collecting detailed farm-specific data on soil, climate,producer expertise, etc.) in an attempt to approximate first-degree price discrimination. Results here thus lend empiricalsupport to the Wilson doctrine. Not only does a uniform linear subsidy achieve allocative efficiency in an administrativelysimple manner, it appears to do so with a small sacrifice of overall cost relative to the second-best.

As a final caveat, this analysis has assumed all idled land in a given county to have an equal environmental benefit. Thischoice was made primarily due to data limitations. The ARMS surveys do not contain information on environmentalbenefits, and NRI data are not available for each farm. Environmental benefits are not private information to firms, however.In principle, a government could use such benefit information to improve the design of the program in two ways notconsidered here. First, if merged with ARMS, environmental benefits could be used as an explanatory variable in costfrontier estimation. If environmental benefits are not uniformly spread across firms, they may change the structure of theestimated cost frontier and distribution of types. Second, as done here with county-level environmental weights, thegovernment could use benefit information to discriminate among classes of firms. Details on how best to measureenvironmental benefits, incorporate such information in estimation, and evaluate its effect on the relative performance ofdifferent policy mechanisms are beyond the scope of this paper and are left for future research.

Acknowledgments

I am indebted to Robert G. Chambers, Jean-Paul Chavas, Sumeet Gulati, Valerie Mueller, Michael Roberts, twoanonymous referees and participants at the North American Workshop on Efficiency and Productivity Analysis, HeartlandEnvironmental and Resource Economics Workshop, and Colorado University Workshop on Environmental and ResourceEconomics for valuable suggestions and criticism. I am also grateful to Roger Claassen and other staff of the USDA EconomicResearch Service for helpful comments and support in using the ARMS and NRI data sets. This paper has been screened byUSDA to ensure that no confidential data have been disclosed.

References

[1] D. Aigner, C.A.K. Lovell, P. Schmidt, Formulation and estimation of stochastic frontier production function models, J. Econometrics 6 (1) (1977) 21–37.[2] M. Bagnoli, T. Bergstrom, Log-concave probability and its applications, Econometric Theory 26 (2005) 445–469.[3] D.P. Baron, R.B. Myerson, Regulating a monopolist with unknown costs, Econometrica 50 (4) (1982) 911–930.[4] J. Bound, D.A. Jaeger, R.M. Baker, Problems with instrumental variable estimation when the correlation between the instruments and the endogenous

explanatory variable is weak, J. Amer. Statistical Assoc. (1995) 443–450.[5] A. Bousquet, M. Ivaldi, Optimal pricing of telephone usage: an econometric implementation, Info. Econ. Pol. 9 (3) (1997) 219–239.[6] M.A. Carree, Technological inefficiency and the skewness of the error component in stochastic frontier analysis, Econ. Letters 77 (1) (2002) 101–107.[7] D.W. Caves, L.R. Christensen, W.E. Diewert, Multilateral comparisons of output, input and productivity using superlative index numbers, Econ. J. 92

(365) (1982) 73–86.[8] Y. Chu, D.E.M. Sappington, Simple cost-sharing contracts, Amer. Econ. Rev. 97 (1) (2007) 419–428.[9] A.-S. Crepin, Incentives for wetland creation, J. Environ. Econ. Manage. 50 (3) (2005) 598–616.

[10] D.M. Dalen, A. Gomez-Lobo, Estimating cost functions in regulated industries characterized by asymmetric information, Europ. Econ. Rev. 41 (3–5)(1997) 935–942.

[11] W.E. Diewert, Duality approaches to microeconomic theory, in: K.J. Arrow, M.D. Intriligator (Eds.), Handbook of Mathematical Economics, North-Holland, New York, 1982.

[12] Economic Research Service, Farm Resource Regions, Agricultural Information Bulletin 760, U.S. Department of Agriculture, Washington, DC, 2000.[13] Environmental Working Group, Farm subsidy database hhttp://www.ewg.org/farmi, 2005.[14] M.A. Fuss, The demand for energy in Canadian manufacturing: an example of the estimation of production structures with many inputs,

J. Econometrics 5 (1977) 89–116.[15] P. Gagnepain, M. Ivaldi, Incentive regulatory policies: the case of public transit systems in France, RAND J. Econ. 33 (4) (2002) 605–629.[16] R. Guesnerie, J.-J. Laffont, A complete solution to a class of principal-agent problems with an application to the control of a self-managed firm,

J. Public Econ. 25 (1984) 329–369.[17] R.E. Hall, D.W. Jorgenson, Tax policy and investment behavior, Amer. Econ. Rev. 57 (3) (1969) 391–414.[18] R.J. Kopp, J. Mullahy, Moment-based estimation and testing of stochastic frontier models, J. Econometrics 46 (1–2) (1990) 165–183.[19] V. Krishna, Auction Theory, Academic Press, New York, 2002.[20] S. Kumbhakar, C. Lovell, Stochastic Frontier Analysis, Cambridge University Press, Cambridge, UK, 2000.[21] P. Lavergne, A. Thomas, Semiparametric estimation and testing in a model of environmental regulation with adverse selection, Empirical Econ. 30 (1)

(2005) 171–192.[22] T.R. Lewis, Protecting the environment when costs and benefits are privately known, RAND J. Econ. 27 (4) (1996) 819–847.[23] W. Meeusen, J. van den Broeck, Efficiency estimation from Cobb–Douglas production functions with composed error, Int. Econ. Rev. 18 (2) (1977)

435–444.

ARTICLE IN PRESS

G. Sheriff / Journal of Environmental Economics and Management 57 (2009) 253–268268

[24] R.B. Myerson, Incentive compatibility and the bargaining problem, Econometrica 47 (1) (1979) 61–73.[25] National Agricultural Statistics Service, Agricultural Cash Rents 2001 Summary, U.S. Department of Agriculture, Washington, DC, July 2001.[26] National Agricultural Statistics Service, Census of Agriculture, U.S. Department of Agriculture, Washington, DC hhttp://www.nass.usda.govi, 2002.[27] W.K. Newey, A method of moments interpretation of sequential estimators, Econ. Letters 14 (2–3) (1984) 201–206.[28] J.V. Pepper, Robust inferences from random clustered samples: an application using data from the panel study of income dynamics, Econ. Letters 75

(3) (2002) 341–345.[29] S.C. Ray, A translog cost function analysis of U.S. agriculture, 1939–1977, Amer. J. Agr. Econ. 64 (3) (1982) 490–498.[30] R.B. Smith, The conservation reserve program as a least-cost land retirement mechanism, Amer. J. Agr. Econ. 77 (1) (1995) 93–105.[31] A. Thomas, Regulating pollution under asymmetric information: the case of industrial wastewater treatment, J. Environ. Econ. Manage. 28 (3) (1995)

357–373.[32] J. Tirole, The Theory of Industrial Organization, MIT Press, Cambridge, MA, 1988.[33] E.G. Tsionas, Efficiency measurement with the Weibull stochastic frontier, Oxford Bull. Econ. Statist. 69 (5) (2007) 693–706.[34] R. Wilson, Game-theoretic analyses of trading processes, in: T. Bewley (Ed.), Advances in Economic Theory: Fifth World Congress, Cambridge

University Press, Cambridge, UK, 1987, pp. 33–70 (Chapter 2).[35] F. Wolak, An econometric analysis of the asymmetric information, regulator–utility interaction, Ann. Econ. Statist. 34 (1994) 13–69.[36] J.M. Wooldridge, Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, 2002.[37] J.M. Wooldridge, Cluster-sample methods in applied econometrics, Amer. Econ. Rev. 93 (2) (2003) 133–138.