30
Journal of Econometrics 13 (1980) 27-56. 0 North-Holland Publishing Company MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC FRONTIER FUNCTIONS* William H. GREENE Cornell Uniwrsity, Ithacu, NY 14853, USA 1. Introduction The estimation of production functions has been one of the more popular areas of applied econometrics. Recent work in duality theory which has linked production and cost functions has made this topic even more attractive. Typically, least squares (or some variant, such as two stage or generalized least squares) is used to estimate the model of interest in accordance with the assumption of a normally distributed disturbance in the model. However, definitions of a production function are given in terms of the maximum output attainable at given levels of the inputs. Similarly, a dual cost function gives the minimum cost of producing a given level of output at some set of input prices [See Christensen and Greene (1976).] It has thus been argued that the disturbances specified in these models, and techniques used to estimate them should account for that fact. These considerations have motivated the recent literature on frontier functions. Numerous studies have been devoted to the respecification of empirical production and cost models to make them more compatible with the underlying theory, and to the derivation of appropriate estimators. In some cases, this has amounted to minor modifications of least squares results. The remaining estimators are based on two distinct specifications. The very recent work on composite disturbances has relaxed somewhat the orthodox interpretation of the underlying function as a strict frontier with all observations lying on one side of it, and has produced well behaved maximum likelihood estimators with all of the usual desirable properties. Other authors, following the more strict interpretation, have employed what we shall call full frontier estimators which allow only one sided residuals. It *This is a revised version of an earlier paper, Cornell Working Paper no. 162. The helpful comments of Henry Wan, Peter Schmidt, Jack Kiefer and two anonymous referees are gratefully acknowledged.

MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

Journal of Econometrics 13 (1980) 27-56. 0 North-Holland Publishing Company

MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC FRONTIER FUNCTIONS*

William H. GREENE

Cornell Uniwrsity, Ithacu, NY 14853, USA

1. Introduction

The estimation of production functions has been one of the more popular

areas of applied econometrics. Recent work in duality theory which has

linked production and cost functions has made this topic even more attractive. Typically, least squares (or some variant, such as two stage or generalized least squares) is used to estimate the model of interest in

accordance with the assumption of a normally distributed disturbance in the model. However, definitions of a production function are given in terms of

the maximum output attainable at given levels of the inputs. Similarly, a

dual cost function gives the minimum cost of producing a given level of

output at some set of input prices [See Christensen and Greene (1976).] It has thus been argued that the disturbances specified in these models, and

techniques used to estimate them should account for that fact. These

considerations have motivated the recent literature on frontier functions. Numerous studies have been devoted to the respecification of empirical

production and cost models to make them more compatible with the

underlying theory, and to the derivation of appropriate estimators. In some cases, this has amounted to minor modifications of least squares results. The

remaining estimators are based on two distinct specifications. The very recent work on composite disturbances has relaxed somewhat the orthodox interpretation of the underlying function as a strict frontier with all observations lying on one side of it, and has produced well behaved maximum likelihood estimators with all of the usual desirable properties.

Other authors, following the more strict interpretation, have employed what we shall call full frontier estimators which allow only one sided residuals. It

*This is a revised version of an earlier paper, Cornell Working Paper no. 162. The helpful comments of Henry Wan, Peter Schmidt, Jack Kiefer and two anonymous referees are gratefully acknowledged.

Page 2: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

28 WH. Greene, MLE of econometric frontier functions

has been shown that the received full frontier estimators are also maximum likelihood. However, estimation of them has been something less than wholly successful, due in large measure to the fact that, in spite of their being maximum likelihood, their statistical properties are unknown.

The composite disturbance models offer an attractive specification. However, they leave unanswered questions of the properties of full frontier

estimators, which do have a theoretical appeal. The purpose of this paper is

to provide some results on maximum likelihood estimation of full frontier

models. First, the common problems of the received estimators will be

analyzed. In short, as Schmidt (1975) points out, one of the standard regularity conditions usually assumed in maximum likelihood estimation is

violated. For the frontier model, this means that the results usually invoked for maximum likelihood estimators do not necessarily apply. An alternative

frontier estimator is then proposed. A class of probability distributions which can be used for the disturbance model and which allow maximum likelihood

estimation to proceed as a regular case is defined. The statistical properties of the resulting estimators are easily established using the standard results in spite of the fact that this is a ‘non-regular’ case. The results for a specific

disturbance formulation with some particularly convenient properties will be

examined in detail. Finally, the technique will be applied to two well known

sets of data.

2. Previous full frontier estimators and their statistical bases

The estimation of econometric frontier functions begins with the study of Aigner and Chu (A-C) (1968). Following the initiative of Farrel (1957) who

describes an industry ‘envelope’ isoquant, they propose a method of estimating a production function model which constrains all residuals from the fitted function to be negative, i.e., a ‘full frontier’ model. Their model is

y= Ax”;‘x;~u, (2.1)

where y is output, x1 and x2 are inputs, and u is a random disturbance. The systematic part of the right-hand side gives the maximum output attainable using inputs xi and x2. They suggest minimizing the sum of absolute residuals from the log of the production function while constraining all residuals to be negative, which is a linear programming problem. No distributional assumption is made by A-C; but Schmidt (1975) shows that their technique is equivalent to maximum likelihood estimation of the model,

logy=logA+r,logx,+a,logx,-a, (2.2)

Page 3: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

WH. Greene, MLE of econometricfrontierjiinctions 29

where E = -log u and has an exponential distribution,

f,(e)=ie-““, ezo, A>O. (2.3)

As an alternative, A-C propose minimizing the sum of squared residuals, again constraining all residuals to be negative. In the setting of the Cobb-

Douglas model, this is a problem of quadratic programming. By similar reasoning, Schmidt goes on to show that the quadratic programming

estimator is maximum likelihood if E has the half-normal distribution.

(2.4)

Recently, Forsund and Jansen (1977) have studied a production frontier by estimating its dual cost function. The homothetic production function

which they estimate is

where y and xj are as before, I,“= 1 rj = 1, and u has distribution

f,(u)= (1 +m)u”, O<USl, Ix> -1.

The dual cost function is

(2.6)

C=ByBeyy fi pj.upl, j= 1

(2.7)

where pj is the unit price of xj, and C is total cost. Forsund and Jansen show that maximum likelihood estimates are obtained using linear programming,

by minimizing the sum of positive residuals from the log of (2.7). Making the transformation from u to a=log U- ‘, we find

f:(s) = (1 + cr)e-(’ +a)E, EZO. (2.8)

Equating 1 in (2.3) to (1 i-cc), we see that the stochastic framework here is the

same as that applied to AC’s model. This completes the list of the received full frontier maximum likelihood

estimators. In each case, estimates are obtained by solving a constrained programming problem. The frontier model has been used to study the structure of production and efficiency in production by a number of authors, nearly all of whom have used the linear programming technique. The full

Page 4: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

30 WH. Grrmr, MLE ol’econornrtricfrontierfunctions

frontier model has the theoretical appeal of forcing the fitted function to correspond to the underlying theory in terms of the signs of the residuals. Moreover, they are maximum likelihood for certain stochastic specifications.

Unfortunately, certain problems have beset them. They can be highly sensitive to outliers. To compensate for this problem, Timmer (1971)

proposed that the linear programming technique be modified to allow a certain prespecified proportion of the residuals to have the ‘wrong’ sign. While this probably does solve the outlier problem, it must surely compound

the. statistical problems. The more serious shortcoming of these full frontier

estimators is their lack of identifiable statistical properties. Although they are maximum likelihood, the characteristics of the estimation problem prevent us from making any use of this result, except, perhaps, to justify the choice of

the computational algorithm. The observation that the programming estimators are ML is not sufficient to enable us to establish their statistical

properties. No standard errors have been derived for them, and no statistical inference based on them has been possible.

Beyond some cursory observations, we will not attempt to establish the

specific asymptotic properties of the programming estimators. This remains

for future research. We will, however, examine in some detail how the problems with these models arise. In the process, the analysis suggests how the problem of inference can be circumvented through the use of an

alternative estimator.

3. Estimation of the full frontier function: General results

Let the production or cost function be specified as

yt = M +/II’ x, + e,, t=l,...,T, (3.1)

where yr is output or total cost, whichever is appropriate, x, is the

corresponding vector of exogenous variables, E, is a random disturbance, r and B are fixed (for all t), but unknown parameters, and T is the size of the sample. The systematic part of the equation gives the ‘optimum’ value of y, given x,. The random part, a,, differs from 0 due to random shocks such as weather, inefficiencies, etc. [See, e.g., Aigner and Chu (1968, p. 258).] The disturbances will always be of one sign, negative for the production case or positive for the cost case. Typically, the model would arise from the logarithmic transformation of exp (yl) =,f( x ,) u, where either 0 < urs 1 or II, 2 1 for the production and cost cases respectively. The following additional assumptions will be maintained for the remainder of this study:

Page 5: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometricfrontier functions 31

Assumption 1. The (K + 1) x 1 vector (1, xi)’ =x: is independent of E, for all f

and s.l

Assumption 2. The random variables cl, Q,. . ., Q- are independent and identically distributed with probability density function (p.d.f.) f(e,), cumulative distribution function (c.d.f.) F(E,), finite mean, p, and finite, positive variance, a2 for all t. We will also assume, purely for convenience,

that the range of E, is 0 5 E, < co, which implies p >O. (The case of negative range involves only a trivial modification of what follows.)

Assumption 3. The T x (K f 1) matrix X* whose tth row is x:’ is observed

and has rank (Kfl) for all TIK+l.

Assumption 4. Let x,~ be the tth observation on the kth element of x, in a sample of size T, k= 1,. .,K, t= 1,. .., T. The sample distribution function

F’$(xk) defined by F$(xk)=j/T, where j is the number of points

Xlk, X2k,. . .> XTk less than or equal to xk converges to a distribution function Fk(xk) and xk is bounded, k = 1,. .,K.

Assumption 5. The (K + 1) x (K + 1) matrix (l/T)z:= 1 x:x:’ converges to a finite positive definite matrix as T goes to co.

Without loss of generality, for the time being, x, will be assumed to have a single element.

3.1. Least squares estimation

With the exception of the non-zero mean of the disturbances, E,, all of the

assumptions of the classical regression model apply to (3.1). Since the model contains an intercept, it is simple to show that ordinary least squares

provides a best linear unbiased and (by virtue of the assumptions about the regressor) consistent estimate of fi. The conventionally computed standard error for this estimate is appropriate, as is the assumption of asymptotic

normality. [See Theil (1971, pp. 38&381).] The only parameter not consistently estimated by OLS is CY. The OLS intercept estimator is consistent

for c1 +p.’ Since 51 is generally of no interest in any event, if consistent estimation of the slope (s) is all that is desired, the analysis can stop at this point. However, in some instances, the least squares residuals can provide a

‘See, e.g., Christensen and Greene (1976, pp. 658%659), Zellner, Kmenta, and D&e (1966), and Schmidt (1975, p. 238).

‘See Schmidt’s equation (3), but note the sign reversal in our formulation. Richmond (1974) derives the same result for a specific disturbance model.

Page 6: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

32 W.H. Greene, MLE of econometricfrontierfinctions

consistent ‘moment’ estimator of M. For example, Richmond (1974) examines a model in which the mean and variance of the one sided disturbance are

both equal to ,u. Hence, the least squares residual variance, s2, which is unbiased for o2 in any event, is also unbiased for p; and koLs - s2 is unbiased

for CI. In general, we should expect that whenf(a,) is a one parameter family,

both p and o2 will be functions of that parameter. Since s2 will always be unbiased for a2, a consistent estimate of ,u, and hence of CI may be

obtainable. (See, for example, the moments of the half-normal and exponential distributions below.) If the disturbance distribution involves

more than one parameter, it may be possible to estimate all of the parameters of the model using additional moments of the OLS residuals.

[See, for example, Aigner, Love11 and Schmidt (1977, p. 28).] Note, though, that even after the correction of & for the non-zero

disturbance mean, some of the residuals will still have the theoretically wrong sign. Of course, this is not inconsistent with the onesidedness of the underlying disturbance; each residual is, after all, a function of all of the

disturbances and all of the data. [See Schmidt (1975).] But, the presence of these ‘wrong’ residuals may impede the computation of efficiency measures

which rely on sign uniformity. A biased but consistent estimate of c( (albeit of uncertain efficiency) which imposes the sign uniformity on the residuals is

easily obtained. Consider, first, a simplified version of the model, with only an intercept,

i.e., ~,=a+&,. For the purpose of the discussion which follows, define F;= CI +a,, and note that for this simple model Et = J,. The OLS estimator of CI would be j, which has plim j= CI + p # a. As an alternative, we propose the

minimum sampled value of yr, which is conventionally denoted y(i,. For the simple model here, y,,,=F(,,. Obviously, E,i,=~+c~i,; so to prove that y,i, is

consistent for CI, it is sufficient to prove that plim a(i) = 0.

Proof In any random sample of T observations, on a, the c.d.f. of the sample minimum is l-cl -F(C)]*. Therefore, for any 6 greater than zero, ~(~~i)~~)=[l-F(6)]r. Since O<F(d)< 1 for all 6>0, lim,,, P(E~~~~~)=O. Now, let 6 become arbitrarily small. With the condition a(i) 2 0 which follows from .s,zO for all t, this proves the assertion directly.3

Assume, for example, that E has density in (2.3). This is easily shown to yield

fE(,)(E(l))=~Te-“‘““), which gives the exact results E(.s(i))=l/iT and V(E(~)) = 1/A2T2. Both of these vanish as T increases. Thus, for this case, a(,) converges in mean square to 0, and y(i) = tl + ql) is consistent for c(.

3Note that a similar argument will establish the consistency of any order statistic of specific rank, i.e., smallest, second smallest, 50th smallest (but not of any quantile in the sample). For this reason, the efficiency of this estimator seems uncertain. Obviously, the bias of the sample minimum is the smallest among the order statistics.

Page 7: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometricfrontier functions 33

For the more general case in which fl #O, the simple result above cannot

be invoked. When B=O, although we cannot observe the disturbances, we can

observe their ranks. When there is a regressor in the model, so that we must derive information about the disturbances from the residuals, even this information is obscured. Let Zr =yl - bx,, where b is any consistent estimator of p. (The intercept estimator has been discarded.) Note that et is the sample estimate of I, =yt - ox,, but that while .Ct is observed, E, is not. Moreover,

(C i,. . ., ET) is not a random sample as. the residuals are, in general, neither

independent nor identically distributed. Thus, the standard results for order

statistics do not apply here. Nonetheless, e’(,, is a consistent estimate of c( under the assumptions already made. (Our proof relies on the boundedness

of xt, and does not necessarily apply for cases in which x, is not restricted to be finite for all t.)

Proof: By direct substitution, Ct =E;+ (b-P)x,. It is assumed that plim (b -/i’)=O in spite of the non-zero mean of 8,. This would be true, for example,

of the OLS estimator. It must then be true that

&5F;+max(b-fl)x, Vt

Thus,

rninC?~~min E,+max(b-P)x, f f [ f 1 qmin ~2, s min Et + max (b - fl)x,

f t t

=-plim min & 5 plim min E, f plim max (b - p)x,. t f f

But, plim min, 1, =plim EC,, =u was proved above. Since x, is uniformly bounded by Assumption 4, plim max, (b -p)x, =0 follows from the consistency of b.

(*) :. plimEC,,sa.

It must also be true that

&Z&+min(b--p)x, Vt. t

Page 8: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

34 W.H. Greene, MLE ofeconometricfrontier,functions

Thus

minE,zmin i?,+min(b-0)x, f * [ f 1 *min e, 2 min E, + min (b - P)x~

1 I r

*plim min tFr 2 plim min 2, + plim min (b - P)x,. f f t

Again, plim Ed,) = a; and plim min, (b - p)x, = 0 follows from b and from the assumption that x, is uniformly bounded.

(**) :. plim~~,,Zcx.

the consistency of

Since (*) and (**) must both be true, we have proved that plim&,=cc. The conclusion is that regardless of the distribution of a,, if the sequence x,

is well behaved (as detined by our assumptions) and if the distribution of a,

meets the requirements above, then the OLS residuals can be used to derive a consistent estimate of c(. We need only shift the intercept of the estimated

function until all residuals (save for the one support point) have the correct sign. As before, ancillary parameters of the disturbance distribution can now

be consistently estimated using the moments of the observed residuals.

3.2. Maximum likelihood estimation

Consistent estimates of all of the parameters of the frontier function can be

obtained using only a simple modification of the ordinary least squares estimator. However, the distribution of E is necessarily asymmetric. A maximum likelihood estimator which makes use of this information should

be more efficient, at least asymptotically. Maximum likelihood estimation requires a particular assumption about the distribution of the disturbance. The linear and quadratic programming estimators of Aigner and Chu are maximum likelihood if the distribution of E, is assumed to be exponential [(2.3)], which has p= l/n and g2 = l/i’, in the first case and truncated

normal [(2.4)], which has ,U =@&/A and c2 = 02(7r-2)/n in the second.4

“In obtaining moments of the half-normal distribution, the gamma integral

[ xP1 e-Qx”dx=(l/cc)a-P’“T(pjr) where T(R)= ~xR~le-xdx for R 20, 0

T(R)=(R-l)f(R-1)for Rzl,

I-( l/2) = J’G, has been used.

and

Page 9: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier,functions 35

The asymptotic properties of these maximum likelihood estimators remain

to be established. The usual approach to the problem for econometric

models is to extend the analysis of Cramer (1946) to the regression case at hand as, for example, in Barnett (1976) or Amemiya (1973).5 Unfortunately, this approach requires a regularity condition which is absent here. In particular, let aT be the vector of efficient scores (l/T)c3logL/@, where L is

the likelihood function based on a sample of size T and Cp is the vector of unknown parameters being estimated. The standard approach requires either

E+(a,)=O or lim,,, aT = 0. Unfortunately, as shown below, for neither of the aforementioned distributions, can either result be established. On the other hand, the consistency proof of Wald (1949) requires less stringent regularity conditions, and it would seem that it could be suitably modified to apply to the LP and QP frontier estimators. The argument in Kendall and

Stuart (1973, p. 42) for example, which is based on this proof, should provide a suitable framework.

Of course, the (appropriately modified) OLS estimator is also consistent, and more easily computed. Rather, we are interested in asymptotic efficiency.

Unfortunately, the observation that these estimators are maximum likelihood is of little value in this regard, as it provides no guidance as to how to formulate or estimate standard errors. Consider the following exact

expectations based on a sample of size T, (J-~,.x,): For the exponential case, with 4 = (A, CI, p),

E[(l/T)dlogL/Z$]=

and

- E[~I/T )a2 i0g L/a+a+q =

(3.2a)

p”’ -

1 0 1 x 0 0 I 3 (3.2b)

where X is the sample mean of the observations on x,. Note, E(a,)#O. Moreover, both E(a,ak) and E[l/T)a’log L/I%#&#J’] are singular in

every sample, so regardless of what is assumed about the data, the proof of consistency, efficiency and asymptotic normality of Cramer will break down.

(The characteristic roots of E(a,a’,) are 1/A2, IX’ + p2.U2, and 0.) For the half

‘The extension is necessary because (J,, y2,. ., y.,.) do not constitute a random sample. While independent (in most settings) the observations are not identically distributed; each has its own mean. One approach is simply to consider repeated sampling of the multivariate observation (p,lx,), t= l,..., T, as in Theil (1971, sec. 8.1). Alternatively, one can establish the necessary results directly as do Barnett (1976) and Amemiya (1973).

Page 10: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

36 W.H. Greene, MLE qfeconometric frontier,functions

normal case, with +=(O, a,/?),

(3.3a)

and

l/244

-E[(l/T)a* logL/a&%$‘] = ,,6/43 & 1/42

1

(3.3b)

&Xl@ J;r fix/@ x”l#’

where x 1 is the sample mean square of the observations on x,. Again,

E(a,)#O, and clearly lim,,,, a, will not vanish either. It is easily verified

that the expectation in (3.3b) is not positive definite, so again the results of Cramer cannot be applied.

None of the established results for MLE’s of regression models apply here, and it is not clear how asymptotic standard errors for these estimators can

be obtained or what asymptotic distribution is appropriate.

The problem, as Schmidt has correctly diagnosed, arises because the range of the observed random variable depends upon the parameters being

estimated. Consequently, this case is an irregular one as far as maximum likelihood estimation is concerned. Some special considerations are required

to handle this violation of the usual regularity conditions. We note, though, that aside from the range problem, if one is willing to make the necessary assumptions about the exogenous variables, then the two distributions

proposed by Schmidt imply a likelihood function which is otherwise well behaved. (That is, there are no discontinuities in the density function of the

observed random variable anywhere over its range, all required derivatives exist and are finite, etc.) But, as he observes, the range problem will persist regardless of what distribution is chosen for c.

The range problem described above does not preclude us from establishing the desirable large sample properties of all maximum likelihood estimates for frontier functions. Nor, as shown below, does it necessarily invalidate the standard analyses usually applied to better behaved problems (as, for example, by Amemiya), including the use of the conventionally computed information matrix to form standard errors.

Assume, for now, that no regression is involved, so that the results for random sampling apply to the observations yr, t= 1,2,. . ., T, and consider the general problem of maximum likelihood estimation of a parameter vector 4. Let ,f(y,, $) be the common density function of yr, t= 1,. . ., T, and assume that the following regularity conditions all apply:

Page 11: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

(1)

(2)

(3)

(4)

(5)

W.H. Greene, MLE of econometric frontier functions 31

The parameter space @, which may be restricted, for example to exclude

non-positive variances, is compact and contains an open neighborhood

of the true value of q5, &.

f’(yt, 4) is positive, continuous, and differentiable with respect to C$ everywhere values of 4 in an interval A which contains

The range of y, is independent of 4.

2 logf(y,, $)/a$ has a positive definite

elements.

three times continuously in the range of yt and for all & as an interior point.

variance matrix with finite

The absolute value of the third derivatives of logf(yt,$) with respect to 4 are bounded by a bounded integrable function of y1 which does not

depend on 4.

(Note that the estimation problem defined here is a particularly well behaved one.)

Now, by definition

~/(O)d~,=l, I

so that

Assumptions 2 and 3 (by Leibnitz’s rule) allow the interchange of the order of integration and differentiation so that we also have (at least in A)

It is now easily shown that

-&,[a log UOI = 0, (3.4)

where L is the likelihood function, nr= 1 f(yt, qb), and Ego(. ) indicates that the expectation is taken at the true parameter value. Similarly,

where f+ = af (y,, 4)/i@ ; and, interchanging again,

Page 12: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

38 W.H. Grrune, MLE ofeconometricfrontierfunctio~s

where f++, = Z2f( y!, 4)/&@4’. This, in turn yields

and

(3.5)

With these results in hand, and with the other regularity conditions, consistency, asymptotic efficiency, and asymptotic normality of the maximum likelihood estimator can now be established [as, for example, in Kendall and Stuart (1973, ch. 18)]. The negative of the inverse of E(a210gL/?@+‘)

provides the appropriate asymptotic covariance matrix for the maximum likelihood estimator.

Independence of the range of yt of the parameters in 4 is normally included among the regularity conditions in order to make the interchange of

integration and differentiation needed to establish (3.4) and (3.5) permissible. In fact, given Condition 2, Condition 3 is not necessary but only sufficient.

The results in (3.4) and (3.5) can be established without Condition 3 provided other conditions are met.6 This has direct relevance for the estimation of frontier functions, as this range problem is generally the only

one which prevents the likelihood function from being perfectly well behaved. If we can establish (3.4) and (3.5) (or the necessary analog for the regression

case), then, in spite of this violation of the regularity conditions, the analysis

of Barnett or Amemiya for maximum likelihood estimation may be appealed to directly to establish the properties of the MLE. (The simple regression model generally considered in the frontier case is covered a fortiori by their

results for more involved cases.) Suppose, then, that ,f’(y,, 4) satisfies all of the regularity conditions except

that the range of J’~ depends upon 4. In particular, assume /(~)SY, su($). As always,

In A, given Condition 2,

(3.6)

‘The following argument is suggested by Kendall and Stuart (1973, p. 35)

Page 13: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 39

[A proof of this form of Leibnitz’s rule may be found in e.g., Kaplan (1952,

p. 221).] The second and third terms after the first equality vanish when the range of

yr is independent of 4, as &($)/a$ = a/(4)/@ =O. However, they also

vanish whenf(u(~),~)=f(@),~)=O, even if the range of yt depends upon 4.

The requirement is simply that ,f(~,4) be zero at its terminal points. Distributions for which this is the case are numerous. For example,

is one; the lognormal,

is another. [See Kendall and Stuart (1973, p. 168).]

Assuming f(y, 4) vanishes at the terminals, we have, in A,

from which (3.4) follows immediately. Applying the previous result.

~~~~.~~(?-$)dy~=~~~~~m

(3.7)

If the derivatives of f(yt,4) with respect to 4 are zero when JJ~ is at the upper

and lower terminal points of the distribution, then the operations of

differentiation and integration can again be interchanged. From here it is straightforward to obtain (3.5). All of the familiar asymptotic properties for maximum likelihood estimators can now be invoked directly, as the

assumption about the range of yt plays no further role. [See, for example, Kendall and Stuart (1973, ch. IS).]

It remains to extend these results to the frontier function (regression) case. Consider, then, a disturbance specification for the model (3.1) with continuous p.d.f.f(s). Two sets of parameters will be involved here. First are the ancillary parameters of the disturbance distribution, such as /z in (2.3);

second are the intercept and slopes of the frontier function. The range of y1 is free of the first set. Maximum likelihood estimation of them presents no unusual problems so long as f(s) is regular enough with respect to these parameters, which we will assume. Hence, they will be ignored in what

follows with no loss of generality. Let 4 = (a, p). Then 1(+) =a +pxt, while u(4)

Page 14: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

40 W.H. Greene, MLE of‘economrtric frorttier functions

is irrelevant. Given J”(E) and yt =tl +/?x, +E,, so that the Jacobian of the transformation from E, to y1 is unity, we have

&(.L4)=f,(L’-~(dJ)).

To establish (3.4) we need, from (3.6),

f‘(4’,~)/,=,,~r=~(O)=o. (3.8)

This simply requires that the contact point of the disturbance distribution be zero. Obviously, this does not hold for the exponential distribution for which

f,(O)=;l, nor for the half-normal distribution, for which f,(O)=2/&/%r. It

does hold, however, for the lognormal distribution. For (3.5) by (3.7), we require

(3.9)

Now, assume that (3.1), Assumptions l-5, Regularity Conditions 1,2,4, and 5, and (3.8) and (3.9) all apply to the problem at hand. The problem of maximum likelihood estimation of 4 thus implied is an extremely well behaved one, and the already established results for MLE’s for regression

models which are (essentially) linear in their parameters, can be invoked directly. (While this set of conditions is quite stringent for the general problem of maximum likelihood estimation, it places virtually no restriction of the sorts of empirical models typically specified for production frontiers.) Thus, there should be no obstacle to asserting the usual maximum likelihood properties to the MLE of 4. For example, verification can follow the analysis of Amemiya. We note, in passing, that the extension of the results to a nonlinear l(4) would be direct, and should not greatly complicate the analysis. An additional assumption about boundedness of the derivatives of I(4) would be all that is required.

Page 15: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric Jrontierfunctions 41

To summarize, then, the difficulties in maximum likelihood estimation of

frontier functions are not inherent in the problem. Given the set of regularity

conditions and assumptions listed above, the MLE of (3.1) will be consistent, asymptotically efficient and asymptotically normally distributed. An appropriate estimator of the asymptotic covariance matrix for the MLE will be provided by the second derivatives matrix of the log likelihood function.

What is required is a careful choice of the disturbance model.

4. A disturbance specification

The exponential and truncated normal distributions do not satisfy (3.8) and (3.9). There are, however, numerous distributions which do meet these requirements. One which is particularly attractive for the frontier estimator is

the gamma density,

f(~)=G(~,P)=iP$-l e=“‘, P(P)

EZO, 1>0, P>2. (4.1)

The mean and variance of F are p = P/i and g2 = P/A’. The presence of two

free parameters in f(e) obviates the possibly unwarranted assumption of a functional relationship between p and 0’ implicit in (2.3) and (2.4). For the general case of G(& P), P must be positive. The restriction P> 2 gives (3.8)

and (3.9).

The log of the likelihood function for this disturbance model is

-21 (yt-ct-fl’x,).

The first derivatives of the log likelihood are

I i; log L/an

i; log L/dP

? log Ll& 1 C (: log ~/i;p J

= [2 log L/&#J]

‘TP/l - 1 tzr

T/i- (P-1)1 (l/c,)x, \ f

(4.2)

Page 16: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

42 W.H. Greene, MLE o~econometric,frontier.functions

where et =yt - CI - /?‘xt and X is the (K x 1) column vector of sample means of the K variables in x. With the exception of alog L/dP, it is simple to verify E(dlogL/a$)=O.E(E)=P/~* and E(l/s)=i,/(P- 1) are found by direct

integration, from which E(dlog L/di,)=E(? log L/&)=E(dlog L/dfik)=O

follow directly. To find E(ln a), let v =/?a, so E(ln E) =E(ln tl) -In /1. The

distribution of v is easily shown to be G(l, P). While the distribution of In v is messy, the cumulants of In v are simply K,(ln v) =d’lnT(P)/dP’. [Kendall

and Stuart (1973, p. 177).] In general, or =p and IC~=CT~. Thus, E(lnv)

=r’(P)/T(P), and E(Z logL/dP)=O follows imediately. The second derivatives of the log likelihood are

where i is a column vector of ones, and the intercept has been included in

the vector of slopes. The only new result required to derive the exact

(P-1)1 (l/E:)X:X; J

(4.4)

expectations for (4.4) is E( l/e:) = A’/( (P - 1 )(P - 2)). This gives

(4.5)

where

A = ‘IA2 -l//I

and

r=r(q,

Straightforward, though tedious algebra verifies that Z‘, gives the exact covariance matrix of uT= (l/T)8 log L/&j for all T 2 (K + 1). Assumptions 4

and 5 guarantee that C, converges to a positive definite matrix with finite elements. All of the other regularity conditions listed above (Conditions 1, 2,

Page 17: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 43

4, and 5) are easily verified. Thus, we will, at this point, invoke the results of Cramer to assert the consistency asymptotic efficiency, and asymptotic normality of the vector of parameter estimates 6 which maximize (4.2). The

limiting distribution of fi (&-4,) will be normal with mean 0 and covariance matrix C-l, where Z=lim,,,C,; (l/T)C-‘, which we will estimate by (l/T)f; ‘, gives the asymptotic covariance matrix for 6.

The Gamma distribution is obviously asymmetric, hence maximum likelihood estimation of the parameters in (4.2) should be more efficient than

least squares which takes no account of that fact. It will also, unlike OLS, give a consistent estimate of CI. Ignoring, for now, the inconsistency of the OLS intercept term, we would expect the gain in efliciency obtained by ML to be related to the degree of skewness of the distribution. The skewness

coefficient, E(E-E(E))~/~ is readily shown to be 2/a.’ The parameter P is clearly crucial. Intuitively, we might expect the greatest efficiency gain from ML when P is small (near 2).

In fact, a more direct efficiency comparison is available. The exact variance of the OLS estimator is, as always, a’(X’,X,)-’ = (P/E.2)(X;X,)) ‘. The lower right block of the inverse of TC, provides the basis upon which we will estimate the asymptotic variance matrix of the maximum likelihood estimator. As an initial approximation, assume TC, is block-diagonal. Then the appropriate variance matrix is derived from ((P - 2)/A2)(X;X*)- 1 = ((P -2)/P)a2(X;X,)- ‘. Again, a small value of P will suggest a large efficiency

gain of ML over OLS. As a first guess, the number P/(P-2) should be indicative of the relative asymptotic efficiency of ML over OLS for this model. The limiting case of P=2 (which is inadmissable) results in a singular second derivatives matrix, while large values of P imply no gain. In accordance with the earlier result, large P implies a symmetric distribution.

Unfortunately, TC, will never be block diagonal, even if all regressors are in deviation form, so long as there is an intercept in the model. The first row of XLiS’ is T6’ in every sample. Partitioning the inverse of TZ:,, we get

1 -1

X:X, -Xi i6'A - ‘6i’X, .

This can be simplified to

(4.6)

‘This uses E(~‘)=r(P+r)/(l’T(p)). Note that i has disappeared. This follows from the fact that E is just (l/L)0 where u- G(l,P), and a constant scale factor will not affect the shape of the distribution.

Page 18: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

44 W.H. Greene, MLE qf econometric frontier functions

where

p=A*(P-2)[(P- 1)2/1- (P-2)]

P(Py- 1)(P-1)2 ’

and

Y= (r(pr’(p)- (~‘(p))2)l(~(p))2.

Consider the limiting cases P+2 and P+ cc. As P+2, p+O as does (P-2)/P,

and ST vanishes as before.8 As P+m, p+O again, while (P-2)/P-+1. Maximum likelihood estimation provides no gain over OLS if the error

distribution in this model is symmetric. For the intermediate cases, we can appeal to the usual results to assert the relative asymptotic efficiency of the MLE.

For a final characterization of this distribution recall that E =v/A where v

y G(l,P). Now, let z= (a-E(u))/(T, = (v -P)/fi. The rth cumulant of z is rc,(z)=pl-“2 (rz2). As P+m, all cumulants except the second will go to

zero. (ICY =0 if ,U =O.) This sequence of cumulants characterizes the normal

distribution, so we may conclude that as P+co, z tends to standard normality.’ As E is a linear transformation of z, we see that as P+co, the

distribution of E tends to normality. This implies that the maximum likelihood estimator should approach the ordinary least squares estimator.

This last property makes the gamma density extremely attractive for

estimating the production or cost frontier, as it implies that the model is quite flexible in the shapes of error distributions it will accommodate. Suppose the process generating the disturbances is such that the error distribution is symmetric, or nearly so. Our consistent estimate of (c(,p’) will allow us to discern this from the regression residuals. The maximum likelihood estimate of P will tend to be.large, while i will adjust to place the mean in the appropriate location. Moreover, the slope estimators should

resemble the OLS estimators in value and efftciency, while the intercept term will now be a consistent estimate of c(. Alternatively, if the observations tend to be grouped close to the frontier, with only a relatively small number in the extreme range, then P should be small, the error distribution will be highly skewed, and we should expect the maximum likelihood estimator to be highly efficient relative to OLS.

This has an implication for the ‘average’ versus frontier estimators discussed in a number of studies. The average estimator is generally

‘This requires that (Py- 1) not go to zero. But (Py- I )= (r2/T)lAl, which must be positive for all P greater than 0. [Consider the asymptotic variance matrix of the MLE for (i,P) based on an observed sample of c’s, This would be A-‘, which must be positive definite.]

‘Kendall and Stuart (1968, pp. 47, 68, 94-101, 136, 166-167).

Page 19: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 45

understood to be OLS, (although what it estimates is somewhat ambiguous)

while the frontier estimator corresponds to any of the methods thus far

described. The Gamma specification allows a relationship to be established between the average and frontier estimators. If the disturbances about the frontier estimator tend to be symmetrically distributed, we should expect the average estimator to be a displaced or simply scaled version of it with the

same shape. The more skewed the disturbances about the frontier are, the less it should resemble the ‘average’ estimator.

In summary, the Gamma density provides several useful results for the specification and estimation of frontier functions. First, it provides a

maximum likelihood estimator with all of the usual desirable properties. The asymptotic distribution of the estimator is easily derived, and the asymptotic

variance matrix is readily estimated. As shown below, the estimator and these variances are relatively easily computed. The ancillary parameters ,J and P provide additional information on the shape of the distribution with

which we may characterize our observations on relative (cost or technical) efficiency and offer some evidence on the relationship between the frontier and average estimators. In the case in which the error distribution is

symmetric in the relevant range, so that the simple modification of the results of OLS suggested in section 3.1 is reasonable and appropriate, this is what the maximum likelihood estimator should provide. Hence, the added

efficiency of ML will not be simply artificially built into the problem. When the error distribution is highly skewed away from the frontier, however, large gains in efficiency over OLS can be obtained by accounting for this

asymmetry. For the gamma density with P >2, maximum likelihood estimation of the

parameters is a regular case. The problem is complicated somewhat because

estimates of the likelihood function and its derivatives will involve the logarithms and reciprocals of the residuals. Hence, log L must be maximized in the open set e, = (y, - LX- /?‘x,) > 0. But, as any e, approaches 0, log L goes to negative infinity; while c?log L/i@ will be unbounded. Therefore, the maximum must be at an interior point. It appears that this situation will arise for any distribution for which (3.8) and (3.9) hold, as the conditions

seem to imply that f(a) must be of the form &‘(‘@)g(c, 4). Since the expected values in (4.5) are known exactly, the method of scoring

can be used to maximize (4.2) with respect to 4=(&P,&, p’)‘. For this procedure, the gamma function and its derivatives must be approximated.

For the function T(P),

Page 20: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

46 W.H. Greene, MLE of econometric frontier functions

For our applications, a sixteen point Gauss-Laguerre quadrature was used

to approximate P(P),

P(P) z f vi h(wi, r, P), i=l

where h(wi, r, P) = wp- ’ (log wi)*, and vi and wi are the Gauss-Laguerre polynomial weights. [See IBM (1977, pp. 303-307).] The constraints P>2

and ;I > 0 were imposed by the method of squaring. The parameters P and A

in (4.2) were replaced by (Pi +2) and Ai; then (4.2) was maximized with respect to P, and A.+ without constraints.

In the iterative procedure, a slight modification of the scoring algorithm was necessary. The procedure would normally be &‘+ ‘) = $‘) + d(‘), where d”)= -H(s)-‘g’s), HCs) is the inverse of (4.5), and g”) is the current value of

(4.3). However, the direction vector, d, was quite large in the early iterations, so the process became unstable. To compensate, the elements of (AZ/p - 2)‘S’x;x* in (4.5) were multiplied by T prior to inversion. This

substantially reduced the size of d at each iteration, and resulted in extremely slow convergence of the process. However, the computations at each iteration are simple, and even excessive numbers of iterations are quite

inexpensive. The process was stopped when the relative change in the likelihood function was less than O.OOOO1.lo

Starting values for the intercept and slopes were the modified ordinary least squares values of section 3.1. The intercept must be moved slightly further than this procedure dictates in order to insure that no residual be zero. The modified set of residuals now has mean C and (unchanged)

variance s2, both positive. Since E(E) =P/A and V(E) = P/,i2, appropriate

consistent starting values for P and A are P22/s2 and FJs2, respectively. The starting value for A is obviously positive, and the starting value for P was well over 2 in every case attempted.’ ’

“For the function in the second application, with 5 parameters in addition to P and E., convergence required about 220 iterations, but less than eight seconds of CPU time on an IBM- 370. All computations were done in double precision.

“This does not necessarily produce p>2. In all applications considered, however, i, was greater than 5. It is easy to see, though, that P will almost surely be greater than 2 in any sample. The modified set of OLS residuals is obtained by shifting the intercept until every residual is positive, Let e,i, be the minimum of the original OLS residuals, Then e,i, must be less than 0. The mean of the modified residuals is simply --e,,,, while the variance remains .s2. Thus. ~=c:,,/.s’, The requirements P>2 is equivalent to e(,,/ s < - 1.4142. We require that the smallest OLS residual in the sample be at least 1.4142 standard deviations below 0, an event which is extremely likely in any sample, and has probability approaching 1 as T+ x

Page 21: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 41

5. Applications

Two data sets will be analyzed to illustrate the estimation of the frontier

function. To provide a comparison with some prior results, the production frontier estimated by Aigner and Chu and by Aigner, Lovell, and Schmidt will be reestimated using the technique of section 4. The second application will involve a dual cost function. Finally, we will consider the question of estimating technical efficiency using the frontier estimates.

5.1. Production function

The AignerXhu study uses statewide data on the U.S. primary metals industry (SIC 33) to estimate the parameters of a Cobb-Douglas production

function,

In V=cc+p,lnL+fl&R lnK)-E, (5.1)

where I/ is value added, L is labor input, K is the gross book value of plant

and equipment, and R is the ratio of net to gross book value of plant and equipment. Value added, labor, and capital are computed on a per

establishment basis, and there are 28 statewide observations in the sample. [These data were first analyzed by Hildebrand and Liu (1965).]

Aigner and Chu estimated the parameters of (5.1) using linear programming (LP) and quadratic programming (QP). These estimators are maximum likelihood for the exponential distribution and half-normal

distribution, respectively. In their recent, innovative paper, Aigner, Love11 and Schmidt (ALS) respecified the disturbance, E, to be equal to (tl-u),

where u is assumed to be normally distributed with mean zero and variance at, while u has the half normal distribution in (2.3)” Thus, E has an asymmetric distribution. The symmetric disturbance, v, is assumed to be due to uncontrollable factors such as weather, making the effective frontier,

c( +/Yx+ c, stochastic. The negative term, -u, is assumed to be due to inefficiency.

Table 1 presents the parameter estimates obtained using five estimators, OLS, LP, QP, maximum likelihood for the stochastic frontier, and maximum likelihood for the full frontier using the Gamma density.13 Numbers in parentheses below certain of the estimators are asymptotic ‘t-

ratios’ computed using the ratio of the estimate to the square root of the

appropriate diagonal element of the estimated asymptotic covariance matrix.

“They also consider the case in which u has the exponential distribution. See also Meeusen and van den Broeck (1977).

13The first four of these are taken from Aigner, Lovell, and Schmidt (1977, p. 32).

Page 22: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

48 W.H. Greene, MLE qf econometric~rontier functions

While the stochastic frontier function is quite close to the OLS estimator, the full (gamma) frontier is substantially different. ALS deduce the first of these comparisons from the very small value of 0,’ which suggests that their E

is dominated by the symmetric normal error term. Thus, the resemblance to

Table 1

Estimates of eq. (5.1).

Estimator

OLS (Aigner- Love&Schmidt)

KgnerChu)

ZgnerChu)

Stochastic frontier (AignerpLovellMchmidt)

Full frontier

(gamma)

dl

0.9146 (2.04)

0.9600 (2.06)

&2 = 0.000686, ”

1.4197 (5.33)

x=11.7937, (3.55)

CY =0.0387,

BL. PI;

0.9 168 0.04164 (7.31) (2.19)

0.8730 0.003 1

1.071 0.0269

0.9105 0.04208 (7.68) (2.34)

G* = 0.0692 I,

0.7496 0.0756 (10.62) (7.21)

F’= 5.3804 (3.76)

F;= 0.4562

Estimated covariance matrix for the coefficient estimates

& 131. 8, 1 B

;

0.0707

ii:

-0.0122 0.0048 o.OQOO1 -0.0011 0.00011

,I ~ 0.0061 a a 11.0301 B - 0.0048 a a 4.5273 2.045X

“*less than 10 lb.

OLS is to be expected. The results for the gamma function are in strong

disagreement. The value for P of 5.3804 is rather small, giving a skewness coefficient of 0.8622. By comparison, the skewness for ALS’s distribution is only 0.000068.14 As expected, the gamma parameter estimates are quite different from OLS and the ALS estimates. The large efficiency gains predicted by the small value of P can be seen in the substantially larger ‘f- ratios’. If anything, the first guess efficiency ratio of 1.59 .understates the difference.

‘“This can be deduced from their results on pp. 26 and 29f.

Page 23: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE o~ec.onometricfrontierfirnc.tions 49

5.2. Cost frontier

In his classic study of economies of scale in electric power generation, Nerlove (1963) provided the first empirical application of the well known duality between cost and production functions. His study was primarily

concerned with estimation of the cost function dual to a Cobb-Douglas production function based on a sample of 145 firms producing electric power

in 1955. This study provides an excellent setting in which to apply the notion

of a cost frontier. Assume that production is characterized by the production function y

=F(x), where y is output, x is a vector of inputs. F(x) gives the maximum output producible given X; and F(x) is a smooth, neoclassical production

function with positive . and decreasing marginal products and convex isoquants for all pairs of inputs. Then F(x) has a one-to-one correspondence with a cost function, 4 = C(y, p), where q is total cost, p is the vector of input prices, and C(y, p) gives the minimum total cost of producing output y when

input prices are p, and y = F(x). C(y, p) will be similarly smooth, concave and

linearly homogeneous in p and have positive total and marginal cost [C( . ) and X/ay] and positive factor demands xi =i?C( )/Zpi.15

In an empirical setting, if F(x) is interpreted as a production frontier, then

C(y,p) should be interpreted as a cost frontier. Technical efficiency implies cost efficiency and vice-versa. The negative residual on an estimated

production frontier will always have associated with it a positive residual from a cost frontier. The choice of which function to estimate can be based not on the information one expects to obtain, as it is the same in both cases

(by the one-to-one correspondence), but on statistical issues. Nerlove chose a cost function. He argued that output could reasonably be considered exogenously determined for a regulated firm, as could the input prices, while

the factor demands, and hence total cost, should be treated as endogenous. To allow for neutral variations in the returns to scale parameter, Nerlove

generalized the three input Cobb-Douglas cost function to

In (q,lPF, ) = Do + a In yI + B(ln yt Y/2

+ 0, In (PK,IPF,) + Qrd In (PL,IPF,) + E,, (5.2)

where K, I,, and F are inputs of capital, labor and fuel respectively. The implied underlying production function is homothetic, but not homogeneous.

[See Christensen and Greene (1976, pp. 661, 665).] The linear homogeneity in prices constraint has been imposed, and E, is assumed to be greater than 0 in the current setting. The implied scale economies parameter is

qt = l/(8 In C(y,, p,)l? In yr) = l/(a + B In yt),

‘%ee Diewert (1974) and Shephard (1953) for proofs of these results

Page 24: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

50 W.H. Greene, MLE qf‘econometric frontier functions

which clearly varies with yr as desired. Note that v1 is the ratio of average to

marginal cost. Several estimates of (5.2) are presented in table 2. First, the OLS and

modified OLS results (per section 3) are given. Second, the multivariate regression results obtained by maximum likelihood estimation (assuming

multivariate normality, MLMN) of the system of equations (5.2) and the

factor share equations,

c7 In &a In PKI = P,&Jq, = 8, + Ed,,

ii In y,/a In P,, = P,tL,/q, = 8, + Ed,,

are given along with the intercept correction. Finally, the frontier estimator using the gamma distributed error term (MLG) is presented.16

There is a surprising pattern in the parameter estimates. The estimates of the price terms, 0, and 0, using the frontier estimator are very similar to the

OLS results. The large differences in the MLMN estimates are obviously due to the information in the share equations used by this estimator. The mean

sample values of capital’s and labor’s share in total cost are 0.439 and 0.106, respectively. As might be expected, the efficiency of the full information estimator of these parameters far exceeds that of either single equation

estimator. The output terms behave quite differently. Both MLG estimates are very close to midway between *the OLS and MLMN estimates, an outcome which is somewhat surprising in view of the relatively large value of

P obtained. As before, noticeable gains in efficiency over OLS are obtained. The frontier estimator performs about as well as MLMN on the output

terms, but generally worse on the price terms. Estimates of the mean and variance (both empirical and implied for the MLG case) of the disturbance distribution are presented with each set of estimates. Again, the MLG results more closely resemble OLS.

The relative similarity of the MLG results to OLS might have been predicted on the basis of the large estimate of P. Table 3 gives a comparison of the distributions of c estimated for the production and cost frontiers. The efficiency ratio p./(p-2) is far less for the cost case. The skewness coefficient is much smaller, indicating a more nearly symmetric error distribution. As a final measure, the degree of excess, E(E -~)~/a~ - 3 is computed. For the

“See Christensen and Greene (1976) for details on the MLMN estimator. The third factor share is redundant due to the adding up condition, and is dropped. Also, one of Nerlove’s observations appears to be inappropriate for the sample. The sample is of the, costs of steam power generation for 145 firms, but his observation 6 has costs only 10% to 25’?:, of that of comparably sized firms. However, most of this company’s capacity was hydraulic, which would greatly reduce the costs of thermal generation. This observation was dropped in calculating the frontier estimator. All results are based on the reduced sample of 144 observations.

Page 25: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 51

Table 2

Estimates of eq. (5.2).

OLS

Original Modified

MLMN

Original Modified

MLG

PO 10.050 (14.30)

a

B

0.152 (2.46)

0.101 (9.42)

0.074 (0.49)

0.481 (2.98)

0.095

8.668 7.865 (14.30)a (47.90)

0.300 (5.14)

0.083 (8.24)

0.426 (44.18)

0.106 (27.88)

1.382

0.033

For the estimated gamma density

A= 14.860, p = 17.072 (8.49) (3.66)

PIi= 1.149, I?jK2 = 0.077 (58.32) (5.88)

6.739 8.496 (47.90)a (14.24)

0.238 (4.67)

0.090 (9.82)

0.092 (0.73)

0.453 (3.31)

1.126 1.148

0.106

aThe ‘t-ratio’ for the modified intercept will not quite equal that of the original estimate, since V(jO)# V(fl,+e,,,). It is not clear what standard error is appropriate, although the original estimate should be a good approximation.

Table 3

Summary statistics for production and cost frontier disturbance distributions

Primary metals (production) Electric power (cost) (Hildebrand & Liu) (Nerlove)

B 5.3804 17.0716 Asymptotic efficiency ratio 1.5916 1.1327 Skewness 0.8662 0.4841 Degree of excess 1.1152 0.3083

Gamma distribution, this is simply 6/P. This measure is sometimes used as a measure of non-normality, as the ‘mesokurtic’ value of 0 would be obtained if E were normally distributed. Again, the value for the cost function is substantially less than that for the production function.

Page 26: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

52 W.H. Greene, MLE of econometric frontier functions

Table 4

Estimates of scale economies

Output (million k Wh) OLS MLMN MLG

43 1.88 1.63 1.75 338 1.35 1.28 1.32

1109 1.16 1.13 1.16 2226 1.07 1.06 I .07 5819 0.97 0.98 0.98

Implied estimates of minimum efficient scale (million kWh)

4429 4600 4753

The scale economies results for the three estimators are very similar in spite of the differences in the parameter estimates. Table 4 presents the value

of ylt for the firm with the median output in each live groups. (The firms are ranked by output and there are 29 firms in each group.) The frontier does predict that scale economies are slightly more persistent than suggested by

the other two estimators. The predicted minimum efficient scale (MES), at which average cost reaches its minimum, is larger for the frontier, although the difference is certainly not economically meaningful.

5.3. Measuring technic-d and cost &ciency

One of the primary motivations for estimating frontier functions is to study technical and/or cost efficiency. The formulation of the production model as most authors have considered it is y =F(x)u where O<us 1, so that in log form, logy=logF(n)+log u=logF(x)-c. In the case of full frontier models, the sample residuals e, or C,, t = 1,. . ., T, provide observation specific

estimates of the efficiency factors with each sample point. The residuals from

the cost frontier provide analogous information on cost efficiency. It is also useful, particularly if the sample contains a large number of

observations, to have a summary measure of efficiency for the sample as a whole; and the approach typically taken has been to use the moments of the estimated distribution of u or .Z to characterize the overall efficiency in the sample. For example, Afriat (1972) suggested that u be chosen so that logu has a Gamma density. Using the one parameter family with A= 1 in (4.1), Richmond (1974) analyzes this suggestion in some detail and finds it has some potentially peculiar implications. In particular, if E has the Gamma density with ;i = 1, then u = e-’ has distribution

f;(u)==(1I~(P)) (log(llu))p-‘, O<uSl.

Page 27: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier functions 53

The mass of the distribution of u can be concentrated near 1 if PC 1, which

would imply most firms are relatively efficient, spread uniformly between 0 and 1 if P= 1, which would imply a uniform distribution of technical

efficiencies, or concentrated near 0 if P> 1, which would imply most firms are relatively inefficient. In fact, a similar characterization applies to the A-C (and Schmidt) and Forsund and Jansen (1977) formulation. The distribution of u =emE for the variate in (2.3) or (2.8) is f,(u)=,k- ‘. The median of this distribution is 2- ‘jn, which is greater than, equal to, or less than l/2 as 1 is

greater than, equal to, or less than one. A similar analysis can be applied to the half normal variate for which e-’ has a truncated lognormal distribution. The unavoidable conclusion is that without some a priori constraint on the parameters of f:(s) or fU(u), these distributional implications are simply

inherent in the model. Richmond then shows that a summary efficiency measure for the one

parameter Gamma disturbance, E(e-“) = 2-‘, can be estimated consistently, but with an upward bias, using the OLS residuals. For the more general model used in this study, summary measures for the distribution of

efficiencies may be estimated using the following results for E - G(i, P) (which can be verified by direct integration):

E(e-‘“)=[i/l+rIjP, (5.3)

and

E(e’“)=[i/A-rlP, %>r. (5.4)

The first of these provides moments for the distribution of technical

(productive) efficiencies, the second for the distribution of cost efficiencies. [Note that Richmond’s measure is a particular case of (5.3).] The means and

standard deviations for the implied efficiency distributions for the models estimated in this study are presented below in table 5. The moments of the

distribution of log u are reproduced for convenience. Finally, it would be useful to establish a relationship between the values of

technical and cost efficiency for a particular firm. Let U, be the efticiency factor on the production function and U; be that on the dual cost function. If the production function is homogeneous of degree a, then u; =IA-‘, which is particularly convenient in the constant returns to scale (a= 1) case. If the production function is not homogeneous, then the relationship between u, and u; need not be explicit, and will depend on the type of function specified. In general, this relationship will depend on the degree of returns to scale, which need not be constant, in a complicated manner. Some results for the homothetic case may be found in Forsund and Jansen (1977); however, their

Page 28: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

54 W.H. Greene, MLE of econometric frontier jurtcrions

disturbance formulation is an unnatural one. This should be a useful area for

future research given the inherent limitations of the (homogeneous) Cobb

Douglas model.

Table 5

Efficiency distributions

Production cost

A

P

Gamma density (E) Mean Standard deviation

Efficiency distribution (u) Mean Standard deviation

Implied dual efftciency distribution (a’) Mean Standard deviation

11.1931 14.8600 5.3804 17.072 1

0.4562 1.149 0.1967 0.2715

0.6454 3.2849 0.1182 1.0028

1.4827 0.2146

a

a

“The cost function does not correspond to a homogeneous production function. These parameters were not estimated.

6. Summary and conclusions

The frontier function has been something of an enigma for econometric

estimation. Beyond (apparent) consistency of some of them, the properties of the received estimators have remained unknown. The ingenious new specification of Aigner, Lovell, and Schmidt has provided a regular estimator with more clearly defined properties and a better behaved likelihood function than previous frontier estimators; however, it lacks the theoretical appeal of the full frontier function. This paper has provided a simple estimator for the full frontier function which has all of the familiar properties of maximum likelihood estimators. Standard errors are computed in the usual fashion, and all of the results for ‘regular’ MLE’s, are obtained.

It is shown that the irregular nature of the likelihood function is not an unavoidable problem in the frontier setting, but merely a consequence of the. choice of the disturbance distribution. There is a large class of disturbance distributions which may be specified which make the maximum likelihood frontier estimator regular and well behaved. Two simple conditions on the error distribution are shown which are sufficient to make the problem of estimation a regular one to which the standard analysis may be applied.

Page 29: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

W.H. Greene, MLE of econometric frontier junctions 55

Heuristically, what is required is that the shape of the distribution of the

disturbance term be similar to those of the familiar lognormal or chi-squared distributions (both of which are candidates).

Two estimators have been presented. First, a consistent estimator is obtained by a minor modification of least squares. Then, a maximum

likelihood estimator for the gamma distributed error term is presented. Depending on how asymmetric the disturbance distribution is, large efficiency gains can be achieved over OLS by using maximum likelihood.

The gamma distribution shares an attractive property with ALS’s specification. If the errors are symmetrically distributed, the distribution approaches normality, and the resulting estimator approaches least squares. This implies that the large efficiency gains of maximum likelihood are

obtained only when the error distribution is asymmetric, a result which is

discerned from the regression residuals. Two applications are then considered. In the first, earlier results of Aigner

and Chu are replicated with our new specification. Large efficiency gains over least squares are found, and the results are substantially different from those obtained earlier. The second application re-estimates a cost function of

Nerlove. It is found that the gains in efficiency are somewhat smaller than in

the first case, but still noticeable. Inferences about scale economies under our specification turn out to be virtually identical. The specification allows a

detailed analysis of technical or cost inefficiency, both in terms of the overall

characteristics of the distributions of the random variables involved and on an individual observation by observation basis.

In general, the specification given in this study is a substantial modification of the received frontier estimators. To date, all of these have been lit using some programming algorithm, and the result has been an envelope function. For our specification, this is not the case. The estimator is not an envelope in the usual sense, as no points are on its boundary.” The

difference is that the statistical properties of the estimator are clearly known.

“Consider, for example, a sample of 1 from the model y=cr ts and B - G( 1,3). Any programming estimator will choose &=y and e=O, but the MLE will be 4=).-Z and e=2.

References

Afriat, N.S., 1972, Efficiency estimation of production functions, International Economic Review 13, Oct., 568-598.

Aigner, D.J. and D.S. Chu, 1968. On estimating the industry production function, American Economic Review 58. 826839.

Aigner, D., K. Love11 and P. Schmidt, 1977, Formulation and estimation of stochastic frontier production function models, Journal of Econometrics 5, no. 1, 21-38.

Amemiya, T., 1973, Regression analysis when the dependent variable is truncated normal, Econometrica 41, no. 6, Nov., 997-1016.

Page 30: MAXIMUM LIKELIHOOD ESTIMATION OF ECONOMETRIC …pages.stern.nyu.edu/~wgreene/FrontierModeling/Reference-Papers/G… · The estimation of production functions has been one of the more

Barnett, W.A., 1976, Maximum likelihood and iterated Aitken estimation of nonlinear systems of equations, Journal of the American Statistical Association 71, June, 354-360.

Christensen, L.R. and W.H. Greene, 1976, Economies of scale in U.S. electric power generation, Journal of Political Economy 84, no. 1, 655-676.

Cramer, H., 1946, Mathematical methods of statistics (Princeton University Press, Princeton, NJ).

Diewert, E., 1974, Applications of duality theory, in M.D. Intrilligator and D.A. Kendrick, eds., Frontiers of quantitative economics (North-Holland, Amsterdam).

Farrel, J.M., 1957, The measurement of productive efficiency, Journal of the Royal Statistical Society A CXX, Part III, 253 290.

Forsund, F. and E.S. Jansen, 1977, On estimating average and best practice homothetic production functions via cost functions, International Economic Review 18, no. 2, June, 463. 476.

Hildebrand, G. and T.C. Liu, 1965, Manufacturing production functions in the United States, 1957 (Cornell University Press, Ithaca, NY).

IBM, 1977, Scientific subroutine package. Jennrich, RI., 1969, Asymptotic properties of nonlinear least squares estimators, Annals of

Mathematical Statistics 40, April, 6333643. Kaplan, W., 1952, Advanced calculus (Addison Wesley, Reading, MA). Kendall, M.G. and AS. Stuart, 1969, The advanced theory of statistics, Vol. 1 (Griffin, London). Kendall. M.G. and A.S. Stuart, 1973, The advanced theory of statistics, Vol. II (Griffin,

London). Meeusen, W. and J. van den Broeck, 1977, Efficiency estimation from CobbDouglas production

functions with composed error, International Economic Review 18, no.2, June, 435-555. Nerlove, M., 1963, Returns to scale in electricity supply, in: Carl F. Christ, ed., Measurement in

economics, Studies in mathematical economics and econometrics in honor of Yehuda Grunfeld (Stanford University Press, Stanford, CA).

Richmond, J., 1974, Estimating the efficiency of production, International Economic Review 15, no. 2, June, 515-521.

Schmidt, P., 1975, On the statistical estimation of parametric frontier production functions, Review of Economics and Statistics 58. 238239.

Shephard, R.W., 1953, Cost and production functions (Princeton University Press, Princeton, NJ). Theil, H., 1971, Principles of econometrics (Wiley, New York). Timmer, C.P., 1971, Using a probabilistic frontier production function to measure technical

efficiency, Journal of Political Economy 79, 7677794. Wald, A., 1949, Note on the consistency of the maximum likelihood estimator, Annals of

Mathematical Statistics 20, 5955601. Zellner, A., J. Kmenta and J. Dreze, 1966, Specification and estimation of Cobb Douglas

production functions, Econometrica 34, 784~795.