ARTICLE IN PRESS
Journal of Econometrics 134 (2006) 317–340
0304-4076/$ -
doi:10.1016/j
�CorrespoE-mail ad
(H.-J. Wang)
www.elsevier.com/locate/jeconom
Pitfalls in the estimation of a cost functionthat ignores allocative inefficiency:
A Monte Carlo analysis
Subal C. Kumbhakara, Hung-Jen Wangb,�
aDepartment of Economics, State University of New York– Binghamton, Binghamton, NY 13902, USAbInstitute of Economics, Academia Sinica, Taipei 115, Taiwan
Available online 15 August 2005
Abstract
In the stochastic frontier literature, it is a widely held view that allocative inefficiency can be
lumped together with technical inefficiency in the estimation of cost frontiers. Therefore, a
one-sided error term in the cost function is believed to capture the cost of overall (technical
plus allocative) inefficiency. In this paper we challenge that view through a detailed Monte
Carlo investigation. The results show that failure to include the cost of allocative inefficiency
explicitly in the cost function biases the estimates of: (i) the cost function parameters,
(ii) returns to scale, (iii) input price elasticities, and (iv) cost-inefficiency.
r 2005 Elsevier B.V. All rights reserved.
JEL classification: C15; C21
Keywords: Technical inefficiency; Allocative inefficiency; Stochastic frontier
1. Introduction
Although the mainstream neoclassical paradigm in economics assumes thatproducers always operate efficiently, in reality these producers are not always
see front matter r 2005 Elsevier B.V. All rights reserved.
.jeconom.2005.06.025
nding author. Tel.: +22782 2791x323; fax: +2 2785 3946.
dresses: [email protected] (S.C. Kumbhakar), [email protected]
.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340318
efficient. Consequently, two otherwise identical producers may never produce thesame output, and their costs and profit are not the same. This difference in output,cost, and profit can be explained in terms of technical and allocative inefficiency, andsome unforeseen exogenous shocks. Given the input quantities, a producer is said tobe technically inefficient if it fails to produce the maximum possible output.Similarly, a cost or profit maximizing producer is allocatively inefficient if it fails toallocate the inputs optimally, given input and output prices. Both types ofinefficiency are costly in the sense that cost (profit) is increased (decreased) due tothese inefficiency.
The econometric estimation of technical inefficiency (popularly known as thestochastic frontier technique) originated in Aigner et al. (1977) and Meeusen and vanden Broeck (1977). In this framework one specifies the production technology interms of the production function, i.e.,
y ¼ f ðx1; . . . ;xk : bÞ � expðv� uÞ, (1)
where y denotes a single output, x1; . . . ; xk are k inputs used to produce y, f ð:Þ is thedeterministic production frontier, and b is a technology parameter vector to beestimated. Finally, v is a random noise component (an exogenous shock unknown tothe producer) and the one-sided term u (uX0) captures technical inefficiency.1 This islabeled as output-oriented (output-augmenting) technical inefficiency, becauseoutput can be increased by reducing inefficiency, ceteris paribus.
The cost frontier is often used as an alternative representation of the productiontechnology. In this formulation, one specifies the model as
C ¼ Cðw1; . . . ;wk; y : bÞ � expðvþ uÞ, (2)
where C is actual/observed cost and Cðw1; . . . ;wk; y : bÞ is the cost frontier. Sinceinefficiency increases cost, uX0 appears with a positive sign and cost efficiency isdefined as expð�uÞp1 (Coelli et al., 1998, p. 210; Kumbhakar and Lovell, 2000,p. 139). The question is: Does u represent the cost of technical inefficiency (increasein cost due to technical inefficiency) or is it a measure of overall cost inefficiency (costof both technical and allocative)? In other words, can one lump the costs of bothtechnical and allocative inefficiency together and assume it to be an i.i.d. randomvariable (as in Coelli et al., 1998, Chapter 9.4; Kumbhakar and Lovell, 2000,Chapter 4)?
The purpose of this paper is to show that if there is allocative inefficiency and oneestimates the cost function in (2) and thereby either ignores the presence of allocativeinefficiency or erroneously assumes that allocative inefficiency can be lumpedtogether with technical inefficiency in the estimation, then the estimated technologyand estimates of inefficiency are likely to be wrong. In particular, we examine howthe exclusion of allocative inefficiency affects (i) the measure of cost inefficiency,(ii) the parameters of the cost function, (iii) returns to scale, and (iv) price elasticitiesof input demand. All we can say analytically is that parameter estimates are biased
1Alternatively, technical efficiency is expð�uÞp1. Consequently, technical efficiency equals 1 minus
technical inefficiency, especially when technical inefficiency is small.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 319
and inconsistent. Therefore, measures based on the biased parameter estimates arelikely to be biased. Empirically, the issue is not only about bias, but also the degreeof bias in the estimates, especially in the measures of interest such as inputelasticities, returns to scale, etc. The extent of such biases can only be evaluatednumerically. Therefore, we conduct a Monte Carlo analysis to assess these effects.
2. Preliminaries
The cost function framework is based on the assumption that producers minimizecost given the output and the technology. Thus, the optimization problem for aproducer is
minx
w0 � x s.t. y ¼ f ðx expð�lÞÞ, (3)
where x and w are J � 1 vectors of actual input quantities and prices, respectively,and lX0 measures input-oriented (input-saving) technical inefficiency.2 Since thevector x denotes the actual quantities of the inputs and expð�lÞp1, one can labelxj expð�lÞ as the technical inefficiency-adjusted level of input j. Alternatively, whenl is multiplied by 100, it shows the percent by which all the inputs can be reducedwithout reducing output. If producers fail to allocate inputs efficiently, then the first-order conditions of the above are f j=f 1awj=w1, which can be rewritten as
f j
f 1
¼expðxjÞwj
w1�
w�j
w1; j ¼ 2; . . . ; J, (4)
where f j is the partial derivative of f ð�Þ with respect to the jth input and xja0ðj ¼ 2; . . . ; JÞ represents allocative inefficiency for the input pair (j; 1).3
One can view the above optimization problem from a different angle. The first-order conditions in (4) show that producers use the shadow prices (w�) to determinethe technical inefficiency-adjusted level of inputs. Thus, the above optimizationproblem can be viewed as though producers are minimizing shadow cost given theproduction technology, viz.,
C� ¼ w�0 � x expð�lÞ s.t. y ¼ f ðx expð�lÞÞ (5)
instead of the actual cost in choosing inefficiency-adjusted input quantities. Thesolution to this problem gives the technical inefficiency-adjusted input quantities,which in turn can be used to define the shadow cost function, C� ¼ C�ðw�; yÞ, in (5).
An advantage of the shadow cost function approach is that once it is specified,Shephard’s lemma can be used to derive the inefficiency-adjusted input demandfunctions, viz.,
qC�ðw�; yÞ
qw�j¼ xjðw
�; yÞ expð�lÞ. (6)
2If the underlying production technology is homothetic, then u in (1) is a constant multiple of l.3If the production function is not homogeneous then f j=f 1 will depend on l.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340320
This approach is an alternative to the primal approach outlined in (3) and (4) inwhich a production function is specified first, and then the corresponding first-orderconditions of cost minimization are derived (Schmidt and Lovell, 1979).
Since C� is unobserved we have to establish the link between actual cost (C) andC�.4 For this, first we define the shadow cost share function of input j, viz.,
S�j ¼w�j x�j
C�ðw�; yÞ¼
q lnC�ðw�; yÞ
q lnw�j
!; where x�j ¼ xjðw
�; yÞ expð�lÞ, (7)
which gives
xjðw�; yÞ ¼
C�ðw�; yÞS�jw�j
expðlÞ. (8)
Therefore, the actual cost is
C ¼XJ
j¼1
wjxjðw�; yÞ ¼
XJ
j¼1
wj
C�ðw�; yÞS�jw�j
expðlÞ
" #ð9Þ
¼ expðlÞC�ðw�; yÞXJ
j¼1
wj
w�jS�j ð10Þ
¼ expðlÞC�ðw�; yÞXJ
j¼1
S�j expð�xjÞ ð11Þ
¼ C0Cal expðlÞ, ð12Þ
where
C0 ¼ C�jl¼0;n¼0 (13)
is the minimum cost5 in the absence of both technical and allocative inefficiency, and
Cal ¼ C�ðw�; yÞX
j
S�j expð�xjÞ
( ),C0 (14)
measures costs due to allocative inefficiency. Thus, overall cost inefficiency (CI) canbe defined as
CI ¼C
C0¼
C
Cjl¼0;na0
�Cjl¼0;na0
C0¼ Cal � Ctech, (15)
where Cal and Ctech measure the cost of allocative and technical inefficiency,respectively. Alternatively, the overall cost efficiency (CE) is (Farrell, 1957)
CE ¼C0
C¼
1
Cal�
1
Ctech¼ AE � TE, (16)
4The following derivations are based on Kumbhakar (1997).5This is also labeled as the deterministic cost frontier or the neo-classical cost function.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 321
where allocative and technical efficiencies (AE and TE) are the reciprocals of Cal andCtech. For a well behaved cost function Cal
X1 and CtechX1, which in turn imply that
AEp1, TEp1, and CEp1. It is clear from (14) that Cal depends on allocativeinefficiency as well as data on y and w, while Ctech ¼ expðlÞ is independent of data.
From the above equation, the actual cost can be written as (appending a two-sidedrandom noise term v):
lnC ¼ lnC0 þ lnCal þ lþ v � lnC0 þ uþ v, (17)
where u ¼ lnCal þ l is a function of n, y, and w. Both the mean and the variance of u
will depend on w, y and the parameters of the cost function.In empirical applications it is often assumed that u (in (17)) follows a half- or a
truncated-normal distribution. The cost function in (17) is then estimated with thepresumption that u captures either the cost of technical inefficiency (therebyignoring/assuming away costs due to allocative inefficiency, i.e., assuming u ¼ l), orthe overall cost inefficiency (the cost of both allocative and technical inefficiency).Neither of these presumptions, however, is a reasonable representation of the truemodel in many cases. It can be clearly seen from (17) that assuming u ¼ l andignoring lnCal result in a mis-specified model. Since the true EðuÞ depends on w andy via lnCal, and that the same variables (w and y) appear in the cost frontierfunction, the estimated coefficients of the cost function will be biased andinconsistent even if one assumes that there is no technical inefficiency (i.e., l ¼ 0)and subsequently applies least squares to (17). On the other hand, if u ¼ lnCal þ l isassumed, then a half- or truncated-normal distribution on u cannot be correct,because lnCal is a function of data, parameters and random variables n. Due to thenon-linearity of lnCal, it is not possible to analytically derive the distribution of u
(which is required for a maximum-likelihood (ML) estimation) from any reasonabledistributions of n and l. Consequently, if one uses the half- or truncated-normaldistribution with constant parameter(s), then the ML estimates will be biased andinconsistent. Recently, Coelli et al. (2003, p. 36–40) also point out that productivitymeasurement based upon single equation cost functions will be biased when there aresystematic deviations from allocative efficiency, due to the biased coefficients.
Thus, the main problem in estimating the mis-specified model (in (17)) using leastsquares or ML is that the estimated parameters will be inconsistent. The MLestimators might be more problematic, because the half (truncated) normalassumption on u is not consistent with a model that allows allocative inefficiency.This parameter inconsistency will be transmitted to everything that is derived fromthe estimated technology such as returns to scale, elasticities of substitution, inputdemand functions, technical and allocative inefficiency, etc. Although the degree ofinconsistency can also be affected by other factors when the data generation processis unknown (which is the case if real data are used), the problem can be avoided in acontrolled environment (Monte Carlo studies), which is what we do here. Wetherefore, we use an extensive Monte Carlo experiment to investigate theconsequences of improper treatments of lnCal in the model. In particular,we examine how the exclusion of the allocative inefficiency component (or theassumption of lumping both allocative and technical inefficiency in u) affects (i) the
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340322
cost inefficiency measure, (ii) the technology parameters (the cost function), (iii)returns to scale, and (iv) input price elasticities.
To do this, we compare not only the means, standard errors, and root meansquare errors of the concerned statistics, but also the distributions of the statisticsacross competing models. The equality of the distributions across models is testedusing a non-parametric test of Li (1996).
3. Simulations
3.1. Design of the experiment
We consider a translog cost function for lnC� with J ¼ 2. To impose the linearhomogeneity in prices, we use w1 as our numeraire and define w ¼ w2=w1. Allocativeinefficiency of the input pair (x2, x1) is represented by x.
We assume a translog functional form for lnC� and obtain the followingrelationships based on (13) and (14):
lnC0 ¼ b0 þ by ln yþ bw lnwþ 12byyðln yÞ2 þ 1
2bwwðlnwÞ2 þ bywðln y lnwÞ, ð18Þ
lnCal ¼ lnðS�1 þ S�2 expð�xÞÞ þ bwxþ bwwx lnwþ 12bwwx
2þ bywx ln y, ð19Þ
where S�1 and S�2 are
S�1 ¼ ð1� bwÞ � bwwðlnwþ xÞ � byw ln y, ð20Þ
S�2 ¼ bw þ bwwðlnwþ xÞ þ byw ln y. ð21Þ
To generate data we use the following distributions on the non-stochastic (ln y,lnw) and random (x, v, l) variables.
ln y�Nðmy; s2yÞ; lnw�Nðmw;s
2wÞ; x�Nðmx; s
2xÞ; v�Nð0;s2vÞ, ð22Þ
l ¼ jl�j; where l��Nð0; s2lÞ. ð23Þ
The notation of (23) indicates that the technical inefficiency l has a half-normaldistribution.
Given the parameter values (discussed later), we draw N observations based on theabove distributional assumptions for each of the R replications. Two differentmodels are then estimated using the generated data. The first one, labeled as Modela, is the mis-specified model, i.e., one that estimates
lnC ¼ lnC0 þ uþ v, (24)
where u is a random variable with a half-normal distribution. As discussed in theprevious section, this model implicitly assumes that u captures either the cost oftechnical inefficiency (thereby ignoring/assuming away the cost of allocativeinefficiency, i.e., u ¼ l), or the overall (both allocative and technical) cost inefficiencyof the firm. Since neither of the presumptions can be justified in the light of the truemodel, the estimated model is mis-specified.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 323
The second model, labeled as Model b, is the correctly specified model and itassumes that allocative inefficiency n is known/observed. We do this for two reasons.First, by estimating the cost function alone (no matter what technique is used) onecannot obtain estimates of both l and xj. That is, the cost function residuals cannotbe decomposed into l and lnCal.
Second, an estimation of l and n from the cost system, that can separate n (andtherefore lnCal) from l, is a difficult problem, especially when the xj are random.6
Since we focus on the results based on estimating a single equation cost function, it isessential that the correctly specified model can also be estimated using a singleequation method. This purpose can easily be served (in a Monte Carlo analysis) byassuming that the xj are known, while estimating Model b. Thus, from an estimationpoint of view we write the correctly specified model as
lnC ¼ lnCal þ lnC0 þ lþ v, (25)
where lnCal is a well defined function of parameters and data. Consequently,differences in the results of models a and b can be attributed to the effects ofimproper treatment of allocative inefficiency, and they are not due to the use ofdifferent estimation techniques.
We make sure that regularity conditions of the cost function are satisfied in thesimulation model. The regularity conditions require a cost function to be (Varian,1992): (1) nondecreasing in w, (2) homogeneous of degree 1 in w, and (3) concave inw. We further require two additional conditions to be numerically satisfied in each ofthe replications: (4) values of the input shares are between 0 and 1, and (5) non-negative allocative inefficiency (lnCal40). Condition (3) is satisfied by selecting anegative value for bww. For conditions (4) and (5), we begin by drawing the numberof observations larger than N, discarding those violating the conditions, and thenrandomly choosing N observations from the remaining data. The retained data aresubsequently tested for the assumed distributions regarding ln y, lnw, and x to ensurethat the filtering does not systematically alter the underlying distributions.
3.2. Parameter configurations
We assign values to two sets of the parameters: A ¼ fmx;s2x; corrðl
�; xÞ;c;Ngwhere c ¼ s2l=s
2v , and B ¼ fmy; s
2y; mw;s
2w; b0;by;bw;byy;bww;byw;k;Rg, where
k ¼ s2l þ s2v . In the analysis we consider cases with different values of the Aparameters, and we hold values of the B parameters constant in all the experiments.As discussed in Olson et al. (1980), the choice of the bs will not affect the empiricalbiases, variances, and mean square errors; only the means of the bs change. The sameargument holds for my and mw as well. For s2y and s2w, we choose arbitrarily largenumbers to represent a dataset with heterogeneous product and input markets. Thefollowing parameter values of B are chosen in this study: fmy ¼ 2, s2y ¼ 2:5,
6There is no easy solution to this problem in a cross-sectional framework. See Kumbhakar and Tsionas
(2003a) for a solution using a Bayesian approach.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340324
mw ¼ 1,s2w ¼ 2:5, b0 ¼ 0:5, by ¼ 0:9, bw ¼ 0:5, byy ¼ 0:1, bww ¼ �0:1, byw ¼ 0:1,k ¼ 0:15, R ¼ 1; 000g.
The set A contains five parameters: {mx, s2x, corr(l
�; x), c;Ng. The Monte Carlostudy of Coelli (1995) considers the effects of c and N, but it does not have allocativeinefficiency in the model. Here, we take into account three additional parametersthat are related to the allocative inefficiency term. The following parameterconfiguration of A is used in the base case: A ¼ fmx ¼ 0, s2x ¼ 0:5, corrðl�; xÞ ¼ 0,c ¼ 4:2, N ¼ 200g. The choice of mx ¼ 0 implies that the allocative inefficiency is apure random error, and there is no systematic bias involved in input allocationdecisions. The assumption of corr(l�; xÞ ¼ 0 implies that technical and allocativeinefficiency are uncorrelated. The variance ratio c equal to 4.2 indicates that thetechnical inefficiency variance (ð1� 2=pÞs2l) is about 1.5 times the variance of thestatistical noise term (s2v).
The Monte Carlo analysis considers the following cases in addition to the basecase: (1) s2x ¼ 1; 1:5, to investigate effects of the variance of allocative inefficiency. (2)mx ¼ 0:5; 1:5, so that there is systematic bias in the input decisions, as may beobserved in a regulated environment. Note that for a model with only two inputprices, we can focus on the cases of mx40 without loss of generality. (3)corrðl�; xÞ ¼ 0:3; 0:8, so that both high and low correlations between technicaland allocative inefficiency are allowed. As discussed by Schmidt and Lovell (1980),the correlation between l� and x implies that technically inefficient producers alsohave input ratios which are more in error (either too high or too low). (4) c ¼ 2; 10,which changes the share of the total variances attributed to the technical inefficiency.(5) N ¼ 50; 500, so as to examine the sample size effect.
3.3. Estimation of inefficiency, returns to scale, and elasticities
We obtain ML estimates of models a and b by numerically maximizing therespective likelihood functions. The likelihood function of observation i is (given inKumbhakar and Lovell, 2000, p. 140):
ln½f ðlnCiÞ� ¼ �1
2lnðs2l þ s2vÞ þ ln f
lnCi � lnCFiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s2l þ s2vq
0B@
1CA
264
375
� ln1
2
� �þ ln F
�m�s
� �� �, ð26Þ
where
�m ¼s2lðlnCi � lnCF
i Þ
s2v þ s2l, ð27Þ
�s2 ¼s2vs
2l
s2v þ s2l, ð28Þ
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 325
and lnCF ¼ lnC0 for model a, lnCF ¼ lnC0 þ lnCal for model b, and fð�Þ and Fð�Þare the probability density and the cumulative distribution functions of a standardnormal variable, respectively. The likelihood function of the model is the sum of theabove function over all observations. The calculations were carried out using Stata7.0 computer software and we used the Stata numerical maximization routine tomaximize the likelihood function.
We implement the Jondrow et al. (1982) formula viz.,
Eðl j lþ vÞ ¼ �mþ �sfð �m= �sÞFð �m= �sÞ
(29)
to compute the observation-specific technical inefficiency for the various models. Forsimplicity, we use the notation l for the above conditional expectation. Theestimated technical inefficiency of the mis-specified model is denoted la which isobtained by estimating model (24) and then substituting the estimated parametersinto (29). The estimated technical inefficiency of the correctly specified model is lb
which is similarly obtained based on (25) and (29). The two inefficiency are to becompared with l based on (29) using the true parameter values. We also computeand report the overall inefficiency from the true parameters and data, i.e., Eðln C
alþ
l j lþ vÞ denoted by ln Calþ l.
The return to scale statistic (RTS) and the own price elasticity of input i (Bii)7are
computed from:
RTS ¼ ðby þ byy ln yþ bywðlnwþ xÞ þbywðexpð�xÞ � 1Þ
ðS�1 þ S�2 expð�xÞÞ
� ��1, ð30Þ
Bii ¼q ln xi
q lnwi
¼bww
S�iþ ðS�i � 1Þ. ð31Þ
3.4. Bias, distribution, and rank correlation
We use multiple metrics in evaluating the consequences of model mis-specifications. They include the bias of the parameters, distributions of theinefficiency, and the rank correlations between the estimated and true inefficiency.
Computing the bias of the parameters warrants an explanation. There arearguments that no bias of the b parameters (other than the intercept) should arise iftechnical inefficiency is not included in the estimation of the Aigner et al. (1977) typemodel. This is because E(lþ v) is a constant where lþ v is the error term in themodel. In the present case, however, the error term has three components (lnCal, l,and v), and E(lnCal þ lþ v) is not a constant, but rather a function of data andparameters (unless the underlying technology is Cobb–Douglas). This makes theestimated parameters of model a (that ignores lnCal) biased and inconsistent (notjust the intercept).
7Since we are dealing with only two inputs, we report estimated values of B11 and B22. The cross price
elasticities can be recovered from B12 ¼ �B11 and B21 ¼ �B22, sinceP
jBij ¼ 0 8i.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340326
Even if there is no bias in the b parameters, la will still be different from ln Calþ l,
which is what la tries to capture. This is because la is conditional on lþ v which iscomputed from lnC � lnC0, while in the correctly specified model the value of lþ v
is lnC � lnC0 � lnCal. Thus, unless lnCal � 0, these two measures will differ. Ofcourse, in reality one will be using the estimated values of the parameters that arebiased unless lnCal � 0.
In addition to comparing la, lb, l, and ln Calþ l at their means, we also look at
the overall distributions of these variables. To this end, we plot the kernel densities ofthe estimated variables and also conduct the equal-distribution hypothesis test usingthe non-parametric test of Li (1996).
The bias and the kernel density comparisons are useful, but both of them areinvariant to how the estimated observation-specific inefficiency index is distributedamong the sample observations. It is therefore possible to have two differentinefficiency indices that produce the same sample means and density distributions,but imply quite different—even the opposite—efficiency rankings for the observa-tions. Since efficiency rankings of sample observations are often used in efficiencystudies, we compute and compare Kendall’s rank correlation coefficient between thetrue and the estimated inefficiency index in the Monte Carlo analysis. Thecoefficient, denoted by t, has a value between �1 and 1. A value of �1 implies aperfect negative correlation, and a value of 1 implies a perfect positive correlation ofthe rankings between the series. It can be shown (e.g., Kendall and Gibbons, 1990)that 100 � ð1� tÞ=2 gives the probability that a randomly chosen pair ofobservations have opposite rankings in the two series.
3.5. Simulation results
3.5.1. The base case
The base case has the following parameter values: (mx ¼ 0, s2x ¼ 0:5,corrðl�; xÞ ¼ 0, c ¼ 4:2, N ¼ 200). The estimation results are presented inTable 1, and the kernel densities of l, B, and RTS are presented in Fig. 1.
The mean value of the technical inefficiency from the mis-specified model ðEðlaÞÞ is0.293, which is between the mean values of the true overall inefficiency(E(ln C
alþ l) ¼ 0.351) and the mean of the true technical inefficiency
(E(l) ¼ 0.279). The biases seem obvious. On the other hand, the correctlyspecified model has EðlbÞ equal to 0.275; the (slight) underestimation (com-pared to E(l)) is also observed in other simulation studies (see, e.g., Wang andSchmidt, 2002).
The Kendall correlation coefficients show that the efficiency ranking of the mis-specified model is closer to the ranking of ln C
alþ l ðcoefficient ¼ 0:829Þ than to the
ranking of l ðcoefficient ¼ 0:789Þ. A coefficient of 0.789 means that the probabilityof producing an incorrect ranking between a randomly chosen pair of observations is10.6% (100 � ð1� 0:789Þ=2¼
:10:6). This probability appears to be quantitatively non-
trivial. As a comparison, the correctly-specified model has a mis-ranking probabilityof less than 5%. The probability of mis-ranking is thus doubled for the mis-specifiedmodel.
ARTICLE IN PRESS
Table 1
Base case
la lbModel aa Model b
Mean s.d. RMSE Mean s.d. RMSE
True: Eðln Calþ lÞ ¼ 0:351, EðlÞ ¼ 0:279, Eð ^RTSÞ ¼ 0:869, EðB11Þ ¼ �0:871, EðB22Þ ¼ �0:573
Mean 0.293 0.275
s.d. 0.048 0.045 ^RTS 0.869 0.021 0.871 0.021
Kendall’s rank coefficient ^B11 �0.806 0.102 �0.869 0.092
^B22 �0.529 0.059 �0.570 0.057
ln Calþ l 0.829 0.681 b0 0.553 0.064 0.083 0.504 0.059 0.060
l 0.789 0.912 by0.874 0.048 0.054 0.897 0.044 0.044
Nonparametric distribution test (p value)b bw0.523 0.039 0.045 0.500 0.036 0.036
byy0.124 0.030 0.039 0.102 0.028 0.028
ln Calþ l 0.051 0.024 bww
�0.074 0.029 0.038 �0.097 0.027 0.027
l 0.355 0.384 byw0.076 0.025 0.035 0.098 0.023 0.023
s2l 0.139 0.043 0.046 0.122 0.037 0.037
s2v 0.031 0.012 0.012 0.028 0.011 0.011
The base model: mx ¼ 0, s2x ¼ 0:5, c ¼ s2l=s2v ¼ 4:2, k ¼ s2l þ s2v ¼ 0:15, corrðl�; xÞ ¼ 0; my ¼ 2, s2y ¼ 2:5,
mw ¼ 1, s2w ¼ 2:5, b0 ¼ 0:5, by ¼ 0:9, bw ¼ 0:5, byy ¼ 0:1, bww ¼ �0:1, byw ¼ 0:1; N ¼ 200, R ¼ 1000.aModel a is a mis-specified model of (24) ignoring the existence of allocative inefficiency. Model b is the
correctly specified model of (25) assuming the allocative inefficiency is observable. Notations: ln Calþ l is
the true value of the overall cost inefficiency consisting of both the allocative inefficiency and the technical
inefficiency, l is the true value of the technical inefficiency, la is the estimated technical inefficiency of
Model a, and lb is the estimated inefficiency of Model b. Computations of l, la, and lb are based on the
formula of conditional expectation by Jondrow et al. (1982). The RTS is the returns to scale statistics. The
terms B11 and B22 are the own-price elasticities of the first and the second inputs, respectively. The first row
of the table lists true values of the statistics based on generated data. They are averaged first over the
observations for each replication, and then over the replications.bThe non-parametric distribution test is based on Li (1996). The reported numbers are the p-values of
the null hypothesis that distributions of the corresponding statistics are equal.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 327
The rank correlation coefficient is a useful measure if one is interested incomparing relative efficiencies. Absolute efficiency distributions from the twomodels, however, can be quite different, although they may still produce the samerank correlation coefficient (which is invariant to a monotonic transformation ofabsolute efficiencies). Because of this we examine the efficiency distributions from thecompeting models. Specifically, we test (non-parametrically) whether H0 : gðxÞ ¼
hðxÞ for all x (against the alternative, H1 : gðxÞahðxÞ for some x), where gðxÞ andhðxÞ are the efficiency distributions from the two different models.
The p-values of the non-parametric tests are reported in Table 1. As shown, thenull hypothesis of equal distributions between la and l cannot be rejected, but theequality is rejected between the distributions of la and ln C
alþ l. The kernel
ARTICLE IN PRESS
-0.96 -0.7
0
57.7437 ˆ
(solid)�
ˆ
ˆ ˆ
(dashed)�b
�a
0.843 0.896
0.818837
80.4495 ˆ (solid)RTS
ˆ
(long dash)RTSa
ˆ
(short dash)RTSb
69.2872
0.758139
0.236 0.35 0.44
1nCal + ��a (solid)ˆ
�
(dashed)
ˆ ˆ1nCb + �bal
(solid)
�bˆ
(dashed)
(a) (b)
(c)
Fig. 1. Base case. (a) inefficiency distribution; (b) price elasticity distribution; and (c) RTS distribution.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340328
densities plotted in Fig. 1(a) visually confirm the result. As expected, the distributionof la lies in between the distributions of l and ln C
alþ l.
For the price elasticities, the means of B11 and B22 are upward biased in the mis-specified model. For example, E(B11)¼ �0:806 for the mis-specified model while thetrue value is �0:871. The correctly specified model has E(B11) equal to �0:869. Thekernel densities in Fig. 1(b) shows the upward bias in the mean as well as the largevariance. As for the returns to scale, the result appears insensitive to the ignorance ofthe allocative inefficiency; we shall see appreciable differences in some of thesimulation models later.
Table 1 also shows the means, standard deviations, and the RMSEs of the bcoefficients. As indicated, the RMSE is about 40% larger in the mis-specified modelthan in the correctly specified model. The differences are attributed to the ignoranceof the allocative inefficiency in the model.
How do the above results help predict the consequences of ignoring allocativeinefficiency in the estimation model? In particular, are the estimated inefficiencyreasonable approximations of the true inefficiency? For the case of the base model,the answer is mixed: (a) The mean of la lies in-between the means of l and ln C
alþ l,
and the biases appear obvious; (b) the ranking of la is closer to the ranking ofln C
alþ l than that of l; and (c) the distribution of la is statistically indistinguishable
from the distribution of l, but is statistically different from that of ln Calþ l. Since
the conventional wisdom is that la captures overall cost inefficiency, the aboveresults do not support such a claim, i.e., the distribution of cost inefficiency frommodel a is statistically different from the distribution of costs of technical andallocative inefficiency.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 329
3.5.2. Changes in s2xWe consider changing the variance of allocative inefficiency (s2x) from the base
value of 0.5 to 1 and 1.5. Increases in the variance imply that the firms’ inputallocations are more in errors, and we expect to see larger cost effects. Consequently,ignoring allocative inefficiency in the model should result in larger biases. The resultsare presented in Table 2 and Fig. 2.
Table 2
Changes in s2x
la lbModel a Model b
Mean s.d. RMSE Mean s.d. RMSE
Change from base: s2x ¼ 1
True: Eðln Calþ lÞ ¼ 0:410, EðlÞ ¼ 0:279, Eð ^RTSÞ ¼ 0:871, EðB11Þ ¼ �0:873, EðB22Þ ¼ �0:573
Mean 0.340 0.276
s.d. 0.046 0.044 ^RTS 0.871 0.024 0.873 0.021
Kendall’s rank coefficient ^B11 �0.740 0.088 �0.878 0.085
^B22 �0.489 0.055 �0.572 0.051
ln Calþ l 0.792 0.557 b0 0.555 0.064 0.084 0.503 0.059 0.059
l 0.691 0.913 by0.849 0.048 0.070 0.896 0.041 0.041
Nonparametric distribution test (p value) bw0.545 0.038 0.059 0.501 0.033 0.033
byy0.149 0.028 0.057 0.102 0.024 0.024
ln Calþ l 0.057 0.002 bww
�0.052 0.028 0.056 �0.099 0.024 0.024
l 0.174 0.382 byw0.053 0.023 0.052 0.100 0.020 0.020
s2l 0.185 0.047 0.079 0.123 0.036 0.036
s2v 0.032 0.012 0.012 0.028 0.010 0.010
Change from base: s2x ¼ 1:5
True: Eðln Calþ lÞ ¼ 0:460, EðlÞ ¼ 0:279, Eð ^RTSÞ ¼ 0:872, EðB11Þ ¼ �0:872, EðB22Þ ¼ �0:573
Mean 0.385 0.279
s.d. 0.046 0.042 ^RTS 0.873 0.025 0.875 0.021
Kendall’s rank coefficient ^B11 �0.684 0.084 �0.877 0.081
^B22 �0.456 0.052 �0.572 0.047
ln Calþ l 0.766 0.485 b0 0.544 0.063 0.077 0.503 0.055 0.055
l 0.624 0.913 by0.834 0.049 0.082 0.896 0.040 0.040
Nonparametric distribution test (p value) bw0.564 0.041 0.076 0.501 0.032 0.032
byy0.167 0.030 0.073 0.102 0.023 0.023
ln Calþ l 0.110 0.000 bww
�0.032 0.029 0.073 �0.099 0.022 0.022
l 0.048 0.403 byw0.033 0.024 0.071 0.099 0.018 0.018
s2l 0.237 0.055 0.128 0.125 0.035 0.035
s2v 0.032 0.013 0.013 0.028 0.010 0.010
ARTICLE IN PRESS
Fig. 2. Change variances of x. (a) inefficiency; s2x ¼ 1; (b) inefficiency; s2x ¼ 1:5; (c) price elas.; s2x ¼ 1; (d)
price elas.; s2x ¼ 1:5; (e) RTS; s2x ¼ 1; and (f) RTS; s2x ¼ 1:5.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340330
As shown in Table 2, the first-order effect is indeed obvious. The mean of the trueoverall inefficiency, Eðln C
alþ lÞ, changes from the base case of 0.351 to 0.410
(s2x ¼ 1) and 0.460 (s2x ¼ 1:5).The biases associated with the mis-specified model increase as well when s2x
becomes larger. In the case of s2x ¼ 1:5, EðlaÞ ¼ 0:385 while EðlÞ ¼ 0:279 andEðln C
alþ lÞ ¼ 0:460. The expected inefficiency is clearly not representative of either
of the true values when the model is mis-specified. The lb, on the other hand,remains close to l. The Kendall correlation coefficients between la and the truevariables also deteriorate as s2x increases, while the correlation coefficient between lb
and l is not affected by the change of variance. Therefore, increases in the bias anddecreases in the correlation are due entirely to the ignorance of allocative inefficiencyin the mis-specified model.
Since our focus is on the observation-specific estimates, we examine thedistributions of the relevant variables. In Figs. 2(a) and (b), the kernel densities ofla shift to the right when s2x increases. Compared with the base case, the hypothesis
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 331
testing also shows that the equal distribution hypotheses regarding la are more likelyto be rejected (at least at the margin) when s2x increases.
The density plots of B11 in Figs. 2(c) and (d) clearly show the severe upward-biasesof the mean of B11 as well as the entire distribution of B11 when the model is mis-specified. The density plots of ^RTS in Figs. 2(e) and (f), on the other hand, showminimum impacts on the distribution of estimated returns to scale.
3.5.3. Changes in mxWe consider changing the mean of the allocative inefficiency (mx) to non-zero
values, thus allowing for systematic allocative inefficiency. The systematic bias isparticularly likely when producers operate in a regulated environment. We choosethe values of mx to be 0:5 and 1:5. We expect the systematic bias to cause a greatercost effect, thus ignoring the allocative inefficiency will lead to larger biases. Theresults are presented in Table 3 and Fig. 3.
As shown in Table 3, changing mx to non-zero values indeed leads to larger valuesof ln C
alþ l and larger RMSEs of the b coefficients. There are also noticeable
reductions in the Kendall correlation coefficients when the value of mx is large. Forinstance, with mx ¼ 1:5, the rank coefficient between la and l is 0.494, implying a25% probability of mis-ranking for a randomly chosen pair of observations. Thelarger mx also makes the equal distribution hypothesis between la and the truedistributions more likely to be rejected.
In many ways, the effects of an increase in mx are similar to the effectsof an increase in s2x; after all, both changes entail larger costs of allocativeinefficiency. A notable exception is the impact on the estimated RTS. As shown inTable 3 as well as Figs. 3(e) and (f), increases in mx lead to a significant downwardbias of ^RTS when the model is mis-specified. The bias is not observed in othersimulation cases.
3.5.4. Changes in corrðl�; xÞWe now allow the technical and allocative inefficiency to be correlated through the
correlation of l� and x. The correlation implies that technical inefficient producersalso have input ratios which are more in error (Schmidt and Lovell, 1980). A positivecorrelation between l� and x may make the model mis-specification less serious inview of measuring the overall cost effect, since the effect may be picked up byincluding only one of the inefficiency in the model. We note, however, that biases inother parts of the model, including the b coefficients and the model statistics, are notlikely to subside, because the allocative inefficiency variable, x, enters the costfunction in a highly non-linear way (see lnCal in (19)). We consider cases ofcorr(l�; x) equal to 0.3 and 0.8. The estimation results are in Table 4 while the densityplots are in Fig. 4.
It is shown that the mean of la increases as the correlation moves away from 0.The increase has such an effect that, when the correlation is equal to 0.8, the value ofEðlaÞ (0.366) even surpasses the value of Eðln C
alþ lÞ (0.347). Among other things,
the evidence suggests that researchers should not trust la as being bounded aboveby ln C
alþ l.
ARTICLE IN PRESS
Table 3
Changes in mx
la lbModel a Model b
Mean s.d. RMSE Mean s.d. RMSE
Change from base: mx ¼ 0:5
True: Eðln Calþ lÞ ¼ 0:391, EðlÞ ¼ 0:278, Eð ^RTSÞ ¼ 0:866, EðB11Þ ¼ �0:853, EðB22Þ ¼ �0:588
Mean 0.319 0.275
s.d. 0.047 0.044 ^RTS 0.845 0.020 0.867 0.020
Kendall’s rank coefficient ^B11 �0.834 0.157 �0.854 0.069
^B22 �0.532 0.060 �0.589 0.066
ln Calþ l 0.795 0.589 b0 0.533 0.066 0.074 0.502 0.065 0.065
l 0.720 0.912 by0.903 0.055 0.055 0.898 0.049 0.049
Nonparametric distribution test (p value) bw0.495 0.045 0.045 0.501 0.038 0.038
byy0.120 0.031 0.037 0.102 0.026 0.026
ln Calþ l 0.043 0.003 bww
�0.080 0.031 0.037 �0.099 0.025 0.025
l 0.263 0.382 byw0.081 0.027 0.033 0.099 0.022 0.022
s2l 0.163 0.045 0.061 0.122 0.036 0.036
s2v 0.032 0.012 0.012 0.027 0.011 0.011
Change from base: mx ¼ 1:5
True: Eðln Calþ lÞ ¼ 0:681, EðlÞ ¼ 0:278, Eð ^RTSÞ ¼ 0:855, EðB11Þ ¼ �0:815, EðB22Þ ¼ �0:616
Mean 0.392 0.274
s.d. 0.072 0.043 ^RTS 0.795 0.021 0.856 0.021
Kendall’s rank coefficient ^B11 �0.954 1.016 �0.816 0.035
^B22 �0.564 0.108 �0.622 0.083
ln Calþ l 0.702 0.317 b0 0.572 0.104 0.127 0.505 0.082 0.082
l 0.494 0.912 by1.003 0.091 0.138 0.899 0.058 0.058
Nonparametric distribution test (p value) bw0.395 0.077 0.130 0.498 0.046 0.046
byy0.099 0.039 0.039 0.100 0.025 0.025
ln Calþ l 0.000 0.000 bww
�0.100 0.039 0.039 �0.100 0.022 0.022
l 0.005 0.393 byw0.102 0.034 0.034 0.101 0.020 0.020
s2l 0.249 0.083 0.152 0.121 0.035 0.035
s2v 0.058 0.022 0.037 0.028 0.010 0.010
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340332
For the Kendall correlation coefficients, increases in corr(l�; x) have favorable butsmall effects on them. For instance, with the correlation equal to 0.8, the rankcoefficient between la and ln C
alþ l increases from 0.829 in the base case to 0.852 in
the current case. The result is understandable, because l alone may pick up most ofthe effects of x in this case when the latter is missing in the model.
ARTICLE IN PRESS
-0.96 -0.7
0
60.8718 ˆ (solid)
� ˆ (solid)
�
�a
ˆ (dashed)
�b
ˆ (dashed)
�b
ˆ
�aˆ
-0.96 -0.7
0.038039
61.1884
0.843 0.896
0.690995
68.5851
(lon
g da
sh)
ˆR
TS a ˆ
(solid)RTS
ˆ(short dash)RTSb
0.843 0.896
0.675521
70.4931ˆ
(solid)RTS
ˆ(long dash)RTSa
ˆ
(short dash)RTSb
73.9046
0.637158
0.236 0.35 0.44
64.7057
0.43395
0.236 0.35 0.44(a) (b)
(c) (d)
(e) (f)
Fig. 3. Change means of x. (a) inefficiency; mx ¼ 0:5; (b) inefficiency; mx ¼ 1:5; (c) price elas.; mx ¼ 0:5; (d)price elas.; mx ¼ 1:5; (e) RTS; mx ¼ 0:5; and (f) RTS; mx ¼ 1:5.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 333
The correlation makes almost all other estimates more erroneous. For example,the upward bias of input price elasticities is bigger, and the RMSEs also increase formost of the slope coefficients. The exception is on E( ^RTS). For that, we do not seemuch of the effect from changes in values of corr(l�; x).
3.5.5. Changes in cThe variance ratio of c ¼ s2l=s
2v has long been shown to be critical in how
precisely the model can be estimated. A larger variance ratio helps to identify thetruncated error component (i.e., l), and therefore it improves the estimation results.Here, we are interested in seeing if the identification of l has any consequential effecton the model when the model is mis-specified. With c ¼ 4:2 in the base case, weconsider c ¼ 2 and c ¼ 10 in the current simulation. The results are presented inTable 5 and Fig. 5.
Note that given k ¼ s2l þ s2v remains unchanged, a smaller value of c implies asmaller technical inefficiency effect, and therefore the values of EðlÞ and Eðln C
alþ
ARTICLE IN PRESS
Table 4
Changes in corrðl�; xÞ
la lbModel a Model b
Mean s.d. RMSE Mean s.d. RMSE
Change from base: corrðl�; xÞ ¼ 0:3
True: Eðln Calþ lÞ ¼ 0:351, EðlÞ ¼ 0:278, Eð ^RTSÞ ¼ 0:870, EðB11Þ ¼ �0:872, EðB22Þ ¼ �0:573
Mean 0.307 0.274
s.d. 0.046 0.044 ^RTS 0.870 0.021 0.872 0.020
Kendall’s rank coefficient ^B11 �0.803 0.182 �0.881 0.097
^B22 �0.525 0.061 �0.577 0.058
ln Calþ l 0.832 0.690 b0 0.540 0.061 0.073 0.507 0.056 0.057
l 0.793 0.913 by0.868 0.048 0.058 0.892 0.044 0.045
Nonparametric distribution test (p value) bw0.528 0.041 0.049 0.503 0.036 0.037
byy0.129 0.031 0.043 0.105 0.028 0.028
ln Calþ l 0.086 0.025 bww
�0.073 0.031 0.041 �0.101 0.028 0.028
l 0.307 0.404 byw0.072 0.027 0.038 0.098 0.024 0.024
s2l 0.151 0.042 0.052 0.121 0.036 0.036
s2v 0.029 0.011 0.011 0.028 0.010 0.010
Change from base: corrðl�; xÞ ¼ 0:8
True: Eðln Calþ lÞ ¼ 0:347, EðlÞ ¼ 0:274, Eð ^RTSÞ ¼ 0:868, EðB11Þ ¼ �0:873, EðB22Þ ¼ �0:572
Mean 0.366 0.266
s.d. 0.037 0.045 ^RTS 0.870 0.024 0.872 0.023
Kendall’s rank coefficient ^B11 �0.762 0.118 �0.950 0.259
^B22 �0.500 0.062 �0.622 0.060
ln Calþ l 0.852 0.756 b0 0.472 0.055 0.062 0.519 0.059 0.062
l 0.827 0.904 by0.857 0.052 0.068 0.891 0.045 0.045
Nonparametric distribution test (p value) bw0.538 0.042 0.057 0.502 0.036 0.036
byy0.141 0.033 0.052 0.102 0.027 0.027
ln Calþ l 0.324 0.117 bww
�0.059 0.033 0.053 �0.122 0.027 0.035
l 0.067 0.388 byw0.060 0.029 0.049 0.108 0.024 0.025
s2l 0.215 0.043 0.104 0.114 0.035 0.036
s2v 0.022 0.009 0.011 0.028 0.011 0.011
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340334
lÞ change in the same direction as c, as shown in Table 5. As expected, the standarddeviations and the RMSEs of the b coefficients are smaller when c is larger,indicating the model is more precisely estimated. This is true for both the mis-specified and the correctly-specified models. The Kendall correlation coefficients alsoimprove when c increases. Other than these changes, the results are qualitativelysimilar to those of the base case.
ARTICLE IN PRESS
-0.96 -0.7
0
64.9542ˆ
(solid)
ˆ (dashed)
ˆ
-0.96 -0.7
0
80.674
ˆ (solid)
�
�a
�b
�b�a
�
ˆ
(dashed)
ˆ
0.843 0.896
0.694728
63.219 ˆ (solid)
RTS
ˆ(long dash)
RTSa
ˆ
(short dash)RTSb
0.843 0.896
0.720096
69.283ˆ
(solid)RTS ˆ
(long dash)RTSa
ˆ(short dash)RTSb
74.5665
0.888397
0.236 0.35 0.44
70.4403
0.502918
0.236 0.35 0.44(a) (b)
(c) (d)
(e) (f)
Fig. 4. Non-zero correlations between l� and x. (a) inefficiency; corrðl�; xÞ ¼ 0:3; (b) inefficiency;
corrðl�; xÞ ¼ 0:8; (c) price elas.; corrðl�; xÞ ¼ 0:3; (d) price elas.; corrðl�; xÞ ¼ 0:8; (e) RTS;
corrðl�; xÞ ¼ 0:3; and (f) RTS; corrðl�; xÞ ¼ 0:8.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 335
3.5.6. Changes in N
We examine here whether the consequences of model mis-specification aresensitive to the choice of sample size. We consider cases of N ¼ 50 and N ¼ 500. Theresults are in Table 6 and Fig. 6.
When the sample size is small, the mean of lb is downward biased, which is consistentwith the finding of Wang and Schmidt, 2002. Although it appears that EðlaÞ is close toEðlÞ, the result is only incidental. The inability to reject the equal-distribution hypothesiswhen the sample size is small may owe to the fact that the power of the non-parametrictest is compromised by the small sample size. On the other hand, the larger variancesand RMSEs shown in the Table are typical for a small sample estimation.
When the sample size is relatively large ðN ¼ 500Þ, the biases of the mis-specifiedmodel are clearly demonstrated. We note that the results from the case of N ¼ 500are consistent with those from the base case where N ¼ 200 is assumed. Therefore,the simulation results presented in this article are not driven by the phenomenon ofsmall sample sizes.
ARTICLE IN PRESS
Table 5
Changes in c
la lbModel a Model b
Mean s.d. RMSE Mean s.d. RMSE
Change from base: c ¼ 2
True: Eðln Calþ lÞ ¼ 0:325, EðlÞ ¼ 0:252, Eð ^RTSÞ ¼ 0:869, EðB11Þ ¼ �0:872, EðB22Þ ¼ �0:573
Mean 0.272 0.250
s.d. 0.059 0.059 ^RTS 0.869 0.023 0.870 0.022
Kendall’s rank coefficient ^B11 �0.814 0.140 �0.877 0.117
^B22 �0.531 0.067 �0.573 0.064
ln Calþ l 0.769 0.623 b0 0.548 0.078 0.092 0.502 0.076 0.076
l 0.799 0.910 by0.876 0.053 0.058 0.900 0.050 0.050
Nonparametric distribution test (p value) bw0.524 0.044 0.050 0.500 0.040 0.040
byy0.124 0.033 0.041 0.101 0.031 0.031
ln Calþ l 0.061 0.028 bww
�0.076 0.033 0.041 �0.099 0.030 0.030
l 0.223 0.221 byw0.076 0.027 0.036 0.099 0.025 0.025
s2l 0.121 0.048 0.052 0.103 0.043 0.044
s2v 0.049 0.016 0.016 0.047 0.015 0.015
Change from base: c ¼ 10
True: Eðln Calþ lÞ ¼ 0:367, EðlÞ ¼ 0:294, Eð ^RTSÞ ¼ 0:869, EðB11Þ ¼ �0:872, EðB22Þ ¼ �0:573
Mean 0.310 0.291
s.d. 0.033 0.030 ^RTS 0.868 0.020 0.870 0.019
Kendall’s rank coefficient ^B11 �0.815 0.142 �0.869 0.081
^B22 �0.533 0.052 �0.571 0.048
ln Calþ l 0.872 0.716 b0 0.554 0.051 0.074 0.504 0.044 0.045
l 0.774 0.915 by0.876 0.042 0.048 0.896 0.038 0.038
Nonparametric distribution test (p value) bw0.520 0.034 0.040 0.500 0.031 0.031
byy0.123 0.026 0.035 0.103 0.024 0.024
ln Calþ l 0.072 0.026 bww
�0.077 0.026 0.035 �0.098 0.023 0.023
l 0.479 0.521 byw0.078 0.022 0.031 0.098 0.020 0.020
s2l 0.153 0.032 0.035 0.135 0.027 0.027
s2v 0.016 0.007 0.007 0.013 0.006 0.006
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340336
4. Conclusion
In this paper we have shown that failure to model the presence of the allocativeinefficiency component in a flexible (translog) cost function leads to biased estimatesof both parameters and cost inefficiency. In particular, we find that the technological
ARTICLE IN PRESS
-0.96 -0.7
0
64.9578
-0.96 -0.7
0
58.6753
ˆ (solid)
ˆ (solid)
ˆb
ˆa
ˆa (dashed)
ˆb
(dashed)
0.843 0.896
0.657396
55.7472
ˆ(solid)RTS
ˆ(long dash)
RTSa ˆ
(short dash)RTSb
0.843 0.896
0
61.1896 ˆ(solid)
RTS
ˆ(long dash)
RTSa
ˆ
(short dash)RTSb
108.112
0
0.236 0.35 0.44
61.2856
0
0.236 0.35 0.44(a) (b)
(c) (d)
(e) (f)
Fig. 5. Change c. (a) inefficiency; c ¼ 2; (b) inefficiency; c ¼ 10; (c) price elas.; c ¼ 2; (d) price elas.;
c ¼ 10; (e) RTS; c ¼ 2; and (f) RTS; c ¼ 10.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 337
parameters in the mis-specified model are biased, and their RMSEs can be more thantwice as large as those from a correctly specified model. The biases are particularlyprominent when the variance of the allocative error is large and when the mean ofthe error deviates from zero (systematic input allocation errors). The input priceelasticities tend to be overestimated, and the variances of the estimated elasticities arequite large (compared to those from a correctly specified model). We also find thatthe returns to scale statistic is underestimated when the true model exhibitssystematic input allocation errors. For the cost inefficiency, the estimated valueusually falls between the true technical inefficiency and the true overall inefficiency.In all the cases, the mean biases (with respect to both technical and overallinefficiency) are obvious. Based on these results we conclude that the widely heldview, which ignores the costs of allocative inefficiency as not likely to affect thetechnological parameters, model statistics, and the measures of cost inefficiency, isnot correct.
ARTICLE IN PRESS
Table 6
Changes in N
la lbModel a Model b
Mean s.d. RMSE Mean s.d. RMSE
Change from base: N ¼ 50
True: Eðln Calþ lÞ ¼ 0:351, EðlÞ ¼ 0:279, Eð ^RTSÞ ¼ 0:870, EðB11Þ ¼ �0:873, EðB22Þ ¼ �0:572
Mean 0.277 0.264
s.d. 0.070 0.065 ^RTS 0.877 0.048 0.876 0.045
Kendall’s rank coefficient ^B11 �1.204 12.937 �0.883 0.389
^B22 �0.527 0.154 �0.564 0.134
ln Calþ l 0.753 0.637 b0 0.572 0.121 0.141 0.520 0.115 0.117
l 0.723 0.812 by0.870 0.112 0.116 0.890 0.103 0.103
Nonparametric distribution test (p value) bw0.515 0.091 0.092 0.495 0.083 0.083
byy0.126 0.070 0.074 0.107 0.062 0.063
ln Calþ l 0.266 0.232 bww
�0.068 0.069 0.076 �0.089 0.060 0.061
l 0.507 0.501 byw0.076 0.058 0.063 0.096 0.052 0.052
s2l 0.129 0.059 0.060 0.117 0.052 0.052
s2v 0.028 0.017 0.017 0.024 0.014 0.015
Change from base: N ¼ 500
True: Eðln Calþ lÞ ¼ 0:351, EðlÞ ¼ 0:278, Eð ^RTSÞ ¼ 0:869, EðB11Þ ¼ �0:872, EðB22Þ ¼ �0:573
Mean 0.295 0.276
s.d. 0.026 0.025 ^RTS 0.868 0.013 0.870 0.013
Kendall’s rank coefficient ^B11 �0.810 0.058 �0.873 0.055
^B22 �0.531 0.036 �0.572 0.035
ln Calþ l 0.849 0.689 b0 0.552 0.037 0.064 0.503 0.035 0.035
l 0.803 0.946 by0.874 0.029 0.039 0.897 0.027 0.027
Nonparametric distribution test (p value) bw0.525 0.024 0.035 0.501 0.022 0.022
byy0.124 0.018 0.030 0.102 0.016 0.016
ln Calþ l 0.001 0 bww
�0.077 0.018 0.029 �0.099 0.016 0.016
l 0.312 0.397 byw0.077 0.015 0.028 0.099 0.014 0.014
s2l 0.138 0.024 0.029 0.121 0.021 0.021
s2v 0.031 0.007 0.007 0.028 0.006 0.006
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340338
Given the above conclusions, it is perhaps worth pointing out some alternativeestimation strategies that accommodate both technical and allocative inefficiency.Since both the inefficiency components cannot be estimated from the results of asingle equation approach, the alternative techniques have to be based on a systemapproach. The first approach is to use a dual cost system (e.g., a translog cost andthe associated share equations as in Kumbhakar, 1997). Kumbhakar and Tsionas
ARTICLE IN PRESS
-0.96 -0.7
0
82.7727
ˆ
(solid)�
ˆ
(solid)�
�b �a
�a
(dashed)
�b (dashed)
-0.96 -0.7
0
77.3084
0.843 0.896
1.79822
57.8733
ˆ(long dash)RTSa
ˆ(long dash)RTSa
ˆ
(solid)RTS
ˆ
(solid)RTS
ˆ
(short dash)RTSb
ˆ
(short dash)RTSb
0.843 0.896
0.553808
66.668
78.5721
2.04835
0.236 0.35 0.44
77.1356
0
0.236 0.35 0.44(a) (b)
(c) (d)
(e) (f)
Fig. 6. Change N. (a) inefficiency; N ¼ 50; (b) inefficiency; N ¼ 500; (c) price elas.; N ¼ 50; (d) price elas.;
N ¼ 500; (e) RTS; N ¼ 50; and (f) RTS; N ¼ 500.
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340 339
(2003a, b) utilized this framework to estimate the costs of technical and allocativeinefficiency in both cross-sectional and panel data settings, using a Bayesianapproach. The second approach, which is taken by Kumbhakar and Wang (2003), isto use a primal production system (consisting of the production function and thefirst-order conditions of cost minimization) to estimate technical and allocativeinefficiency. These estimates are then used to compute the increase in costs due totechnical and allocative inefficiency separately.
Acknowledgements
The authors thank two referees for helpful comments. Hung-Jen Wang gratefullyacknowledges financial support by National Science Council of Taiwan Grant no.92-2415-H-001-032.
ARTICLE IN PRESS
S.C. Kumbhakar, H.-J. Wang / Journal of Econometrics 134 (2006) 317–340340
References
Aigner, D., Lovell, C.A.K., Schmidt, P., 1977. Formulation and estimation of stochastic frontier
production function models. Journal of Econometrics 6, 21–37.
Coelli, T., 1995. Estimators and hypothesis tests for a stochastic frontier function: a Monte Carlo analysis.
Journal of Productivity Analysis 6, 247–268.
Coelli, T., Prasada Rao, D.S., Battese, G.E., 1998. An Introduction to Efficiency and Productivity
Analysis. Kluwer Academic Publishers, Boston.
Coelli, T., Estache, A., Perelman, S., Trujillo, L., 2003. A Primer on Efficiency Measurement for Utilities
and Transport Regulators. (World Bank Publication).
Farrell, M.J., 1957. The measurement of productive efficiency. Journal of the Royal Statistical Society
Series A 120, 253–281.
Jondrow, J., Lovell, C.A.K., Materov, I.S., Schmidt, P., 1982. On the estimation of technical inefficiency
in the stochastic frontier production function model. Journal of Econometrics 19, 233–238.
Kendall, M.G., Gibbons, J.D., 1990. Rank Correlation Methods. Oxford University Press, New York.
Kumbhakar, S.C., 1997. Modeling allocative inefficiency in a translog cost function and cost share
equations: an exact relationship. Journal of Econometrics 76, 351–356.
Kumbhakar, S.C., Lovell, C.A.K., 2000. Stochastic frontier analysis, Cambridge University press,
New York.
Kumbhakar, S.C., Tsionas, E.G., 2003a. The joint measurement of technical and allocative inefficiency: an
application of Bayesian inference in nonlinear random effects models, Working Paper.
Kumbhakar, S.C., Tsionas, E.G., 2003b. Measuring technical and allocative inefficiency in the translog
cost system: a Bayesian approach, Working Paper.
Kumbhakar, S.C, Wang, H.J., 2003. Estimation of technical and allocative inefficiency in a stochastic
frontier production model: a system approach, Working Paper.
Li, Q., 1996. Nonparametric testing of closeness between two unknown distribution functions.
Econometric Reviews 15, 261–274.
Meeusen, W., van den Broeck, J., 1977. Technical efficiency and dimension of the firm: some results on the
use of frontier production functions. Empirical Economics 2, 109–122.
Olson, J.A., Schmidt, P., Waldman, D.M., 1980. A Monte Carlo study of estimators of stochastic frontier
production functions. Journal of Econometrics 13, 67–82.
Schmidt, P., Lovell, C.A.K., 1979. Estimating technical and allocative inefficiency relative to stochastic
production and cost frontiers. Journal of Econometrics 9, 343–366.
Schmidt, P., Lovell, C.A.K., 1980. Estimating stochastic production and cost frontiers when technical and
allocative inefficiency are correlated. Journal of Econometrics 13, 83–100.
Varian, H.R., 1992. Microeconomic Analysis. Norton, New York.
Wang, H.J., Schmidt, P., 2002. One-step and two-step estimation of the effects of exogenous variables on
technical efficiency levels. Journal of Productivity Analysis 18, 129–144.