Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency...

Preview:

Citation preview

Econometrics in Health Economics

Discrete Choice Modelingand

Frontier Modeling and Efficiency Estimation

Professor William GreeneStern School of Business

New York UniversitySeptember 2-4, 2007

Frontier and Efficiency Estimation Session 5

Efficiency Analysis Stochastic Frontier Model Efficiency Estimation

Session 6 Panel Data Models and Heterogeneity Fixed and Random Effects Bayesian and Classical Estimation

Session 7 Efficiency Models Stochastic Frontier and Data Envelopment Analysis Student Presentation: Silvio Daidone and Francesco D’Amico

Session 8: Computer Exercises and Applications

The Production Function“A single output technology is commonly

described by means of a production function f(z) that gives the maximum amount q of output that can be produced using input amounts (z1,…,zL-1) > 0.

“Microeconomic Theory,” Mas-Colell, Whinston, Green: Oxford, 1995, p. 129. See also Samuelson (1938) and Shephard (1953).

Thoughts on InefficiencyFailure to achieve the theoretical maximum Hicks (ca. 1935) on the benefits of monopoly Leibenstein (ca. 1966): X inefficiency Debreu, Farrell (1950s) on management

inefficiency

All related to firm behavior in the absence of market restraint – the exercise of market power.

A History of Empirical Investigation Cobb-Douglas (1927) Arrow, Chenery, Minhas, Solow (1963) Joel Dean (1940s, 1950s) Johnston (1950s) Nerlove (1960) Christensen et al. (1972)

Inefficiency in the “Real” WorldMeasurement of inefficiency in “markets” –

heterogeneous production outcomes: Aigner and Chu (1968) Timmer (1971) Aigner, Lovell, Schmidt (1977) Meeusen, van den Broeck (1977)

Production FunctionsProduction is a process of

transformation of a set of inputs, denoted x into a set of outputs, y

Transformation of inputs to outputs is via the transformation function: T(y,x) = 0.

K

M

Defining the Production Set

Level set:

The Production function is defined by the isoquant

The efficient subset is defined in terms of the level sets:

L .y x y x( ) = { : ( , ) is producible}

I( ) = { : L( ) and ( ) if 0 <1}.y x x y x yL

k k k j

ES( )={ : L( ) and ' L( ) for '

when k and < for some j}.

y x x y x y x

x x x x

Isoquants and Level Sets

The Distance Function

Inefficiency

Production Function Model with Inefficiency

Cost Inefficiencyy* = f(x) C* = g(y*,w)

(Samuelson – Shephard duality results)

Cost inefficiency: If y < f(x), then C must be greater than g(y,w). Implies the idea of a cost frontier.

lnC = lng(y,w) + u, u > 0.

Specification

1

121 1 1

Cobb Douglas

ln ln

Translog

ln ln ln ln

Box-Cox transformations to cope with zeros

Regularity Conditions: Monotonicity and Concavity

Translog Cost Model

ln ln

K

k kk

K K K

k k km k mk k m

k k

y x

y x x x

C w

121 1 1

L L1st21 s 1 t 1

1 1

ln ln

ln ln ln

ln ln ,

K K K

km k mk k m

L

s s s ts

K L

ks k sk s

w w

y y y

w y

Corrected Ordinary Least Squares

Modified OLSAn alternative approach that requires a parametric model of

the distribution of ui is modified OLS (MOLS). The OLS residuals, save for the constant displacement, are pointwise consistent estimates of their population counterparts, - ui. suppose that ui has an exponential distribution with mean λ. Then, the variance of ui is λ2, so the standard deviation of the OLS residuals is a consistent estimator of E[ui] = λ. Since this is a one parameter distribution, the entire model for ui can be characterized by this parameter and functions of it. The estimated frontier function can now be displaced upward by this estimate of E[ui].

COLS and MOLS

Deterministic Frontier: Programming Estimators

Estimating Inefficiency

Statistical Problems with Programming Estimators

They do correspond to MLEs. The likelihood functions are “irregular” There are no known statistical properties – no

estimable covariance matrix for estimates. They might be “robust,” like LAD. Noone knows

for sure. Never demonstrated.

A Model with a Statistical Basis

i

K Kki kiik ki ik=1 k=1

PP-1 -θuii i

i i1 1

Gamma Frontier Model (Greene (1980)

lny = α + + = α + - u β βεx x

θ h(u) = , u 0, θ > 0, P > 2u eΓ(P)

ln ( , , , ) ln ln ( ) ( 1) lnu u

N N

i iL P P N P P

Τ

i i i u =α+β x - y >0

Virtues : Known statistical properties, regular likelihood, etc.

Flaws: Completely unwieldy, impractical. (Nonetheless, was

used in several empirical studies.)

Extensions Cost frontiers, based on duality results:

ln y = f(x) – u ln C = g(y,w) + u’

u > 0. u’ > 0. Economies of scale and

allocative inefficiency blur the relationship. Corrected and modified least squares estimators based

on the deterministic frontiers are easily constructed.

Data Envelopment Analysis

Methodological Problems Measurement error Outliers Specification errors The overall problem with the deterministic

frontier approach

Stochastic Frontier Models Motivation:

Factors not under control of the firm Measurement error Differential rates of adoption of technology

frontier is randomly placed by the whole collection of stochastic elements which might enter the model outside the control of the firm.

Aigner, Lovell, Schmidt (1977), Meeusen, van den Broeck (1977)

Stochastic Frontier Model ( )

ln +

= + .

iviii

i i ii

i i

= fy eTE

= + v uy

+

x

x

x

ui > 0, but vi may take any value. A symmetric distribution, such as the normal distribution, is usually assumed for vi. Thus, the stochastic frontier is

+’xi+vi

and, as before, ui represents the inefficiency.

Least Squares EstimationAverage inefficiency is embodied in the third

moment of the disturbance εi = vi - ui.

So long as E[vi - ui] is constant, the OLS estimates of the slope parameters of the frontier function are unbiased and consistent. (The constant term estimates α-E[ui]. The average inefficiency present in the distribution is reflected in the asymmetry of the distribution, which can be estimated using the OLS residuals:

3

1

1 ˆˆ( - [ ])N

N

3 i ii

= Em

Application to Spanish Dairy Farms

Input Units Mean Std. Dev.

Minimum

Maximum

Milk Milk production (liters)

131,108 92,539 14,110 727,281

Cows # of milking cows 2.12 11.27 4.5 82.3

Labor

# man-equivalent units

1.67 0.55 1.0 4.0

Land Hectares of land devoted to pasture and crops.

12.99 6.17 2.0 45.1

Feed Total amount of feedstuffs fed to dairy cows (tons)

57,941 47,981 3,924.1 376,732

N = 247 farms, T = 6 years (1993-1998)

Example: Dairy Farms

EI

.56

1.13

1.69

2.25

2.81

.00-.500 -.250 .000 .250 .500 .750-.750

Kernel density estimate for EI

Den

sit

y

The Normal-Half Normal Model

Normal-Half Normal Variable

Decomposition

Standard Form

Estimation: Least Squares/MoM OLS estimator of β is consistent E[ui] = (2/π)1/2σu, so OLS constant

estimates α+ (2/π)1/2σu

Second and third moments of OLS residuals estimate

and 0

2 2 32 u v 3 u

- 2 2 4 = + = 1 - m m

A Problem with Method of Moments

Estimator of σu is [m3/-.21801]1/3

Theoretical m3 is < 0

Sample m3 may be > 0. If so, no solution for σu . (Negative to 1/3 power.)

Likelihood Function

Waldman (1982) result on skewness of OLS residuals: If the OLS residuals are positively skewed, rather than negative, then OLS maximizes the log likelihood, and there is no evidence of inefficiency in the data.

Alternative Model: Exponential

Normal-Exponential Likelihood

2 2n

ui=1

Ln ( ; ) =

(( ) / ( )1-ln ln

2

v u

u i i v u i i

v v u

L data

v u v u

Truncated Normal Model

Normal-Truncated Normal

Other Models Other Parametric Models (we will examine

gamma later in the course) Semiparametric and nonparametric – the

recent outer reaches of the theoretical literature

Other variations including heterogeneity in the frontier function and in the distribution of inefficiency

Estimating ui

No direct estimate of ui

Data permit estimation of yi – β’xi. Can this be used? εi = yi – β’xi = vi – ui

Indirect estimate of ui, using E[ui|vi – ui]

vi – ui is estimable with ei = yi – b’xi.

Fundamental Tool - JLMS

2

( )[ | ] ,

1 ( )it it

it it it itit

E u

We can insert our maximum likelihood estimates of all parameters.

Note: This estimates E[u|vi – ui], not ui.

Other Distributions

2 2

2

2

( / )| = + , = - /

( / )

For the Normal- Truncated Normal Model

For the Normal-Exponential Model

i u vi

vitit it it v it it v u

vit

zE u z z

z

Efficiency

** 2* *

***

2 2* 2 2 2 u v

i u * 2

1

2

[( / ) ][exp( ) | ] exp

[( / )]

= + / and

ii i i

i

i

E u

where

For the Normal- Truncated Normal Model

For the normal-half normal model, = 0.

Application: Electricity Generation

Estimated Translog Production Frontiers

Inefficiency Estimates

Estimated Inefficiency Distribution

TRNCNRML

1.86

3.72

5.59

7.45

9.31

.00.10 .20 .30 .40 .50.00

Kernel dens ity estimate for TRNCNRML

De

ns

ity

Confidence Region

Application (Based on Costs)Horrace/Schmidt Confidence Bounds for Cost Efficiency

FIR M

.724

.798

.872

.946

1.020

.65025 50 75 100 1250

E FFN E FFU P P E RE FFLO W E R

Ee(-u|e

)

Multiple Output Frontier The formal theory of production departs from the

transformation function that links the vector of outputs, y to the vector of inputs, x;

T(y,x) = 0. As it stands, some further assumptions are

obviously needed to produce the framework for an empirical model. By assuming homothetic separability, the function may be written in the form

A(y) = f(x).

Multiple Output Production Function

1/ qT

1x

M q qm i,t,m it it itmy v u

Inefficiency in this setting reflects the failure of the firm to achieve the maximum aggregate output attainable. Note that the model does not address the economic question of whether the chosen output mix is optimal with respect to the output prices and input costs. That would require a profit function approach. Berger (1993) and Adams et al. (1999) apply the method to a panel of U.S. banks – 798 banks, ten years.

Duality Between Production and Cost

T( ) = min{ : ( ) }C y, f yw w x x

Implied Cost Frontier Function

Stochastic Cost Frontier

Cobb-Douglas Cost Frontier

Translog Cost Frontier

2 21 1 1kl yy2 2 2

Cost frontier with K variable inputs, one fixed input (F) and

output, y.

ln ln ln ln

ln ln ln ln

ln ln ln ln

F Kk=1 k k F y

K Kk=1 l=1 k l FF

K Kk=1 kF k k=1 ky k

C w F y

w w F y

w F w y

K

k=1k

ln ln

Cost functions fit subject to theoretical homogeneity in prices

lnCrestriction: 1. Imposed by dividing C and all but

lnw

one of the input prices by the "last" (numeraire) price.

Fy i iF y v u

Restricted Translog Cost Function

212

2 21 12 2

ln ln ln ln ln

ln ln ln ln

ln ln ln l

K L y yy

KK LL KL

yK yL

C PK PLy y

PF PF PF

PK PL PK PL

PF PF PF PF

PKy y

PF

nPL

v uPF

Cost Application to C&G Data

Estimates of Economic Efficiency

Duality – Production vs. Cost

Multiple Output Cost Frontier

1 1 15

4

15

1ln ln ln ln

2

ln second order terms + ...

M M M

my m lm l mm l m

kkk

Cy y y

w

wv u

w

Allocative Inefficiency and Economic Inefficiency

Technical inefficiency: Off the isoquant.

Allocative inefficiency: Wrong input mix.

Cost Structure – Demand System

Cost Function

Cost = f(output, input prices) = C(y, )

Shephard's Lemma Produces Input Demands

C*(y, ) = Cost minimizing demands =

w

x ww

Cost Frontier Model

k kk

k

Stochastic cost frontier

lnC(y, ) = g(lny,ln ) + v + u

u = cost inefficiency

Factor demands in the form of cost shares

lnC(y, )s h(lny,ln ) + e

lnw

e allocative inefficiency

w w

ww

The Greene Problem Factor shares are derived from the cost function by

differentiation. Where does ek come from? Any nonzero value of ek, which can be positive or

negative, must translate into higher costs. Thus, u must be a function of e1,…,eK such that ∂u/∂ek > 0

Noone had derived a complete, internally consistent equation system the Greene problem.

Solution: Kumbhakar in several recent papers. Very complicated – near to impractical Apparently not of interest to practitioners

Observable Heterogeneity As opposed to unobservable

heterogeneity Observe: Y or C (outcome) and X or w

(inputs or input prices) Firm characteristics z. Not production or

cost, characterize the production process. Enter the production or cost function? Enter the inefficiency distribution? How?

Shifting the Outcome Function

ln f( , ) ( , ) ( )x zit it it it ity g h t v u

Firm specific heterogeneity can also be incorporated into the inefficiency model as follows: This modifies the mean of the truncated normal distribution

yi = xi + vi - ui

vi ~ N[0,v2]

ui = |Ui| where Ui ~ N[i, u2], i = 0 + 1zi,

Heterogeneous Mean

Estimated Efficiency

One Step or Two Step2 Step: Fit Half or truncated normal model, compute JLMS

ui, regress ui on zi

Airline EXAMPLE: Fit model without POINTS, LOADFACTOR, STAGE

1 Step: Include zi in the model, compute ui including zi

Airline example: Include 3 variables

Methodological issue: Left out variables in two step approach.

WHO Health Care Study

Application: WHO Data

One vs. Two Step

Unobservable Heterogeneity Parameters vary across firms

Random variation (heterogeneity, not Bayesian) Variation partially explained by observable indicators

Continuous variation – random parameter models: Considered with panel data models

Latent class – discrete parameter variation

A Latent Class Model

Latent Class ApplicationBanking Costs

Heteroscedasticity in v and/or uVar[vi | hi] = v

2gv(hi,) = vi2

gv(hi,0) = 1,

gv(hi,) = [exp(Thi)]2

Var[Ui | hi] = u2gu(hi,)= ui

2

gu(hi,0) = 1,

gu(hi,) = [exp(Thi)]2

Application: WHO Data

A “Scaling” Model

i i i

i

0

1

2

u ( , u * where f(u *) does not involve

Scales both mean and variance of u

Ln ( , , , , ) = -(N/2) ln 2 - ln + ln ( / ) +

1 ln

2

i i

N

i i uii

i i i

i i i

h

L

z z

1

2

exp( ),

exp( ),

exp( ),

/ ,

N i i

ii

i i

ui u i

vi v i

i ui vi

i vi

z

z

z

2 ui

Model Extensions Simulation Based Estimators

Normal-Gamma Frontier Model Bayesian Estimation of Stochastic Frontiers

Similar Model Structures Similar Estimation Methodologies Similar Results

Normal-GammaVery flexible model. VERY difficult log likelihood function.

Bayesians love it. Conjugate functional forms for other model parts

Normal-Gamma Model1( ) exp( / ) , 0, 0

( )

PPu

u i i u i if u u u u PP

2 21

ln ln ( ) ln ( 1, )

Ln ( ) = .- /1+ ln +

2

u i

N

v u i v i v u i

u v u

P P q P

L

i( , ) | > 0, ,riq r = E z z z ~ N[-i + v

2/u, v2].

q(r,εi) is extremely difficult to compute

Normal-Gamma

P u P 1

2

21i2

Gamma Frontier Model

Deterministic Frontier

y = x' - u

f(u) = [ / (P)]e u , u 0

Stochastic Frontier

y = x' + v - u = x' +

f(v) = N[0, ]

LogL=N[Pln + ln (P)] ln

N ii 1

P 1 i

0N 2i ii=1

i

0

z1z dz

+ ln , z1

dz

Simulating the Likelihood

2 2

111

1

- /1ln ln ( )+ ln +

2Ln ( ) = .

1ln ( (1 ) ( / )

v i v u iu

N u v uS v u i

PQ

i v iq iq i vq

P P

L

F FQ

i = yi - Txi, i = -i - v2/u, = v, and PL = (-i/) and Fq is a draw from the

continuous uniform(0,1) distribution.

Application to C&G Data

This is the standard data set for developing and testing Exponential, Gamma, and Bayesian estimators.

Application to C&G Data

Bayesian Estimation Short history – first developed post 1995 Range of applications

Largely replicated existing classical methods Recent applications have extended received

approaches Common features of the application

Bayesian Formulation of SF Model

2 2N

i=1

-(( ) / )1Ln ( ; ) = ln + ln +

2v i i v u i i

v u uu v u

v u v uL data

Normal – Exponential Model

vi – ui = yi - - Txi.

Estimation proceeds (in principle) by specifying priors over = (,,v,u), then deriving inferences from the joint posterior p(|data). In general, the joint posterior for this model cannot be derived in closed form, so direct analysis is not feasible. Using Gibbs sampling, and known conditional posteriors, it is possible use Markov Chain Monte Carlo (MCMC) methods to sample from the marginal posteriors and use that device to learn about the parameters and inefficiencies. In particular, for the model parameters, we are interested in estimating E[|data], Var[|data] and, perhaps even more fully characterizing the density f(|data).

Estimating Inefficiency One might, ex post, estimate E[ui|data]

however, it is more natural in this setting to include (u1,...,uN) with , and estimate the conditional means with those of the other parameters. The method is known as data augmentation.

Priors Over Parameters

v v

P 1

u

Diffuse priors are assumed for all of these

p( , ) Uniform over the real "line" so p(..)=1

p(1/ ) Gamma(1/ | ,P )

= exp (1/ ) (1/ ) , 1/ 0(P )

p( ) exp( )

v

v

u

v v

Pv

v v v vv

Pv

u uuP

1, 0.vPu u

Priors for Inefficiencies

Posterior

Gibbs Sampling: Conditional Posteriors

Bayesian Normal-Gamma Model Tsionas (2002)

Erlang form – Integer P “Random parameters” Applied to C&G

River Huang (2004) Fully general Applied (as usual) to C&G

Bayesian and Classical Results

Methodological Comparison Bayesian vs. Classical

Interpretation Practical results: Bernstein – von Mises Theorem in

the presence of diffuse priors Kim and Schmidt comparison (JPA, 2000) Important difference – tight priors over u i in this

context. Conclusions?

Recommended