Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
1/76
2/76
William Greene
New York University
True Random Effects in Stochastic
Frontier Models
http://people.stern.nyu.edu/wgreene/appc2014.pdf
3/76
Agenda
Skew normality – Adelchi Azzalini
Stochastic frontier model
Panel Data: Time varying and time invariant inefficiency models
Panel Data: True random effects models
Maximum Simulated Likelihood Estimation
Applications of true random effects
Persistent and transient inefficiency in Swiss railroads
A panel data sample selection corrected stochastic frontier model
Spatial effects in a stochastic frontier model
4/76
Skew Normality
5/76
The Stochastic Frontier Model
2
2
ln ,
~ 0, ,
| |, ~ 0, ,
= v | |
Convenient parameterization (notation)
| | = [0,1] | [0,1] |
i i i i
i v
i i i u
i i i i i
i v i u i v i u
y v u
v N
u U U N
v u U
V U N N
x
6/76
2 2
1
1
, =
2log log
log ( , , , ) = ( )
log
2 = log
uu v
v
i i
N
i
i i
N i i
i
y
Ly
x
x
Log Likelihood
Skew Normal
Density
7/76
Birnbaum (1950) Wrote About Skew Normality
Effect of
Linear
Truncation on
a Multinormal
Population
8/76
Weinstein (1964) Found f()
Query 2: The Sum of
Values from a
Normal and a
Truncated Normal
Distribution
See, also, Nelson (Technometrics, 1964), Roberts (JASA, 1966)
9/76
Resembles f()
O’Hagan and Leonard (1976) Found
Something Like f()
Bayes Estimation
Subject to Uncertainty
About Parameter
Constraints
10/76
ALS (1977) Discovered How
to Make Great Use of f()
See, also, Forsund and Hjalmarsson (1974), Battese and Corra (1976)
Poirier,… Timmer, … several others.
11/76
The standard skew normal distribution
f( ) = 2 ( ) ( )
Azzalini (1985) Figured Out f()
And Noticed the Connection to ALS
© 2014
12/76
http://azzalini.stat.unipd.it/SN/
13/76
http://azzalini.stat.unipd.it/SN/abstracts.html#sn99
ALS
14/76
How to generate pseudo random draws on
1. Draw , from independent N[0,1]
2. = + | |u u
U V
V U
A Useful FAQ About the Skew Normal
15/76
2 2 22 2
2 2
For a particular desired and
Use and = 1 1
Then
(0,1) | (0,1) |
u v u
v uN N
Random Number Generator
16/76
How Many Applications of SF Are There?
17/76
2 ( ) ( )z z
W. D. Walls (2006) On Skewness in the Movies
Cites Azzalini.
18/76
“The skew-normal
distribution
developed by Sahu et
al. (2003)…”
Does not
know Azzalini.
SNARCH Model for Financial Crises (2013)
19/76
1
Mixed Logit Model
exp( )Prob( )
exp( )
Random Parameters
Asymmetric (Skewed) Parameter Distribution
| |~ (0, , )
i ij
i J
i ijj
ik ik
ik ik ik
Choice j
w
w v U SN
x
x
A Skew Normal Mixed Logit Model (2010)
Greene (2010, knows Azzalini and ALS),
Bhat (2011, knows not Azzalini … or ALS)
20/76
Foundation: An Entire Field
Stochastic Frontier Model
Occasional Modeling Strategy
Culture: Skewed Distribution of Movie Revenues
Finance: Crisis and Contagion
Choice Modeling: The Mixed Logit Model
How can these people find each other?
Where else do applications appear?
Skew Normal Applications
21/76
Stochastic Frontier
22/76
The Cross Section Departure Point: 1977
2
2
2
Aigner et al. (ALS) Stochastic Frontier Model
~ [0, ]
| | and ~ [0, ]
Jondrow et al. (JLMS) Inefficiency Estimator
( )ˆ [ | ]
1 ( )
,
i i i i
i v
i i i u
ii i i i
i
ui i i
y v u
v N
u U U N
u E u
v u
x
2 2, ,
iv u i
v
23/76
The Panel Data Models Appear: 1981
2 2
1
Pitt and Lee Random Effects Approach: 1981
~ [0, ], | | and ~ [0, ]
Counterpart to Jondrow et al. (1982)
( / )ˆ [ | ,..., ]
1 ( / )
=
it it it i
it v i i i u
it it i
ii i i iT i
i
ii
y v u
v N u U U N
v u
u E u
T
x
2
2 ,
1 1
u u
vT T
Time
fixed
24/76
Reinterpreting the Within Estimator: 1984
2
Schmidt and Sickles Fixed Effects Approach: 1984
~ [0, ],
.
Counterpart to Jondrow et al. (198
it i it it
it v i
y v
v N
x
semiparametically specified
fixed mean, constant variance
2)
ˆ ˆˆ max ( )
(The cost of the semiparametric specification is the
location of the inefficiency distribution. The authors
also revisit Pitt and Lee to demonstrate.)
i i i iuTime
fixed
25/76
Misgivings About Time Fixed Inefficiency: 1990-
2
0 1 2
2 1
Cornwell Schmidt and Sickles (1990)
Kumbhakar (1990)
[1 exp( )] | |
Battese and Coelli (1992, 1995)
exp[ ( )] | |, exp[ ( , , )] | |
Cuesta (2000)
exp[ ( )]
it i i i
it i
it i it it i
it i
t t
u bt ct U
u t T U u g t T U
u t T
z
| |, exp[ ( , , )] | |i it i it iU u g t T Uz
26/76
Are the systematically time varying models
more like time fixed or freely time varying?
A Pooled Model
Battese and Coelli (1992) exp[ ( )] | |
Pitt and Lee (1981) | |
Where is Battese and Coelli?
Closer to
it it it it
it i
it it it i
y v u
u t T U
y v U
x
x
the pooled model or to Pitt and Lee?
Greene (2004): Much closer to the Pitt and Lee model
27/76
2 2
In these models with time varying inefficiency,
( , ) | |
~ [0, ] and ~ [0, ],
where does unobserved time invariant
heterogeneity end up?
In the inefficiency! Even with t
it it it i it i
it v it u
y v g t U
v N U N
x z
he extensions.
28/76
Skepticism About Time Varying Inefficiency
Models: Greene (2004)
29/76
True Random Effects
30/76
True Random and Fixed Effects: 2004
2 2
True Random and Fixed Effects Approach: 2004
~ [0, ], | | and ~ [0, ]
Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
Jo
it i it it it
it v it it it u
i
y v u
v N u U U N
x
2
2 2
ndrow et al. (JLMS) Inefficiency Estimator
( )[ | ]
1 ( )
, , ,
itit it it
it
u itit it it v u i
v
E u
v u
Time
varying
Time
fixed
31/76
Estimation of TFE and TRE Models: 2004
2 2
True Fixed Effects: MLE
~ [0, ], | | and ~ [0, ]
Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
it i it it it
it v it it it u
i
y v u
v N u U U N
x
2 2 2
Just add firm dummy variables to the SF model (!)
True Random Effects: Maximum Simulated Likelihood (RPM)
( )
~ [0, ], | | and ~ [0, ], ~ [0, ]
it i it it it
it v it it it u i w
y w v u
v N u U U N w N
x
Unobserved time invariant heterogeneity,
not unobserved time invariant inefficiency
Random parameters stochastic frontier model
i
32/76
1
Log likelihood function for stochastic frontier model
2log log
log ( , , , ) = ( )
log
i i
N
i
i i
y
Ly
x
x
33/76
1 1 1
for stochastic frontier model
with a time invariant random constant term. (TRE model)
2 ( )
1log ( , , , , ) = log
( (
it w ir it
N R TS
w i r t
it w
y w
LR y w
Simulated log likelihood fun t
x
c ion
) )
draws from N[0,1].
ir it
irw
x
34/76
The Most Famous Frontier Study Ever
35/76
The Famous WHO Model
logCOMP= +1logPerCapitaHealthExpenditure +
2logYearsEduc +
3Log2YearsEduc +
= v - u
Schmidt/Sickles FEM
191 Countries.
140 of them observed 1993-1997.
36/76
The Notorious WHO Results
37
37/76
No, it
doesn’t.
August
12, 2012
37
38/76 Huffington Post, April 17, 2014
39/76
we are #37
40/76
Greene, W., Distinguishing Between
Heterogeneity and Inefficiency:
Stochastic Frontier Analysis of the
World Health Organization’s Panel
Data on National Health Care
Systems, Health Economics, 13, 2004,
pp. 959-980.
41/76
21, log , log , log
log , log ,
, , ,
Exp Ed Ed
PopDen PerCapitaGDP
GovtEff VoxPopuli OECD GINI
x
z
42/76
Three Extensions of the
True Random Effects Model
43/76
Generalized True Random Effects Stochastic Frontier Model
Transient random components
Time varying normal - half normal SF
Persistent random com
xit i i it it it
it it
y A B v u
v u
ponents
Time fixed normal - half normal SFi iA B
Generalized True Random Effects Model
44/76
A Stochastic Frontier Model with Short-Run and
Long-Run Inefficiency:
Colombi, R., Kumbhakar, S., Martini, G., Vittadini,
G., University of Bergamo, WP, 2011, JPA 2014,
forthcoming.
Tsionas, G. and Kumbhakar, S.
Firm Heterogeneity, Persistent and Transient Technical Inefficiency:
A Generalized True Random Effects Model
Journal of Applied Econometrics. Published online, November, 2012.
Extremely involved Bayesian MCMC procedure. Efficiency components estimated by
data augmentation.
45/76
2 2
Generalized True Random Effects Stochastic Frontier Model
( | |)
Time varying, transient random components
~ [0, ], | | and ~ [0, ],
Time
it w i i it it it
it v it it it u
y w e v u
v N u U U N
x
invariant random components
~ [0,1], ~ [0,1]
The random constant term in this model has a closed skew
normal distribution, instead of the usual normal distribution.
i iw N e N
46/76
Estimating Efficiency in the CSN Model
1 12
1
Moment Generating Function for the Multivariate CSN Distribution
( , )E[exp( ) | ] exp
( , )
(..., ) Multivariate normal cdf. Parts defined in Colombi et al.
Computed using
T ii i i
T i
Rr tt u y t Rr t t
Rr
1
GHK simulator.
1 0 0
0 1 0, = , , ...,
0 0 1
i
i
i
iT
e
u
u
u t
47/76
Estimating the GTRE Model
48/76
1
Colombi et al. Classical Maximum Likelihood Estimator
log ( , )log
log ( ( , )) log 2
(...) T-variate normal pdf.
(..., )) ( 1) Multivariate normal int
N T i i T
iq i i T
T
q
Lnq
T
y X 1 AVA
R y X 1
egral.
Very time consuming and complicated.
“From the sampling theory perspective, the application
of the model is computationally prohibitive when T is
large. This is because the likelihood function depends
on a (T+1)-dimensional integral of the normal
distribution.” [Tsionas and Kumbhakar (2012, p. 6)]
49/76
Kumbhakar, Lien, Hardaker
Technical Efficiency in Competing Panel Data Models: A Study of
Norwegian Grain Farming, JPA, Published online, September, 2012.
Three steps based on GLS:
(1) RE/FGLS to estimate (,)
(2) Decompose time varying residuals using MoM and SF.
(3) Decompose estimates of time invariant residuals.
50/76
1 1 1
Maximum Simulated Full Information log likelihood function for the
"generalized true random effects stochastic frontier model"
( | |)2,
1logL , = log
,
it w ir ir
TN RS
i r t
w
y w U
R
( ( | |) )
draws from N[0,1]
|U | absolute values of draws from N[0,1]
it
it w ir ir it
ir
ir
y w U
w
x
x
51/76
WHO Results: 2014
21, log , log , log
log , log ,
, , ,
it i i it it
Exp Ed Ed
PopDen PerCapitaGDP
GovtEff VoxPopuli OECD GINI
A B v u
x
z
52/76
53/76
Empirical application
Cost Efficiency of Swiss Railway
Companies
54/76
Model Specification
TC = f ( Y1, Y2, PL , PC , PE , N, NS, dt )
54
C : Total costs
Y1 : Passenger-km
Y2 : Ton-km
PL : Price of labor (wage per FTE)
PC : Price of capital (capital costs / total number of seats)
PE : Price of electricity
N : Network length
NS: Number of stations
Dt: time dummies
55/76
Data
50 railway companies
Period 1985 to 1997
unbalanced panel with number of periods (Ti) varying from 1 to 13 and
with 45 companies with 12 or 13 years, resulting in 605 observations
Data source: Swiss federal transport office
Data set available at http://people.stern.nyu.edu/wgreene/
Data set used in: Farsi, Filippini, Greene (2005), Efficiency and
measurement in network industries: application to the Swiss railway
companies, Journal of Regulatory Economics
55
56/76
57/76
58/76
Cost Efficiency Estimates
58
59/76
Correlations
60/76
MSL Estimation
61/76
Why is the MSL method so computationally
efficient compared to classical FIML and
Bayesian MCMC for this model?
Conditioned on the permanent effects, the group
observations are independent.
The joint conditional distribution is simple and easy to
compute, in closed form.
The full likelihood is obtained by integrating over only
one dimension. (This was discovered by Butler and
Moffitt in 1982.)
Neither of the other methods takes advantage of this
result. Both integrate over T+1 dimensions.
62/76
63/76
Equivalent Log Likelihood – Identical Outcome
One Dimensional Integration over δi
T+1 Dimensional Integration over Rei.
64/76
1 1
1log ( | , , , , , )
N R S
i ir w hi rG
R
Simulated [over (w,h)] Log Likelihood
Very Fast – with T=13, one minute or so
65/76
Also Simulated Log Likelihood
GHK simulator is used to approximate the T+1 variate normal
integrals.
Very Slow – Huge amount of unnecessary computation.
66/76
247 Farms, 6 years.
100 Halton draws.
Computation time:
35 seconds including
computing efficiencies.
Computation of the GTRE Model is Actually Fast and Easy
67/76
Simulation Variance
68/76
Does the simulation chatter degrade the
econometric efficiency of the MSL estimator?
Hajivassiliou, V., “Some practical issues in maximum simulated
likelihood,” Simulation-based Inference in Econometrics: Methods
and Applications, Mariano, R., Weeks, M. and Schuerman, T.,
Cambridge University Press, 2008
Speculated that Asy.Var[estimator] = V + (1/R)C
The contribution of the chatter would be of second or third order.
R is typically in the hundreds or thousands.
No other evidence on this subject.
69/76
An Experiment
Pooled Spanish Dairy Farms Data
Stochastic frontier using FIML.
Random constant term linear regression with
constant term equal to - |w|, w~ N[0,1]
This is equivalent to the stochastic frontier
model.
Maximum simulated likelihood
500 random draws for the simulation for the base case.
Uses Mersenne Twister for the RNG
50 repetitions of estimation based on 500 random
draws to suggest variation due to simulation chatter.
70/76
v
u
ˆ 0.10371
ˆ 0.15573
71/76
Chatter
.00543
.00590
.00042
.00119
Simulation Noise in Standard Errors of Coefficients
72/76
Quasi-Monte Carlo Integration Based on
Halton Sequences
0
Coverage of the unit interval is the objective,
not randomness of the set of draws.
Halton sequences --- Markov chain
p = a prime number,
r= the sequence of integers, decomposed as
H(r|p)
I i
iib p
b 1
0, ,...1 r = r (e.g., 10,11,12,...)
I i
iip
For example, using base p=5, the integer r=37 has b0 = 2, b1 = 2, and b3 = 1; (37=1x52 + 2x51 + 2x50). Then H(37|5) = 25-1 + 25-2 + 15-3 = 0.488.
73/76
Is It Really Simulation?
Halton or Sobol sequences are not
random
Far more stable than random draws, by a
factor of about 10.
There is no simulation chatter
View the same as numerical quadrature
There may be some approximation error.
How would we know?
74/76
0
I i
iib p
b
Halton sequences --- Markov chain
p = a prime number,
r= the sequence of integers, decomposed as
H(r|p)
Coverage of the unit interval is the objective,
not randomness of the set of draws.
1
0, ,...
I i
iip
1 r = r (e.g., 10,11,12,...)
Halton Sequences
75/76
1 1
S
1 1 1
LogL( , , , , , )
2
2log
LogL ( , , , , , )
2
1log
it it i
TNi i
i t it it i
it it ir
TN R
i r t
y
y
y
R
x
x
x
1
1
| |
Halton[prime( ), burn in]
Halton[prime( ), burn in]
it it ir
ir w ir h ir
ir
ir
y
W H
W w r
H h r
x
Haltonized Log Likelihood
76/76
Summary
The skew normal distribution
Two useful models for panel data (and one
potentially useful model pending development)
Extension of TRE model that allows both transient and
persistent random variation and inefficiency
Sample selection corrected stochastic frontier
Spatial autocorrelation stochastic frontier model
Methods: Maximum simulated likelihood as an
alternative to received brute force methods
Simpler
Faster
Accurate
Simulation “chatter” is a red herring – use Halton sequences
77/76
Sample Selection
78/76
TECHNICAL EFFICIENCY ANALYSIS CORRECTING FOR
BIASES FROM OBSERVED AND UNOBSERVED
VARIABLES: AN APPLICATION TO A NATURAL RESOURCE
MANAGEMENT PROJECT Empirical Economics: Volume 43, Issue 1 (2012), Pages 55-72
Boris Bravo-Ureta
University of Connecticut
Daniel Solis
University of Miami
William Greene
New York University
79/76
The MARENA Program in Honduras
Several programs have been implemented to address resource degradation while also seeking to improve productivity, managerial performance and reduce poverty (and in some cases make up for lack of public support).
One such effort is the Programa Multifase de Manejo de Recursos Naturales en Cuencas Prioritarias or MARENA in Honduras focusing on small scale hillside farmers.
80/76
Expected Impact Evaluation
81/76
Methods
A matched group of beneficiaries and control
farmers is determined using Propensity Score
Matching techniques to mitigate biases that
would stem from selection on observed
variables.
In addition, we deal with possible self-selection
on unobservables arising from unobserved
variables using a selectivity correction model for
stochastic frontiers introduced by Greene (2010).
82/76
A Sample Selected SF Model
di = 1[′zi + hi > 0], hi ~ N[0,12]
yi = + ′xi + i, i ~ N[0,2]
(yi,xi) observed only when di = 1.
i = vi - ui
ui = u|Ui| where Ui ~ N[0,12]
vi = vVi where Vi ~ N[0,12].
(hi,vi) ~ N2[(0,1), (1, v, v2)]
83/76
Simulated logL for the Standard SF Model
2 212exp[ ( |) / ]
( | ,| |)2
i i u i vi i i
v
y |Uf y U
xx
2 212
| |
exp[ ( |) / ]( | ) (| |) | |
2
i
i i u i vi i i i
Uv
y |Uf y p U d U
xx
2122exp[ | | ]
(| |) , |U | 0. (Half normal)2
ii i
Up U
2 212
1
1 exp[ ( |) / ]( | )
2
R i i u ir vi r
v
y |Uf y
R
xx
2 212
=1 1
1 exp[ ( |) / ]log ( , , , ) = log
2
N R i i u ir vS u v i r
v
y |UL
R
x
This is simply a linear regression with a random constant term, αi = α - σu |Ui |
84/76
Likelihood For a Sample Selected SF Model
2 212
2
| |
| ( , , ,| |)
exp ( | |) / )
2 (1 ) ( )
( | |) /
1
| ( , , ) | ( , , ,| |) (| |) | |
i
i i i i i
i i u i v
v
i i i
i i u i i
i i i i i i i i i i iU
f y d U
y U
d dy U
f y d f y d U f U d U
x z
x
zx z
x z x z
85/76
Simulated Log Likelihood for a Selectivity
Corrected Stochastic Frontier Model
2 212
1 12
exp ( | |) / )
2
1 ( | |) /log ( , , , , , ) log 1
(1 ) ( )
i i u ir v
i
v
N Ri i u ir iS u v i r
i i
y Ud
y ULR
d
x
x z
z
The simulation is over the inefficiency term.
86/76
JLMS Estimator of ui
2 212
2
1 1
1
ˆˆ ˆ ˆexp ( | |) / )
ˆ 2ˆ
ˆˆ ˆ ˆ ˆ( | |) /
ˆ1
1 1ˆ ˆˆ ˆˆ = ( | |) ,
ˆˆ Estimator of [ | ]
ˆ
ˆ
i i u ir v
v
ir
i i u ir v i
R R
i u ir ir i irr r
ii i i
i
R
irr
y U
f
y U a
A U f B fR R
Au E u
B
g
x
x
1
1
ˆˆ ˆ ˆ| | where , 1
ˆ
Riru ir ir irR r
irr
fU g g
f
87/76
Closed Form for the Selection Model
The selection model can be estimated without
simulation
“The stochastic frontier model with correction
for sample selection revisited.” Lai, Hung-pin.
Forthcoming, JPA
Based on closed skew normal distribution
Similar to Maddala’s 1982 result for the linear
selection model. See slide 42.
Not more computationally efficient.
Statistical properties identical.
Suggested possibility that simulation chatter is an element of
inefficiency in the maximum simulated likelihood estimator.
88/76
Spanish Dairy Farms: Selection based on being farm #1-125. 6 periods
The theory works.
Closed Form vs. Simulation
89/76
Variables Used
in the Analysis
Production
Participation
90/76
Findings from the First Wave
91/76
A Panel Data Model
Selection takes place only at the baseline.
There is no attrition.
0 0 0
0
0 0 0
1[ > 0] Sample Selector
, 0,1,... Stochastic Frontier
Selection effect is exerted on ; Corr( , , )
( , ) ( ) ( | )
C
i i i
it i it it it
i i i
it i i it i
d h
y w v u t
w h w
P y d P d P y d
z
x
0
0 1 0 00
0 1 0 0 0
onditioned on the selection ( ) observations are independent.
( , ,..., | ) ( | )
I.e., the selection is acting like a permanent random effect.
( , ,..., , ) ( ) (
i
T
i i iT i it it
T
i i iT i i it
h
P y y y d P y d
P y y y d P d P y 0| )t id
92/76
Simulated Log Likelihood
,
2 212
1 1 0
0
2
log ( , , , , )
exp ( | |) / )
21log
( | |) /
1
i
S C u v
it it u itr v
T vR
d r t
it it u itr v i
L
y U
R y U a
x
x
93/76
Benefit group is more efficient in both years
The gap is wider in the second year
Both means increase from year 0 to year 1
Both variances decline from year 0 to year 1
Main Empirical Conclusions from Waves 0 and 1
94/76
95/76
Spatial Autocorrelation
96/76
Spatial Stochastic Frontier Models: Accounting for Unobserved
Local Determinants of Inefficiency: A.M.Schmidt, A.R.B.Morris,
S.M.Helfand, T.C.O.Fonseca, Journal of Productivity Analysis, 31,
2009, pp. 101-112
Simply redefines the random effect to be a ‘region effect.’ Just a
reinterpretation of the ‘group.’ No spatial decay with distance.
True REM does not “perform” as well as several other
specifications. (“Performance” has nothing to do with the frontier
model.)
True Random Spatial Effects