Upload
cunningjames
View
74
Download
0
Embed Size (px)
DESCRIPTION
Testing
Citation preview
Chapter 2: Variably Parametric Nonlinear Regression with Endogenous Switching
James Cunningham
(September, 2012)
Introduction
• Most empirical research in health economics (HE) focuses on measurement of policy-relevant
causal effects: what effect would an exogenously mandated change in the (policy) variable
have on the outcome of interest?
• HE is replete with nonlinear outcomes: non-negative; count-valued; highly skewed; etc.
• The dissertation as a whole treats practical methods of estimating endogenous treatment
effects in nonlinear models.
• This paper — Chapter 2 — develops some flexible but parametric estimators in the case of
binary endogenous switching, methods which are either
o Minimally parametric (requiring specification of the conditional mean)
o Full information (requiring specification of the conditional density)
Introduction (contd)
• These form foundation of the dissertation, drawing upon the research of Terza (1998, 2009,
etc).
• I demonstrate two estimators:
o Minimally parametric with specification of a conditional mean; by example we use an
exponential conditional mean with a linear index.
o Fully parametric with specification of the conditional density of the outcome; by example
we use the three-parameter generalized gamma (Manning et al. [2005]).
• In the sections that follow we introduce the estimation objective (average treatment effect);
give detail on the estimators; provide a Monte Carlo study of their efficiency properties; and
apply them to real data.
Estimation Objective: Average Treatment Effect from a Potential Outcomes Perspective
• Consider measurement of the effect of a policy-relevant variable Xp on an outcome Y.
• Distinguish between the observed Xp and its exogenously mandated counterpart Xp* , and
similarly between Y and its potential (possibly counterfactual) value YXp
* .
• Then the average treatment effect is given by
E Y1⎡⎣ ⎤⎦ − E Y0⎡⎣ ⎤⎦ (1)
• Due to the (possibly) counterfactual natural of the random variables Y1 and Y0 , (1) cannot be
estimated directly.
Estimation Objective: Average Treatment Effect (contd)
• But when controlling for a comprehensive set of variables Xo (observed), Xu (unobserved),
we can iterate expectations:
ATE = E Y1⎡⎣ ⎤⎦ − E Y0⎡⎣ ⎤⎦
= EXo ,XuE Y Xp = 1,Xo,Xu⎡⎣
⎤⎦ − E Y Xp = 0,Xo,Xu
⎡⎣
⎤⎦
⎡⎣
⎤⎦ (2)
• When correlated with Xp , ignoring the unobserved Xu will spuriously attribute some of its
effect to Xp .
• We can recover causal interpretation by formalizing the correlation between Xp and Xu , as in
Xp = 1 Wα + Xu > 0( ) (3)
where 1 ⋅( ) is a standard indicator function, W = Xo W+⎡
⎣⎤⎦ , W+ is a vector of identifying
instrumental variables, and Xu W( ) ~ N 0, 1( ).
Estimation Objective: Average Treatment Effect (contd)
• By iterating expectations, we can then write (2) as
ATE = EXo
E Y Xp = 1,Xo,Xu⎡⎣
⎤⎦ − E Y Xp = 0,Xo,Xu
⎡⎣
⎤⎦{ }ϕ Xu( ) dXu
−∞
∞⌠
⌡⎮⎡
⎣⎢
⎤
⎦⎥ (4)
• Then an estimator of (1), through (2) and (3), is
ATE = 1
nE Y Xp = 1,Xo,Xu⎡⎣
⎤⎦ − E Y Xp = 0,Xo,Xu
⎡⎣
⎤⎦{ }ϕ Xu( ) dXu
−∞
∞⌠
⌡⎮⎡
⎣⎢
⎤
⎦⎥
i=1
n∑ (5)
where E ⋅⎡⎣ ⎤⎦ denotes an estimate of an expected value.
• We thus proceed by specifying estimators as if Xu were observed, just one variable among
others.
Endogenous Treatment Effects in Continuous Nonnegative Models
• Consider the common specification
E Y Xp,Xo,Xu⎡⎣
⎤⎦ = exp Xpβp + Xoβo + Xuβu( ) (6)
• After some algebra the treatment effect from (5) can be written
ATE = 1
nexp Xoβo
+( ) exp βp( )−1( )⎡⎣⎢
⎤⎦⎥i=1
n∑ (7)
where β denotes an estimate of β , and βo+ is βo with its constant term shifted by
12βu
2 .
• We consider minimally and fully parametric approaches to the estimation of the parameters
necessary for (7).
Endogenous Treatment Effects in Continuous Nonnegative Models: Minimally Parametric
• If the conditional mean assumption (6) holds, no further assumption is required (beyond the
relationship between Xp and Xu ).
• To derive consistent estimates of the parameters, it can be shown that
E Y Xp,W⎡⎣
⎤⎦ = exp Xpβp + Xoβo
+( ) xp
Φ βu + wα( )Φ wα( ) + 1− xp( )1−Φ βu + wα( )
1−Φ wα( )⎡
⎣⎢⎢
⎤
⎦⎥⎥
(8)
• (8) can be employed in estimation via a two-step procedure: probit in the first stage and
Nonlinear least squares in the second.
Endogenous Treatment Effects in Continuous Nonnegative Models: Fully Parametric
• When further assumptions can or must be made, we must consider a full-information version
of the model above. Letting gg refer to the generalized gamma, assume that
f Y Xo,Xp,Xu( ) = gg Y X;µ,κ,σ( )= γ γ
σY γΓ γ( )exp Z γ − U( ) (9)
X = Xp Xo Xu
⎡⎣ ⎤⎦ , µ = Xpβp + Xoβo + Xuβu , γ = κ−2
, Z = sgn κ( ) log y−µ( ) / σ ,
and U = γ exp κ Z( ) • The generalized gamma is highly flexible: it fits the nonnegative, highly skewed outcomes
common in HE, and subsumes many popular distributions (gamma, Weibull, exponential,
lognormal)
Endogenous Treatment Effects in Continuous Nonnegative Models: Fully Parametric
• Further:
E Y Xp,Xo,Xu⎡⎣
⎤⎦ = exp µ + k( ) (10)
where k = σ / κ( )log κ2( ) + log Γ κ−2 + σ / κ⎡
⎣⎤⎦( )− log κ−2( )
• Thus the average treatment effect estimator takes the above form, after adding the correction
k.
• It can be shown that (11)
L α,β,µ,κ,σ Y,Xp;W( ) = Xpi logXpi gg Yi Xi;κ,µ,σ( )ϕ Xu( ) dXu +−wα
∞⌠⌡⎮
1− Xpi( ) gg Yi Xi;κ,µ,σ( )ϕ Xu( ) dXu−∞
−wα⌠⌡⎮
⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪i=1
n∑
• The parameters β and α can be jointly estimated via maximum likelihood using (11).
Monte Carlo Simulations
• To evaluate the consistency properties of the above estimators, we undertake a Monte Carlo
study. In all simulations the data generating process takes the following form:
Xo ~ U −0.5,1( ) , W ~ U 0,1( ), Xu ~ N 0,1( )
Xp = 1 Xoαo + Wαw +αc + Xu > 0( )
µ = Xpβp + Xoβo + Xuβu +βc , κ = 0.8, σ = 0.4
Y ~ GeneralizedGamma µ,σ,κ( )
αo αW αc⎡⎣ ⎤⎦ = 1 1 0.5⎡⎣ ⎤⎦
βp βo βu βc⎡⎣ ⎤⎦ = 1 1 0.5 0.25⎡⎣ ⎤⎦
• The average treatment effect was estimated by the above.
Monte Carlo Simulations (contd)
With 500 repetitions each with sample sizes 5,000; 10,000; 50,000; and 100,000, we compute the
absolute percentage bias for each parameter: ABP β( ) = 1
mβ − ββi=1
m∑ .
Endogenous Treatment: Minimally Parametric Exponential Conditional Mean Estimator
βp = 1 βo = 1 βu = 0.5 βc = 0.25 ATE = 2.22
n Est ABP Est ABP Est ABP Est ABP Est ABP
5,000 0.995 7.65% 1.002 2.82% 0.504 11.24% 0.247 12.97% 2.201 6.24%
10,000 0.996 5.58% 1.002 1.97% 0.504 8.14% 0.249 9.88% 2.208 4.47%
50,000 1.002 2.38% 1.000 0.90% 0.498 3.55% 0.249 4.07% 2.219 1.91%
100,000 0.998 1.72% 1.000 0.67% 0.501 2.53% 0.250 2.84% 2.212 1.41%
Monte Carlo Simulations (contd)
Endogenous Treatment: Full-Information Generalized Gamma Estimator
βp = 1 βo = 1 βu = 0.5 βc = 0.25
n Est ABP Est ABP Est ABP Est ABP
5,000 1.008 2.20% 0.998 0.91% 0.494 2.60% 0.240 8.34%
10,000 1.007 1.62% 0.999 0.67% 0.495 1.80% 0.243 6.12%
50,000 1.006 0.86% 0.999 0.32% 0.496 1.04% 0.243 3.45%
100,000 1.006 0.71% 0.999 0.23% 0.496 0.92% 0.243 3.03%
ATE = 2.22 κ = 0.8 σ = 0.4
Est ABP Est ABP Est ABP
5,000 2.226 2.12% 0.773 7.10% 0.406 2.95%
10,000 2.229 1.64% 0.777 5.25% 0.406 2.25%
50,000 2.227 0.86% 0.777 3.21% 0.406 1.53%
100,000 0.223 0.62% 0.778 2.83% 0.405 1.38%
Monte Carlo Simulations (contd)
• On average, the parameter estimates are hit relatively well.
• There are clear efficiency advantages to using the full-information estimator — percentage
biases are low even in small samples.
• In small samples using the minimally parametric estimator, βu appears subject to some bias,
but implications for treatment effect estimation seems minimal.
• In future revisions simulations should draw upon correct standard errors to characterize the
seriousness of these implications in determining (and correcting for) endogeneity bias in small
samples.
Real Data Example
• To provide an empirical demonstration, we applied both estimators above to the birthweight
data from Mullahy (1997), who investigated the role played by maternal cigarette smoking in
determining birthweight.
• Consider birthweight production to be a function of a binary indicator (cig) for whether the
mother smoked during pregnancy, other relevant covariates ( Xo), and any unobservable
determinants of birthweight ( Xu ):
E BirthWeight cig,Xo,Xu⎡⎣ ⎤⎦ = exp cig ⋅βcig + Xoβo + Xuβu( ) (12)
in the minimally parametric case, and (13)
BirthWeight cig,Xo,Xu( ) ~ GeneralizedGamma κ,µ = cig ⋅βcig + Xoβo + Xuβu ,σ( )
Real Data Example (contd)
• The observable vector Xo contains birth order (parity), an indicator for race (white v.
nonwhite), an indicator for gender, and a constant;
• The variable of instruments contains parental education, family income, and the per-state
cigarette excise tax. Results
Birthweight Model with Endogenous Treatment Effect
Minimally Parametric (Exp Cond Mean)
Fully Parametric (Generalized Gamma)
Coefficient T-Statistic P-Value Coefficient T-Statistic P-Value
Smoked During Pregnancy -0.17 -3.82 0.00 -0.15 -7.10 0.00 Parity 0.02 3.06 0.00 0.01 2.81 0.01 White 0.06 4.65 0.00 0.05 4.19 0.00 Male 0.02 2.31 0.02 0.02 1.91 0.06 Constant 1.95 124.33 0.00 1.99 130.90 0.00 Xu 0.05 2.23 0.03 0.04 5.30 0.00 Effect of Cig on B.Wt. (lbs) -1.18 -4.26 0.00 -1.03 -7.80 0.00 κ 0.60 4.77 0.00 σ 0.16 20.00 0.00
All parameter estimates significant at conventional levels. Standard errors corrected for multi-step estimation.
Real Data Example (contd)
• Results are broadly consistent between minimally and maximally parametric estimators,
although there are appear to be some efficiency gains from using maximum likelihood.
• In the minimally parametric case, maternal smoking appears to lead to a loss of 1.18 pounds;
and in the fully parametric case a loss of 1.03 pounds.
• Both are considerably different from a treatment effect estimate using NLS with an
exponential conditional mean that did not correct for endogeneity, which implies an average
drop in birthweight of about 0.57 pounds.
• Estimates of parameters κ and σ are statistically significant, so use of the generalized gamma
does appear to offer an opportunity for greater fit.