Download pdf - Metrics Guide 1

8/3/2019 Metrics Guide 1

http://slidepdf.com/reader/full/metrics-guide-1 1/29

Study Guide for Econometrics

Unit 0: Preliminaries

Data types

Cross-sectional data

Time-series data

Panel data (or longitudinal data)





Study Guide for Econometrics, page 3of29.

Total sum of squares: ( yi∑ − y)2 = (N − 1) ⋅var( y)

Model sum of squares: ( ˆ yi∑ − y)2 = (N −1) ⋅var( ˆ y)

Residual sum of squares: ( ˆ yi∑ − y

i)2 = (N − 1) ⋅var(e)

Coefficient of determination: R2 = MSS TSS = 1−RSS TSS = corr( y , ˆ y)2

Interpretation

Quality of model (useful, but incorrect)

Comparison between models

Hypothesis testing

Univariate: H 0: β

j =β j

*;H

A: β

j≠ β

j

*

Test statistic t*= (β j − β j

*)/st.err.(β j ) has t-distribution with N − k d.o.f.

Multivariate

Null hypothesis that a set of β s take on particular values;Alternative that at least one of them does not.

Test statistic has F-distribution.




Example of Stata commands:

reg y x1 x2 x3 OLS regression test x1 = 2.13 Test of individual hypothesis test x2 = -5.9, accum Joint test of hypotheses test x2 x3 Joint test of equaling zero

. reg y x1 x2 x3

Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 3, 96) = 1476.46

Model | 5241.10184 3 1747.03395 Prob > F = 0.0000 Residual | 113.592501 96 1.18325521 R-squared = 0.9788

-------------+------------------------------ Adj R-squared = 0.9781 Total | 5354.69434 99 54.0878216 Root MSE = 1.0878

------------------------------------------------------------------------------y | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------x1 | 1.983209 .109285 18.15 0.000 1.76628 2.200137x2 | -7.048816 .1111005 -63.45 0.000 -7.269349 -6.828283x3 | .0388324 .107991 0.36 0.720 -.1755282 .2531929

_cons | 3.109514 .1091322 28.49 0.000 2.892888 3.32614------------------------------------------------------------------------------

. test x1 = 2.13

( 1) x1 = 2.13

F( 1, 96) = 1.80Prob > F = 0.1824

. test x2 = -5.9, accum

( 1) x1 = 2.13( 2) x2 = -5.9

F( 2, 96) = 55.42Prob > F = 0.0000

. test x2 x3

( 1) x2 = 0( 2) x3 = 0

F( 2, 96) = 2076.70Prob > F = 0.0000

Top-left table: “SS” column contains the MSS, RSS, and TSS. Disregard “df” and “MS”.

Top-right table: number of observations; F-statistic for the “overall significance” of the regression

(testing the hypothesis that all of the explanatory variables have zero effect); p-value of thishypothesis; R

2

of the regression. Disregard “adjusted R2

and “Root MSE”.

Bottom table: “Coef.” column contains estimates of β ; the next column has standard errors of

each β ; then the t-statistic testing the hypothesis that this variable has zero effect; then the p-value of this test; finally, a 95% confidence interval for the estimated coefficient.




Unit 2: Data Concerns

Collinearity

Perfect collinearity: one explanatory variable is a linear function of others.

Implication: ˆβ cannot be estimated.

Solution: Drop one variable; modify interpretation.

Near collinearity: high correlation between explanatory variables.

Implication: ˆβ has large standard errors.

Solutions: Dropping variables (discouraged); Change nothing, butfocus on joint significance (preferred).

Specification

Rescaling variables: no theoretical difference (some practical concerns) {6.1}

Omitted variables: Omittingx3 from the model causes

E[β 2] = β

2+

cov(x2,x

3)

var(x2)

β 3

(“omitted variable bias”)

Irrelevant variables: Including irrelevant x3

introduces no bias in

estimation of β , and E[β 3] = β

3= 0 .

Qualitative variables

Dummy variables: values of 0 or 1, depending on whether acondition is met.

Categorical variables: convert to a series of dummy variables; omitthe “reference” category.

Nonlinear models

Common nonlinear specifications

Quadratrics (for changing marginal effects)

yi = β

1+ β

2x

i + β 3x

i

2+ e

i; Δ y Δx = β

2+ β

3x

i.

Logarithms (for percentage changes and elasticities)

yi =

β 1+ β

2ln(x

i)+ e

i; β

2 Δ y %Δx .

ln( yi ) = β 1 + β 2xi +ei ; β 2 %Δ y Δx .

ln( yi) = β

1+ β

2ln(x

i)+ e

i; β

2%Δ y %Δx .

Interactions (for complementarities)

yi =

β 1+ β

2x

2 i +β

3x

3 i +β

4(x

2 i⋅ x

3 i) ;

Δ y Δx2= β

2+ β

4x3i

and Δ y Δx3= β

3+ β

4x2i

.




Interactions with dummy variables

Choosing a specification

Economic theory (preferred)

Eyeballing data

Comparison of R2 values (dangerous)

Testing a specification

Simple: inclusion of higher order terms

Ramsey’s Econometric Specification Error Test (RESET)

Dangers of “data mining” (and specification mining)

Classical measurement error

True model: yi= β

1+ β

2x

i+ e

i , but x

i= x

i+m

imeasured instead.

“Classical”:E

[m

i everything]=

0 .

Implication: E[β OLS

] = β var(x)

var(x)+ var(m)(“attenuation bias”; “bias toward zero”)

Special case: tests of H 0: β

2= 0 unaffected.

Unusual observations (“outliers”)

Implication: OLS is highly sensitive to extreme values of yi.

Solutions:

Dropping outliers (dangerous)

Least absolute deviations estimator: ˆβ LAD

to min∑ yi− x

iβ .

No adjustments (recommended)

Interpretation of OLS results

Experimental data: researcher manipulates x values.

Correlation can be interpreted as causal effect.

Empirical data: generated through real-world processes

Factors contributing to observed correlation, aside from effect of x on y:

Unobserved heterogeneityReverse causalitySelection






Ramsey Econometric Specification Test

. reg y x1 x2 x3

Source | SS df MS Number of obs = 2134-------------+------------------------------ F( 3, 2130) = 437.98

Model | 35040.6408 3 11680.2136 Prob > F = 0.0000Residual | 56803.8176 2130 26.668459 R-squared = 0.3815

-------------+------------------------------ Adj R-squared = 0.3807Total | 91844.4584 2133 43.0588178 Root MSE = 5.1642


-------------+----------------------------------------------------------------x1 | -1.185214 .1127196 -10.51 0.000 -1.406266 -.9641618 x2 | 2.081589 .1122702 18.54 0.000 1.861418 2.301759 x3 | -3.18763 .1100042 -28.98 0.000 -3.403357 -2.971904

_cons | -.5286567 .111807 -4.73 0.000 -.7479189 -.3093945 ------------------------------------------------------------------------------

. predict yhat(option xb assumed; fitted values)

. gen yhat2 = yhat^2




. reg y x1 x2 x3 yhat2 yhat3 yhat4 yhat5




------------------------------------------------------------------------------

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------x1 | -1.281567 .128619 -9.96 0.000 -1.533799 -1.029335 x2 | 2.237487 .1578664 14.17 0.000 1.927898 2.547076 x3 | -3.419771 .1976777 -17.30 0.000 -3.807433 -3.032109

yhat2 | -.007986 .0107457 -0.74 0.457 -.0290591 .0130871 yhat3 | -.0030688 .0018225 -1.68 0.092 -.0066428 .0005051 yhat4 | -.0001027 .0001054 -0.97 0.330 -.0003094 .000104 yhat5 | .0000144 .0000111 1.29 0.197 -7.49e-06 .0000362 _cons | -.361512 .1569125 -2.30 0.021 -.6692299 -.053794

------------------------------------------------------------------------------

. test yhat2 yhat3 yhat4 yhat5

( 1) yhat2 = 0 ( 2) yhat3 = 0

( 3) yhat4 = 0 ( 4) yhat5 = 0

F( 4, 2126) = 4.67 Prob > F = 0.0009

Note: Despite the marginal significance of all the “yhat” terms in this example, they are jointlyhighly significant.




Unit 3: Weighted and Generalized Least Squares Regression

Heteroskedasticity: E[ei

2x

i] = σ

i

2≠ σ

2 .

OLS unbiased as long as E[e X] = 0 holds.

Variance calculation incorrect.

Robust standard errors: var(β OLS

) = ( ′X X)−1 ′X ˆ′e eX( ′X X)−1 .

OLS inefficient.

Testing for heteroskedasticity

White test

Breusch-Pagan test

Generalized Least Squares (GLS)

Objective: pick

ˆβ GLS to

min ˆ′e Ωe

, for some symmetricN × N

matrixΩ

.Estimator: β

GLS= ( ′X ΩX)−1( ′X ΩY) .

Unbiasedness: if E[e X] = 0 , then any GLS estimator is unbiased.

Most efficient: Ω = (E[ ′e e])−1 .

Special cases of GLS:

Weighted least squares (WLS): Ω is a diagonal matrix; mostefficient with heteroskedasticity (and no cross correlation).

Ordinary least squares (OLS): Ω is the identity matrix; mostefficient with homoskedasticity (and no cross-correlation).

Feasible Generalized Least Squares (FGLS)

Problem: In practice, Ω is unknown.

Solution: Use OLS to predict e ; then calculate ˆΩ ; use in place of unknown Ω .

Estimator: β FGLS

= ( ′X ΩX)−1( ′X ΩY) .




Examples of Stata commands:

reg y x1 x2 x3 OLS regression hettest Breusch-Pagan test reg y x1 x2 x3, robust OLS regression with robust st. errors reg y x1 x2 x3 [weight=omega] WLS regression

. reg y x1 x2 x3





-------------+----------------------------------------------------------------x1 | 1.929843 .1192645 16.18 0.000 1.693104 2.166581

x2 | -7.02553 .1212458 -57.94 0.000 -7.266201 -6.784859x3 | .0407538 .1178524 0.35 0.730 -.1931813 .2746889 _cons | 2.985645 .1190978 25.07 0.000 2.749238 3.222053

------------------------------------------------------------------------------

. hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticityHo: Constant varianceVariables: fitted values of y

chi2(1) = 9.38Prob > chi2 = 0.0022

. reg y x1 x2 x3, robust

Linear regression Number of obs = 100 F( 3, 96) = 812.97 Prob > F = 0.0000 R-squared = 0.9746 Root MSE = 1.1871

------------------------------------------------------------------------------| Robust

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

x1 | 1.929843 .1055009 18.29 0.000 1.720425 2.13926x2 | -7.02553 .179712 -39.09 0.000 -7.382255 -6.668804x3 | .0407538 .2123003 0.19 0.848 -.380659 .4621666

_cons | 2.985645 .1202126 24.84 0.000 2.747025 3.224266------------------------------------------------------------------------------




Weighted Least Squares for heteroskedasticity

. reg y x1 x2 x3


Model | 15297.6754 3 5099.22513 Prob > F = 0.0000

Residual | 203043.149 496 409.361188 R-squared = 0.0701-------------+------------------------------ Adj R-squared = 0.0644

Total | 218340.825 499 437.556763 Root MSE = 20.233


-------------+----------------------------------------------------------------x1 | 3.646955 .8512642 4.28 0.000 1.974427 5.319483x2 | -3.964182 .9390897 -4.22 0.000 -5.809266 -2.119098x3 | .6393417 .8702385 0.73 0.463 -1.070467 2.34915

_cons | 3.094226 .9082631 3.41 0.001 1.309709 4.878744------------------------------------------------------------------------------

. predict ehat, resid

. gen ehat2 = ehat^2

. gen x1sq = x1^2. gen x2sq = x2^2

. gen x3sq = x3^2

. gen x1x2 = x1*x2

. gen x1x3 = x1*x3

. gen x2x3 = x2*x3

. quietly reg ehat2 x1 x2 x3 x1sq x2sq x3sq x1x2 x1x3 x2x3

. predict ehat2hat(option xb assumed; fitted values)

. gen omega = 1/(ehat2hat)^.5(204 missing values generated)

. reg y x1 x2 x3 [weight = omega](analytic weights assumed)

(sum of wgt is 2.0025e+01)





-------------+----------------------------------------------------------------x1 | 3.223748 .9682773 3.33 0.001 1.318061 5.129435x2 | -5.863431 .9979716 -5.88 0.000 -7.82756 -3.899302x3 | 1.226024 .9680623 1.27 0.206 -.6792399 3.131288

_cons | 3.44261 1.005671 3.42 0.001 1.463327 5.421893------------------------------------------------------------------------------

Note: Some of the predicted variances were negative, so the weights were couldn’t be calculatedfor these observations, and they were dropped from the WLS regression. The smaller sample sizehides some of the increase in precision.




Unit 4: Instrumental Variables Regression

Endogeneity: E[e X] ≠ 0 .

OLS with endogenous regressors: E[β OLS

] = β + ( ′X X)−1( ′X e) ; biased.

Instrumental variable: has ability to predict endogenous regressors.Assumptions/requirements for the instrument:

At least as many instruments as explanatory variables; #Z ≥#X .

Note: if x j

is exogenous, it is technically used as an instrument for itself.

Uncorrelated with unobservables; E[e Z] = 0 ⇒ E[ ′Z e] = 0 ( E[e X] ≠ 0

usually, but not necessarily.)

Correlated with the endogenous explanatory variables; ( ′Z X) is invertible,when same number of instruments. (Generally: ( ′Z X) is of full rank.)

Two-Stage Least Squares (2SLS)

First stage: X = Zγ + u ⇒ γ = ( ′Z Z)−1( ′Z X) ⇒ ˆX = Zγ .

Second stage: regression of Y on ˆX yields β 2SLS

= ( ˆ ′X X)−1( ˆ ′X Y)

(Standard errors incorrect)

Instrumental Variables Regression: direct computation

β 2SLS =

( ˆ ′X X)−1( ˆ ′X Y) is equivalent to ( ′Z X)−1( ′Z Y) = β IV

(when #Z =#X )

Variance in estimator:

Estimated variance: var(β IV ) = ( ′Z X)

−1( ′Z Z)( ′X Z)−1σ

e

2

Inefficiency: var(β IV ) =

var(β OLS

)

corr(x ,z)2 , when #X =#Z = 1 .

Post-estimation tests

Hausman test: explanatory variables are endogenous.

Motivation: because of efficiency, OLS is preferred to IV if no endogeneity.

Null hypothesis: E[β IV

] = β = E[β OLS

] , var[β IV ] ≥ var[β

OLS] .

Alternative hypothesis: E[β IV

] = β .

Check for weak instruments: cov(x ,z) ≈ 0 ?

Motivation: with weak instruments, very inefficient; biases get magnified;also, distribution of estimator may not be approximately normal.

Correlations; F-statistics from first stage.







IV regression: x1 and x2 are exogenous, x3 and x4 are endogenous, and z1 , z2 ,and z3 (plus x1 and x2) are instruments.

ivreg y x1 x2 (x3 x4 = z1 z2 z3)

IV regression, displaying first-stage results.ivreg y x1 x2 (x3 x4 = z1 z2 z3), first

Hausman test

ivreg y x1 x2 (x3 x4 = z1 z2 z3) est sto ivest reg y x1 x2 x3 x4 est sto olsest hausman ivest olsest

Test of over-identification

ivreg y x1 x2 (x3 x4 = z1 z2 z3) predict ehat, residreg ehat x1 x2 z1 z2 z3




. ivreg y x1 x2 (x3 x4 = z1 z2 z3)

Instrumental variables (2SLS) regression





-------------+----------------------------------------------------------------x3 | -1.771296 .4852945 -3.65 0.000 -2.723393 -.8191982x4 | 4.056543 .1996468 20.32 0.000 3.664856 4.448229x1 | .9336217 .25295 3.69 0.000 .4373601 1.429883x2 | 2.284979 .2538783 9.00 0.000 1.786896 2.783062

_cons | -5.534924 .2459335 -22.51 0.000 -6.01742 -5.052428------------------------------------------------------------------------------Instrumented: x3 x4Instruments: x1 x2 z1 z2 z3------------------------------------------------------------------------------

. est sto ivest

. reg y x1 x2 x3 x4





-------------+----------------------------------------------------------------x1 | 1.059869 .1963066 5.40 0.000 .6747357 1.445002

x2 | 2.661992 .1938744 13.73 0.000 2.281631 3.042354x3 | .785674 .1315465 5.97 0.000 .5275934 1.043755x4 | 2.795138 .0945471 29.56 0.000 2.609647 2.98063

_cons | -5.416561 .1896704 -28.56 0.000 -5.788675 -5.044447------------------------------------------------------------------------------

. est sto olsest

. hausman ivest olsest

---- Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B))| ivest olsest Difference S.E.

-------------+----------------------------------------------------------------x3 | -1.771296 .785674 -2.55697 .4671255 x4 | 4.056543 2.795138 1.261404 .17584 x1 | .9336217 1.059869 -.1262471 .1595225

x2 | 2.284979 2.661992 -.377013 .1639113 ------------------------------------------------------------------------------

b = consistent under Ho and Ha; obtained from ivregB = inconsistent under Ha, efficient under Ho; obtained from regress

Test: Ho: difference in coefficients not systematicchi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)

= 88.01Prob>chi2 = 0.0000






Unit 5: Systems of Equations

Endogenous: the value of yi

is determined in part by other variables in the system.

Exogenous: the value of xi

is determined by unrelated, outside forces.

Simultaneity bias

Model: y1i = β

1+ β

2 y

2i +…; y

2i = γ 1+ γ

2 y

1i +…

OLS estimates of β and γ are biased.

2SLS for systems of equations with endogenous regressors

Model: y1i =

β 1+ β

2 y

2 i +β 3x1i +

β 4x2i +

e1i

; y2i = γ

1+ γ

2 y

1i + γ 3x1i + γ

4x3i +

e2i.

Someoverlapinexogenousexplanatoryvariables.

Someendogenousvariablesarepredictorsofothers.

Twoormoreequations.

Assumptions: all x variables are exogenous.

“Identification”: to measure the effect of yi

on other outcomes, musthave one variable in this equation that does not appear in others.

Technique:

1. Regress each endogenous explanatory variable yi

on its

exogenous determinants; obtain predicted values, ˆ yi

.

2. Regress each outcome on its exogenous explanatorydeterminants and the predicted values of the endogenous variables.

Note: incorrect standard errors.

Inefficiency: does not take advantage of correlation between anindividual’s unobservables in different equations.

Seemingly Unrelated Regression (SUR) for systems with exogenous regressors

Model: y1i= x

1iβ 1+ e

1i; y

2i= x

2 iβ 2+ e

2i; etc.

Some (or complete) overlap in explanatory variables.

Two or more equations.

OLS estimation of each equation separately is unbiased, but inefficient.

Motivation for SUR: accounting for cross-equation correlation inunobservables can yield more precise estimates.

Technique: FGLS.

3SLS for systems of equations with endogenous regressors

Model: y1i= β

1+ β

2 y

2 i+ β

3x1i+ β

4x2i+ e

1i; y

2i= γ

1+ γ

2 y

1i+ γ

3x1i+ γ

4x3i+ e

2i.

Someoverlapinexogenousexplanatoryvariables.




Someendogenousvariablesarepredictorsofothers.

Twoormoreequations.

Motivation: Correct for simultaneity bias, plus improve precision.

Technique: 2SLS combined with FGLS.

“Identification”: to measure the effect of yi

on other outcomes, must haveone variable in this equation that does not appear in others.

Efficiency: most efficient estimator.





. reg3 (y1 = x1 x2) (y2 = x1 x3), sur

Seemingly unrelated regression----------------------------------------------------------------------Equation Obs Parms RMSE "R-sq" chi2 P

----------------------------------------------------------------------y1 1000 2 14.30263 0.0384 44.97 0.0000y2 1000 2 14.02793 0.0783 106.18 0.0000----------------------------------------------------------------------

------------------------------------------------------------------------------| Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------y1 |

x1 | 1.991289 .4480522 4.44 0.000 1.113123 2.869455x2 | -1.972894 .3969844 -4.97 0.000 -2.750969 -1.194818

_cons | 5.011258 .4524773 11.08 0.000 4.124419 5.898097-------------+----------------------------------------------------------------y2 |

x1 | -3.0162 .4393652 -6.86 0.000 -3.877339 -2.15506

x3 | 3.050422 .3966516 7.69 0.000 2.272999 3.827845 _cons | 6.868261 .4444411 15.45 0.000 5.997172 7.739349------------------------------------------------------------------------------

. reg3 (y1 = x1 x2 y2) (y2 = x1 x3 y1)

Three-stage least-squares regression----------------------------------------------------------------------Equation Obs Parms RMSE "R-sq" chi2 P----------------------------------------------------------------------y1 1000 3 26.71598 -0.1711 13.76 0.0032y2 1000 3 26.19243 -1.8229 59.07 0.0000----------------------------------------------------------------------

------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------y1 |

x1 | 2.549894 1.563719 1.63 0.103 -.5149385 5.614726x2 | -3.443286 1.023338 -3.36 0.001 -5.448992 -1.437579y2 | .7138822 .264056 2.70 0.007 .196342 1.231422

_cons | 10.61793 1.006379 10.55 0.000 8.645468 12.5904-------------+----------------------------------------------------------------y2 |

x1 | -5.942887 .9132913 -6.51 0.000 -7.732905 -4.152869x3 | 5.445124 1.226737 4.44 0.000 3.040764 7.849484y1 | -.9107926 .4018143 -2.27 0.023 -1.698334 -.123251

_cons | 12.70989 4.843172 2.62 0.009 3.217444 22.20233------------------------------------------------------------------------------Endogenous variables: y1 y2

Exogenous variables: x1 x2 x3------------------------------------------------------------------------------




Unit 6: Policy Analysis

Before-and-after comparisons

Advantages: simplicity

Disadvantage: natural history, natural trend.

Controlling for effects of time

Difference-in-Difference estimation

“Counterfactual”

Natural experiments

Exogeneity requirements

Criticisms

Serial correlation

Exogeneity of policy




Unit 7: Panel Data Models

Panel data model: yit =

xitβ + e

it; e

it= c

i+ u

it.

Repeated observations of same individuals over time.

Permanent component and transitory component to unobservable.

Strict exogeneity: E[uitx

is] = 0 .

Error structure: E[ ′eie

i] = I

T ×T σ

u

2+ 1

T ×T σ

c

2 .

Pooled OLS (POLS): treats all observations as if from distinct individuals.

Unbiased if E[eit

xit] = 0 ; requires strict exogeneity and E[c

ix

it] = 0 .

Variance calculation incorrect

“Clustered” standard errors

Inefficient, because of cross-correlation between unobservables.

Random effects (RE): GLS with ΩRE

−1= E[ ′e

iei] = I

T ×T σ

u

2+ 1

T ×T σ

c

2 .

Estimator: β RE = ( ′X ΩREX)−1( ′X Ω

REY) .

Unbiased if E[eit

xit] = 0 ; requires strict exogeneity and E[c

ix

it] = 0 .

Most precise.

Fixed effects: OLS with transformed data.

Fixed effects transformation: xit

FE= x

it− x

i , y

it = y

it− y .

Estimator: β FE = ( ′X X)−1( ′X Y) .

Unbiased if E[eit

xit] = 0 ; requires only strict exogeneity.

Possibly inefficient.

First differences: OLS with differenced data.

Fixed effects transformation: Δxit= x

it− x

it−1 , Δ y

it = y

it− y

it−1.

Estimator: β FD= (Δ ′X ΔX)−1(Δ ′X ΔY) .

Unbiased if E[ΔeitΔx

it] = 0 ; requires (less than) strict exogeneity.

Possibly inefficient.

Dummy variables: OLS with a dummy variable for each individual.

Equivalent to Fixed Effects.

Relaxation of strict exogeneity






Examples of Stata commands

Dataset originally in “wide” format: 1000 observations of the variables y73 , y74 ,y75 , xa73 , xa74 , xa75 , xb73 , xb74 , xb75 , xc (time-invariant), and id (identifier).

. reshape long y xa xb, i(id) j(year)(note: j = 73 74 75)

Data wide -> long-----------------------------------------------------------------------------Number of obs. 1000 -> 3000 Number of variables 11 -> 6 j variable (3 values) -> year xij variables:

y73 y74 y75 -> y xa73 xa74 xa75 -> xa xb73 xb74 xb75 -> xb

-----------------------------------------------------------------------------

. reg y xa xb xc, cluster(id)Linear regression Number of obs = 3000

F( 3, 999) = 1221.84 Prob > F = 0.0000 R-squared = 0.5720 Root MSE = 5.0551

(Std. Err. adjusted for 1000 clusters in id)------------------------------------------------------------------------------

| Robusty | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------xa | 2.149723 .0918978 23.39 0.000 1.969388 2.330058xb | -3.151929 .0910431 -34.62 0.000 -3.330586 -2.973271xc | 4.564069 .1006943 45.33 0.000 4.366472 4.761665

_cons | 1.054455 .0942891 11.18 0.000 .8694272 1.239482------------------------------------------------------------------------------

. xtreg y xa xb xc, re i(id)

Random-effects GLS regression Number of obs = 3000Group variable: id Number of groups = 1000

R-sq: within = 0.3781 Obs per group: min = 3between = 0.7303 avg = 3.0overall = 0.5720 max = 3

Random effects u_i ~ Gaussian Wald chi2(3) = 3915.38corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

------------------------------------------------------------------------------y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

xa | 2.151934 .0920012 23.39 0.000 1.971615 2.332253xb | -3.150849 .0906708 -34.75 0.000 -3.32856 -2.973137xc | 4.564102 .0970299 47.04 0.000 4.373927 4.754278

_cons | 1.054451 .0942954 11.18 0.000 .8696358 1.239267-------------+----------------------------------------------------------------

sigma_u | .73494385 sigma_e | 5.0024879

rho | .02112818 (fraction of variance due to u_i)




------------------------------------------------------------------------------

. est sto reest

. xtreg y xa xb xc, fe i(id)note: xc omitted because of collinearity

Fixed-effects (within) regression Number of obs = 3000Group variable: id Number of groups = 1000

R-sq: within = 0.3782 Obs per group: min = 3between = 0.1315 avg = 3.0overall = 0.2423 max = 3

F(2,1998) = 607.55corr(u_i, Xb) = -0.0056 Prob > F = 0.0000


-------------+----------------------------------------------------------------xa | 2.204303 .1121128 19.66 0.000 1.984432 2.424173 xb | -3.125336 .108589 -28.78 0.000 -3.338295 -2.912376 xc | (omitted)

_cons | .9069813 .0913603 9.93 0.000 .7278098 1.086153 -------------+----------------------------------------------------------------

sigma_u | 5.3422733 sigma_e | 5.0024879

rho | .53281074 (fraction of variance due to u_i)------------------------------------------------------------------------------F test that all u_i=0: F(999, 1998) = 1.06 Prob > F = 0.1326

. est sto feest

. hausman feest reest

---- Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B))| feest reest Difference S.E.

-------------+----------------------------------------------------------------

xa | 2.204303 2.151934 .0523684 .0640706 xb | -3.125336 -3.150849 .0255129 .0597526 ------------------------------------------------------------------------------

b = consistent under Ho and Ha; obtained from xtregB = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)= 0.82

Prob>chi2 = 0.6640

. reshape wide xa xb y, i(id) j(year)(note: j = 73 74 75)

Data long -> wide

-----------------------------------------------------------------------------Number of obs. 3000 -> 1000 Number of variables 8 -> 13 j variable (3 values) year -> (dropped)xij variables:

xa -> xa73 xa74 xa75 xb -> xb73 xb74 xb75 y -> y73 y74 y75

-----------------------------------------------------------------------------




. gen dy74 = y74-y73

. gen dy75 = y75-y74

. gen dxa74 = xa74-xa73

. gen dxa75 = xa75-xa74

. gen dxb74 = xb74-xb73

. gen dxb75 = xb75-xa74

. reshape long xa xb y dy dxa dxb, i(id) j(year)(note: j = 73 74 75)(note: dy73 not found)(note: dxa73 not found)(note: dxb73 not found)

Data wide -> long-----------------------------------------------------------------------------Number of obs. 1000 -> 3000 Number of variables 19 -> 11 j variable (3 values) -> year xij variables:

xa73 xa74 xa75 -> xa xb73 xb74 xb75 -> xb

y73 y74 y75 -> y dy73 dy74 dy75 -> dy

dxa73 dxa74 dxa75 -> dxa dxb73 dxb74 dxb75 -> dxb

-----------------------------------------------------------------------------

. reg dy dxa dxb




------------------------------------------------------------------------------dy | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------dxa | 2.930953 .1228469 23.86 0.000 2.690032 3.171875dxb | -2.665305 .120511 -22.12 0.000 -2.901646 -2.428965

_cons | -.1314784 .1692294 -0.78 0.437 -.4633631 .2004063------------------------------------------------------------------------------




Unit 8: Discrete and Limited Dependent Variables

Maximum-likelihood estimation

Philosophy: find the parameters that make the observation most likely.

General technique

1. Select a probability distribution to model the phenomenon.

2. Write out the likelihood of observing the outcome, as a functionof unknown parameters.

3. Find the values of the parameters that make the observation mostprobable.

Examples

Binomial outcome

Linear model with normally distributed unobservables

Binary outcome models: yi = 0 or yi = 1 .Objects of interest: what we want to know.

Predicted probabilities: P[ yi= 1 x

i]

Marginal effects: ∂P[ yi= 1 x

i] ∂x

i.

Linear probability model: OLS with binary outcome yi.

Advantages

Simplicity: easily calculated.

Ease of interpretation: ˆβ are estimated marginal effects; xi

ˆβ is predicted probability.

Permits IV and panel data techniques.

Disadvantages

Heteroskedasticity: given xi , e

itakes one of two values.

Implausible predicted probabilities: P[ yi =

1 xi] can be less

than 0 or greater than 1.

Inconsistency with models.

Maximum likelihood models

“Latent value”

Likelihood function: P[ yixi] = [1−CDF(−x

iβ )] yi [CDF(−x

iβ )](1− yi )

Choice of distribution for ei

Probit model: for normal distribution.




Marginal effects

Logit

Odds ratios

Interpretation of estimated coefficients

Not marginal effectsSign and relative magnitude only

Marginal effects at average, ∂P[ y = 1 x] ∂x (probit)

Odds ratio: P[ y = 1 x + 1] P[ y = 1 x] (logit)

Instrumental variables: the “forbidden regression”

(IV probit)

Logistic regression

Multiple discrete outcomesMultiple unordered outcomes: multinomial logit

Interpretation

Multiple ordered/ranked outcomes, no scale: ordered probit

Count data: Poisson regression

Censored regression

Tobit

Sample selection

Heckman





. probit y x1 x2Iteration 0: log likelihood = -1387.1993Iteration 1: log likelihood = -1260.593Iteration 2: log likelihood = -1260.3134

Iteration 3: log likelihood = -1260.3134

Probit regression Number of obs = 2134LR chi2(2) = 253.77Prob > chi2 = 0.0000

Log likelihood = -1260.3134 Pseudo R2 = 0.0915

------------------------------------------------------------------------------y | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------x1 | .4028099 .031099 12.95 0.000 .341857 .4637628x2 | -.2626652 .0297566 -8.83 0.000 -.3209871 -.2043433

_cons | .4113152 .0292439 14.07 0.000 .3539983 .4686322------------------------------------------------------------------------------

. logit y x1 x2Iteration 0: log likelihood = -1387.1993Iteration 1: log likelihood = -1262.4423Iteration 2: log likelihood = -1261.0108Iteration 3: log likelihood = -1261.0105Iteration 4: log likelihood = -1261.0105

Logistic regression Number of obs = 2134LR chi2(2) = 252.38Prob > chi2 = 0.0000



-------------+----------------------------------------------------------------x1 | .665104 .0529186 12.57 0.000 .5613854 .7688226x2 | -.4295009 .0497464 -8.63 0.000 -.5270021 -.3319997

_cons | .6753662 .0490636 13.77 0.000 .5792033 .7715291------------------------------------------------------------------------------




. mlogit y x1 x2 x3Iteration 0: log likelihood = -3433.7503Iteration 1: log likelihood = -3295.8062Iteration 2: log likelihood = -3289.9601Iteration 3: log likelihood = -3289.9381Iteration 4: log likelihood = -3289.9381

Multinomial logistic regression Number of obs = 3313LR chi2(6) = 287.62Prob > chi2 = 0.0000



-------------+----------------------------------------------------------------1 |

x1 | -.222846 .0407254 -5.47 0.000 -.3026663 -.1430256 x2 | .0026875 .0395999 0.07 0.946 -.074927 .080302 x3 | -.032302 .0392481 -0.82 0.410 -.1092268 .0446228

_cons | -.1885108 .0391199 -4.82 0.000 -.2651843 -.1118373 -------------+----------------------------------------------------------------2 |

x1 | -.7996707 .0541183 -14.78 0.000 -.9057406 -.6936007

x2 | -.1135117 .0509981 -2.23 0.026 -.2134661 -.0135573x3 | -.3267343 .051504 -6.34 0.000 -.4276803 -.2257884

_cons | -1.070102 .0547543 -19.54 0.000 -1.177419 -.9627856-------------+----------------------------------------------------------------3 | (base outcome)------------------------------------------------------------------------------