FINAL MEETING – OTHER METHODS Development Workshop

FINAL MEETING – OTHER METHODS

Development

Workshop

General conclusions on causal analyses

Magic tool of „ceteris paribus”– Regression is ceteris paribus by definition– But the data need not to be – they are just a subsample

of general populations and many other things confound Causal effects, i.e. cause and effect

– Propensity Score Matching– Regression Discontinuity– Fixed Effects – Instrumental Variables

2

If we cannot experiment..…

3

Cross-sectional data Panel data

„Regression Discontinuity

Design“

„Propensity Score Matching“

IV

Before After Estimators

Difference in Difference Estimators (DiD)

„Propensity Score Matching“ + DiD

Problems with causal inference

4

ConfoundingInfluence

(environment)Treatment

EffectObservables

Unobservables

Instrumental Variables solution…


Treatment

OutcomeInstrumentalVariable(s)

Observed Factor

Unobserved Factor

Fixed Effects Solution… (DiD does pretty much the same)


Treatment

Outcome

Fixed Influences

Observed Factor

Unobserved Factor

Propensity Score Matching


Treatment

Outcome

Treatment

Observed Factor

Unobserved Factor

Regression Discontinuity Design


Treatment

Effect

Group that is key for this policy

Observables

Unobservables

8

A motivating story

Today women in Poland have on average 1,7 kid About 50 years ago, women had 2,8 kids Todays women are 6 times more educated than 50 years ago –

will a drop from 2.8 to 1.7 be an effect of this educational change? Natural experiment: in 1960 schooling obligation was extended by

one year (11 to 12 years).– THE SAME women born just before 1953 went to primary and

secondary schools a year shorter than born after 1953– THE SAME = ?

RD allows to compare fertility (with individual characteristics) for women born around 1953

9


Idea– Focus your analyses on a group for which treament was random (or

rather: independent)

How to do it?– Example: weaker students have lower grades, but are also frequently

„delayed” to repeat courses/years; if we give them extra classes, better students will outperform them anyway, so how to test if extra classes help?

– RDD will compare the performance of students just above and just below „threshold”, so quite similar ones

– RDD will only work if people cannot „prevent” or „encourage” treatment by relocating themselves around „threshold”

10


Advantages:– Really marginal effect– Causal, if RDD well applied

Disadvantages:– Sample size largely limited – Only „local” character of estimations (marginal≠average)

Problems:– How do we know how far away from threshold can we go

(bandwidth)?– How do we know if design is ok.?

11

Regression Discontinuity Design Zastosowanie

– Trade off between narrow “bandwidth” (for independence assumption) and wide “bandwidth” to increase sample size

– One can try to find it empirically ( “fuzzy” RD design)

– Y is the effect, p is treatment probability.

+ is effect of probability just above „cut-off”

- is effect of probability just below „cut-off”

cutoff

Y Y

p p

12


13


14


15

16

How to do this in STATA?

First – download package: net instal rd Second – define your model

– rd $out, treatment, $in [if] [in] [weight] [, options] Third – there are some options

– mbw(numlist) multiplication of „bandwidth” in percent (default: "100 50 200" which means we always do 50%, 100% and 200%)

– z0(real) sets cutoff Z0 (treatment)– ddens asks for extra estimation of discontinuities in Z density– graph – draws graphs we’ve seen automatically

Sample results in STATA - data

Note: dataset has changed since last savedSorted by: fips district ranwin byte %8.0g veterans double %12.0g Veteran Population Shareurban double %12.0g Urban Population Shareunion float %9.0g Unionized Population Shareunemplyd double %12.0g Unemp Population Sharemanuf double %12.0g Manufactur Population Shareforborn double %12.0g Foreign Born Population Sharefedwrkr double %12.0g Fed Worker Population Sharefarmer double %12.0g Farmer Population Shareblucllr double %12.0g Blue-collar Population Shareblack double %12.0g Black Population Sharepopulatn long %12.0g Populationvotpop double %10.0g Voting Age Population Sharevotingpop long %12.0g Voting Age Populationi byte %9.0g Incumbentlne float %9.0g Log fed expenditure in districtwin byte %9.0g Dem Won Raced double %10.0g Dem vote share minus .5district byte %8.0g Congr districtfips byte %8.0g fips State code variable name type format label variable label storage display value size: 39,437 (99.9% of memory free) vars: 20 5 Nov 2007 17:02 obs: 349 102nd CongressContains data from votex.dta

Output from STATA

18

lwald -.0773955 .1056062 -0.73 0.464 -.28438 .1295889 lne Coef. Std. Err. z P>|z| [95% Conf. Interval] Estimating for bandwidth .29287775925349Bandwidth: .29287776; loc Wald Estimate: -.07739553Command used for graph: lpoly; Kernel used: triangle (default)

Outcome variable y is lne Treatment variable X_T unspecified Assignment variable Z is d

assumed to jump from zero to one at Z=0. Two variables specified; treatment is . rd lne d, gr mbw(100)

(102nd Congress). use votex

Output from STATA - graph20

2122

23

-.2 0 .2 .4 .6

Log fed expenditure in district Bandwidth .29287775925349

Output from STATA –„fuzzy” version

20

gen byte ranwin=cond(uniform()<.1,1-win,win)rd lne ranwin d, mbw(25(25)300) bdep ox

-.8

-.6

-.4

-.2

0.2

Est

imat

ed e

ffec

t

.29 7.3e-02 .15 .22 .37 .44 .51 .59 .66 .73 .81 .88Bandwidth

CI Est

Quintile regressions

One last thing

A motivating story

1 decyl

2 decyl

3 decyl

4 decyl

5 decyl

6 decyl

7 decyl

8 decyl

9 decyl

przeciętna

0 zł

500 zł

1 000 zł

1 500 zł

2 000 zł

2 500 zł

3 000 zł

3 500 zł

4 000 zł

4 500 zł

1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

Some basics „doubts” of an empirical economist…

Compare similar to similar Keep statistical properties Understand bezond „average x” Understand (and be independent of) „outliers”

Robust estimators

First flavour of robust – regression with robust option– Helps if problem is not systematic– Does not help if problem is the nature of the process

(e.g. heterogeneity) Second flavour of robust – nonparametric estimators

– Complex from mathematical point of view– Takes longer to compute– But veeeery elastic=> Koenker (and his followers)

How to do this in STATA?

Estimate at median– qreg y $in

Estimate at any other percentile– qreg y $in, quantile(q) where q is your percentile

Estimate differences between different percentiles– iqreg y $in, quantile(.25 .75) reps(100) + additionally may

bootstrap

Output from STATA

_cons 3 2.774852 1.08 0.311 -3.39882 9.39882 x 17 3.924233 4.33 0.003 7.950702 26.0493 y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Min sum of deviations 110 Pseudo R2 = 0.2994 Raw sum of deviations 157 (about 14)Median regression Number of obs = 10

Iteration 2: sum of abs. weighted deviations = 110Iteration 1: sum of abs. weighted deviations = 111

Iteration 1: WLS sum of weighted deviations = 121.88268

Output from STATA

_cons 1 3.258348 0.31 0.767 -6.513764 8.513764 x 18 4.608 3.91 0.005 7.373933 28.62607 y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Min sum of deviations 78.66 Pseudo R2 = 0.3598 Raw sum of deviations 122.86 (about 3).33 Quantile regression Number of obs = 10

Iteration 3: sum of abs. weighted deviations = 78.66Iteration 2: sum of abs. weighted deviations = 79.36Iteration 1: sum of abs. weighted deviations = 80.66

Iteration 1: WLS sum of weighted deviations = 80.060899

Summarising all this crap


(environment)Treatment

EffectObservables

Unobservables

Problems

Sample– size– heterogeneity

Methods– None is perfect– Question important– Nonparametric (kernel in PSM or QR) are robust,

robust is not a synonim for miraculous

Documents

FINAL MEETING – OTHER METHODS Development Workshop