29
FINAL MEETING – OTHER METHODS Development Workshop

FINAL MEETING – OTHER METHODS Development Workshop

Embed Size (px)

Citation preview

Page 1: FINAL MEETING – OTHER METHODS Development Workshop

FINAL MEETING – OTHER METHODS

Development

Workshop

Page 2: FINAL MEETING – OTHER METHODS Development Workshop

General conclusions on causal analyses

Magic tool of „ceteris paribus”– Regression is ceteris paribus by definition– But the data need not to be – they are just a subsample

of general populations and many other things confound Causal effects, i.e. cause and effect

– Propensity Score Matching– Regression Discontinuity– Fixed Effects – Instrumental Variables

2

Page 3: FINAL MEETING – OTHER METHODS Development Workshop

If we cannot experiment..…

3

Cross-sectional data Panel data

„Regression Discontinuity

Design“

„Propensity Score Matching“

IV

Before After Estimators

Difference in Difference Estimators (DiD)

„Propensity Score Matching“ + DiD

Page 4: FINAL MEETING – OTHER METHODS Development Workshop

Problems with causal inference

4

ConfoundingInfluence

(environment)Treatment

EffectObservables

Unobservables

Page 5: FINAL MEETING – OTHER METHODS Development Workshop

Instrumental Variables solution…

ConfoundingInfluence

Treatment

OutcomeInstrumentalVariable(s)

Observed Factor

Unobserved Factor

Page 6: FINAL MEETING – OTHER METHODS Development Workshop

Fixed Effects Solution… (DiD does pretty much the same)

ConfoundingInfluence

Treatment

Outcome

Fixed Influences

Observed Factor

Unobserved Factor

Page 7: FINAL MEETING – OTHER METHODS Development Workshop

Propensity Score Matching

ConfoundingInfluence

Treatment

Outcome

Treatment

Observed Factor

Unobserved Factor

Page 8: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

ConfoundingInfluence

Treatment

Effect

Group that is key for this policy

Observables

Unobservables

8

Page 9: FINAL MEETING – OTHER METHODS Development Workshop

A motivating story

Today women in Poland have on average 1,7 kid About 50 years ago, women had 2,8 kids Todays women are 6 times more educated than 50 years ago –

will a drop from 2.8 to 1.7 be an effect of this educational change? Natural experiment: in 1960 schooling obligation was extended by

one year (11 to 12 years).– THE SAME women born just before 1953 went to primary and

secondary schools a year shorter than born after 1953– THE SAME = ?

RD allows to compare fertility (with individual characteristics) for women born around 1953

9

Page 10: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

Idea– Focus your analyses on a group for which treament was random (or

rather: independent)

How to do it?– Example: weaker students have lower grades, but are also frequently

„delayed” to repeat courses/years; if we give them extra classes, better students will outperform them anyway, so how to test if extra classes help?

– RDD will compare the performance of students just above and just below „threshold”, so quite similar ones

– RDD will only work if people cannot „prevent” or „encourage” treatment by relocating themselves around „threshold”

10

Page 11: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

Advantages:– Really marginal effect– Causal, if RDD well applied

Disadvantages:– Sample size largely limited – Only „local” character of estimations (marginal≠average)

Problems:– How do we know how far away from threshold can we go

(bandwidth)?– How do we know if design is ok.?

11

Page 12: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design Zastosowanie

– Trade off between narrow “bandwidth” (for independence assumption) and wide “bandwidth” to increase sample size

– One can try to find it empirically ( “fuzzy” RD design)

– Y is the effect, p is treatment probability.

+ is effect of probability just above „cut-off”

- is effect of probability just below „cut-off”

cutoff

Y Y

p p

12

Page 13: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

13

Page 14: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

14

Page 15: FINAL MEETING – OTHER METHODS Development Workshop

Regression Discontinuity Design

15

Page 16: FINAL MEETING – OTHER METHODS Development Workshop

16

How to do this in STATA?

First – download package: net instal rd Second – define your model

– rd $out, treatment, $in [if] [in] [weight] [, options] Third – there are some options

– mbw(numlist) multiplication of „bandwidth” in percent (default: "100 50 200" which means we always do 50%, 100% and 200%)

– z0(real) sets cutoff Z0 (treatment)– ddens asks for extra estimation of discontinuities in Z density– graph – draws graphs we’ve seen automatically

Page 17: FINAL MEETING – OTHER METHODS Development Workshop

Sample results in STATA - data

Note: dataset has changed since last savedSorted by: fips district ranwin byte %8.0g veterans double %12.0g Veteran Population Shareurban double %12.0g Urban Population Shareunion float %9.0g Unionized Population Shareunemplyd double %12.0g Unemp Population Sharemanuf double %12.0g Manufactur Population Shareforborn double %12.0g Foreign Born Population Sharefedwrkr double %12.0g Fed Worker Population Sharefarmer double %12.0g Farmer Population Shareblucllr double %12.0g Blue-collar Population Shareblack double %12.0g Black Population Sharepopulatn long %12.0g Populationvotpop double %10.0g Voting Age Population Sharevotingpop long %12.0g Voting Age Populationi byte %9.0g Incumbentlne float %9.0g Log fed expenditure in districtwin byte %9.0g Dem Won Raced double %10.0g Dem vote share minus .5district byte %8.0g Congr districtfips byte %8.0g fips State code variable name type format label variable label storage display value size: 39,437 (99.9% of memory free) vars: 20 5 Nov 2007 17:02 obs: 349 102nd CongressContains data from votex.dta

Page 18: FINAL MEETING – OTHER METHODS Development Workshop

Output from STATA

18

lwald -.0773955 .1056062 -0.73 0.464 -.28438 .1295889 lne Coef. Std. Err. z P>|z| [95% Conf. Interval] Estimating for bandwidth .29287775925349Bandwidth: .29287776; loc Wald Estimate: -.07739553Command used for graph: lpoly; Kernel used: triangle (default)

Outcome variable y is lne Treatment variable X_T unspecified Assignment variable Z is d

assumed to jump from zero to one at Z=0. Two variables specified; treatment is . rd lne d, gr mbw(100)

(102nd Congress). use votex

Page 19: FINAL MEETING – OTHER METHODS Development Workshop

Output from STATA - graph20

2122

23

-.2 0 .2 .4 .6

Log fed expenditure in district Bandwidth .29287775925349

Page 20: FINAL MEETING – OTHER METHODS Development Workshop

Output from STATA –„fuzzy” version

20

gen byte ranwin=cond(uniform()<.1,1-win,win)rd lne ranwin d, mbw(25(25)300) bdep ox

-.8

-.6

-.4

-.2

0.2

Est

imat

ed e

ffec

t

.29 7.3e-02 .15 .22 .37 .44 .51 .59 .66 .73 .81 .88Bandwidth

CI Est

Page 21: FINAL MEETING – OTHER METHODS Development Workshop

Quintile regressions

One last thing

Page 22: FINAL MEETING – OTHER METHODS Development Workshop

A motivating story

1 decyl

2 decyl

3 decyl

4 decyl

5 decyl

6 decyl

7 decyl

8 decyl

9 decyl

przeciętna

0 zł

500 zł

1 000 zł

1 500 zł

2 000 zł

2 500 zł

3 000 zł

3 500 zł

4 000 zł

4 500 zł

1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

Page 23: FINAL MEETING – OTHER METHODS Development Workshop

Some basics „doubts” of an empirical economist…

Compare similar to similar Keep statistical properties Understand bezond „average x” Understand (and be independent of) „outliers”

Page 24: FINAL MEETING – OTHER METHODS Development Workshop

Robust estimators

First flavour of robust – regression with robust option– Helps if problem is not systematic– Does not help if problem is the nature of the process

(e.g. heterogeneity) Second flavour of robust – nonparametric estimators

– Complex from mathematical point of view– Takes longer to compute– But veeeery elastic=> Koenker (and his followers)

Page 25: FINAL MEETING – OTHER METHODS Development Workshop

How to do this in STATA?

Estimate at median– qreg y $in

Estimate at any other percentile– qreg y $in, quantile(q) where q is your percentile

Estimate differences between different percentiles– iqreg y $in, quantile(.25 .75) reps(100) + additionally may

bootstrap

Page 26: FINAL MEETING – OTHER METHODS Development Workshop

Output from STATA

_cons 3 2.774852 1.08 0.311 -3.39882 9.39882 x 17 3.924233 4.33 0.003 7.950702 26.0493 y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Min sum of deviations 110 Pseudo R2 = 0.2994 Raw sum of deviations 157 (about 14)Median regression Number of obs = 10

Iteration 2: sum of abs. weighted deviations = 110Iteration 1: sum of abs. weighted deviations = 111

Iteration 1: WLS sum of weighted deviations = 121.88268

Page 27: FINAL MEETING – OTHER METHODS Development Workshop

Output from STATA

_cons 1 3.258348 0.31 0.767 -6.513764 8.513764 x 18 4.608 3.91 0.005 7.373933 28.62607 y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Min sum of deviations 78.66 Pseudo R2 = 0.3598 Raw sum of deviations 122.86 (about 3).33 Quantile regression Number of obs = 10

Iteration 3: sum of abs. weighted deviations = 78.66Iteration 2: sum of abs. weighted deviations = 79.36Iteration 1: sum of abs. weighted deviations = 80.66

Iteration 1: WLS sum of weighted deviations = 80.060899

Page 28: FINAL MEETING – OTHER METHODS Development Workshop

Summarising all this crap

ConfoundingInfluence

(environment)Treatment

EffectObservables

Unobservables

Page 29: FINAL MEETING – OTHER METHODS Development Workshop

Problems

Sample– size– heterogeneity

Methods– None is perfect– Question important– Nonparametric (kernel in PSM or QR) are robust,

robust is not a synonim for miraculous