Sharp Regression Discontinuityfaculty.smu.edu/kyler/courses/7312/presentations/2013/axel/axel.pdf · Sharp Regression Discontinuity Saler Axel May 2013 Southern Methodist University

+

Sharp Regression Discontinuity Saler Axel May 2013 Southern Methodist University

+Presentation Outline

n  History

n  As a Research Design

n  In Educational Research

n  Statistical Analysis Overview

n  Assumptions

n  Models

n  Conclusion

n  Computing Regression Discontinuity in R

n  No Statistical Significance

n  Main Effect

n  Interaction Effect

n  Main and Interaction Effects

Overview Analysis and Code in R

+History

n  Ex Post Facto Experiment: n  Analysis in which two groups—an experimental and a control

group—are selected through matching to yield a quasi- experimental comparison.

n  Critique: One or more relevant matching variables have been inadequately controlled or overlooked

n  Thistlethwaite and Campbell (1960) n  Developed Regression Discontinuity Analysis n  Used when researchers are unable to randomly assign subjects to

treatment and control groups n  Does not rely upon matching to equate experimental and control

groups n  Developed to control for selection bias

+Research Design Overview

n  Pretest-Posttest program-comparison design n  The same measure (or alternate forms) is administered before

and after treatment.

n  Pretest observed variable (cutoff score) is used to assign treatment or comparison groups

n  Quasi-Experimental Design n  Participants are assigned to comparison groups using an

observed variable

n  Uses a pre-program measure cut-off score.

+Research Design Benefits

n  Does not assume that treatment and comparison groups begin at equal places before treatment n  Assumes that the pre-post relationship will be equal for the two

groups without treatment.

n  Strong Internal Validity n  Does not reference changes in posttest averages between groups n  Looks at changes in the pre-post relationship at the observed variable

(cutoff)

n  Ethical: n  Random Controlled Trials can be unethical in educational research. n  Example: Assigning highly proficient readers to a remedial Reading

Recovery program solely for the purpose of research n  (Matthews, Peters, & Housand, 2012)

+In Educational Research n  Provides quicker treatment to those who most need or

deserve it. n  All sample groups begin treatment immediately

n  Lacks a no-program comparison group

n  Helpful when targeting a population of special learners

n  Do not need to collect new data. n  Can use previously collected assessment scores.

n  Example of use: n  Students who score below the observed variable (cut-off) on a

formative assessment are provided supplemental resources to enhance achievement.

+Example in Educational Research

n  The Map was administered as a pre-test.

n  Treatment group: Scored > 90th percentile.

n  Control group: Scored < 90th percentile.

n  The Map was re-administered as a post-test.

Total School Cluster Grouping as Measured by the Northwest Evaluation Association’s

Measure of Academic Progress (MAP) Test

(Matthews, Peters, & Housand, 2012)

+Example in Educational Research

•  Both regression lines have the same slope.

•  The intercepts at the cut score differ.

•  A treatment effect: There is a clear difference in intercept between the parallel regression lines on either side of the cut score.

(Matthews, Peters, & Housand, 2012)

Total School Cluster Grouping as Measured by the Northwest Evaluation Association’s

Measure of Academic Progress (MAP) Test

+Statistical Analysis Overview

n  No matching treatment and comparison groups.

n  Comparisons come from a predicted regression line.

n  A treatment effect exists if: n  Posttest scores in the treatment group are better predicted by a

new regression line than the regression line of the comparison group

n  Discontinuity is found at the cutoff criterion

n  Can be either a statistically significant change in slope or y-intercept

+Assumptions

• Observed variable remains constant • Selection threat if the cutoff criterion changes Cutoff Criterion

• A polynomial distribution • Biases if distribution is better determined by a curve.

Pre- Post-Distribution

• Variability in comparison group pretest values • Adequate estimate of pre-post regression line

Comparison Group Pretest Variance

• Both groups derive from one continuous pretest distribution • Group division determined by the cutoff criterion

Continuous Pretest Distribution

• All recipients receive the same treatment in the same manner. Program

Implementation

+True Model

yi = βo + β1xi + β2zi

Posttest Outcome Variable

Intercept

Effect of Independent

Variable

Pretest Independent

Variable

Effect of Dummy Variable

Dummy Variable

0 = Control 1 = Treatment

+Exactly Specified Fitted Model: An Unbiased Estimate

yi = βo + β1xi + β2zi + ei

Error

+Over-Specified Fitted Model: An Inefficient Estimate

yi = βo + β1xi + β2zi + β3xizi + ei

Over-Specified

Interaction Effect

Result: Unnecessary noise in the model.

+Under-Specified Fitted Model: A Biased Estimate

yi = βo + β1xi + β2zi + β3xizi

The under-specified model could be:

If the true model is:

yi = βo + β1xi + β2zi + e

Necessary Terms

Excluded

+Visualizing Regression Discontinuity Creating a Cut Off

Cut Off

+Visualizing Regression Discontinuity Main Effect Change in Intercept

Cut Off

Change in Intercept,

Same Slope

+Visualizing Regression Discontinuity Interaction Effect Change in Slope

Cut Off

Change in Slope, Same

Intercept

+Visualizing Regression Discontinuity Main and Interaction Effect Change in Intercept and Slope

Cut Off

Change in Intercept and Slope

+No Statistical Significance R Code library(lattice)!!nostatsig<-data.frame(pretest=1:20,posttest=(c(2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40)),interv1=(factor(rep(0:1,each=10))))!!nostatsig$posttest<-jitter(nostatsig$posttest,factor=10)!!xyplot(posttest~pretest,nostatsig,xlab="Pre Test",ylab="Post Test",main="No Statistical Significance",pch=c(19,17),groups=interv1, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!first<-lm(posttest~I(pretest-10)*interv1,nostatsig)!!summary(first)!

+No Statistical Significance R Output

p-values > .05

Call:!lm(formula = posttest ~ I(pretest - 10) * interv1, data = nostatsig)!!Residuals:! Min 1Q Median 3Q Max !-3.1834 -0.8581 0.3161 1.3056 3.1510 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 18.8172 1.1597 16.225 2.34e-11 ***!I(pretest - 10) 1.7435 0.2172 8.026 5.32e-07 ***!interv11 0.1424 1.7782 0.080 0.937 !I(pretest - 10):interv11 0.3216 0.3072 1.047 0.311 !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 1.973 on 16 degrees of freedom!Multiple R-squared: 0.9754,!Adjusted R-squared: 0.9708 !F-statistic: 211.8 on 3 and 16 DF, p-value: 4.371e-13 !

+No Statistical Significance Model

p-values > .05

!

y = 18.82 + 1.74(pretest-10) + .14(interv1) + .32(pretest-10)(interv1)!

!

!

Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 18.8172 1.1597 16.225 2.34e-11 ***!I(pretest - 10) 1.7435 0.2172 8.026 5.32e-07 ***!interv11 0.1424 1.7782 0.080 0.937 !I(pretest - 10):interv11 0.3216 0.3072 1.047 0.311 !


Pre Cut Off Group!y = 18.82 + 1.74(pretest-10) + .14(0) + .32(pretest-10)(0)!y = 18.82 + 1.74(pretest-10)!

Post Cut Off Group y = 18.82 + 1.74(pretest-10) + .14(1) + .32(pretest-10)(1) y = 18.96 + 2.06(pretest-10)!

+No Statistical Significance Plot

Cut Off

+Main Effect R Code library(lattice)!!maineffect<-data.frame(pretest2=1:20,posttest2=(c(2,4,6,8,10,12,14,16,18,20,110,112,114,116,118,120,122,124,126,128)),interv2=(factor(rep(0:1,each=10))))!!maineffect$posttest2<-jitter(maineffect$posttest2,factor=10)!!xyplot(posttest2~pretest2,maineffect,xlab="Pre Test",ylab="Post Test",main="Main Effect",pch=c(19,17),groups=interv2, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!second<-lm(posttest2~I(pretest2-10)*interv2,maineffect)!summary(second)!

+Main Effect R Output Call:!lm(formula = posttest2 ~ I(pretest2 - 10) * interv2, data = maineffect)!!Residuals:! Min 1Q Median 3Q Max !-5.1999 -1.4300 0.5758 1.9021 2.9837 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 20.1134 1.4813 13.578 3.37e-10 ***!I(pretest2 - 10) 2.0396 0.2775 7.350 1.63e-06 ***!interv21 87.2021 2.2713 38.394 < 2e-16 ***!I(pretest2 - 10):interv21 0.2818 0.3924 0.718 0.483 !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 2.52 on 16 degrees of freedom!Multiple R-squared: 0.9983, !Adjusted R-squared: 0.998 !F-statistic: 3167 on 3 and 16 DF, p-value: < 2.2e-16 !

p-value < .05

p-value > .05

+Main Effect Model

p-value > .05


!

y = 20.11 + 2.04(pretest2-10) + 87.2(interv2) + .28(pretest2-10)(interv2)!

!

!

Pre Cut Off Group!y = 20.11 + 2.04(pretest2-10) + 87.2(0) + .28(pretest-10)(0)!y = 20.11 + 2.04(pretest-10)!

Post Cut Off Group y = 20.11 + 2.04(pretest2-10) + 87.2(1) + .28(pretest2-10)(1) y = 107.31 + 2.32(pretest-10)!

Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 20.1134 1.4813 13.578 3.37e-10 ***!I(pretest2 - 10) 2.0396 0.2775 7.350 1.63e-06 ***!interv21 87.2021 2.2713 38.394 < 2e-16 ***!I(pretest2 - 10):interv21 0.2818 0.3924 0.718 0.483 !

p-value < .05

+Main Effect Plot

Cut Off

Change in Intercept

+Interaction Effect R Code

library(lattice)!!inteffect<-data.frame(pretest3=1:20,posttest3=c(84:98,120,122,124,126,128),interv3=(factor(rep(0:1,each=10))))!!inteffect$posttest3<-jitter(inteffect$posttest3,factor=10)!!xyplot(posttest3~pretest3,inteffect,xlab="Pre Test",ylab="Post Test",main="Interaction Effect",pch=c(19,17),groups=interv3, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!third<-lm(posttest3~I(pretest3-10.5)*interv3,inteffect)!summary(third)!

+Interaction Effect R Output

p-value < .05

p-value > .05

Call:!lm(formula = posttest3 ~ I(pretest3 - 10.5) * interv3, data = inteffect)!!Residuals:! Min 1Q Median 3Q Max !-9.3752 -1.1839 0.2121 1.7872 6.8703 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 93.8177 2.5205 37.223 < 2e-16 ***!I(pretest3 - 10.5) 1.1182 0.4371 2.558 0.0211 * !interv31 -6.7534 3.5645 -1.895 0.0764 . !I(pretest3 - 10.5):interv31 3.4253 0.6182 5.541 4.47e-05 ***!---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 3.97 on 16 degrees of freedom!Multiple R-squared: 0.9424,!Adjusted R-squared: 0.9316 !F-statistic: 87.32 on 3 and 16 DF, p-value: 3.918e-10 !

+ Interaction Effect Model

p-value < .05


!

y = 93.82 + 1.19(pretest3-10.5) – 6.75(interv3) + 3.43(pretest3-10.5)(interv3)!

!

!

Pre Cut Off Group! y = 93.82 + 1.19(pretest3-10.5) – 6.75(0) + 3.43(pretest3-10.5)(0)!

y = 93.82 + 1.19(pretest-10.5)!

Post Cut Off Group

y = 93.82 + 1.19(pretest3-10.5) – 6.75(1) + 3.43(pretest3-10.5)(1)! y = 87.07 + 4.62(pretest-10.5)!

p-value > .05

Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 93.8177 2.5205 37.223 < 2e-16 ***!I(pretest3 - 10.5) 1.1182 0.4371 2.558 0.0211 * !interv31 -6.7534 3.5645 -1.895 0.0764 . !I(pretest3 - 10.5):interv31 3.4253 0.6182 5.541 4.47e-05 ***!

+Interaction Effect Plot

Cut Off

Change in

Intercept

+Main Effect and Interaction Effects R Code library(lattice)!!maininteffect<-data.frame(pretest4=1:20,posttest4=(c(36,38,40,42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 68, 80, 82, 84, 86)),interv2=(factor(rep(0:1,each=10))))!!maininteffect$posttest4<-jitter(maininteffect$posttest4,factor=10)!!xyplot(posttest4~pretest4,maininteffect,xlab="Pre Test",ylab="Post Test", main="Main Effects",pch=c(19,17),groups=interv2, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!fourth<-lm(posttest4~I(pretest4-4)*interv4,maininteffect)!!summary(fourth)!

+Main Effect and Interaction Effect R Output

p-values < .05

Call:!lm(formula = posttest4 ~ I(pretest4 - 4) * interv4, data = maininteffect)!!Residuals:! Min 1Q Median 3Q Max !-6.5819 -2.0496 -0.4671 1.9956 5.5435 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 41.2661 1.1466 35.989 < 2e-16 ***!I(pretest4 - 4) 2.0676 0.3539 5.843 2.5e-05 ***!interv41 -14.1485 4.3483 -3.254 0.00498 ** !I(pretest4 - 4):interv41 1.5968 0.5004 3.191 0.00568 ** !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 3.214 on 16 degrees of freedom!Multiple R-squared: 0.965, !Adjusted R-squared: 0.9584 !F-statistic: 147.1 on 3 and 16 DF, p-value: 7.383e-12 !

+Main Effect and Interaction Effect Model

p-values < .05


!

y = 41.27 + 2.07(pretest4-4) – 14.15(interv4) + 1.6(pretest4-4)(interv4)!

!

!

Pre Cut Off Group! y = 41.27 + 2.07(pretest4-4) – 14.15(0) + 1.6(pretest4-4)(0)!

y = 41.27 + 2.07(pretest-4)! Post Cut Off Group

y = 41.27 + 2.07(pretest4-4) – 14.15(1) + 1.6(pretest4-4)(1)! y = 27.12 + 3.67(pretest-4)!

Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 41.2661 1.1466 35.989 < 2e-16 ***!I(pretest4 - 4) 2.0676 0.3539 5.843 2.5e-05 ***!interv41 -14.1485 4.3483 -3.254 0.00498 ** !I(pretest4 - 4):interv41 1.5968 0.5004 3.191 0.00568 ** !

+Main Effect and Interaction Effect Plot

Cut Off


+Regression Discontinuity Another View

Cut Off


+Conclusions

n  Quasi-Experimental Research Design

n  Beneficial when treatment assignment is done using a score system

n  Uses a much larger sample size than Random Control Trial Design

n  Carefully check assumptions to avoid violations and threaten internal validity

n  Analyze data visually and statistically

n  Treatment effects exist when a new regression line better predicts post-test scores

+References

Lee, H., & Munk, T. (2008). Using regression discontinuity design for program evaluation. JSM. Retrieved from http://

www.amstat.org/sections/srms/proceedings/y2008/Files/301149.pdf

Matthews, M. S., Peters, S. J., & Housand, A. M. (2012). Regression discontinuity design in gifted and talented education

research. Gifted Child Quarterly, 56(2), 105-112. doi: 10.1177/0016986212444845

Thistlethwaite, D., & Campbell, D. (1960). Regression discontinuity analysis: An alternative to the ex post facto

experiment. Journal of Educational Psychology, 51(6), 309-317.

Trochim, W. (2006). Regression-discontinuity analysis. Retrieved from http://www.socialresearchmethods.net/kb/

statrd.php

Trochim, W. (2006).The regression-discontinuity design. Retrieved from http://www.socialresearchmethods.net/kb/

quasird.php

Documents

Sharp Regression Discontinuityfaculty.smu.edu/kyler/courses/7312/presentations/2013/axel/axel.pdf · Sharp Regression Discontinuity Saler Axel May 2013 Southern Methodist University