Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
+
Sharp Regression Discontinuity Saler Axel May 2013 Southern Methodist University
+Presentation Outline
n History
n As a Research Design
n In Educational Research
n Statistical Analysis Overview
n Assumptions
n Models
n Conclusion
n Computing Regression Discontinuity in R
n No Statistical Significance
n Main Effect
n Interaction Effect
n Main and Interaction Effects
Overview Analysis and Code in R
+History
n Ex Post Facto Experiment: n Analysis in which two groups—an experimental and a control
group—are selected through matching to yield a quasi- experimental comparison.
n Critique: One or more relevant matching variables have been inadequately controlled or overlooked
n Thistlethwaite and Campbell (1960) n Developed Regression Discontinuity Analysis n Used when researchers are unable to randomly assign subjects to
treatment and control groups n Does not rely upon matching to equate experimental and control
groups n Developed to control for selection bias
+Research Design Overview
n Pretest-Posttest program-comparison design n The same measure (or alternate forms) is administered before
and after treatment.
n Pretest observed variable (cutoff score) is used to assign treatment or comparison groups
n Quasi-Experimental Design n Participants are assigned to comparison groups using an
observed variable
n Uses a pre-program measure cut-off score.
+Research Design Benefits
n Does not assume that treatment and comparison groups begin at equal places before treatment n Assumes that the pre-post relationship will be equal for the two
groups without treatment.
n Strong Internal Validity n Does not reference changes in posttest averages between groups n Looks at changes in the pre-post relationship at the observed variable
(cutoff)
n Ethical: n Random Controlled Trials can be unethical in educational research. n Example: Assigning highly proficient readers to a remedial Reading
Recovery program solely for the purpose of research n (Matthews, Peters, & Housand, 2012)
+In Educational Research n Provides quicker treatment to those who most need or
deserve it. n All sample groups begin treatment immediately
n Lacks a no-program comparison group
n Helpful when targeting a population of special learners
n Do not need to collect new data. n Can use previously collected assessment scores.
n Example of use: n Students who score below the observed variable (cut-off) on a
formative assessment are provided supplemental resources to enhance achievement.
+Example in Educational Research
n The Map was administered as a pre-test.
n Treatment group: Scored > 90th percentile.
n Control group: Scored < 90th percentile.
n The Map was re-administered as a post-test.
Total School Cluster Grouping as Measured by the Northwest Evaluation Association’s
Measure of Academic Progress (MAP) Test
(Matthews, Peters, & Housand, 2012)
+Example in Educational Research
• Both regression lines have the same slope.
• The intercepts at the cut score differ.
• A treatment effect: There is a clear difference in intercept between the parallel regression lines on either side of the cut score.
(Matthews, Peters, & Housand, 2012)
Total School Cluster Grouping as Measured by the Northwest Evaluation Association’s
Measure of Academic Progress (MAP) Test
+Statistical Analysis Overview
n No matching treatment and comparison groups.
n Comparisons come from a predicted regression line.
n A treatment effect exists if: n Posttest scores in the treatment group are better predicted by a
new regression line than the regression line of the comparison group
n Discontinuity is found at the cutoff criterion
n Can be either a statistically significant change in slope or y-intercept
+Assumptions
• Observed variable remains constant • Selection threat if the cutoff criterion changes Cutoff Criterion
• A polynomial distribution • Biases if distribution is better determined by a curve.
Pre- Post-Distribution
• Variability in comparison group pretest values • Adequate estimate of pre-post regression line
Comparison Group Pretest Variance
• Both groups derive from one continuous pretest distribution • Group division determined by the cutoff criterion
Continuous Pretest Distribution
• All recipients receive the same treatment in the same manner. Program
Implementation
+True Model
yi = βo + β1xi + β2zi
Posttest Outcome Variable
Intercept
Effect of Independent
Variable
Pretest Independent
Variable
Effect of Dummy Variable
Dummy Variable
0 = Control 1 = Treatment
+Exactly Specified Fitted Model: An Unbiased Estimate
yi = βo + β1xi + β2zi + ei
Error
+Over-Specified Fitted Model: An Inefficient Estimate
yi = βo + β1xi + β2zi + β3xizi + ei
Over-Specified
Interaction Effect
Result: Unnecessary noise in the model.
+Under-Specified Fitted Model: A Biased Estimate
yi = βo + β1xi + β2zi + β3xizi
The under-specified model could be:
If the true model is:
yi = βo + β1xi + β2zi + e
Necessary Terms
Excluded
+Visualizing Regression Discontinuity Creating a Cut Off
Cut Off
+Visualizing Regression Discontinuity Main Effect Change in Intercept
Cut Off
Change in Intercept,
Same Slope
+Visualizing Regression Discontinuity Interaction Effect Change in Slope
Cut Off
Change in Slope, Same
Intercept
+Visualizing Regression Discontinuity Main and Interaction Effect Change in Intercept and Slope
Cut Off
Change in Intercept and Slope
+No Statistical Significance R Code library(lattice)!!nostatsig<-data.frame(pretest=1:20,posttest=(c(2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40)),interv1=(factor(rep(0:1,each=10))))!!nostatsig$posttest<-jitter(nostatsig$posttest,factor=10)!!xyplot(posttest~pretest,nostatsig,xlab="Pre Test",ylab="Post Test",main="No Statistical Significance",pch=c(19,17),groups=interv1, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!first<-lm(posttest~I(pretest-10)*interv1,nostatsig)!!summary(first)!
+No Statistical Significance R Output
p-values > .05
Call:!lm(formula = posttest ~ I(pretest - 10) * interv1, data = nostatsig)!!Residuals:! Min 1Q Median 3Q Max !-3.1834 -0.8581 0.3161 1.3056 3.1510 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 18.8172 1.1597 16.225 2.34e-11 ***!I(pretest - 10) 1.7435 0.2172 8.026 5.32e-07 ***!interv11 0.1424 1.7782 0.080 0.937 !I(pretest - 10):interv11 0.3216 0.3072 1.047 0.311 !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 1.973 on 16 degrees of freedom!Multiple R-squared: 0.9754,!Adjusted R-squared: 0.9708 !F-statistic: 211.8 on 3 and 16 DF, p-value: 4.371e-13 !
+No Statistical Significance Model
p-values > .05
!
y = 18.82 + 1.74(pretest-10) + .14(interv1) + .32(pretest-10)(interv1)!
!
!
Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 18.8172 1.1597 16.225 2.34e-11 ***!I(pretest - 10) 1.7435 0.2172 8.026 5.32e-07 ***!interv11 0.1424 1.7782 0.080 0.937 !I(pretest - 10):interv11 0.3216 0.3072 1.047 0.311 !
yi = βo + β1xi + β2zi + β3xizi + ei
Pre Cut Off Group!y = 18.82 + 1.74(pretest-10) + .14(0) + .32(pretest-10)(0)!y = 18.82 + 1.74(pretest-10)!
Post Cut Off Group y = 18.82 + 1.74(pretest-10) + .14(1) + .32(pretest-10)(1) y = 18.96 + 2.06(pretest-10)!
+No Statistical Significance Plot
Cut Off
+Main Effect R Code library(lattice)!!maineffect<-data.frame(pretest2=1:20,posttest2=(c(2,4,6,8,10,12,14,16,18,20,110,112,114,116,118,120,122,124,126,128)),interv2=(factor(rep(0:1,each=10))))!!maineffect$posttest2<-jitter(maineffect$posttest2,factor=10)!!xyplot(posttest2~pretest2,maineffect,xlab="Pre Test",ylab="Post Test",main="Main Effect",pch=c(19,17),groups=interv2, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!second<-lm(posttest2~I(pretest2-10)*interv2,maineffect)!summary(second)!
+Main Effect R Output Call:!lm(formula = posttest2 ~ I(pretest2 - 10) * interv2, data = maineffect)!!Residuals:! Min 1Q Median 3Q Max !-5.1999 -1.4300 0.5758 1.9021 2.9837 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 20.1134 1.4813 13.578 3.37e-10 ***!I(pretest2 - 10) 2.0396 0.2775 7.350 1.63e-06 ***!interv21 87.2021 2.2713 38.394 < 2e-16 ***!I(pretest2 - 10):interv21 0.2818 0.3924 0.718 0.483 !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 2.52 on 16 degrees of freedom!Multiple R-squared: 0.9983, !Adjusted R-squared: 0.998 !F-statistic: 3167 on 3 and 16 DF, p-value: < 2.2e-16 !
p-value < .05
p-value > .05
+Main Effect Model
p-value > .05
yi = βo + β1xi + β2zi + β3xizi + ei
!
y = 20.11 + 2.04(pretest2-10) + 87.2(interv2) + .28(pretest2-10)(interv2)!
!
!
Pre Cut Off Group!y = 20.11 + 2.04(pretest2-10) + 87.2(0) + .28(pretest-10)(0)!y = 20.11 + 2.04(pretest-10)!
Post Cut Off Group y = 20.11 + 2.04(pretest2-10) + 87.2(1) + .28(pretest2-10)(1) y = 107.31 + 2.32(pretest-10)!
Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 20.1134 1.4813 13.578 3.37e-10 ***!I(pretest2 - 10) 2.0396 0.2775 7.350 1.63e-06 ***!interv21 87.2021 2.2713 38.394 < 2e-16 ***!I(pretest2 - 10):interv21 0.2818 0.3924 0.718 0.483 !
p-value < .05
+Main Effect Plot
Cut Off
Change in Intercept
+Interaction Effect R Code
library(lattice)!!inteffect<-data.frame(pretest3=1:20,posttest3=c(84:98,120,122,124,126,128),interv3=(factor(rep(0:1,each=10))))!!inteffect$posttest3<-jitter(inteffect$posttest3,factor=10)!!xyplot(posttest3~pretest3,inteffect,xlab="Pre Test",ylab="Post Test",main="Interaction Effect",pch=c(19,17),groups=interv3, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!third<-lm(posttest3~I(pretest3-10.5)*interv3,inteffect)!summary(third)!
+Interaction Effect R Output
p-value < .05
p-value > .05
Call:!lm(formula = posttest3 ~ I(pretest3 - 10.5) * interv3, data = inteffect)!!Residuals:! Min 1Q Median 3Q Max !-9.3752 -1.1839 0.2121 1.7872 6.8703 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 93.8177 2.5205 37.223 < 2e-16 ***!I(pretest3 - 10.5) 1.1182 0.4371 2.558 0.0211 * !interv31 -6.7534 3.5645 -1.895 0.0764 . !I(pretest3 - 10.5):interv31 3.4253 0.6182 5.541 4.47e-05 ***!---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 3.97 on 16 degrees of freedom!Multiple R-squared: 0.9424,!Adjusted R-squared: 0.9316 !F-statistic: 87.32 on 3 and 16 DF, p-value: 3.918e-10 !
+ Interaction Effect Model
p-value < .05
yi = βo + β1xi + β2zi + β3xizi + ei
!
y = 93.82 + 1.19(pretest3-10.5) – 6.75(interv3) + 3.43(pretest3-10.5)(interv3)!
!
!
Pre Cut Off Group! y = 93.82 + 1.19(pretest3-10.5) – 6.75(0) + 3.43(pretest3-10.5)(0)!
y = 93.82 + 1.19(pretest-10.5)!
Post Cut Off Group
y = 93.82 + 1.19(pretest3-10.5) – 6.75(1) + 3.43(pretest3-10.5)(1)! y = 87.07 + 4.62(pretest-10.5)!
p-value > .05
Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 93.8177 2.5205 37.223 < 2e-16 ***!I(pretest3 - 10.5) 1.1182 0.4371 2.558 0.0211 * !interv31 -6.7534 3.5645 -1.895 0.0764 . !I(pretest3 - 10.5):interv31 3.4253 0.6182 5.541 4.47e-05 ***!
+Interaction Effect Plot
Cut Off
Change in
Intercept
+Main Effect and Interaction Effects R Code library(lattice)!!maininteffect<-data.frame(pretest4=1:20,posttest4=(c(36,38,40,42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 68, 80, 82, 84, 86)),interv2=(factor(rep(0:1,each=10))))!!maininteffect$posttest4<-jitter(maininteffect$posttest4,factor=10)!!xyplot(posttest4~pretest4,maininteffect,xlab="Pre Test",ylab="Post Test", main="Main Effects",pch=c(19,17),groups=interv2, col=c("blue","red"),type=c("p","r"),lwd=2,lty=c(2,1))!!fourth<-lm(posttest4~I(pretest4-4)*interv4,maininteffect)!!summary(fourth)!
+Main Effect and Interaction Effect R Output
p-values < .05
Call:!lm(formula = posttest4 ~ I(pretest4 - 4) * interv4, data = maininteffect)!!Residuals:! Min 1Q Median 3Q Max !-6.5819 -2.0496 -0.4671 1.9956 5.5435 !!Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 41.2661 1.1466 35.989 < 2e-16 ***!I(pretest4 - 4) 2.0676 0.3539 5.843 2.5e-05 ***!interv41 -14.1485 4.3483 -3.254 0.00498 ** !I(pretest4 - 4):interv41 1.5968 0.5004 3.191 0.00568 ** !---!Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 !!Residual standard error: 3.214 on 16 degrees of freedom!Multiple R-squared: 0.965, !Adjusted R-squared: 0.9584 !F-statistic: 147.1 on 3 and 16 DF, p-value: 7.383e-12 !
+Main Effect and Interaction Effect Model
p-values < .05
yi = βo + β1xi + β2zi + β3xizi + ei
!
y = 41.27 + 2.07(pretest4-4) – 14.15(interv4) + 1.6(pretest4-4)(interv4)!
!
!
Pre Cut Off Group! y = 41.27 + 2.07(pretest4-4) – 14.15(0) + 1.6(pretest4-4)(0)!
y = 41.27 + 2.07(pretest-4)! Post Cut Off Group
y = 41.27 + 2.07(pretest4-4) – 14.15(1) + 1.6(pretest4-4)(1)! y = 27.12 + 3.67(pretest-4)!
Coefficients:! Estimate Std. Error t value Pr(>|t|) !(Intercept) 41.2661 1.1466 35.989 < 2e-16 ***!I(pretest4 - 4) 2.0676 0.3539 5.843 2.5e-05 ***!interv41 -14.1485 4.3483 -3.254 0.00498 ** !I(pretest4 - 4):interv41 1.5968 0.5004 3.191 0.00568 ** !
+Main Effect and Interaction Effect Plot
Cut Off
Change in Intercept and Slope
+Regression Discontinuity Another View
Cut Off
Change in Intercept and Slope
+Conclusions
n Quasi-Experimental Research Design
n Beneficial when treatment assignment is done using a score system
n Uses a much larger sample size than Random Control Trial Design
n Carefully check assumptions to avoid violations and threaten internal validity
n Analyze data visually and statistically
n Treatment effects exist when a new regression line better predicts post-test scores
+References
Lee, H., & Munk, T. (2008). Using regression discontinuity design for program evaluation. JSM. Retrieved from http://
www.amstat.org/sections/srms/proceedings/y2008/Files/301149.pdf
Matthews, M. S., Peters, S. J., & Housand, A. M. (2012). Regression discontinuity design in gifted and talented education
research. Gifted Child Quarterly, 56(2), 105-112. doi: 10.1177/0016986212444845
Thistlethwaite, D., & Campbell, D. (1960). Regression discontinuity analysis: An alternative to the ex post facto
experiment. Journal of Educational Psychology, 51(6), 309-317.
Trochim, W. (2006). Regression-discontinuity analysis. Retrieved from http://www.socialresearchmethods.net/kb/
statrd.php
Trochim, W. (2006).The regression-discontinuity design. Retrieved from http://www.socialresearchmethods.net/kb/
quasird.php