61
Replacement Cases Framework overview Thresholds for inference and % bias t o invalidate The counterfactual paradigm Internal validity example: kindergart en retention What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences Abstract We contribute to debate about causal inferences in educational research in two ways. First, we quantify how much bias there must be in an estimate to invalidate an inference. Second, we utilize Rubin’s causal model (RCM) to interpret the bias necessary to invalidate an inference in terms of sample replacement. We apply our analysis to an inference of a positive effect of Open Court Curriculum on reading achievement from a randomized experiment, and an inference of a negative effect of kindergarten retention on reading achievement from an observational study. We consider details of our framework, and then discuss how our approach informs judgment of inference relative to study design. We conclude with implications for scientific discourse. Keywords: causal inference; Rubin’s causal model; sensitivity analysis; observational studies Frank , K.A., Maroulis, S., Duong, M., and Kelcey, B. 2013. What would it take to Change an Infe rence?: Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences. Education, Evaluation and Policy Analysis . Vol 35: 437-460. http://epa.sagepub.com/content/early/recent

What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

  • Upload
    pavel

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences. Abstract - PowerPoint PPT Presentation

Citation preview

Page 1: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

What Would It Take to Change an Inference? Using Rubin’s Causal Model to

Interpret the Robustness of Causal Inferences

Abstract We contribute to debate about causal inferences in educational research in two ways. First, we quantify how much bias there must be in an estimate to invalidate an inference. Second, we utilize Rubin’s causal model (RCM) to interpret the bias necessary to invalidate an inference in terms of sample replacement. We apply our analysis to an inference of a positive effect of Open Court Curriculum on reading achievement from a randomized experiment, and an inference of a negative effect of kindergarten retention on reading achievement from an observational study. We consider details of our framework, and then discuss how our approach informs judgment of inference relative to study design. We conclude with implications for scientific discourse.  Keywords: causal inference; Rubin’s causal model; sensitivity analysis; observational studies Frank, K.A., Maroulis, S., Duong, M., and Kelcey, B. 2013.  What would it take to Change an Inference?: Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences.   Education, Evaluation and Policy Analysis.  Vol 35: 437-460.http://epa.sagepub.com/content/early/recent

Page 2: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

0123456789

A B

Study

Estim

ated

Effe

ct

above threshold

below threshold

Figure 1 Estimated Treatment Effects in Hypothetical Studies A and

B Relative to a Threshold for Inference

Threshold{}

% bias necessary

to invalidate

the inference

Page 3: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Quantifying the Discourse: Formalizing

Bias Necessary to Invalidate an Inference δ =a population effect, =the estimated effect, andδ# =the threshold for making an inference

An inference is invalid if: > δ # > δ. (1)

An inference is invalid if the estimate is greater than the threshold while the population value is less than the threshold.

Defining bias as -δ, (1) implies an estimate is invalid if and only if:

Expressed as a proportion of the estimate, inference invalid if:

# #ˆ( )ˆ% ( ) 1ˆ ˆbias

#ˆ ˆ( ) (2)bias

Page 4: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

0123456789

A B

Study

Estim

ated

Effe

ct

above threshold

below threshold

Figure 1 Estimated Treatment Effects in Hypothetical Studies A and

B Relative to a Threshold for Inference

Threshold

δ#

{}

% bias necessary

to invalidate

the inference

# #ˆ( ) 4 1ˆ% ( ) to invalidate= 1 1 33%ˆ ˆ 6 3bias

Page 5: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Interpretation of % Bias to Invalidate an Inference

% Bias is intuitiveRelates to how we think about statistical significance

Better than “highly significant” or “barely significant”

But need a framework for interpreting

Page 6: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Framework for Interpreting % Bias to Invalidate an Inference: Rubin’s Causal

Model and the Counterfactual

1) I have a headache2) I take an aspirin (treatment)3) My headache goes away (outcome)

Q) Is it because I took the aspirin?A) We’ll never know – it is counterfactual – for the

individualThis is the Fundamental Problem of Causal

Inference

Page 7: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Definition of Replacement Cases as Counterfactual: Potential Outcomes

t ci i iY Y Definition of treatment effect for individual i:

value on outcome if unit received treatment

value on outcome if unit received control

ti

ci

Y

Y

Fundamental problem of causal inference is that we cannot simultaneously observe

and t ci iY Y

Page 8: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Fundamental Problem of Inference and Approximating the Counterfactual with

Observed Data (Internal Validity)

345

6?6?6?

But how well does the observed data approximate the counterfactual?

91011

Page 9: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Symbolic: Fundamental Problem of Inference and Approximating the

Counterfactual with Observed Data (Internal Validity)

6?6?6?

But how well does the observed data approximate the counterfactual?

Yt|X=t Yc|X=t

Yc|X=cYt|X=c

Page 10: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Approximating the Counterfactual with Observed Data

345

But how well does the observed data approximate the counterfactual?Difference between counterfactual values and observed values for the control implies the treatment effect of 1

8910

111

6

is overestimated as 6 using observed control cases with mean of 4

9

Page 11: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Using the Counterfactual to Interpret % Bias to Invalidate the Inference

How many cases would you have to replace with zero effect counterfactuals to change the inference?Assume threshold is 4 (δ# =4):1- δ# /

=1-4/6=.33 =(1/3)

666

6.00

The inference would be invalid if you replaced 33% (or 1 case) with counterfactuals for which there was no treatment effect. New estimate=(1-% replaced) +%replaced(no effect)=(1-%replaced) =(1-.33)6=.66(6)=4

000

64

345

1011

9

Page 12: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

0123456789

A B

Study

Estim

ated

Effe

ct

above threshold

below threshold

Figure 1 Estimated Treatment Effects in Hypothetical Studies A and

B Relative to a Threshold for Inference

Threshold

δ#

{}

% bias necessary

to invalidate

the inference

# #ˆ( ) 4 1ˆ% ( ) to invalidate= 1 1 33%ˆ ˆ 6 3bias

To invalidate the inference, replace 33% of cases with counterfactual data with zero effect

Page 13: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Fundamental Problem of Inference to an Unsampled Population (External Validity)

But how well does the observed data represent both populations?

91011345

888666

64

counterfactual

Page 14: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Fundamental Problem of Inference and Approximating the Unsampled Population

with Observed Data (External Validity)

910 Yt|Z=p1134 Yc|Z=p5

66 6 6

64

How many cases would you have to replace with cases with zero effect to change the inference?Assume threshold is: δ# =4:1- δ# /

=1-4/6=.33 =(1/3)

6 Yt|Z=p´

6 Yc|Z=p´

0

Page 15: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

0123456789

A B

Study

Estim

ated

Effe

ct

above threshold

below threshold

Figure 1 Estimated Treatment Effects in Hypothetical Studies A and

B Relative to a Threshold for Inference

Threshold

δ#

{}

% bias necessary

to invalidate

the inference

# #ˆ( ) 4 1ˆ% ( ) to invalidate= 1 1 33%ˆ ˆ 6 3bias

To invalidate the inference, replace 33% of cases with cases from unsampled population data with zero effect

Page 16: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Review & ReflectionReview of Framework

Pragmatism thresholdsHow much does an estimate exceed the threshold % bias to invalidate the inferenceInterpretation: Rubin’s causal model

• internal validity: % bias to invalidate number of cases that must be replaced with counterfactual cases (for which there is no effect)

• external validity: % bias to invalidate number of cases that must be replaced with unobserved population (for which there is no effect)

ReflectWhich part is most confusing to you?Is there more than one interpretation?Discuss with a partner or two

Page 17: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Example of Internal Validity from Observational Study : The Effect of Kindergarten Retention on Reading and Math Achievement

(Hong and Raudenbush 2005)

1. What is the average effect of kindergarten retention policy? (Example used here)

Should we expect to see a change in children’s average learning outcomes if a school changes its retention policy?

Propensity based questions (not explored here)2. What is the average impact of a school’s retention policy on children who

would be promoted if the policy were adopted?

Use principal stratification.

Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224

Page 18: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Data• Early Childhood Longitudinal Study Kindergarten cohort (ECLSK)

– US National Center for Education Statistics (NCES). • Nationally representative• Kindergarten and 1st grade

– observed Fall 1998, Spring 1998, Spring 1999 • Student

– background and educational experiences– Math and reading achievement (dependent variable)– experience in class

• Parenting information and style• Teacher assessment of student• School conditions• Analytic sample (1,080 schools that do retain some children)

– 471 kindergarten retainees – 10,255 promoted students

Page 19: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Effect of Retention on Reading Scores(Hong and Raudenbush)

Page 20: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Possible Confounding Variables(note they controlled for these)

• Gender• Two Parent Household• Poverty• Mother’s level of Education (especially relevant for reading

achievement)• Extensive pretests

– measured in the Spring of 1999 (at the beginning of the second year of school)

– standardized measures of reading ability, math ability, and general knowledge;

– indirect assessments of literature, math and general knowledge that include aspects of a child’s process as well as product;

– teacher’s rating of the child’s skills in language, math, and science

Page 21: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Calculating the % Bias to Invalidate the Inference:Obtain spreadsheet

From https://www.msu.edu/~kenfrank/research.htm#causalChoose spreadsheet for calculating indices

Access spreadsheet

Page 22: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Calculating % Bias to Invalidate an Inference

Choose % bias to invalidate

Page 23: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Obtain t critical, estimated effect and standard error

Estimated effect( ) = -9.01 Standard

error=.68n=7168+471=7639;df > 500,

t critical=-1.96

From: Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224

Page 24: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Calculating the % Bias to Invalidate the Inference:Entering Values and Calculating

=the estimated effect = -9.01 standard error =.68 t critical= -1.96

δ# =the threshold for making an inference =

se x tcritical, df>230 =.68 x -1.96=-1.33

[user can specify alternative threshold]

% Bias necessary to invalidate inference = 1-δ#/ =1-1.33/-9.01=85%

85% of the estimate must be due to bias to invalidate the inference.

}

Page 25: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Using the Counterfactual to Interpret % Bias to Invalidate the Inference

How many cases would you have to replace with zero effect counterfactuals to change the inference?Assume threshold is 4 (δ# =4):1- δ# /

=1-4/6=.33 =(1/3)

666

6.00

The inference would be invalid if you replaced 33% (or 1 case) with counterfactuals for which there was no treatment effect. New estimate=(1-% replaced) +%replaced(no effect)=(1-%replaced) =(1-.33)6=.66(6)=4

000

64

345

1011

9

Page 26: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Original distributionReplacement counterfactual cases with zero effect

Retained Promoted

Example Replacement of Cases with Counterfactual Data to Invalidate Inference of an Effect of Kindergarten Retention

Counterfactual:promoted students, if they had been retained

Comparison in observed data

To invalidate, 85% of promoted students would have to have had most (7.2) of their advantage (conditional on pretests, motivation, ses, etc.) if all had been retained.

Page 27: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Original cases that were not replacedReplacement counterfactual cases with zero effect

Original distribution

Retained Promoted

Example Replacement of Cases with Counterfactual Data to Invalidate Inference of an Effect of Kindergarten Retention

Counterfactual:promoted students, if they had been retained

Comparison in observed data

To invalidate, 85% of promoted students would have to have had most (7.2) of their advantage (conditional on pretests, motivation, ses, etc.) if all had been retained.

Page 28: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Interpretation1) Consider test scores of a set of children who were retained that are considerably lower (9 points) than others who were candidates for retention but who were in fact promoted. No doubt some of the difference is due to advantages the comparable others had before being promoted. But now to believe that retention did not have an effect one must believe that 85% of those comparable others would have enjoyed most (7.2) of their advantages whether or not they had been retained.

This is even after controlling for differences on pretests, mother’s education, etc.

2) The replacement cases would come from the counterfactual condition for the observed outcomes. That is, 85% of the observed potential outcomes must be unexchangeable with the unobserved counterfactual potential outcomes such that it is necessary to replace those 85% with the counterfactual potential outcomes to make an inference in this sample. Note that this replacement must occur even after observed cases have been conditioned on background characteristics, school membership, and pretests used to define comparable groups.

Page 29: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Evaluation of % Bias Necessary to Invalidate Inference

Compare Bias Necessary to Invalidate Inference with Bias Accounted for by Background Characteristics

1% of estimated effect accounted for by background characteristics (including mother’s education), once controlling for pretestsMore than 85 times more unmeasured bias necessary to invalidate the inference

Compare with % Bias necessary to invalidate inference in other studiesUse correlation metric

• Adjusts for differences in scale

Page 30: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

% Bias Necessary to Invalidate Inference based on Correlationto Compare across Studies

2 2

t 13.25r .150(n q 1) t (7639 2) (13.25)

t taken from HLM: =-9.01/.68=-13.25n is the sample size q is the number of parameters estimated

# critical2 2critical

t 1.96threshold= r . .022(n q 1) t (7639 2) 1.96

Where t is critical value for df>200

% bias to invalidate inference=1-.022/.150=85%Accounts for changes in regression coefficient and standard errorBecause t(r)=t(β)

Page 31: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Calculating % Bias to Invalidate in terms of Correlations to Compare Across Studies

Choose impact and replacement

Page 32: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Compare with Bias other Observational Studies

Page 33: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Page 34: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

% Bias to Invalidate Inference for observational studieson-line EEPA July 24-Nov 15 2012

Kindergarten retention effect

Page 35: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Exercise 1 : % Bias necessary to Invalidate an Inference

• Take an example from an observational study in your own data or an article

• Calculate the % bias necessary to invalidate the inference– Interpret the % bias in terms of sample

replacement– What are the possible sources of bias?– Would they all work in the same direction?

• Debate your inference with a partner

Page 36: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

36

Application to Randomized Experiment: Effect of

Open Court Curriculum on Reading Achievement

• Open Court “scripted” curriculum versus business as usual

• 917 elementary students in 49 classrooms• Comparisons within grade and school• Outcome Measure: Terra Nova

comprehensive reading score Borman, G. D., Dowling, N. M., and Schneck, C. (2008). A multi-site cluster randomized field trial of

Open Court Reading. Educational Evaluation and Policy Analysis, 30(4), 389-407.

Page 37: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Page 38: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Page 39: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Value of Randomization

Few differences between groupsBut done at classroom level

Teachers might talk to each otherSchool level is expensive (Slavin, 2009)

Page 40: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

n=27+22=49

Page 41: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Obtaining # parameters estimated, t critical, estimated effect and standard error

Estimated effect( ) = 7.95

Standard error=1.83

3 parameters estimated,Df=n of classrooms-# of parameters estimated=49-3=46.t critical = t.05, df=46=2.013

Page 42: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Differences between Open Court and Business as Usual

Difference across grades: about 10 units

7.95 using statistical model“statistically significant” unlikely (probability < 5%) to have occurred by chance alone if there were really no differences in the populationBut is the Inference about Open Court valid in other contexts?

Page 43: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Quantifying the Discourse for Borman et al:What would it take to change the inference?

δ =a population effect, =the estimated effect = 7.95, andδ # =the threshold for making an inference = se x tcritical, df=46 =1.83 x 2.013=3.68

% Bias necessary to invalidate inference = 1- δ #/ =1-3.68/7.95=54%

54% of the estimate must be due to bias to invalidate the inference

Page 44: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Calculating the % Bias to Invalidate the Inference:Entering Values and Calculating

=the estimated effect = 7.95standard error =1.83t critical= 2.013

δ# =the threshold for making an inference = se x tcritical, df=46 =1.83 x 2.013=3.68[user can override to specify threshold]

% Bias necessary to invalidate inference = 1-d#/d =1-3.68/7.95=54%

54% of the estimate must be due to bias to invalidate the inference.

Page 45: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

OCR0

1

2

3

4

5

6

7

8

9

below threshold above threshold

Estim

ated

Effe

ct% Exceeding Threshold for Open Court

Estimated Effect

ˆ 7.95

δ# =3.68

54 % above threshold=1-3.68/7.95=.54}54% of the estimate must be due to bias to invalidate the inference

Page 46: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Fundamental Problem of Inference to an Unsampled Population (External Validity)

But how well does the observed data represent both populations?

91011345

888666

64

Page 47: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Fundamental Problem of Inference and Approximating the Counterfactual with

Observed Data (External Validity)

91011345

66 6 6

64

How many cases would you have to replace with cases with zero effect to change the inference?Assume threshold is: δ# =4:1- δ# /

=1-4/6=.33 =(1/3)

6

6

0

Page 48: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Interpretation of Amount of Bias Necessary to Invalidate the Inference: Sample

RepresentativenessTo invalidate the inference:

54% of the estimate must be due to sampling bias to invalidate Borman et al.’s inferenceYou would have to replace 54% of Borman’s cases (about 30 classes) with cases in which Open Court had no effect to invalidate the inference

Are 54% of Borman et al.’s cases irrelevant for non-volunteer schools?We have quantified the discourse about the concern of validity

Page 49: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Example Replacement of Cases from Non-Volunteer Schools to Invalidate Inference of an Effect of the Open Court Curriculum

Open CourtBusiness as Usual

Original volunteer cases that were not replacedReplacement cases from non-volunteer schools with no treatment effectOriginal distribution for all volunteer cases

Page 50: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Example Replacement of Cases from Non-Volunteer Schools to Invalidate Inference of an Effect of the Open Court Curriculum

Open CourtBusiness as Usual

Original volunteer cases that were not replacedReplacement cases from non-volunteer schools with no treatment effectOriginal distribution for all volunteer cases

Page 51: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

The Fundamental Problem of External Validity

Before a randomized experiment:People believe they do not “know” what generally works People choose treatments based on idiosyncratic conditions -- what they believe will work for them (Heckman, Urzua and Vytlacil, 2006)

After a randomized experiment:People believe they know what generally worksPeople are more inclined to choose a treatment shown to generally work in a study because they believe “it works”

The population is fundamentally changed by the experimenter (Ben-David; Kuhn)The fundamental problem of external validity

the more influential a study the more different the pre and post populations, the less the results apply to the post experimental populationAll the more so if it is due to the design (Burtless, 1995)

Page 52: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Comparisons across Randomized Experiments (correlation metric)

Page 53: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Page 54: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Distribution of % Bias to Invalidate Inference for Randomized Studies EEPA: On-line Jul 24-Nov 5 2012

Page 55: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Review & Reflection

Review of applicationsConcern about internal validity: Kindergarten retention (Hong and Raudenbush)

• 85% of cases must be replaced counterfactual data (with no effect) to invalidate the inference of a negative effect of retention on reading achievement

– Comparison with other observational studies

Concern about external validity: Open Court Curriculum

• 54% of cases must be replaced with data from unobserved population to invalidate the inference of a positive effect of Open Court on reading achievement in non-volunteer schools

– Comparison with other randomized experiments

ReflectWhich part is most confusing to you?Is there more than one interpretation?Discuss with a partner or two

Page 56: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Exercise 2 : % Bias necessary to Invalidate an Inference

Take an example of a randomized experiment in your own data or an articleCalculate the % bias necessary to invalidate the inference

Interpret the % bias in terms of sample replacementWhat are the possible sources of bias?Would they all work in the same direction?

Debate your inference with a new partner

Page 57: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Extensions of the Framework

Ordered thresholds for decision-makingAlternative hypotheses and scenariosRelationship to confidence intervalsRelated techniques

Page 58: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Ordered Thresholds Relative to Transaction Costs

1. Changing beliefs, without a corresponding change in action.

2. Changing action for an individual (or family)3. Increasing investments in an existing program.4. Initial investment in a pilot program where none

exists.5. Dismantling an existing program and replacing

it with a new program.

Definition of threshold: the point at which evidence from a study would make one indifferent to policy choices

Page 59: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Alternative Hypotheses and Scenarios

Non-zero null hypotheses (for kindergarten retention)H0:δ> −6. se x tcritical, df=7639=.68 x (−1.645)= −1.12 (one tailed test). δ# = −6−1.12=−7.121− δ #/ =1− (−7.12/−9)=.21. 21% of estimated effect would have to be due to bias to invalidate inference for H0:δ> −6.

Failure to reject the null hypothesis when in fact the null is false.Use δ# = −4

Non-zero effect in the replacement (non-volunteer) population 1-πp<(δp − δ#)/(δ p − δ p´ ). If δ p´ = −2, and δ#=3.68 and δ p =7.95 (both as in the initial example). Inference is invalid if 1-πp<(7.95 – 3.68)/(7.95 − −2 ) =.43;

inference invalid if more than 43% of the sample were replaced with cases for which the effect of OCR was −2 .

Page 60: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

}0 1 2 3 4 5 6 7 8 9 1 1 1 1 0 1 2 3

Confidence Interval

Relationship between the Confidence Interval and % Bias Necessary to Invalidate the Inference of an Effect of Open Court on Comprehensive Reading Score

δ #

Lower bound of confidence interval “far from 0” estimate exceeds threshold by large amount

0 1

2 3 4

5 6 7

8 9 1

1 1 1

0 1

2 3} }

δ #

}}

Page 61: What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences

Replacement Cases Framework

overview

Thresholds for inference and % bias to invalidateThe counterfactual paradigmInternal validity example: kindergarten retention External validity exampleOpen Court curriculumExtensions of the framework

Related TechniquesBounding (e.g., Altonji et, Elder & Tabor, 2005; Imbens 2003; Manski)

lower bound: “if unobserved factors are as strong as observed factors, how small could the estimate be?”

• Focus on estimate% robustness: “how strong would unobserved factors have to be to invalidate inference?”

• Focus on inference, policy & behaviorExternal validity based on propensity to be in a study (Hedges and O’Muircheartaigh )

They focus on estimateWe focus on comparison with a threshold

Other sensitivity (e.g., Rosenbaum or Robins)Characteristics of variables needed to change inferenceWe focus on how sample must change.

• Can be applied to observational study or RCTOther Sources of Bias

Violations of SUTVA• Agent based models?

Measurement error• Just another source of bias (minor concern for examples here)

Differential treatment effects• Use propensity scores to differentiate, then apply indices