128
SIEF Kenya Impact Evaluation Workshop Difference-in-Difference Estimation May 6, 2015 Instructor: Pamela Jakiela University of Maryland, College Park, USA

Generating counterfactuals: Differences-in-differences

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Generating counterfactuals: Differences-in-differences

SIEF Kenya Impact Evaluation Workshop

Difference-in-Difference Estimation

May 6, 2015

Instructor: Pamela JakielaUniversity of Maryland, College Park, USA

Page 2: Generating counterfactuals: Differences-in-differences

Overview

• Review: false counterfactuals

• Difference-in-differences: the intuition

• Difference-in-differences: the Stata code

• Checking the common trends assumption

• A practical example

SIEF IE Workshop: Difference-in-Difference Estimation Slide 2

Page 3: Generating counterfactuals: Differences-in-differences

Motivation: False Counterfactuals

Page 4: Generating counterfactuals: Differences-in-differences

What Is an Impact Evaluation?

“An impact evaluation assesses changes in the well-being of individuals,households, communities or firms that can be attributed to a particular

project, program or policy. The central impact evaluation question iswhat would have happened to those receiving the intervention if they had

not in fact received the program. Since we cannot observe this groupboth with and without the intervention, the key challenge is to develop a

counterfactual — that is, a group which is as similar as possible (inobservable and unobservable dimensions) to those receiving the

intervention. This comparison allows for the establishment of definitivecausality — attributing observed changes in welfare to the program, while

removing confounding factors.”

SIEF IE Workshop: Difference-in-Difference Estimation Slide 4

Page 5: Generating counterfactuals: Differences-in-differences

What Is an Impact Evaluation?

Goal: measure causal impacts of policy on participants

• We did A; as a result, B happened

• A is a policy or intervention

• B is an outcome of interest (what we hope to impact)

• Examples:

I We gave out insecticide-treated bednets, and fewer children underthe age of 5 got sick with or died from malaria as a result

I We distributed free lunches in elementary schools, and schoolattendance and/or academic performance went up as a result

SIEF IE Workshop: Difference-in-Difference Estimation Slide 5

Page 6: Generating counterfactuals: Differences-in-differences

What Is an Impact Evaluation?

Goal: measure causal impacts of policy on participants

• We did A; as a result, B happened

• A is a policy or intervention

• B is an outcome of interest (what we hope to impact)

• Examples:

I We gave out insecticide-treated bednets, and fewer children underthe age of 5 got sick with or died from malaria as a result

I We distributed free lunches in elementary schools, and schoolattendance and/or academic performance went up as a result

SIEF IE Workshop: Difference-in-Difference Estimation Slide 5

Page 7: Generating counterfactuals: Differences-in-differences

What Is an Impact Evaluation?

Goal: measure causal impacts of policy on participants

• We did A; as a result, B happened

• A is a policy or intervention

• B is an outcome of interest (what we hope to impact)

• Examples:

I We gave out insecticide-treated bednets, and fewer children underthe age of 5 got sick with or died from malaria as a result

I We distributed free lunches in elementary schools, and schoolattendance and/or academic performance went up as a result

SIEF IE Workshop: Difference-in-Difference Estimation Slide 5

Page 8: Generating counterfactuals: Differences-in-differences

Establishing Causality

Goal: measure causal impacts of policy on participants

• We want to be able to say B happened because of A

I We need to rule out other possible causes of B

• If we can say this, then we can also say: if we did A again (inanother place), we think that B would happen there as well

In an ideal world (research-wise), we could clone each programparticipant and observe the impacts of our program on their lives

vs.

SIEF IE Workshop: Difference-in-Difference Estimation Slide 6

Page 9: Generating counterfactuals: Differences-in-differences

Establishing Causality

Goal: measure causal impacts of policy on participants

• We want to be able to say B happened because of A

I We need to rule out other possible causes of B

• If we can say this, then we can also say: if we did A again (inanother place), we think that B would happen there as well

In an ideal world (research-wise), we could clone each programparticipant and observe the impacts of our program on their lives

vs.

SIEF IE Workshop: Difference-in-Difference Estimation Slide 6

Page 10: Generating counterfactuals: Differences-in-differences

Establishing Causality

In an ideal world (research-wise), we could clone each programparticipant and observe the impacts of our program on their lives

vs.

What is the impact of giving Lisa a book on her test score?

• Impact = Lisa’s score with a book - Lisa’s score without a book

In the real world, we either observe Lisa with a book or without

• We never observe the counterfactual

SIEF IE Workshop: Difference-in-Difference Estimation Slide 7

Page 11: Generating counterfactuals: Differences-in-differences

Establishing Causality

In an ideal world (research-wise), we could clone each programparticipant and observe the impacts of our program on their lives

vs.

What is the impact of giving Lisa a book on her test score?

• Impact = Lisa’s score with a book - Lisa’s score without a book

In the real world, we either observe Lisa with a book or without

• We never observe the counterfactual

SIEF IE Workshop: Difference-in-Difference Estimation Slide 7

Page 12: Generating counterfactuals: Differences-in-differences

Establishing Causality

To measure the causal impact of giving Lisa a book on her test score, weneed to find a comparison group that did not receive a book

vs.

Our estimate of the impact of the book is then the difference in testscores between the treatment group and the comparison group

• Impact = Lisa’s score with a book - Bart’s score without a book

As this example illustrates, finding a good comparison group is hard

SIEF IE Workshop: Difference-in-Difference Estimation Slide 8

Page 13: Generating counterfactuals: Differences-in-differences

Establishing Causality

To measure the causal impact of giving Lisa a book on her test score, weneed to find a comparison group that did not receive a book

vs.

Our estimate of the impact of the book is then the difference in testscores between the treatment group and the comparison group

• Impact = Lisa’s score with a book - Bart’s score without a book

As this example illustrates, finding a good comparison group is hard

SIEF IE Workshop: Difference-in-Difference Estimation Slide 8

Page 14: Generating counterfactuals: Differences-in-differences

The Potential Outcomes Framework

Two potential outcomes for each individual, community, etc:

Potential outcome =

{Y0i Pi = 0

Y1i Pi = 1

The problem: we only observe one of Y1i and Y0i

• Each individual either participates in the program or not

• The causal impact of program (P) on i is: Y1i − Y0i

We observe i ’s actual outcome:

Yi = Y0i + (Y1i − Y0i )︸ ︷︷ ︸impact

Pi

SIEF IE Workshop: Difference-in-Difference Estimation Slide 9

Page 15: Generating counterfactuals: Differences-in-differences

The Potential Outcomes Framework

Two potential outcomes for each individual, community, etc:

Potential outcome =

{Y0i Pi = 0

Y1i Pi = 1

The problem: we only observe one of Y1i and Y0i

• Each individual either participates in the program or not

• The causal impact of program (P) on i is: Y1i − Y0i

We observe i ’s actual outcome:

Yi = Y0i + (Y1i − Y0i )︸ ︷︷ ︸impact

Pi

SIEF IE Workshop: Difference-in-Difference Estimation Slide 9

Page 16: Generating counterfactuals: Differences-in-differences

The Potential Outcomes Framework

Two potential outcomes for each individual, community, etc:

Potential outcome =

{Y0i Pi = 0

Y1i Pi = 1

The problem: we only observe one of Y1i and Y0i

• Each individual either participates in the program or not

• The causal impact of program (P) on i is: Y1i − Y0i

We observe i ’s actual outcome:

Yi = Y0i + (Y1i − Y0i )︸ ︷︷ ︸impact

Pi

SIEF IE Workshop: Difference-in-Difference Estimation Slide 9

Page 17: Generating counterfactuals: Differences-in-differences

Defining the Counterfactual

To estimate the impact of a program, we need to know what would havehappened to every participant i in the absence the program

• We call this the counterfactual

Of course, we can’t actually clone our participants and see what happensto the clones if they don’t participate in the program

• Instead, we estimate the counterfactual using a comparison group

The comparison group needs to:

• Look identical to the treatment group prior to the program

• Not be impacted by the program in anyway

YOU CANNOT HAVE A GOOD IMPACT EVALUATIONWITHOUT A CREDIBLE, CONVINCING COMPARISON GROUP

SIEF IE Workshop: Difference-in-Difference Estimation Slide 10

Page 18: Generating counterfactuals: Differences-in-differences

Defining the Counterfactual

To estimate the impact of a program, we need to know what would havehappened to every participant i in the absence the program

• We call this the counterfactual

Of course, we can’t actually clone our participants and see what happensto the clones if they don’t participate in the program

• Instead, we estimate the counterfactual using a comparison group

The comparison group needs to:

• Look identical to the treatment group prior to the program

• Not be impacted by the program in anyway

YOU CANNOT HAVE A GOOD IMPACT EVALUATIONWITHOUT A CREDIBLE, CONVINCING COMPARISON GROUP

SIEF IE Workshop: Difference-in-Difference Estimation Slide 10

Page 19: Generating counterfactuals: Differences-in-differences

Defining the Counterfactual

To estimate the impact of a program, we need to know what would havehappened to every participant i in the absence the program

• We call this the counterfactual

Of course, we can’t actually clone our participants and see what happensto the clones if they don’t participate in the program

• Instead, we estimate the counterfactual using a comparison group

The comparison group needs to:

• Look identical to the treatment group prior to the program

• Not be impacted by the program in anyway

YOU CANNOT HAVE A GOOD IMPACT EVALUATIONWITHOUT A CREDIBLE, CONVINCING COMPARISON GROUP

SIEF IE Workshop: Difference-in-Difference Estimation Slide 10

Page 20: Generating counterfactuals: Differences-in-differences

Defining the Counterfactual

To estimate the impact of a program, we need to know what would havehappened to every participant i in the absence the program

• We call this the counterfactual

Of course, we can’t actually clone our participants and see what happensto the clones if they don’t participate in the program

• Instead, we estimate the counterfactual using a comparison group

The comparison group needs to:

• Look identical to the treatment group prior to the program

• Not be impacted by the program in anyway

YOU CANNOT HAVE A GOOD IMPACT EVALUATIONWITHOUT A CREDIBLE, CONVINCING COMPARISON GROUP

SIEF IE Workshop: Difference-in-Difference Estimation Slide 10

Page 21: Generating counterfactuals: Differences-in-differences

The Moving Parts of an Impact Evaluation

A policy or program of interest (aka the “treatment”)

• Pi = 1 if individual/community i participated in the program

• Pi = 0 otherwise

• The treatment group: a group of people for whom Pi = 1

• The comparison group: a group of people for whom Pi = 0

The outcome of interest: the dependent variable in our analysis

• Something that we care about

• Something that we expect to be impacted by the treatment

An impact evaluation compares values of the outcome of interest in thetreatment group to values in the comparison group

• We attribute the difference to the impact of treatment

SIEF IE Workshop: Difference-in-Difference Estimation Slide 11

Page 22: Generating counterfactuals: Differences-in-differences

The Moving Parts of an Impact Evaluation

A policy or program of interest (aka the “treatment”)

• Pi = 1 if individual/community i participated in the program

• Pi = 0 otherwise

• The treatment group: a group of people for whom Pi = 1

• The comparison group: a group of people for whom Pi = 0

The outcome of interest: the dependent variable in our analysis

• Something that we care about

• Something that we expect to be impacted by the treatment

An impact evaluation compares values of the outcome of interest in thetreatment group to values in the comparison group

• We attribute the difference to the impact of treatment

SIEF IE Workshop: Difference-in-Difference Estimation Slide 11

Page 23: Generating counterfactuals: Differences-in-differences

The Moving Parts of an Impact Evaluation

A policy or program of interest (aka the “treatment”)

• Pi = 1 if individual/community i participated in the program

• Pi = 0 otherwise

• The treatment group: a group of people for whom Pi = 1

• The comparison group: a group of people for whom Pi = 0

The outcome of interest: the dependent variable in our analysis

• Something that we care about

• Something that we expect to be impacted by the treatment

An impact evaluation compares values of the outcome of interest in thetreatment group to values in the comparison group

• We attribute the difference to the impact of treatment

SIEF IE Workshop: Difference-in-Difference Estimation Slide 11

Page 24: Generating counterfactuals: Differences-in-differences

The Moving Parts of an Impact Evaluation

A policy or program of interest (aka the “treatment”)

• Pi = 1 if individual/community i participated in the program

• Pi = 0 otherwise

• The treatment group: a group of people for whom Pi = 1

• The comparison group: a group of people for whom Pi = 0

The outcome of interest: the dependent variable in our analysis

• Something that we care about

• Something that we expect to be impacted by the treatment

An impact evaluation compares values of the outcome of interest in thetreatment group to values in the comparison group

• We attribute the difference to the impact of treatment

SIEF IE Workshop: Difference-in-Difference Estimation Slide 11

Page 25: Generating counterfactuals: Differences-in-differences

False Counterfactuals

Two types of false counterfactuals:

• Before vs. After Comparisons

• Participant vs. Non-Participant Comparisons

Consider these false counterfactuals in context of a simple example:

• Problem: poor academic performance

• Program: extra training for teachers, materials for classrooms

• Outcome: student test scores

• Strategy: baseline and endline (before and after) data collection

SIEF IE Workshop: Difference-in-Difference Estimation Slide 12

Page 26: Generating counterfactuals: Differences-in-differences

False Counterfactuals

Two types of false counterfactuals:

• Before vs. After Comparisons

• Participant vs. Non-Participant Comparisons

Consider these false counterfactuals in context of a simple example:

• Problem: poor academic performance

• Program: extra training for teachers, materials for classrooms

• Outcome: student test scores

• Strategy: baseline and endline (before and after) data collection

SIEF IE Workshop: Difference-in-Difference Estimation Slide 12

Page 27: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

Impact of the program: B − A?

Before vs. after analysis assumes test scores would not have changedbetween t = 0 and t = 1 in the absence of the program

SIEF IE Workshop: Difference-in-Difference Estimation Slide 13

Page 28: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

Impact of the program: B − A?

Before vs. after analysis assumes test scores would not have changedbetween t = 0 and t = 1 in the absence of the program

SIEF IE Workshop: Difference-in-Difference Estimation Slide 13

Page 29: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

Impact of the program: B − A?

Before vs. after analysis assumes test scores would not have changedbetween t = 0 and t = 1 in the absence of the program

SIEF IE Workshop: Difference-in-Difference Estimation Slide 14

Page 30: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

Impact of the program: B − A?

Before vs. after analysis assumes test scores would not have changedbetween t = 0 and t = 1 in the absence of the program

SIEF IE Workshop: Difference-in-Difference Estimation Slide 14

Page 31: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

What if the parents’ income, or students’ overall level of learning, or theteacher, or the weather, or some other thing(s) changed?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 15

Page 32: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

The perils of pre vs. post analysis should be obvious. . .

. . .as we all recall from reading the famous paper: “Does graduatingfrom college cause women get pregnant? A pre-vs-post analysis ofthe impacts of education on fertility”

A slightly more subtle example of the perils of pre vs. post analysis comesfrom the mid-term report evaluating the Millennium Villages

• The report highlights the fourfold increase in mobile phoneownership between 2005 and 2008 among households in Bar Sauri

SIEF IE Workshop: Difference-in-Difference Estimation Slide 16

Page 33: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

The perils of pre vs. post analysis should be obvious. . .

. . .as we all recall from reading the famous paper: “Does graduatingfrom college cause women get pregnant? A pre-vs-post analysis ofthe impacts of education on fertility”

A slightly more subtle example of the perils of pre vs. post analysis comesfrom the mid-term report evaluating the Millennium Villages

• The report highlights the fourfold increase in mobile phoneownership between 2005 and 2008 among households in Bar Sauri

SIEF IE Workshop: Difference-in-Difference Estimation Slide 16

Page 34: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

SIEF IE Workshop: Difference-in-Difference Estimation Slide 17

Page 35: Generating counterfactuals: Differences-in-differences

Before vs. After Comparisons

Clemens and Demombynes (2010) compare changes in mobile phoneownership in Bar Sauri (rectangles) to trends in Kenya (red), rural Kenya(green), and rural areas in Nyanza Province (blue)

• The problem is obvious: before vs. after analysis assumes that thereis no time trend in mobile phone ownership in Kenya

SIEF IE Workshop: Difference-in-Difference Estimation Slide 18

Page 36: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

What if we compare (post-intervention) test scores in program schools totest scores in nearby schools that did not participate in the program?

Can we estimate the impact of the program by calculating T − C?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 19

Page 37: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

E [Yi |Pi = Z ] denotes the population (or large sample) average of theoutcome variable Y (test scores) in schools with Pi = 0 or Pi = 1

• E [Yi ] = average test score in school i

• Pi = 1 program school, Pi = 0 in nearby (comparison) school

• Average outcome in program schools: E [Yi |Pi = 1] = Y

• Average outcome in neighboring schools: E [Yi |Pi = 0] = Z

Our estimate of the impact of the program (P) is:

Impact = E [Yi |Pi = 1]− E [Yi |Pi = 0]

In a regression framework: E [Yi ] = α + β · Pi

• When we regress Y on an indicator, P: β = YPi=1 − YPi=0

SIEF IE Workshop: Difference-in-Difference Estimation Slide 20

Page 38: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

E [Yi |Pi = Z ] denotes the population (or large sample) average of theoutcome variable Y (test scores) in schools with Pi = 0 or Pi = 1

• E [Yi ] = average test score in school i

• Pi = 1 program school, Pi = 0 in nearby (comparison) school

• Average outcome in program schools: E [Yi |Pi = 1] = Y

• Average outcome in neighboring schools: E [Yi |Pi = 0] = Z

Our estimate of the impact of the program (P) is:

Impact = E [Yi |Pi = 1]− E [Yi |Pi = 0]

In a regression framework: E [Yi ] = α + β · Pi

• When we regress Y on an indicator, P: β = YPi=1 − YPi=0

SIEF IE Workshop: Difference-in-Difference Estimation Slide 20

Page 39: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

E [Yi |Pi = Z ] denotes the population (or large sample) average of theoutcome variable Y (test scores) in schools with Pi = 0 or Pi = 1

• E [Yi ] = average test score in school i

• Pi = 1 program school, Pi = 0 in nearby (comparison) school

• Average outcome in program schools: E [Yi |Pi = 1] = Y

• Average outcome in neighboring schools: E [Yi |Pi = 0] = Z

Our estimate of the impact of the program (P) is:

Impact = E [Yi |Pi = 1]− E [Yi |Pi = 0]

In a regression framework: E [Yi ] = α + β · Pi

• When we regress Y on an indicator, P: β = YPi=1 − YPi=0

SIEF IE Workshop: Difference-in-Difference Estimation Slide 20

Page 40: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

Wait!! Why weren’t the neighboring schools included in the program?

• Maybe they had low quality head teachers (who didn’t bother to fillout the paperwork to enroll in the program)

• Maybe they already had high test scores

• Those who aren’t eligible and those who choose not to participatemay have different outcomes in the absence of the program

• This is selection bias

Remember: the causal impact of program on i is: Y1i − Y0i

• Assuming that outcomes in program schools in the absence of theprogram would look like outcomes observed in the comparisonschools

SIEF IE Workshop: Difference-in-Difference Estimation Slide 21

Page 41: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

Wait!! Why weren’t the neighboring schools included in the program?

• Maybe they had low quality head teachers (who didn’t bother to fillout the paperwork to enroll in the program)

• Maybe they already had high test scores

• Those who aren’t eligible and those who choose not to participatemay have different outcomes in the absence of the program

• This is selection bias

Remember: the causal impact of program on i is: Y1i − Y0i

• Assuming that outcomes in program schools in the absence of theprogram would look like outcomes observed in the comparisonschools

SIEF IE Workshop: Difference-in-Difference Estimation Slide 21

Page 42: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

SIEF IE Workshop: Difference-in-Difference Estimation Slide 22

Page 43: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

Our estimate of the impact of a training program (P) is:

Impact = E [Yi |Pi = 1]− E [Yi |Pi = 0]

= E [Y1i |Pi = 1]− E [Y0i |Pi = 1]︸ ︷︷ ︸program impact

+E [Y0i |Pi = 1]− E [Y0i |Pi = 0]︸ ︷︷ ︸selection bias

When E [Y0i |Pi = 1]− E [Y0i |Pi = 0] 6= 0, we have a problem.

• The treatment and comparison groups would not have looked thesame in the absence of the program. Why might this occur?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 23

Page 44: Generating counterfactuals: Differences-in-differences

Participants vs. Non-Participants

Our estimate of the impact of a training program (P) is:

Impact = E [Yi |Pi = 1]− E [Yi |Pi = 0]

= E [Y1i |Pi = 1]− E [Y0i |Pi = 1]︸ ︷︷ ︸program impact

+E [Y0i |Pi = 1]− E [Y0i |Pi = 0]︸ ︷︷ ︸selection bias

When E [Y0i |Pi = 1]− E [Y0i |Pi = 0] 6= 0, we have a problem.

• The treatment and comparison groups would not have looked thesame in the absence of the program. Why might this occur?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 23

Page 45: Generating counterfactuals: Differences-in-differences

Summary: False Counterfactuals

Before vs. After Comparisons:

• Compares: same individuals/communities before and after program

• Drawback: things (besides the program) may happen over time

Participant vs. Non-Participant Comparisons:

• Compares: participants to those not in the program

• Drawback: selection bias — why aren’t they in the program?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 24

Page 46: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation: Intuition

Page 47: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Difference-in-difference (or “diff-in-diff” or “DD”) impact evaluationscombine the pre vs. post and enrolled vs. not enrolled approaches

• This can sometimes overcome the twin problems of [1] selectionbias and [2] time trends in the outcome of interest

• The basic idea is to observe the treatment group and a comparisongroup (for example, the not enrolled) before and after the program

The diff-in-diff estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 26

Page 48: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Difference-in-difference (or “diff-in-diff” or “DD”) impact evaluationscombine the pre vs. post and enrolled vs. not enrolled approaches

• This can sometimes overcome the twin problems of [1] selectionbias and [2] time trends in the outcome of interest

• The basic idea is to observe the treatment group and a comparisongroup (for example, the not enrolled) before and after the program

The diff-in-diff estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 26

Page 49: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Treatment Comparison

Pre-Program Y treatmentpre Y comparison

pre

Post-Program Y treatmentpost Y comparison

post

Intuitively, diff-in-diff estimation is just a comparison of 4 cell-level means

SIEF IE Workshop: Difference-in-Difference Estimation Slide 27

Page 50: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Treatment Comparison

Pre-Program Y treatmentpre Y comparison

pre

Post-Program Y treatmentpost Y comparison

post

Only one of the 4 cells is treated (has received the program)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 28

Page 51: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Treatment Comparison

Pre-Program Y treatmentpre Y comparison

pre

Post-Program Y treatmentpost Y comparison

post

Comparing treatment vs. comparison pre-program measures selection bias

SIEF IE Workshop: Difference-in-Difference Estimation Slide 29

Page 52: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Treatment Comparison

Pre-Program Y treatmentpre Y comparison

pre

Post-Program Y treatmentpost Y comparison

post

Only one of the 4 cells is treated (has received the program)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 30

Page 53: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

The assumption underlying diff-in-diff estimation is that, in the absenceof the program, individual i ’s outcome at time t is given by:

E [Yi |Pi = 0, t] = γi + λt

There are two implicit identifying assumptions here:

• Selection bias relates to fixed characteristics of individuals (γi )

I The magnitude of the selection bias term isn’t changing over time

• Time trend (λt) same for treatment and control groups

These two necessary conditions for identification in diff-in-diff estimationare often referred to (collectively) as the common trends assumption

SIEF IE Workshop: Difference-in-Difference Estimation Slide 31

Page 54: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

The assumption underlying diff-in-diff estimation is that, in the absenceof the program, individual i ’s outcome at time t is given by:

E [Yi |Pi = 0, t] = γi + λt

There are two implicit identifying assumptions here:

• Selection bias relates to fixed characteristics of individuals (γi )

I The magnitude of the selection bias term isn’t changing over time

• Time trend (λt) same for treatment and control groups

These two necessary conditions for identification in diff-in-diff estimationare often referred to (collectively) as the common trends assumption

SIEF IE Workshop: Difference-in-Difference Estimation Slide 31

Page 55: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

The assumption underlying diff-in-diff estimation is that, in the absenceof the program, individual i ’s outcome at time t is given by:

E [Yi |Pi = 0, t] = γi + λt

There are two implicit identifying assumptions here:

• Selection bias relates to fixed characteristics of individuals (γi )

I The magnitude of the selection bias term isn’t changing over time

• Time trend (λt) same for treatment and control groups

These two necessary conditions for identification in diff-in-diff estimationare often referred to (collectively) as the common trends assumption

SIEF IE Workshop: Difference-in-Difference Estimation Slide 31

Page 56: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

In the absence of the program, i ’s outcome at time t is:

E [Y0i |Pi = 0, t] = γi + λt

Outcomes in the comparison group:

E [Y comparisonpre ] = E [Y0i |Pi = 0, t = 1] = E [γi |Pi = 0] + λ1

E [Y comparisonpost ] = E [Y0i |Pi = 0, t = 2] = E [γi |Pi = 0] + λ2

Time trend:

E [Y comparisonpost ]− E [Y comparison

pre ] = E [γi |Pi = 0] + λ2 − (E [γi |Pi = 0] + λ1)

= λ2 − λ1

SIEF IE Workshop: Difference-in-Difference Estimation Slide 32

Page 57: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

In the absence of the program, i ’s outcome at time t is:

E [Y0i |Pi = 0, t] = γi + λt

Outcomes in the comparison group:

E [Y comparisonpre ] = E [Y0i |Pi = 0, t = 1] = E [γi |Pi = 0] + λ1

E [Y comparisonpost ] = E [Y0i |Pi = 0, t = 2] = E [γi |Pi = 0] + λ2

Time trend:

E [Y comparisonpost ]− E [Y comparison

pre ] = E [γi |Pi = 0] + λ2 − (E [γi |Pi = 0] + λ1)

= λ2 − λ1

SIEF IE Workshop: Difference-in-Difference Estimation Slide 32

Page 58: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

In the absence of the program, i ’s outcome at time t is:

E [Y0i |Pi = 0, t] = γi + λt

Outcomes in the comparison group:

E [Y comparisonpre ] = E [Y0i |Pi = 0, t = 1] = E [γi |Pi = 0] + λ1

E [Y comparisonpost ] = E [Y0i |Pi = 0, t = 2] = E [γi |Pi = 0] + λ2

Time trend:

E [Y comparisonpost ]− E [Y comparison

pre ] = E [γi |Pi = 0] + λ2 − (E [γi |Pi = 0] + λ1)

= λ2 − λ1

SIEF IE Workshop: Difference-in-Difference Estimation Slide 32

Page 59: Generating counterfactuals: Differences-in-differences

Difference-in-Difference EstimationLet δ denote the true impact of the program:

δ = E [Y1i |Pi = 1, t]− E [Y0i |Pi = 1, t]

which does not depend on the time period or i ’s characteristics

Outcomes in the treatment group:

E [Y treatmentpre ] = E [Y0i |Pi = 1, t = 1] = E [γi |Pi = 1] + λ1

E [Y treatmentpost ] = E [Y1i |Pi = 1, t = 2] = E [γi |Pi = 1] + δ + λ2

If we were to calculate a pre-vs-post estimator, we’d have:

E [Y treatmentpost ]− E [Y treatment

pre ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 1] + λ1)

= δ + λ2 − λ1︸ ︷︷ ︸timetrend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 33

Page 60: Generating counterfactuals: Differences-in-differences

Difference-in-Difference EstimationLet δ denote the true impact of the program:

δ = E [Y1i |Pi = 1, t]− E [Y0i |Pi = 1, t]

which does not depend on the time period or i ’s characteristics

Outcomes in the treatment group:

E [Y treatmentpre ] = E [Y0i |Pi = 1, t = 1] = E [γi |Pi = 1] + λ1

E [Y treatmentpost ] = E [Y1i |Pi = 1, t = 2] = E [γi |Pi = 1] + δ + λ2

If we were to calculate a pre-vs-post estimator, we’d have:

E [Y treatmentpost ]− E [Y treatment

pre ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 1] + λ1)

= δ + λ2 − λ1︸ ︷︷ ︸timetrend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 33

Page 61: Generating counterfactuals: Differences-in-differences

Difference-in-Difference EstimationLet δ denote the true impact of the program:

δ = E [Y1i |Pi = 1, t]− E [Y0i |Pi = 1, t]

which does not depend on the time period or i ’s characteristics

Outcomes in the treatment group:

E [Y treatmentpre ] = E [Y0i |Pi = 1, t = 1] = E [γi |Pi = 1] + λ1

E [Y treatmentpost ] = E [Y1i |Pi = 1, t = 2] = E [γi |Pi = 1] + δ + λ2

If we were to calculate a pre-vs-post estimator, we’d have:

E [Y treatmentpost ]− E [Y treatment

pre ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 1] + λ1)

= δ + λ2 − λ1︸ ︷︷ ︸timetrend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 33

Page 62: Generating counterfactuals: Differences-in-differences

Difference-in-Difference EstimationLet δ denote the true impact of the program:

δ = E [Y1i |Pi = 1, t]− E [Y0i |Pi = 1, t]

which does not depend on the time period or i ’s characteristics

Outcomes in the treatment group:

E [Y treatmentpre ] = E [Y0i |Pi = 1, t = 1] = E [γi |Pi = 1] + λ1

E [Y treatmentpost ] = E [Y1i |Pi = 1, t = 2] = E [γi |Pi = 1] + δ + λ2

If we were to calculate a pre-vs-post estimator, we’d have:

E [Y treatmentpost ]− E [Y treatment

pre ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 1] + λ1)

= δ + λ2 − λ1︸ ︷︷ ︸timetrend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 33

Page 63: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

If we calculated a treatment vs. comparison estimator, we’d have:

E [Y treatmentpost ]− E [Y comparison

post ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 0] + λ2)

= δ + E [γi |Pi = 1]− E [γi |Pi = 0]︸ ︷︷ ︸selectionbias

The diff-in-diff estimator removes the selection bias, time trend:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 34

Page 64: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

If we calculated a treatment vs. comparison estimator, we’d have:

E [Y treatmentpost ]− E [Y comparison

post ] = E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 0] + λ2)

= δ + E [γi |Pi = 1]− E [γi |Pi = 0]︸ ︷︷ ︸selectionbias

The diff-in-diff estimator removes the selection bias, time trend:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 34

Page 65: Generating counterfactuals: Differences-in-differences

Difference-in-Difference Estimation

Substituting in the terms from our model:

DD = E [Y1i |Pi = 1, t = 2]− E [Y1i |Pi = 1, t = 1]

−(E [Y1i |Pi = 0, t = 2]− E [Y1i |Pi = 0, t = 1]

)= E [γi |Pi = 1] + δ + λ2 − (E [γi |Pi = 1] + λ1)

−[E [γi |Pi = 0] + λ2 −

(E [γi |Pi = 0] + λ1

)]

= δ

the true impact of the program on participants

SIEF IE Workshop: Difference-in-Difference Estimation Slide 35

Page 66: Generating counterfactuals: Differences-in-differences

Example: Supply vs. Demand for Education

The supply side of education (provision of quality schools, teachers):

• Are there enough schools?

• Have teachers received enough training?

• Are teachers present in the classroom?

• Are class sizes too large?

Supply constraints are related to school quality

• Main problem: governments need to provide more, better schools

Research question: if the government builds more schools, how muchwill education levels, human capital increase?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 36

Page 67: Generating counterfactuals: Differences-in-differences

Example: Supply vs. Demand for Education

The supply side of education (provision of quality schools, teachers):

• Are there enough schools?

• Have teachers received enough training?

• Are teachers present in the classroom?

• Are class sizes too large?

Supply constraints are related to school quality

• Main problem: governments need to provide more, better schools

Research question: if the government builds more schools, how muchwill education levels, human capital increase?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 36

Page 68: Generating counterfactuals: Differences-in-differences

Example: Supply vs. Demand for Education

The supply side of education (provision of quality schools, teachers):

• Are there enough schools?

• Have teachers received enough training?

• Are teachers present in the classroom?

• Are class sizes too large?

Supply constraints are related to school quality

• Main problem: governments need to provide more, better schools

Research question: if the government builds more schools, how muchwill education levels, human capital increase?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 36

Page 69: Generating counterfactuals: Differences-in-differences

Example: Supply vs. Demand for Education

The demand for education: would parents send their kids to school in theabsence of compulsory schooling laws? Would kids exert sufficient effort?

• How large is the return to education?

I Increase in wages resulting from an additional year of school

• Do parents understand the return to education?

• Can HHs afford to pay for children to go to school?

I What is the opportunity cost of education?

• Do HHs need children on the farm, working at home, etc?

Demand constraints are likely to be critical determinants of educationaloutcomes if the return to education (in terms of wages) are relatively low

SIEF IE Workshop: Difference-in-Difference Estimation Slide 37

Page 70: Generating counterfactuals: Differences-in-differences

Example: Supply vs. Demand for Education

The demand for education: would parents send their kids to school in theabsence of compulsory schooling laws? Would kids exert sufficient effort?

• How large is the return to education?

I Increase in wages resulting from an additional year of school

• Do parents understand the return to education?

• Can HHs afford to pay for children to go to school?

I What is the opportunity cost of education?

• Do HHs need children on the farm, working at home, etc?

Demand constraints are likely to be critical determinants of educationaloutcomes if the return to education (in terms of wages) are relatively low

SIEF IE Workshop: Difference-in-Difference Estimation Slide 37

Page 71: Generating counterfactuals: Differences-in-differences

A “Natural” Experiment in Education

In a famous paper in the American Economic Review, Esther Dufloexamines the impacts of a large wave of school construction in Indonesia

SIEF IE Workshop: Difference-in-Difference Estimation Slide 38

Page 72: Generating counterfactuals: Differences-in-differences

A “Natural” Experiment in Education

The Sekolar Dasar INPRES program (1974–1978):

• Oil crisis creates large windfall for Indonesia

• Suharto uses oil money to fund school construction

• Close to 62,000 schools built by national gov’t

I Approximately 1 school built per 500 school-age children

• More schools built in areas which started with less

• Schools intended to promote national identity

SIEF IE Workshop: Difference-in-Difference Estimation Slide 39

Page 73: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

Strategy: difference-in-difference estimation

• Data on children born before and after program (pre vs. post)

• Data on children born in communities where many schools werebuild (treatment), those where few schools were built (comparison)

• Difference-in-difference estimate of program impact compares prevs. post differences in treatment vs. comparison communities

Intuitively, difference-in-difference estimation asks:

After controlling for time trends and unchanging differences betweentreatment and control communities, do children who were born into areaswith more newly built INPRES schools get more education?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 40

Page 74: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Younger children (reached school age after INPRES) in areas where theprogram built a large number of schools are the treatment group

• Who is the comparison group?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 41

Page 75: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Younger children (reached school age after INPRES) in areas where theprogram built a large number of schools are the treatment group

• Who is the comparison group?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 41

Page 76: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Pre vs. post analysis does not control for time trends

• Indonesia is getting wealthier over time, so younger children (thoseentering school after the program) may get more education anyway

SIEF IE Workshop: Difference-in-Difference Estimation Slide 42

Page 77: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Pre vs. post analysis does not control for time trends

• Indonesia is getting wealthier over time, so younger children (thoseentering school after the program) may get more education anyway

SIEF IE Workshop: Difference-in-Difference Estimation Slide 42

Page 78: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Treatment vs. comparison analysis does not control for selection bias

• More schools were built in those areas that were initially laggingbehind — poorer, more remote, less developed communities

SIEF IE Workshop: Difference-in-Difference Estimation Slide 43

Page 79: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Treatment vs. comparison analysis does not control for selection bias

• More schools were built in those areas that were initially laggingbehind — poorer, more remote, less developed communities

SIEF IE Workshop: Difference-in-Difference Estimation Slide 43

Page 80: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Difference-in-difference estimation compares the change in years ofschooling (i.e. the pre vs. post estimate) in treatment, control areas

• Program areas increased faster than comparison areas

SIEF IE Workshop: Difference-in-Difference Estimation Slide 44

Page 81: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Difference-in-difference estimation compares the change in years ofschooling (i.e. the pre vs. post estimate) in treatment, control areas

• Program areas increased faster than comparison areas

SIEF IE Workshop: Difference-in-Difference Estimation Slide 44

Page 82: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Diff-in-diff estimate suggests program increased educational attainmentby 0.12 years

SIEF IE Workshop: Difference-in-Difference Estimation Slide 45

Page 83: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Years of Schooling

Many Schools Built Few Schools Built Difference

Over 11 in 1974 8.02 9.40 -1.38

Under 7 in 1974 8.49 9.76 -1.27

Difference 0.47 0.36 0.12

Diff-in-diff estimate suggests program increased educational attainmentby 0.12 years

SIEF IE Workshop: Difference-in-Difference Estimation Slide 45

Page 84: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Log (Wages)

Many Schools Built Few Schools Built Difference

Over 11 in 1974 6.87 7.02 -0.15

Under 7 in 1974 6.61 6.73 -0.12

Difference -0.26 -0.29 0.026

Diff-in-diff estimate suggests program increased (log) adult wages by0.026

SIEF IE Workshop: Difference-in-Difference Estimation Slide 46

Page 85: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

In practice, the difference-in-difference estimator is:

DD = Y treatmentpost − Y treatment

pre −(Y comparisonpost − Y comparison

pre

)

Dependent Variable: Log (Wages)

Many Schools Built Few Schools Built Difference

Over 11 in 1974 6.87 7.02 -0.15

Under 7 in 1974 6.61 6.73 -0.12

Difference -0.26 -0.29 0.026

Diff-in-diff estimate suggests program increased (log) adult wages by0.026

SIEF IE Workshop: Difference-in-Difference Estimation Slide 46

Page 86: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

SIEF IE Workshop: Difference-in-Difference Estimation Slide 47

Page 87: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

Another way of looking at the data:

• Interact year of birth with schools built per 1000 students (intensity)

• We observe impacts for children under 10 when program started

SIEF IE Workshop: Difference-in-Difference Estimation Slide 48

Page 88: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

How much additional schooling did impacted children complete?

SIEF IE Workshop: Difference-in-Difference Estimation Slide 49

Page 89: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

• Total education attainment and adult wages grew faster in areaswhere more schools were built as part of the INPRES program

I Schools caused an increase in education

I Increases in education caused an increase in wages

• Results suggest that each additional year of primary schooling leadsto about an 8 percentage point increase in adult wages

• Returns to education are large, supply side interventions can work!

SIEF IE Workshop: Difference-in-Difference Estimation Slide 50

Page 90: Generating counterfactuals: Differences-in-differences

The Return to Education in Indonesia

• Total education attainment and adult wages grew faster in areaswhere more schools were built as part of the INPRES program

I Schools caused an increase in education

I Increases in education caused an increase in wages

• Results suggest that each additional year of primary schooling leadsto about an 8 percentage point increase in adult wages

• Returns to education are large, supply side interventions can work!

SIEF IE Workshop: Difference-in-Difference Estimation Slide 50

Page 91: Generating counterfactuals: Differences-in-differences

Diff-in-Diff in a Regression Framework

Page 92: Generating counterfactuals: Differences-in-differences

Diff-in-Diff in a Regression Framework

To implement diff-in-diff in a regression framework, we estimate:

Yi = α + βPi + ζLatert + δ (Pi ∗ Latert) + εi

where:

• Lateri is an indicator equal to 1 if t = 2

• δ is the coefficient of interest (the treatment effect)

• α = E [γi |Pi = 0] + λ1 — pre-program mean in comparison group

• β = E [γi |Pi = 1]− E [γi |Pi = 0] — selection bias

• ζ = λ2 − λ1 — time trend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 52

Page 93: Generating counterfactuals: Differences-in-differences

Diff-in-Diff in a Regression Framework

To implement diff-in-diff in a regression framework, we estimate:

Yi = α + βPi + ζLatert + δ (Pi ∗ Latert) + εi

where:

• Lateri is an indicator equal to 1 if t = 2

• δ is the coefficient of interest (the treatment effect)

• α = E [γi |Pi = 0] + λ1 — pre-program mean in comparison group

• β = E [γi |Pi = 1]− E [γi |Pi = 0] — selection bias

• ζ = λ2 − λ1 — time trend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 52

Page 94: Generating counterfactuals: Differences-in-differences

Diff-in-Diff in a Regression Framework

To implement diff-in-diff in a regression framework, we estimate:

Yi = α + βPi + ζLatert + δ (Pi ∗ Latert) + εi

where:

• Lateri is an indicator equal to 1 if t = 2

• δ is the coefficient of interest (the treatment effect)

• α = E [γi |Pi = 0] + λ1 — pre-program mean in comparison group

• β = E [γi |Pi = 1]− E [γi |Pi = 0] — selection bias

• ζ = λ2 − λ1 — time trend

SIEF IE Workshop: Difference-in-Difference Estimation Slide 52

Page 95: Generating counterfactuals: Differences-in-differences

Practice Problems 1

The data set SIEFIE DD data1.dta contains observations of studenttest scores (score) for students in Standard 7, normalized to be out of 100percent. The data set includes observations from two years, 2008 and2009, for two schools. In 2009, school 1 was selected to receive a packageof educational inputs (textbooks, flipcharts, and study guides) from alocal NGO. School 2 was not selected to participate in the program.

First steps:

• Open Stata

• Open SIEFIE DD data1.dta

• Open the do file diff-in-diff-part1-problems.do

SIEF IE Workshop: Difference-in-Difference Estimation Slide 53

Page 96: Generating counterfactuals: Differences-in-differences

Practice Problems 1

The data set SIEFIE DD data1.dta contains observations of studenttest scores (score) for students in Standard 7, normalized to be out of 100percent. The data set includes observations from two years, 2008 and2009, for two schools. In 2009, school 1 was selected to receive a packageof educational inputs (textbooks, flipcharts, and study guides) from alocal NGO. School 2 was not selected to participate in the program.

First steps:

• Open Stata

• Open SIEFIE DD data1.dta

• Open the do file diff-in-diff-part1-problems.do

SIEF IE Workshop: Difference-in-Difference Estimation Slide 53

Page 97: Generating counterfactuals: Differences-in-differences

Practice Problems 1

Dependent Variable: Test Scores

Program School Comparison School Difference

2008 63.93 66.59 -2.66

2009 70.21 67.75 2.46

Difference 6.28 1.16 5.16

Diff-in-diff estimate suggests program improved test scores

SIEF IE Workshop: Difference-in-Difference Estimation Slide 54

Page 98: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Page 99: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Your fantastic research assistant has stumbled across test score data forboth schools from 2005 through 2007. This data is stored in the data setSIEFIE data2.dta.

What does this additional data buy us?

• Additional data allow us to plot pre-program trends by school

• Does the common trends assumption hold?

• How do we check?

• We can append the new data to find out

SIEF IE Workshop: Difference-in-Difference Estimation Slide 56

Page 100: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Your fantastic research assistant has stumbled across test score data forboth schools from 2005 through 2007. This data is stored in the data setSIEFIE data2.dta.

What does this additional data buy us?

• Additional data allow us to plot pre-program trends by school

• Does the common trends assumption hold?

• How do we check?

• We can append the new data to find out

SIEF IE Workshop: Difference-in-Difference Estimation Slide 56

Page 101: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

SIEF IE Workshop: Difference-in-Difference Estimation Slide 57

Page 102: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Diff-in-diff estimation goes wrong when treatment and comparisongroups were not on the same trajectory prior to the program

• This is the common trends assumption

Remember the assumptions underlying diff-in-diff estimation:

• Selection bias relates to fixed characteristics of individuals (γi )

• Time trend (λt) same for treatment and control groups

These assumptions guarantee that the common trends assumption issatisfied, but they cannot be tested directly — we have to trust!

• As with any identification strategy, it is important to think carefullyabout whether it checks out both econometrically and intuitively

SIEF IE Workshop: Difference-in-Difference Estimation Slide 58

Page 103: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Diff-in-diff estimation goes wrong when treatment and comparisongroups were not on the same trajectory prior to the program

• This is the common trends assumption

Remember the assumptions underlying diff-in-diff estimation:

• Selection bias relates to fixed characteristics of individuals (γi )

• Time trend (λt) same for treatment and control groups

These assumptions guarantee that the common trends assumption issatisfied, but they cannot be tested directly — we have to trust!

• As with any identification strategy, it is important to think carefullyabout whether it checks out both econometrically and intuitively

SIEF IE Workshop: Difference-in-Difference Estimation Slide 58

Page 104: Generating counterfactuals: Differences-in-differences

Common Trends Assumption

Diff-in-diff estimation goes wrong when treatment and comparisongroups were not on the same trajectory prior to the program

• This is the common trends assumption

Remember the assumptions underlying diff-in-diff estimation:

• Selection bias relates to fixed characteristics of individuals (γi )

• Time trend (λt) same for treatment and control groups

These assumptions guarantee that the common trends assumption issatisfied, but they cannot be tested directly — we have to trust!

• As with any identification strategy, it is important to think carefullyabout whether it checks out both econometrically and intuitively

SIEF IE Workshop: Difference-in-Difference Estimation Slide 58

Page 105: Generating counterfactuals: Differences-in-differences

Testing the Common Trends Assumption

Evidence of different pre-treatment differences in trends means that theassumptions underlying diff-in-diff are not reasonable

With enough data, we can:

• Test for pre-program trend differences

I Trends may, for example, exist in levels but not logs

• Include separate linear trends for treatment, comparison groups

• Include controls for other factors that might be driving differences inpre-existing trends (in treatment, comparison groups)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 59

Page 106: Generating counterfactuals: Differences-in-differences

Testing the Common Trends Assumption

Evidence of different pre-treatment differences in trends means that theassumptions underlying diff-in-diff are not reasonable

With enough data, we can:

• Test for pre-program trend differences

I Trends may, for example, exist in levels but not logs

• Include separate linear trends for treatment, comparison groups

• Include controls for other factors that might be driving differences inpre-existing trends (in treatment, comparison groups)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 59

Page 107: Generating counterfactuals: Differences-in-differences

Diff-in-Diff in Practice

Page 108: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

Malaria kills about 800,000 people per year

• Most are African children

• Repeated bouts of malaria may also reduce overall child health

• Countries with malaria are substantially poorer than other countries,but it is not clear whether malaria is the cause or the effect

SIEF IE Workshop: Difference-in-Difference Estimation Slide 61

Page 109: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

Organized efforts to eradicate malaria are a natural experiment

• First the US (1920s) and then many Latin American countries(1950s) launched major (and successful) eradication campaigns

• Compare trends in adult income by birth cohort in regions which did,did not see major reductions in malaria because of campaigns

SIEF IE Workshop: Difference-in-Difference Estimation Slide 62

Page 110: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

SIEF IE Workshop: Difference-in-Difference Estimation Slide 63

Page 111: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

Colombia’s malaria eradication campaign began in in the late 1950s. . .

. . . and led to a huge decline in malaria morbidity

SIEF IE Workshop: Difference-in-Difference Estimation Slide 64

Page 112: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

Colombia’s malaria eradication campaign began in in the late 1950s. . .

. . . and led to a huge decline in malaria morbidity

SIEF IE Workshop: Difference-in-Difference Estimation Slide 64

Page 113: Generating counterfactuals: Differences-in-differences

Malaria Eradication as a Natural Experiment

Areas with highest pre-program prevalence saw largest declines in malaria

SIEF IE Workshop: Difference-in-Difference Estimation Slide 65

Page 114: Generating counterfactuals: Differences-in-differences

Estimation Strategy

In this framework, treatment is a continuous variable

• Areas with higher pre-intervention malaria prevalence were, inessence “treated” more intensely by the eradication program

• Malaria-free areas should not benefit from eradication

• They can be used (implicitly) to measure the time trend

Exposure (during childhood) also depends on one’s year of birth

• Colombians born after 1957 were fully exposed to program

I Did not suffer from chronic malaria in their early childhood

I Did not miss school because of malaria

• Colombians born before 1940 were adults by the time theeradication campaign began, serve as the comparison group

SIEF IE Workshop: Difference-in-Difference Estimation Slide 66

Page 115: Generating counterfactuals: Differences-in-differences

Estimation Strategy

In this framework, treatment is a continuous variable

• Areas with higher pre-intervention malaria prevalence were, inessence “treated” more intensely by the eradication program

• Malaria-free areas should not benefit from eradication

• They can be used (implicitly) to measure the time trend

Exposure (during childhood) also depends on one’s year of birth

• Colombians born after 1957 were fully exposed to program

I Did not suffer from chronic malaria in their early childhood

I Did not miss school because of malaria

• Colombians born before 1940 were adults by the time theeradication campaign began, serve as the comparison group

SIEF IE Workshop: Difference-in-Difference Estimation Slide 66

Page 116: Generating counterfactuals: Differences-in-differences

Estimation Strategy

Regression specification:

Yj,post − Yj,pre = α + βMj,pre + δXj,pre + εj

where

• Yj,t is an outcome of interest (eg literacy)

• Mj,pre is pre-eradication malaria prevalence

• Xj,pre is a vector of region-level controls

• εipt is the noise term

As we saw in the practice problems, this specification (“in changes”) isequivalent to a standard diff-in-diff regression specification in levels

SIEF IE Workshop: Difference-in-Difference Estimation Slide 67

Page 117: Generating counterfactuals: Differences-in-differences

Estimation Strategy

Regression specification:

Yj,post − Yj,pre = α + βMj,pre + δXj,pre + εj

where

• Yj,t is an outcome of interest (eg literacy)

• Mj,pre is pre-eradication malaria prevalence

• Xj,pre is a vector of region-level controls

• εipt is the noise term

As we saw in the practice problems, this specification (“in changes”) isequivalent to a standard diff-in-diff regression specification in levels

SIEF IE Workshop: Difference-in-Difference Estimation Slide 67

Page 118: Generating counterfactuals: Differences-in-differences

The Impact of Childhood Exposure to Malaria

Higher pre-eradication malaria exposure (on the x-axis) is associated withlarger increases in income across birth cohorts (on the y-axis)

SIEF IE Workshop: Difference-in-Difference Estimation Slide 68

Page 119: Generating counterfactuals: Differences-in-differences

The Impact of Childhood Exposure to Malaria

Regression specification:

Yj,post − Yj,pre = α + βMj,pre + δXj,pre + εipt

Open the do file diff-in-diff-bleakley1.do to replicate the analysis

SIEF IE Workshop: Difference-in-Difference Estimation Slide 69

Page 120: Generating counterfactuals: Differences-in-differences

The Impact of Childhood Exposure to Malaria

Regression specification:

Yj,post − Yj,pre = α + βMj,pre + δXj,pre + εipt

Open the do file diff-in-diff-bleakley1.do to replicate the analysis

SIEF IE Workshop: Difference-in-Difference Estimation Slide 69

Page 121: Generating counterfactuals: Differences-in-differences

Exploiting (More) Variation by Birth CohortWe gain statistical power by exploiting all of the variation in childhoodexposure to treatment (eradication) across regions and birth cohorts

SIEF IE Workshop: Difference-in-Difference Estimation Slide 70

Page 122: Generating counterfactuals: Differences-in-differences

Exploiting (More) Variation by Birth CohortWe gain statistical power by exploiting all of the variation in childhoodexposure to treatment (eradication) across regions and birth cohorts

Exposure, potential impact in areas with malaria

SIEF IE Workshop: Difference-in-Difference Estimation Slide 71

Page 123: Generating counterfactuals: Differences-in-differences

Exploiting (More) Variation by Birth CohortWe gain statistical power by exploiting all of the variation in childhoodexposure to treatment (eradication) across regions and birth cohorts

Exposure, potential impact in areas with malaria

Exposure, potential impact in areas without malaria

SIEF IE Workshop: Difference-in-Difference Estimation Slide 72

Page 124: Generating counterfactuals: Differences-in-differences

Panel Data Analysis

We gain statistical power by exploiting all of the variation in childhoodexposure to treatment (eradication) across regions and birth cohorts

• Between 0 and 18 years of childhood post-eradication

• Interact exposure with pre-program malaria prevalence

• Treatment impacts should be larger for birth cohorts who spentmore years of childhood malaria-free, areas with more initial malaria

Treat data set as a (YOB×location) panel

• Control for region, YOB fixed effects

SIEF IE Workshop: Difference-in-Difference Estimation Slide 73

Page 125: Generating counterfactuals: Differences-in-differences

Panel Data Analysis

We gain statistical power by exploiting all of the variation in childhoodexposure to treatment (eradication) across regions and birth cohorts

• Between 0 and 18 years of childhood post-eradication

• Interact exposure with pre-program malaria prevalence

• Treatment impacts should be larger for birth cohorts who spentmore years of childhood malaria-free, areas with more initial malaria

Treat data set as a (YOB×location) panel

• Control for region, YOB fixed effects

SIEF IE Workshop: Difference-in-Difference Estimation Slide 73

Page 126: Generating counterfactuals: Differences-in-differences

Panel Data Analysis

Regression specification:

Yjkt = β (Mj × EXPk) + δk + δj + δt + εjkt

where

• Mj is pre-eradication malaria prevalence (by region)

• EXPk is proportion of childhood post-eradication (by YOB)

• δk is a YOB fixed effect

• δj is a region of birth effect

• δt is a census year fixed effect

• εjkt is a conditionally mean-zero error term

SIEF IE Workshop: Difference-in-Difference Estimation Slide 74

Page 127: Generating counterfactuals: Differences-in-differences

Panel Data Analysis

We can also look at the relationship between log (adult) wages andpre-eradication malaria rates separately by birth cohort

SIEF IE Workshop: Difference-in-Difference Estimation Slide 75

Page 128: Generating counterfactuals: Differences-in-differences

Summary and Review

Diff-in-diff estimation can overcome the twin problems of [1] selectionbias and [2] time trends in the outcomes of interest

• Is a credible impact evaluation strategy when the common trendsassumption can be tested (i.e. you have a long panel)

• Outcomes can be defined in levels or transformed levels

• Treatment need not be binary: often involves an interaction betweenpre-intervention conditions (eg malaria) and intervention timing

SIEF IE Workshop: Difference-in-Difference Estimation Slide 76