1 Program Evaluation and Panel Discrete Data Models – Some Considerations on Methodology and...

Preview:

Citation preview

1

Program Evaluation andPanel Discrete Data Models – Some Considerations on

Methodology and Applications

Cheng Hsiao

2

1. Essential Issues for Program Evaluation

2. Panel Discrete Choice Models

3. General Principle of Estimating Structural Parameters in the Presence of Incidental Parameters

4. Parametric Approach

5. Semi-Parametric Approach

6. Bias-Reduced Approach

7. Concluding Remarks

3

Definition of Treatment Effect

Let denote the potential outcomes of

the ith individual in the untreated and treated state. Then, the treatment effects on the ith individual are just

),( *1

*0 yy ii

yy iii*0

*1

1. Essential Issues for Program Evaluation

4

Two measures of treatment effects are of interest to policy makers:

The average treatment effects (ATE)

and the treatment effects on the treated (TT)

)( *0

*1 yyEATE ii

)group treatment |( *0

*1 iyyETT ii

5

The ATE is of interest if one is interested in the effect of treatment for a randomly assigned individual or population mean response to treatment.

The TT is of interest if the same selection rule for treatment continues in the future.

6

The observed data is in the form of ,

i = 1,…,N, where if the ith individual receives treatment and if not, and

, where i = 1,…,N

In other words, we do not simultaneously observe

and . An individual is either observed with or

. Using

as a measure of ATE could be subject to two sources of bias: selection on observables and selection on unobservables.

),( dy ii

1d i0d i

ydydy iiiii*0

*1 )1(

y i*1

y i*0 y i

*1

y i*0

N

iiiN

ii

N

iiiN

ii

dyd

dyd 1

1

1

1

)1()1(

11(1)

7

If selection into the treatment is random,

(1) can be used to estimate ATE (≡ TT).

),,( *0

*1 yyd iii

then

)()1|()1|( *1

*1 yEdyEdyE iiiii

)()0|()0|( *0

*0 yEdyEdyE iiiii

8

If

(1) will be a biased estimate of ATE or TT.

),,( *0

*1 yyd iii

then

)()1|()1|( *1

*1 yEdyEdyE iiiii

)()0|()0|( *0

*0 yEdyEdyE iiiii

9

Suppose we can decompose the outcomes in terms of the effects of observables and the effects of unobservables as

where denotes the effects of observable factors , denotes the effects of unobservable factors, j = 0,1, and .

Then

and

iii xgy 1~

1*1 )(

iii xgy 0~

0*0 )(

)(~xg ij

~xi ji

0)|(~xE iji

)]()]()([ 01~

0~

1 iiiii xgxg

)]()([~

0~

1 xgxgEATE

10

Selection on observables

Selection on unobservables)1|()( 11 dEE iii

)0|()( 00 dEE iii

11

Conditional Independence Assumption – selection is ignorable after controlling a set of observable confounders x

~

*0

*1 |),( xyyd iiii

),|(),|(~

*1

~

*1 xyExdyE iiiii

),|(),|(~

*0

~

*0 xyExdyE iiiii

)|()|(

),0|(),1|(

)|()(

~~

*0

~~

*1

~~~~

~~

*0

*1

~

xxyExxyE

xxdyExxdyE

xxyyExATE

iiii

iiiiii

iii

then

12

Adjustment Methods

1. If there is no selection on unobservables

(i.e. where j = 0,1)

1a. Propensity Score Matching Method (Rosenbaum and Rubin (1983))

Let , then

(i)

(ii)

0)|()( jdEE ijiji

1)()|1(0~~ xpxdp iii

)(|),(|),(

~

*0

*1

~

*0

*1 xpdyyxdyy iiiiiiii

)(|~~xpdx iii

13

(ii) →

If a subclass of units or a matched treatment-control pair is homogenous in , the treated and control units in that subclass or matched pair will have the same distribution of , then, at any value of a propensity score, the difference between the treatment and the control means is an unbiased estimate of the ATE at that value of the propensity score (treatment assignment is ignorable)

)(~xp

x~

))(|())(,0|())(,1|(~~~~~~xpxfxpdxfxpdxf

14

)0),(|()1),(|()( dxpyEdxpyEEATE iiiixp

))(|())(|( *0

*1)( xpyExpyEE iixp

)( *0

*1 yyE ii

)0),(|()1),(|(1|)( dxpyEdxpyEETT iiiid ixp

15

Issues of this approach

(i) Conditional independence assumption is not a testable hypothesis

(ii) Estimates sensitive to how blocks of

are constructed

)(xp

16

Decriminalization and

Marijuana Smoking Prevalence:

Evidence from Australia

Kannika Damrongplasit, Cheng Hsiao and Xueyan Zhao

17

The global sale of illegal drugs – US $150 billion (2001)US drugs policy costs $30-40 billion a year

18

Decriminalized States Non-decriminalized states

South Australia 1987

ACT 1992

Northern Territory 1996

Western Australia 2003

New South Wales

Queensland

Victoria

Tasmania

19

What is Decriminalization Policy?

• Reduction of penalties for possession and cultivation of marijuana for personal consumption (i.e. minor possession)

• In Australia, it is called “Expiation System”

- Still an offence to use or grow marijuana

- The offence is expiable by payment of a fine with no imprisonment and no criminal record if the fine is paid.

20

Debates on Marijuana Decriminalization

• Supporting Arguments

- Criminal offence from marijuana possession is too severe

- Allow separation of marijuana market from other harder drugs

- Reduce law enforcement and criminal justice resources

• Opposing Arguments

- Increase marijuana smoking prevalence

- Greater use of other illicit drugs

21

Sources of Data

(i) 2001 National Drug Strategy Household survey (NDSHS)- Nationally representative survey of non-institutionalized civilian population aged 14 and above- 26744 total observations- 14008 resulting samples after delete missing data- Treatment group = 2968, Control group = 11040

(ii) Australian Illicit Drug Report

(iii) Australia Bureau of Statistics

22

Variable Description

y

Decrim

PMAR

Income

Age1419

Age2024

Age2529

Age3034

Age3539

Age4069

Age70

Male

Married

Divorce

Widow

Never Married

# Depchild

Degree

Working Status

Aboriginal

Unemployment rate

1 if using marijuana in the last 12 months, otherwise 0

1 if residing in decriminalized states

ln(real price of marijuana)

ln(real household annual income before tax)

1 if age is 14 to 19 years old

1 if age is 20 to 24 years old

1 if age is 25 to 29 years old

1 if age is 30 to 34 years old

1 if age is 35 to 39 years old

1 if age is 40 to 69 years old

1 if age is 70 years old and above, it is a reference category and is omitted from the

estimation

1 for male and 0 for female

1 if married

1 if divorce

1 if widow

1 if single, it is a reference category and omitted from the estimation

Number of dependent children aged 14 or below in the household

1 if university degree

1 if respondent is unemployed, and 0 otherwise

1 if Aboriginal or Torres Strait Islander

State Unemployment rate (%)

23

Summary Statistics

Variable

All Data (N = 14008) Treatment (N = 2968) Control (N = 11040)

Mean S.D. Mean S.D. Mean S.D.

y

Decrim

PMAR

Income

Age1419

Age2024

Age2529

Age3034

Age3539

Age4069

Male

Married

Divorce

Widow

# Depchild

Degree

Working Status

Aboriginal

Unemployment rate

0.157

0.212

5.876

10.445

0.054

0.077

0.100

0.122

0.126

0.455

0.476

0.619

0.115

0.038

0.594

0.260

0.028

0.013

6.986

0.364

0.409

0.237

0.757

0.227

0.267

0.300

0.327

0.331

0.498

0.499

0.486

0.319

0.191

0.895

0.438

0.164

0.111

1.375

0.181

1

5.929

10.554

0.047

0.075

0.103

0.121

0.130

0.466

0.488

0.621

0.124

0.033

0.583

0.292

0.019

0.017

6.331

0.385

0

0.036

0.709

0.211

0.263

0.304

0.326

0.337

0.499

0.500

0.485

0.330

0.178

0.880

0.455

0.137

0.127

1.386

0.151

0

5.862

10.416

0.056

0.078

0.099

0.122

0.124

0.452

0.472

0.618

0.112

0.039

0.597

0.251

0.030

0.012

7.162

0.358

0

0.264

0.767

0.231

0.268

0.299

0.327

0.330

0.498

0.499

0.486

0.315

0.195

0.899

0.434

0.170

0.107

1.318

24

Non-parametric Model: Propensity Score Stratification Matching

• Propensity score is )|1Pr()( wdwp iii

• Under the assumptions,

1)|1Pr()(0 wdwp iii and wdyy iiii |, *0

*1

then

)(|),( *0

*1 wpdyy iiii and )(| wpdw iii

25

2

19

66

240

526

252

51

18

27 10

15

35

84

188

214

152

39

26 2532

37

101

170

253

222

103

34 32

61 1 2 3

01

00

200

300

400

500

Fre

qu

ency

0 .025.05.075 .1 .125.15.175 .2 .225.25.275 .3 .325.35.375 .4 .425.45.475 .5 .525.55.575 .6 .625.65.675 .7 .725.75.775 .8 .825.85.875 .9 .925.95.975 1p(w)

63

403

264

109

345

837

602

150

5423

6094

163

321

423

373

285

154

198

316

534

794

1084

1237

997

660

253

94

5640

11126 5 6 8 4 1 1

05

00

100

01

50

0

Fre

qu

ency

0 .025.05.075 .1 .125.15.175 .2 .225.25.275 .3 .325.35.375 .4 .425.45.475 .5 .525.55.575 .6 .625.65.675 .7 .725.75.775 .8 .825.85.875 .9 .925.95.975 1p(w)

26

Range of Estimated Propensity Score

Number of Treatment

observations

Number of Control observations

ATE ATET

0.05 – 0.45

Length of interval 0.025

2810 10301 0.092***

(0.026)

-0.032

(0.053)

0.05 – 0.45

With STATA interval

2810 10301 0.096***

(0.025)

-0.069*

(0.038)

0.075 – 0.425

Length of interval 0.025

2432 9509 0.074***

(0.018)

-0.056**

(0.025)

0.075 – 0.425

With STATA interval

2432 9509 0.086***

(0.020)

-0.056**

(0.025)

0.05 – 0.4

Length of interval 0.05

2116 10295 0.098***

(0.017)

-0.024†

(0.014)

0.05 – 0.4

With STATA interval

2116 10295 0.112***

(0.026)

-0.026**

(0.012)

0.1 – 0.35

Length of interval 0.05

1909 8263 0.067***

(0.016)

-0.021†

(0.015)

0.1 – 0.35

With STATA interval

1909 8263 0.059***

(0.020)

-0.022†

(0.015)

27

• Violation of )(| wpdw iii could happen because of

(1) Our sample size is not large enough to perform a reliable non-parametric estimation

- For many ranges of propensity score, there is no

overlapping observations

- Within overlapping range, t-squared tests unambiguously reject the balancing condition

or

(2) Conditional independence assumption is violated

28

Non-parametric Approach Parametric Approach

Advantages - Do not impose any

distributional assumption

-Take account of both selection on observables and unobservables- Can estimate the impact of other explanatory variables on smoking outcome in addition to the effect of decriminalization policy

Disadvantages - Conditional independence assumption is the maintained hypothesis- Only take account of selection on observables

- Need to impose both functional form and distributional assumptions

29

• Endogenous Probit Switching Model (Model 1)

iii xy 111*1

iii xy 000*0

iii zd *

1

1

1

,

0

0

0

~

01

010

110

0

1

N

i

i

i

30

Average Treatment Effect

Model 1 & 2:

If sample is randomly drawn,

Model 3 & 4:

)]ˆˆ()ˆˆˆ([1

1xx i

n

iin

)]()([1

ˆˆˆˆ 001

11 xx in

ii

n

dxxfxx )()]()([ 0011

31

Coefficient estimates for marijuana smoking equation

Variable Binary Probit Bivariate Probit

Two-part:

Treatment

Two-part:

Control

Switching:

Treatment

Switching:

Control

Decrim

PMAR

Income

Age1419

Age2024

Age2529

Age3034

Age3539

Age4069

Male

0.173***

(0.033)

-0.338***

(0.057)

-0.008

(0.021)

1.952***

(0.205)

2.105***

(0.202)

2.113***

(0.200)

1.908***

(0.200)

1.848***

(0.200)

1.212***

(0.197)

0.238***

(0.028)

0.187

(0.130)

-0.341***

(0.061)

-0.009

(0.022)

1.952***

(0.205)

2.105***

(0.202)

2.112***

(0.200)

1.908***

(0.200)

1.848***

(0.200)

1.211***

(0.197)

0.238***

(0.028)

-5.305***

(0.810)

-0.016

(0.049)

1.962***

(0.418)

2.084***

(0.409)

2.023***

(0.404)

1.854***

(0.404)

1.894***

(0.402)

1.113***

(0.398)

0.201***

(0.060)

-0.315***

(0.057)

-0.007

(0.024)

1.961***

(0.240)

2.109***

(0.237)

2.133***

(0.235)

1.920***

(0.234)

1.822***

(0.234)

1.237***

(0.232)

0.250***

(0.032)

-5.324***

(0.829)

-0.018

(0.052)

1.963***

(0.418)

2.084***

(0.409)

2.022***

(0.405)

1.853***

(0.404)

1.893***

(0.402)

1.113***

(0.398)

0.201***

(0.060)

-0.409***

(0.060)

-0.022

(0.024)

1.915***

(0.237)

2.048***

(0.235)

2.068***

(0.233)

1.869***

(0.232)

1.774***

(0.231)

1.206***

(0.228)

0.236***

(0.032)

32

Variable Binary Probit Bivariate Probit

Two-part:

Treatment

Two-part:

Control

Switching:

Treatment

Switching:

Control

Married

Divorce

Widow

Degree

Working Status

Aboriginal

Constant

ρ1υ= ρ0υ

ρ1υ

ρ0υ

-0.470***

(0.037)

-0.001

(0.053)

-0.441***

(0.127)

-0.034

(0.034)

0.256***

(0.074)

0.282**

(0.107)

-0.468

(0.429)

-0.470***

(0.037)

-0.001

(0.053)

-0.441***

(0.127)

-0.034

(0.034)

0.257***

(0.075)

0.281**

(0.107)

-0.451

(0.455)

-0.008

(0.077)

-0.429***

(0.080)

-0.051

(0.112)

-0.561†

(0.310)

-0.262***

(0.073)

0.306

(0.187)

-0.304

(0.218)

29.338***

(4.832)

-0.493***

(0.043)

0.008

(0.061)

-0.435***

(0.140)

0.047

(0.039)

0.249***

(0.081)

0.438***

(0.124)

-0.650

(0.457)

-0.428***

(0.080)

-0.053

(0.112)

-0.561†

(0.310)

-0.262***

(0.073)

0.307

(0.188)

-0.308

(0.222)

29.486***

(5.021)

-0.012

(0.109)

-0.470***

(0.044)

-0.005

(0.059)

-0.422***

(0.136)

0.042

(0.038)

0.266***

(0.080)

0.377***

(0.123)

-0.011

(0.475)

-0.469**

(0.181)

33

Average Treatment Effect and Marginal Effect

Binary Probit Bivariate Probit

Two-part:

Treatment

Two-part:

Control

Switching:

Treatment

Switching:

Control

ATE

Marginal Effect

Decrim

PMAR

Income

Age1419

Age2024

Age2529

Age3034

Age3539

Age4069

Male

Married

Divorce

Widow

Degree

Working Status

Aboriginal

0.037***

(0.0002)

0.067***

-0.127***

-0.003

0.352***

0.598***

0.598***

0.578***

0.570***

0.443***

0.085***

-0.157***

-0.0004

-0.149***

-0.013

0.099***

0.110**

0.040***

(0.0002)

0.072

-0.128***

-0.003

0.351***

0.599***

0.599***

0.578***

0.571***

0.443***

0.085***

-0.157***

-0.001

-0.148***

-0.013

0.100***

0.109**

0.137***

(0.002)

-2.101***

-0.006

0.433***

0.523***

0.520***

0.507***

0.510***

0.388***

0.078***

-0.161***

-0.020

-0.204†

-0.101***

0.121

-0.116

-0.118***

-0.003

0.351***

0.599***

0.601***

0.580***

0.567***

0.450***

0.089***

-0.163***

0.003

-0.147***

0.018

0.097***

0.172***

0.163***

(0.002)

-2.112***

-0.007

0.439***

0.518***

0.514***

0.502***

0.505***

0.385**

0.078***

-0.161***

-0.021

-0.206

-0.101***

0.122

-0.118

-0.145***

-0.008

0.304***

0.628***

0.631***

0.604***

0.588***

0.451***

0.078***

-0.144***

-0.002

-0.131***

0.015

0.099***

0.143***

34

Panel Data – Allow better control of selection on observables and unobservables

Difference-in-Difference Method

- outcome of the jth individual after the treatment

- outcome of the jth individual before the treatment

- outcome of the ith individual who did not receive treatment at time t

- outcome of the ith individual at time s

)()( yyyyE isitjsjt

y jty jsyit

yis

35

Cross-Section vs Panel Discrete Modeling

iii xy ~~

'*

0 if ,0

0 if ,1*

*

y

yy

i

ii

)|1Prob(

)Prob(

)Prob(*0)Prob(*1)|(

~

~~

'

~~

'

~~

'

~

xy

x

xxxyE

)|()|(~~~~xxyExxyE jjii

36

37

38

39

40

6. Bias-Reduced Estimator

Mean Square Error =

Suppose

)()(~~

^ '

~~

^ E

)()()()(~~

^ '

~~

^

~

^

~

^ '

~

^

~

^ EEEEE

)()()( '

~

^

biasbiasVar

n

Ob 2~~~

^ 1)( plim

,~~

^

~

^

bR

n

OR 2~~

^ 1)( plim

41

Consider the log-likelihood function of N cross-sectional units observed over T time periods,

where denotes the likelihood function of the T-time series observations for the ith individual.

For instance, consider a binary choice model of the form,

),,(),...,,(~1

1~

i

N

iiN

i

i.i.d.~~

'* , uuxy ititiitit

.0 if ,0

,0 if ,1*

*

y

yy

it

itit

42

Then

The MLE is obtained by simultaneously solving for

),|1(Prob)(),|(~~~

'

~ iitititiitiitit xyFxFxyE

T

tititititii FyFy

1~)1log()1(log),(

),...,,(^

1

^

~

^

N from

0

~

^1

~~

N

i

i(1)

.,...,1 ,0^ Nii

i

i

(2)

43

The MLE of can also be derived by first obtaining

from (2) as a function of , ,

substituting into the likelihood function

to form the concentrated log-likelihood function,

~

^

i ~

)(~

^

i

)(~

^

i

)),(,())(),...,(,(1 ~

^

~

*

~

^

~

^

1~

*

N

iiN

N

ii

N

i

ii

1^

*

1~

~

^

~

*

~

*

0))(,(

i

(3)

(4)

then solving

44

When T is finite

0))(,(

1

1~

~

^

~

*

~

N

i

i

NE

Expanding the score of the concentrated log-likelihoodaround , and evaluating it at : i

))((),())(,(^

*

~

*

~

^

~

* iiiiiii

)())((2

1 2/1^ 2* TO piiiii

(5)

45

)(),(

)()(

)(

)(

2/1*

2/1^

*

*

TO

TO

TO

TO

ii

ii

piii

pii

46

)(~

^

i

0),(),(

)(^

~)~

(^

~

iiii

ii

ii

Riii

ii

i

iii

ii

iii

))((),(^

),(2

2

is derived by solving

47

iiiiiii1^

))((

iiiiiiiE )(

iiiiiii E

iiiE

iiiiiii

E 11

11

...1 2

21

iiiE

i

iiiE

iiii

E

48

Monte carlo experiments conducted by Carro (2006) have shown that when T = 8, the bias of modified MLE for dynamic probit and logit models are negligible. Another advantage of the Arellano-Carro approach is its generality. For instance, a dynamic logit model with time dummy explanatory variable cannot meet the Honore and Kyriazidou (2000) conditions for generating consistent estimator, but can still be estimated by the modified MLE with good finite sample properties.

49

Advantages of Carro (2006) produce:

1. No need to transform the parameters of interest into (information) orthogonal parameters as done in Cox and Reid (1987, JRSS B) or Arellano (2003).

2. No need to impose any conditions on the observed data as in the case of Honore and Kyriazidou (2000). In other words, all observed

data can be utilized to obtain .^

~

M

50

Issues:

1. may not have a closed form solution. Neither is the evaluation of expectation term trivial. Hence, computationally can be tedious.

e.g. in the case of logit model,

)(~

^

i

e

exy

xiti

xiti

iitit

'

'

1),|1(Prob

N

i

T

tiitit

N

i

T

tiit xyx

1 1

'

1 1

' )()(exp1log L log

51

01

log

1

T

titixit

ixit

i

i

i

ye

eL

)(^^^

MLEMLER g

It will be useful to derive a bias-reduced estimatorthat has the form

52

7. Concluding Remarks

Issues:

(i) Is there a simultaneous equation framework for discrete data?

uxyy 111211

uxyy 222122

53

If and are continuous, one can find a

particular realized value of ( u1 , u2 ) satisfying

observed ( , ) or one can consider that

there exist independent shocks ( u1 , u2 ) such

that there exist ( , ) satisfying the model.

y1 y2

y1 y2

y1 y2

54

But if is dichotomous, and exogenous,

then

y1 y2

)1121(121 )(),|1(Prob

xy

duufxyy

)( 1121 xyF

and realized u1 takes value of (1-F) and (-F) while F depends on y2. If y2 is endogenous, then u1 cannot be independent shocks

55

(ii) Is there an equivalent limited information framework for discrete data simultaneous equation model

(iii) Cross-Sectional dependence

Recommended