25
1 RTI International is a trade name of Research Triangle Institute 3040 Cornwallis Road ¦ P.O. Box 12194 ¦ Research Triangle Park, North Carolina, USA 27709 Phone 919-541-5923 e-mail [email protected] Fax 919-541-6416 Propensity Models Versus Weighting Cell Approaches to Nonresponse Adjustment: A Methodological Comparison Peter H. Siegel, James R. Chromy*, Elizabeth Copello Joint Statistical Meetings Minneapolis, MN August 7-11, 2005 *Presenter

Propensity Models Versus Weighting Cell Approaches to ... · Logistic RP 0.008619 0.010184 GEM Case 2 0.008862 0.010337 0.001455 UWE 1.5953 1.5971 1.5944 1.5956 Mean weight 263.8699

Embed Size (px)

Citation preview

1

RTI International is a trade name of Research Triangle Institute

3040 Cornwallis Road ¦ P.O. Box 12194 ¦ Research Triangle Park, North Carolina, USA 27709 Phone 919-541-5923 e-mail [email protected] 919-541-6416

Propensity Models Versus Weighting Cell Approaches to Nonresponse Adjustment: A Methodological Comparison

Peter H. Siegel, James R. Chromy*, Elizabeth Copello

Joint Statistical MeetingsMinneapolis, MN August 7-11, 2005

*Presenter

2

Acknowledgment

§ The authors were supported in the conduct of this research by the National Center for Education Statistics through the Education Statistical Services Institute.

§ The authors remain responsible for all conclusions, including errors.

3

Outline of Presentation

§ Introduction and Background

§ Methods

§ Results

§ Summary

4

Survey Weight Components

§ Design-based weights

§ Nonresponse adjustments

§ Poststratification

§ Control of extremes

5

Focus of This Research

§ Nonresponse adjustment alternatives

§ Motivation

§ Empirical comparisons on person level weights

6

Methods Studied

§ Weighting class adjustments

§ Raking (iterative proportional fitting)

§ Logistic regression model

§ Generalized exponential model (GEM)

7

Deville-Saarndal Calibration (JASA 1992)

λ

λ

λk

k

xA

xA

k elueluul

a ′

−+−−+−

=)1()1()1()1(

)(

λ

λ

λk

k

xA

xA

k elueluul

a ′

−+−−+−

=)1()1()1()1(

)(λ

λ

λk

k

xA

xA

k elueluul

a ′

−+−−+−

=)1()1()1()1(

)(λ

λ

λk

k

xA

xA

k elueluul

a ′

−+−−+−

=)1()1()1()1(

)(

∑ =srespondent

xkkk Tadx )(λ

8

Folsom-Singh GEM (JSM 2000)

λ

λ

λkk

kk

xAkkkk

xAkkkkkk

k elccuelcucul

a ′

−+−−+−

=)()(

)()()(

)])(/[()( kkkkkkk lcculuA −−−=

9

GEM Special Cases

λλ xk eacul ′→=∞→→ )(,1,,0

λλ xk eacul ′+→=∞→→ 1)(,2,,1

GEM 1:

GEM 2:

10

Other Comparative Studies

§ Folsom and Witt (JSM 1994)

§ Rizzo, Kalton, Brick, and Petroni (JSM 1994)

§ Kalton and Flores-Cervantes (JOS 2003)

11

Data

§ ELS:2002 base year data was used

§ Public school component

§ Student level nonresponse only

§ 87 percent student response

§ 12,039 respondents out of 13,882

12

Comparison Criteria

§ Relative root mean square difference was used to evaluate the differences between weights across methods.

§ Evaluated the mean, minimum, median, maximum of the adjustment factors and the weights.

§ Evaluated the unequal weighting effects (UWEs)

13

Root Mean Squared Difference

1

1

221 )(

wn

ww

RMSD

n

kkk∑ −

==

14

One variable Model

§ Illustrated with gender (2)

§All methods gave the same results

15

Two variable model

§ Gender

§ Race/ethnicity (4 levels)

§ Marginal controls only

§ Fully interacted model (8 cells)

16

Two Variables (Marginal Controls)

Raking (IPF)

GEM Case 1

Logistic RP

GEM Case 2

Raking (IPF)GEM Case 1 0.000000Logistic RP 0.001623 0.001623GEM Case 2 0.001619 0.001619 0.000067UWE 1.5695 1.5695 1.5692 1.5692Mean weight 263.8699 263.8699 263.8740 263.8699

MethodRMSD

17

Four variable model

§ Region (4)

§ Metropolitan status (3 levels)

§ Cell model (96 cells) no longer feasible without collapsing cells

18

Four Variables (Marginal Controls)

Raking (IPF)

GEM Case 1

Logistic RP

GEM Case 2

Raking (IPF)GEM Case 1 0.005106 Logistic RP 0.008619 0.010184 GEM Case 2 0.008862 0.010337 0.001455 UWE 1.5953 1.5971 1.5944 1.5956Mean weight 263.8699 263.8699 263.8793 263.8699

MethodRMSD

19

Six Variables (Marginal Controls)

Raking (IPF)

GEM Case 1

Logistic RP

GEM Case 2

Raking (IPF) GEM Case 1 0.008217 Logistic RP 0.021364 0.023303 GEM Case 2 0.022165 0.024069 0.003165 UWE 1.5952 1.5961 1.6020 1.6025Mean weight 263.8699 263.8699 263.8395 263.8699

MethodRMSD

20

Eight Variables (Marginal Controls)

Raking (IPF)

GEM Case 1

Logistic RP

GEM Case 2

Raking (IPF) GEM Case 1 0.017508 Logistic RP 0.020650 0.025503 GEM Case 2 0.022043 0.027410 0.004712 UWE 1.6135 1.6120 1.6120 1.6138Mean weight 263.8699 263.8699 263.8073 263.8699

MethodRMSD

21

Conclusions—Marginal Controls

§ For cases studied, differences are small

§ GEM special case 1 approximates the results of raking.

§ GEM special case 2 approximates the results of logistic propensity modeling.

§ Logistic response propensity does not force to marginal totals; mean weight slightly different.

22

Comments on Weighting Class

§ Simplest; works well with small samples.

§ Quickly leads to empty cells

§ Collapsing is required; no longer comparable

§ Judgment intervenes

§ CHAID or other clustering algorithms useful

§ Unequal weighting effects tend to increase with number of cells.

23

General Comments

§ GEM provides single framework for nonresponseadjustment and poststratification

§ Raking can be implemented as a special case.

§ Weighting class can be implemented as a special case, often after considerable collapsing of cells.

§ Procedures for trimming extreme weights (not discussed), but built into the process.

§ Does not eliminate all judgment calls

24

General Comments (Cont’d)

§ Users may wish to collapse some dimensions

§ Pushing the method to its limits will require user judgments

§ Unequal weighting effects will increase as the number of controls increase.

25

More Information

§ www.rti.org/jsm

§ E-mail: [email protected]