39
1 Multilevel Models in Survey Error Estimation Joop Hox Utrecht University mlsurve y

Multilevel Models in Survey Error Estimation

  • Upload
    liona

  • View
    111

  • Download
    3

Embed Size (px)

DESCRIPTION

Multilevel Models in Survey Error Estimation. Joop Hox Utrecht University. mlsurvey. Multilevel Modeling; some terminology/distinctions. Two broad classes of multilevel models Multilevel regression analysis (HLM, MLwiN, SAS Proc Mixed, SPSS Mixed) Multilevel structural equation analysis - PowerPoint PPT Presentation

Citation preview

Page 1: Multilevel Models in Survey Error Estimation

1

Multilevel Modelsin Survey Error Estimation

Joop HoxUtrecht University

mlsurvey

Page 2: Multilevel Models in Survey Error Estimation

2

Multilevel Modeling;some terminology/distinctions

Two broad classes of multilevel models Multilevel regression analysis

(HLM, MLwiN, SAS Proc Mixed, SPSS Mixed)

Multilevel structural equation analysis(Lisrel 8.5, EQS 6, Mplus)

Which are merging (Mplus, Glamm)

Page 3: Multilevel Models in Survey Error Estimation

3

Multilevel Modeling;some terminology/distinctions

Multilevel Modeling = A statistical model that allows specifying and estimating relationships between variables…

… that have been observed at different levels of a hierarchical data structure

Here mostly examples from multilevel regression modeling

Page 4: Multilevel Models in Survey Error Estimation

4

Multilevel Regression Model

Lowest (individual) level: Yij= 0j+ 1jXij+ eij

and at the Second (group) level: 0j= 00+ 01Zj+ u0j

1j= 10+ 11Zj+ u1j

Combining: Yij= 00 + 10Xij+ 01Zj+ 11ZjXij

+ u1jXij+ u0j+ eij

Page 5: Multilevel Models in Survey Error Estimation

5

The Intercept-Only Model Intercept only model

(null model, baseline model) Contains only intercept and

corresponding error termsYij= 00+ u0j+ eij

Gives the intraclass correlation (rho) 2

u/ (e² + 2u0)

)1(ˆ1 ndeff

Page 6: Multilevel Models in Survey Error Estimation

6

The Fixed Model

Only fixed effects for explanatory variables Slopes do not vary across groups Yij= 00+ 10X1ij …p0Xpij + u0j+ eij

Intercept variance U0j across groups Variance component model Maximum Likelihood estimation, correct

standard errors for clustered data

Page 7: Multilevel Models in Survey Error Estimation

7

Using the Fixed Modelin Survey Research?

Multiple regression (including logistic) is a powerful analysis system

(Jacob Cohen (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426-43.)

Yij= 00+ 10X1ij …p0Xpij + u0j+ eij

Multiple regression model but correct standard errors for clustered data

But…, most multilevel software does not correctly handle weights, stratification

Page 8: Multilevel Models in Survey Error Estimation

8

Using the Fixed Modelin Survey Research?

Multilevel regression in survey data analysis: a niche product

Individuals within groups Interviewer & Survey Organization

effects Groups consisting of individuals

Ratings & Measures of Contexts Occasions within individuals

Longitudinal & Panel data

Page 9: Multilevel Models in Survey Error Estimation

9

Individuals within groups

Interviewer & Organization effects Potentially a three-level structure Respondents within Interviewers within

Organizations

Yijk= 000 + 001Xijk+ 010Zjk+ 100Wk

+ u0k+ u0jk+ eijk

Variance components model

Page 10: Multilevel Models in Survey Error Estimation

10

Interviewers in organizations

“I am not selling anything” Split-run experiment on adding ‘not selling’ argument to

standard telephone intro Multisite study: 10 market research organizations agreed to

run experiment in their standard surveys Data from 101625 cases in 29 surveys within 10

organizations Predict cooperation rate

Survey-level: experiment, saliency, special pop., nationwide, interview duration, length of intro before ‘not selling’

Organization level: no predictors, just variance component Pij= 00 + 01Exp/Conij+ 02X1ij+…+ 06X6ij

+ u0j (+ eij)De Leeuw/Hox (2004). I am not selling anything: 29 experiments in telephone introductions. IJPOR, 16, 464-473.

Page 11: Multilevel Models in Survey Error Estimation

11

Interviewers in organizations across countries

International cooperation on interviewer effects on nonresponse Data from 3064 interviewers, employed in 32

survey organizations, in nine countries Interviewer response rate, cooperation rate Standardized interviewer questionnaire

(translated by organizations) Standardizing interviewer questionnaire

across countries Not multilevel but multigroup SEM Confirmatory Factor Analysis shows

comparable factors in (translated) questionnaires)

Hox/de Leeuw (2002). The influence of interviewers' attitude and behavior on household survey nonresponse: an international comparison. In Groves, Dillman, Eltinge & Little (Eds.) Survey Nonresponse. New York: Wiley.

Page 12: Multilevel Models in Survey Error Estimation

12

Predicting response rate Final multilevel model for interviewer response

ratesPredictor / Model Null Model Final Modelconstant 1.25 (.30) .80 (.40)age .01 (.001)sex .05 (.02)experience .01 (.001soc.val. -.02 (.01)foot in door .01 (.01)nspersuasion .10 (.01)voluntariness -.02 (.01)send other -.01 (.005)²country .59 (.37) .58 (.36)²survey .41 (.13) .39 (.12)

Page 13: Multilevel Models in Survey Error Estimation

13

Multilevel analysis of Interviewer & Organization Effects

Useful for methodological research Standard multilevel regression Response rates: logistic regression

Estimation issues Discussed in Goldstein (2003), Raudenbush & Bryk

(2004), Hox (2002)

Currently best method Hox, de Leeuw & Kreft 1991; Hox & de Leeuw

2002; Pickery & Loosveldt 1998, 1999; Campanelli & O’Muircheartaigh 1999, 2002; Schräpler 2004;

Page 14: Multilevel Models in Survey Error Estimation

14

Groups consisting of individuals

Measuring contextual characteristics Aggregation: characterizing groups

by summarizing the scores of individuals in these groups

Contextual measurement: let individuals within groups rate group or environment characteristics

What are the qualities of such ratings?

Page 15: Multilevel Models in Survey Error Estimation

15

Measuring contextual characteristics

Example: use pupils in schools to rate characteristics of the school manager 854 pupils from 96 schools rate 48 male

+ 48 female managers Variables: six seven-point items on

leadership style Two levels: pupils within schools

Pupils are informants on school manager Pupil level exists, but is not important

Page 16: Multilevel Models in Survey Error Estimation

16

Measuring contextual characteristics

Pupils in schools rate school managers Two levels: pupils within schools

Analysis options Treat as two-level multivariate problem Multilevel SEM (Mplus, Lisrel, Eqs) Treat as three-level problem with

levels variables, pupils, schools Multilevel regression (HLM, MLwiN)

Page 17: Multilevel Models in Survey Error Estimation

17

Measuring the context with multilevel regression

Three levels: variables, pupils, schools Intercept only model:

Estimates: Intercept 2.57 2

school = 0.179, 2pupil = 0.341, 2

item= 0.845

000 0 0 0hij hij ij jY u u u

Page 18: Multilevel Models in Survey Error Estimation

18

Measuring the context:Interpretation of estimates

Intercept 2.57 Item Mean across items, pupils, schools

2school = 0.179

Variation of item means across schools 2

pupil = 0.341 Variation of item means across pupils

2item= 0.845

Item variation (inconsistency)

Page 19: Multilevel Models in Survey Error Estimation

19

Measuring the context:Reliability of measurement

Decomposition of total variance over item, pupil & school level

Pupil level reliability Consistency of pupils across items

Idiosyncratic responses, unique experience

pupil = 2pupil /(2

pupil + 2item /k)

pupil = 0.71

Page 20: Multilevel Models in Survey Error Estimation

20

Measuring the context:Reliability of measurement

Decomposition of total variance over item, pupil & school level

School level reliability Consistency of pupils about manager

school = 0.77

2 2 2 2school school school pupil item jk n

Page 21: Multilevel Models in Survey Error Estimation

21

Measuring the Context:Increasing reliability

School level reliability depends on Mean correlation between items Intraclass correlation for school Number of items k Number of pupils nj

goes up fastest with increasing nj

1 1 1j I

schoolj I I

kn r

kn r k r

Page 22: Multilevel Models in Survey Error Estimation

22

Measuring the context: Combining information

Assume school managers are rated on these 7 items by pupils and themselves

Three levels: items, pupils, schools Two dummy variables that indicate

pupil & self ratings Variances

item (1), pupil (1), school (2 + cov)Item variance

(error)

Pupil variance

(bias)

Manager variance (systematic)

Rating covariance

(validity)

Page 23: Multilevel Models in Survey Error Estimation

23

Example: Measuring neighborhood characteristics

Neighborhoods & Violent Crime Assessment of neighborhoods

343 neighborhoods ± 25 respondents per neighborhood

interviewed & rated own neighborhood(respondent level)

Ratings aggregated to neighborhood level Census information on neighborhood

addedSampson/Raudenbush/Earls (1997). Neighborhoods and violent crime: A multilevel study of collective efficacy.Science, 277, 918-924.

Page 24: Multilevel Models in Survey Error Estimation

24

Example: Measuring neighborhood characteristics

Ratings aggregated to neighborhood level At lowest level demographic variables of

respondents added to control for rating bias due to different subsamples

Neighborhood ratings aggregated conditional on respondent characteristics

Yijk= 000 + 001Xijk+ u0k+ u0jk+ eijk

Intercept-only + individual covariates

Page 25: Multilevel Models in Survey Error Estimation

25

Occasions within individuals

Six persons on up to four occasions Lowest level: occasion; Second: person Mix time variant (occasion level) and

time invariant (person level) predictors Time: trend covariate (1, 2, 3…) or

occasion dummies (0/1) Missing occasions are no problem

Page 26: Multilevel Models in Survey Error Estimation

26

Longitudinal data:Occasion level

Occasion level, time indicator T Yti = 0j + 1j Tti + etj

Intercept and slope coefficients vary across the persons

They are the starting points and rates of change for the different persons

Use for occasion level coefficient, and t for the occasion subscript

On person level we have again and i

Page 27: Multilevel Models in Survey Error Estimation

27

Longitudinal data:Multilevel model

Occasion level:Time varying covariates Yti = 0i + 1i Tti + 2jXti + etj

Person level: time invariant covariates 0j = 00 + 01 Zi + u0i

1j = 10 + 11 Zi + u1i

2j = 20 + 21 Zi + u2i

T time-points, at most T-1 time varying predictors

Or T time varying predictors and no intercept

Page 28: Multilevel Models in Survey Error Estimation

28

Longitudinal data:NLSY Example

Subset of National Longitudinal Survey of Youth (NLSY) 405 children within 2 years of

entering elementary school 4 repeated measurement occasions

Child’s antisocial behavior and reading recognition skills

1 single measure at 1st occasion Mother’s emotional support and

cognitive stimulation

Page 29: Multilevel Models in Survey Error Estimation

29

NLSY Example: Linear Trend

Multilevel regression model for longitudinal GPA data No ‘intercept-only’ model, start

with a model that includes time Occasion fixed

Antisoctj = 00 +10Occti+ u0i+ eti Occasion random

Antisoctj = 00+ 10Occti+ u1iOccti+ u0i+ eti

Different individual trends over time

Page 30: Multilevel Models in Survey Error Estimation

30

NLSY Example:Results linear trend

Linear, Fixed Linear, Random

Intercept 1.58 (.11) 1.56 (.10)

Occasion 0.14 (.03) 0.15 (.04)

intercept

1.84 (.17) 0.96 (.31)

occasion

- 0.10 (.04)

intercept,occasion- .09 (.10)

e

1.91 (.09) 1.74 (.10)

Deviance 5356.82 5318.12

Page 31: Multilevel Models in Survey Error Estimation

31

ComplexCovariance Structures

Standard model for longitudinal data Occasion random: Antisoctj = 00+ 10Occti+ u1iOccti+

u0i+ eti

Variance components: e2 and 00

2 Assumes a very simple error structure

Variance at any occasion equal to e2 + 00

2

Covariance between any two occasions equal to 002

Thus, matrix of covariances between occasions is

2 2 2 2 200 00 00 00

2 2 2 2 200 00 00 002 2 2 2 200 00 00 002 2 2 2 200 00 00 00

e

e

e

e

Y

Page 32: Multilevel Models in Survey Error Estimation

32

ComplexCovariance Structures

Multivariate multilevel model No intercept, include 6 dummies for 6 occasions No variance component at occasion level All dummies random at individual level Equivalent to Manova approach to repeated measures

Covariance matrix:

211 12 13 14

221 22 23 24

231 32 33 34

241 42 43 44

Y

Add occasion, fixed

Page 33: Multilevel Models in Survey Error Estimation

33

ComplexCovariance Structures

Restricted model for longitudinal data Specific constraints on covariance matrix between

occasions Example: assume that autocorrelations between

adjacent time points are higher than between other time points (simplex model)

Example: assume that autocorrelations follow the model et = et-1 +

2 3

22

22

3 2

1

1

11

1

eY

Add occasion, fixed or random

Page 34: Multilevel Models in Survey Error Estimation

34

NLSY Example: Linear trend, Complex covariance structure

Occasion fixed, unrestricted covariance matrix across occasions

Occasion fixed, covariance matrix autocorrelation structure

Occasion random, covariance matrix autocorrelation structure

Page 35: Multilevel Models in Survey Error Estimation

35

NLSY Example:Results linear trend, fixed part

Fixed, Un-constrained

Fixed, Auto-correlation

Random, Autocorrelation

Intercept 1.55 (.10) 1.54 (.13) 1.54 (.13)Occasion 0.14 (.04) 0.15 (.05) .15 (.05)Deviance 5303.95 5401.65 5401.65

Linear trend + randomslope model deviance5318.12 with 8 lessparameters2=14.2, df=8, p=0.08

Far worse than unconstrained model2=97.7, df=8, p<0.0001

Page 36: Multilevel Models in Survey Error Estimation

36

NLSY Example:Results linear trend, random part

Fixed, Un-constrained

Fixed, Auto-correlation

Random, Autocorrelation

Occasion linear

- - Aliased out (redundant)

Occasion dummies

Full covariance matrix, all elements significant

Diagonal variance, autocorr. rho both significant

Diagonal variance, autocorr. rho both significant

Page 37: Multilevel Models in Survey Error Estimation

37

Advantages of Multilevel Modeling Longitudinal Data

Missing occasion data are no problem Manova = listwise deletion, which wastes data Manova = Missing Completely At Random (MCAR) Multilevel model = Missing At Random (MAR)

Can be used for panel & growth models Rate of change may differ across persons,

and predicted by person characteristics Easy to extend to more levels (groups)

Page 38: Multilevel Models in Survey Error Estimation

38

References for Multilevel Analysis

J.J. Hox, 1995. Applied Multilevel Analysis. (http://www.fss.uu.nl/ms/jh) (introductory)

J.J. Hox, 2002. Multilevel Analysis. Techniques and Applications. Hillsdale, NJ: Erlbaum. (intermediate)

T.A.B. Snijders & R.J. Bosker (1999). Multilevel Analysis. Thousand Oaks, CA: Sage.(more technical)

H. Goldstein (2003). Multilevel Statistical Models. London: Arnold Publishers.(very technical)

Page 39: Multilevel Models in Survey Error Estimation

39