Multilevel and Longitudinal Modeling Using Stata · Multilevel and longitudinal models: When and why? 1 I Preliminaries 9 1 Review of linear regression 11 1.1 Introduction 11 1.2

Multilevel and Longitudinal Modeling Using Stata Volume I: Continuous Responses

Third Edition

SOPHIA RABE-HESKETH University of California-Berkeley Institute of Education. University of London

ANDERS SKR.ONDAL Norwegian Institute of Public Health

A Stata Press Publication StataCorp LP College Station, Texas

Contents

List of Tables xvii

List of Figures xix

Preface xxv

Multilevel and longitudinal models: When and why? 1

I Preliminaries 9 1 Review of linear regression 11

1.1 Introduction 11

1.2 Is there gender discrimination in faculty salaries? 11

1.3 Independent-samples t test 12

1.4 One-way analysis of variance 17

1.5 Simple linear regression 19

1.6 Dummy variables 27

1.7 Multiple linear regression 30

1.8 Interactions 36

1.9 Dummy variables for more than two groups 42

1.10 Other types of interactions 48

1.10.1 Interaction between dummy variables 48

1.10.2 Interaction between continuous covariates 50

1.11 Nonlinear effects 52

1.12 Residual diagnostics 54

1.13 ••• Causal and noncausal interpretations of regression coefficients . . 56

1.13.1 Regression as conditional expectation 56

1.13.2 Regression as structural model 57

viii Contents

1.14 Summary and further reading 59

1.15 Exercises 60

II Two-level models 71 2 Variance-components models 73

2.1 Introduction 73

2.2 How reliable are peak-expiratory-flow measurements? 74

2.3 Inspecting within-subject dependence 75

2.4 The variance-components model 77

2.4.1 Model specification 77

2.4.2 Path diagram 78

2.4.3 Between-subject heterogeneity 79

2.4.4 Within-subject dependence 80

Intraclass correlation 80

Intraclass correlation versus Pearson correlation 81

2.5 Estimation using Stata 82

2.5.1 Data preparation: Reshaping to long form 83

2.5.2 Using xtreg 84

2.5.3 Using xtmixed 85

2.6 Hypothesis tests and confidence intervals 87

2.6.1 Hypothesis test and confidence interval for the population mean 87

2.6.2 Hypothesis test and confidence interval for the between-cluster variance 88

Likelihood-ratio test 88

••• Score test 89

F test 92

Confidence intervals 92

2.7 Model as data-generating mechanism 93

2.8 Fixed versus random effects 95

2.9 Crossed versus nested effects 97

Contents ix

2.10 Parameter estimation 99

2.10.1 Model assumptions 99

Mean structure and covariance structure 100

Distributional assumptions 101

2.10.2 Different estimation methods 101

2.10.3 Inference for (3 103

Estimate and standard error: Balanced case 103

Estimate: Unbalanced case 1.05

2.11 Assigning values to the random intercepts 106

2.11.1 Maximum "likelihood'' estimation 106

Implementation via OLS regression 107

Implementation via the mean total residual 108

2.11.2 Empirical Bayes prediction 109

2.11.3 Empirical Bayes standard errors 113

Comparative standard errors 113

Diagnostic standard errors 114


2.13 Exercises 116

3 Random-intercept models with covariates 123

3.1 Introduction 123

3.2 Does smoking during pregnancy affect birthweight? 123

3.2.1 Data structure and descriptive statistics 125

3.3 The linear random-intercept model with covariates 127


3.3.2 Model assumptions 128

3.3.3 Mean structure 130

3.3.4 Residual variance and intraclass correlation 130

3.3.5 Graphical illustration of random-intercept model 131

3.4 Estimation using Stata 131

3.4.1 Using xtreg 132

X Contents

3.4.2 Using xtmixed 133

3.5 Coefficients of determination or variance explained 134

3.6 Hypothesis tests and confidence intervals 138

3.6.1 Hypothesis tests for regression coefficients 138

Hypothesis tests for individual regression coefficients . . . 138

Joint hypothesis tests for several regression coefficients . . 139

3.6.2 Predicted means and confidence intervals 140

3.6.3 Hypothesis test for random-intercept variance 142

3.7 Between and within effects of level-1 covariates 142

3.7.1 Between-mother effects 143

3.7.2 Within-mother effects 145

3.7.3 Relations among estimators 147

3.7.4 Level-2 endogeneity and cluster-level confounding 149

3.7.5 Allowing for different within and between effects 152

3.7.6 Hausman endogeneity test 157

3.8 Fixed versus random effects revisited 158

3.9 Assigning values to random effects: Residual diagnostics 160

3.10 More on statistical inference 164

3.10.1 ••• Overview of estimation methods 164

3.10.2 Consequences of using standard regression modeling for clustered data 167

3.10.3 ••• Power and sample-size determination 168


3.12 Exercises 172

4 Random-coefficient models 181


4.2 How effective are different schools? 181

4.3 Separate linear regressions for each school 182

4.4 Specification and interpretation of a random-coefficient model . . . 188

4.4.1 Specification of a random-coefficient model 188

Contents xi

4.4.2 Interpretation of the random-effects variances and co-

variances 191

4.5 Estimation using xtmixed 194

4.5.1 Random-intercept model 194

4.5.2 Random-coefficient model 196

4.6 Testing the slope variance 197

4.7 Interpretation of estimates 198

4.8 Assigning values to the random intercepts and slopes 200

4.8.1 Maximum "likelihood" estimation 200


4.8.3 Model visualization 203

4.8.4 Residual diagnostics 204

4.8.5 Inferences for individual schools 207

4.9 Two-stage model formulation 210

4.10 Some warnings about random-coefficient models 213

4.10.1 Meaningful specification 213

4.10.2 Many random coefficients 213

4.10.3 Convergence problems 214

4.10.4 Lack of identification 214


4.12 Exercises 216

III Models for longitudinal and panel data 225 Introduction to models for longitudinal and panel data (part III) 227

5 Subject-specific effects and dynamic models 247


5.2 Conventional random-intercept model 248

5.3 Random-intercept models accommodating endogenous covariates . . 250

5.3.1 Consistent estimation of effects of endogenous time-varying covariates 250

xii Contents

5.3.2 Consistent estimation of effects of endogenous

time-varying and endogenous time-constant covariates . . . 253

5.4 Fixed-intercept model 2-57

5.4.1 Using xtreg or regress with a differencing operator 259

5.4.2 ••• Using anova 262

5.5 Random-coefficient model 265

5.6 Fixed-coefficient model 267

5.7 Lagged-response or dynamic models 269

5.7.1 Conventional lagged-response model 269

5.7.2 ••• Lagged-response model with subject-specific intercepts . 273

5.8 Missing data and dropout 278

5.8.1 ••• Maximum likelihood estimation under MAR:

A simulation 279


5.10 Exercises 283

6 Marginal models 293


6.2 Mean structure 293

6.3 Covariance structures 294

6.3.1 Unstructured covariance matrix 298 6.3.2 Random-intercept or compound

symmetric/exchangeable structure 303

6.3.3 Random-coefficient structure 305

6.3.4 Autoregressive and exponential structures 308

6.3.5 Moving-average residual structure 311

6.3.6 Banded and Toeplitz structures 313

6.4 Hybrid and complex marginal models 316

6.4.1 Random effects and correlated level-1 residuals 316

6.4.2 Heteroskedastic level-1 residuals over occasions 317

6.4.3 Heteroskedastic level-1 residuals over groups 318

6.4.4 Different covariance matrices over groups 32f

Contents ' xiii

6.5 Comparing the fit of marginal models 322

6.6 Generalized estimating equations (GEE) 325

6.7 Marginal modeling with few units and many occasions 327

6.7.1 Is a highly organized labor market beneficial for economic growth? 328

6.7.2 Marginal modeling for long panels 329

6.7.3 Fitting marginal models for long panels in Stata 329


6.9 Exercises 333

7 Growth-curve models 343


7.2 How do children grow? 343

7.2.1 Observed growth trajectories 344

7.3 Models for nonlinear growth 345

7.3.1 Polynomial models 345

Fitting the models 346

Predicting the mean trajectory 349

Predicting trajectories for individual children 351

7.3.2 Piecewise linear models 353

Fitting the models 354

Predicting the mean trajectory 357

7.4 Two-stage model formulation 358

7.5 Heteroskedasticity 360

7.5.1 Heteroskedasticity at level 1 360

7.5.2 Heteroskedasticity at level 2 362

7.6 How does reading improve from kindergarten through third grade? 364

7.7 Growth-curve model as a structural equation model 364

7.7.1 Estimation using sem 366

7.7.2 Estimation using xtmixed 371


xiv Contents

7.9 Exercises 376

IV Models with nested and crossed random effects 383 8 Higher-level models with nested random effects 385


8.2 Do peak-expiratory-flow measurements vary between methods within subjects? 386

8.3 Inspecting sources of variability 388

8.4 Three-level variance-components models 389

8.5 Different types of intraclass correlation 392

8.6 Estimation using xtmixed 393

8.7 Empirical Bayes prediction 394

8.8 Testing variance components 395

8.9 Crossed versus nested random effects revisited 397

8.10 Does nutrition affect cognitive development of Kenyan children? . . 399

8.11 Describing and plotting three-level data 400

8.11.1 Data structure and missing data 400

8.11.2 Level-1 variables 401

8.11.3 Level-2 variables • 402

8.11.4 Level-3 variables 403

8.11.5 Plotting growth trajectories 404

8.12 Three-level random-intercept model 405

8.12.1 Model specification: Reduced form 405

8.12.2 Model specification: Three-stage formulation 405


8.13 Three-level random-coefficient models 409

8.13.1 Random coefficient at the child level 409

8.13.2 Random coefficient at the child and school levels 411

8.14 Residual diagnostics and predictions 413


8.16 Exercises 419

Contents xv

9 Crossed random effects 433


9.2 How does investment depend on expected profit and capital stock? 434

9.3 A two-way error-components model 435


9.3.2 Residual variances, covariances. and intraclass correlations 436

Longitudinal correlations 436

Cross-sectional correlations 436


9.3.4 Prediction 441

9.4 How much do primary and secondary schools affect attainment at

age 16? 443

9.5 Data structure 444

9.6 Additive crossed random-effects model 446

9.6.1 Specification 446


9.7 Crossed random-effects model with random interaction 448


9.7.2 Intraclass correlations 448


9.7.4 Testing variance components 451

9.7.5 Some diagnostics 453

9.8 A trick requiring fewer random effects 456


9.10 Exercises 460

A Useful Stata commands 471

References 473

Author index 485

Subject index 491

Multilevel and Longitudinal Modeling Using Stata Volume II: Categorical Responses, Counts, and Survival

Third Edition

SOPHIA RABE-HESKETH University of California. Berkeley Institute of Education. University of London

ANDERS SKRONDAL Norwegian Institute of Public Health

A Stata Press Publication StataCorp LP College Station, Texas

O

Contents

List of Tables xvii

List of Figures xix

V Models for categorical responses 499 10 Dichotomous or binary responses 501


10.2 Single-level logit and probit regression models for dichotomous responses 501

10.2.1 Generalized linear model formulation 502

10.2.2 Latent-response formulation 510

Logistic regression 512

Probit regression 512

10.3 Which treatment is best for toenail infection? 515

10.4 Longitudinal data structure 515

10.5 Proportions and fitted population-averaged or marginal probabilities 517

10.6 Random-intercept logistic regression 520


Reduced-form specification 520

Two-stage formulation 522

10.7 Estimation of random-intercept logistic models 523

10.7.1 Using xtlogit 523

10.7.2 Using xtmelogit 527

10.7.3 Using gllamrn 527

10.8 Subject-specific or conditional vs. population-averaged or marginal relationships 529

viii Contents

10.9 Measures of dependence and heterogeneity 532

10.9.1 Conditional or residual intraclass correlation of the latent responses 532

10.9.2 Median odds ratio 533

10.9.3 *•* Measures of association for observed responses at median fixed part of the model 533

10.10 Inference for random-intercept logistic models 535

10.10.1 Tests and confidence intervals for odds ratios 535

10.10.2 Tests of variance components 536

10.11 Maximum likelihood estimation 537

10.11.1 *•* Adaptive quadrature 537

10.11.2 Some speed and accuracy considerations 540

Advice for speeding up estimation in gllannn 542

10.12 Assigning values to random effects 543

10.12.1 Maximum "likelihood" estimation 544

1.0.12.2 Empirical Bayes prediction 545

10.12.3 Empirical Bayes modal prediction 546

10.13 Different kinds of predicted probabilities 548

10.13.1 Predicted population-averaged or marginal probabilities . . 548

10.13.2 Predicted subject-specific probabilities 549

Predictions for hypothetical subjects: Conditional probabilities 549

Predictions for the subjects in the sample: Posterior

mean probabilities 551

10.14 Other approaches to clustered dichotomous data -557

10.14.1 Conditional logistic regression 557

10.14.2 Generalized estimating equations (GEE) 559


10.16 Exercises 563

11 Ordinal responses 575


Contents ix

11.2 Single-level cumulative models for ordinal responses . . . : 57-5

11.2.1 Generalized linear model formulation 57-5

11.2.2 Latent-response formulation 576

11.2.3 Proportional odds 580

11.2.4 ••• Identification 582

11.3 Are antipsychotic drugs effective for patients with schizophrenia? . 585

11.4 Longitudinal data structure and graphs 585

11.4.1 Longitudinal data structure 586

11.4.2 Plotting cumulative proportions 587

11.4.3 Plotting cumulative sample logits and transforming the

time scale 588

11.5 A single-level proportional odds model 590


11.5.2 Estimation using Stata 591

11.6 A random-intercept proportional odds model 594



11.6.3 Measures of dependence and heterogeneity 595

Residual intraclass correlation of latent responses 595

Median odds ratio 596

11.7 A random-coefficient proportional odds model 596


11.7.2 Estimation using gllamrn 596


11.8.1 Predicted population-averaged or marginal probabilities . . 599

11.8.2 Predicted subject-specific probabilities: Posterior mean . . 602

11.9 Do experts differ in their grading of student essays? 606

11.10 A random-intercept probit model with grader bias 606



x Contents

11.11 Including grader-specific measurement error variances 608



11.12 Including grader-specific thresholds 611


11.12.2 Estimation using gllamm 611

11.13 *•* Other link functions 616

Cumulative complementary log-log model 616

Continuation-ratio logit model 616

Adjacent-category logit model 618

Baseline-category logit and stereotype models 618


11.15 Exercises 620

12 Nominal responses and discrete choice 629


12.2 Single-level models for nominal responses 630

12.2.1 Multinomial logit models 630

12.2.2 Conditional logit models 638

Classical conditional logit models 639

Conditional logit models also including covariates that

vary only over units 645

12.3 Independence from irrelevant alternatives 648

12.4 Utility-maximization formulation 649

12.5 Does marketing affect choice of yogurt? 651

12.6 Single-level conditional logit models 653

12.6.1 Conditional logit models with alternative-specific

intercepts 654

12.7 Multilevel conditional logit models 659

12.7.1 Preference heterogeneity: Brand-specific random intercepts 659

Contents xi

12.7.2 Response heterogeneity: Marketing variables with random coefficients 663

12.7.3 ••• Preference and response heterogeneity 666

Estimation using gllamrn 667

Estimation using mixlogit 669

12.8 Prediction of random effects and response probabilities 672


12.10 Exercises 677

VI Models for counts 685 13 Counts 687


13.2 What are counts? 687

13.2.1 Counts versus proportions 687

• 13.2.2 Counts as aggregated event-history data 688

13.3 Single-level Poisson models for counts 689

13.4 Did the German health-care reform reduce the number of doctor visits? 691

13.5 Longitudinal data structure 691

13.6 Single-level Poisson regression 692



13.7 Random-intercept Poisson regression 696




Using xtpoisson 697

Using xtmepoisson 699

Using gllamrn 700

13.8 Random-coefficient Poisson regression 701


xii Contents


Using xtmepoisson 702

Using gllamrn 704

13.8.3 Interpretation of estimates 705

13.9 Overdispersion in single-level models 706

13.9.1 Normally distributed random intercept 706

13.9.2 Negative binomial models 707

Mean dispersion or NB2 708

Constant dispersion or NB1 709

13.9.3 Quasilikelihood 709

13.10 Level-1 overdispersion in two-level models 711

13.11 Other approaches to two-level count data 713

13.11.1 Conditional Poisson regression 713

13.11.2 Conditional negative binomial regression 715

13.11.3 Generalized estimating equations 715

13.12 Marginal a.nd conditional effects when responses are MAR 716

Simulation 717

13.13 Which Scottish counties have a high risk of

lip cancer? 720

13.14 Standardized mortality ratios 721

13.15 Random-intercept Poisson regression 723



13.15.3 Prediction of standardized mortality ratios 725

13.16 Nonparametric maximum likelihood estimation 727

13.16.1 Specification 727


13.16.3 Prediction 732

13.17 Summary and further reading • 732

13.18 Exercises 733

Contents xiii

VII Models for survival or duration data 741 Introduction to models for survival or duration data (part VII) 743

14 Discrete-time survival 749


14.2 Single-level models for discrete-time survival data 749

14.2.1 Discrete-time hazard and discrete-time survival 749

14.2.2 Data expansion for discrete-time survival analysis 752

14.2.3 Estimation via regression models for dichotomous

responses 754

14.2.4 Including covariates 758

Time-constant covariates 758

Time-varying covariates 762

14.2.5 Multiple absorbing events and competing risks 767

14.2.6 Handling left-truncated data 772

14.3 How does birth history affect child mortality? 773

14.4 Data expansion 774

14.5 Proportional hazards and interval-censoring 776

14.6 Complementary log-log models 777

14.7 A random-intercept complementary log-log model 781


14.7.2 Estimation using Stata 782 14.8 '•* Population-averaged or marginal vs. subject-specific or condi

tional survival probabilities 784


14.10 Exercises 789

15 Continuous-time survival 797


15.2 What makes marriages fail? 797

15.3 Hazards and survival 799

15.4 Proportional hazards models 805

15.4.1 Piecewise exponential model 807

xiv Contents

15.4.2 Cox regression model 815

15.4.3 Poisson regression with smooth baseline hazard 819

15.5 Accelerated failure-time models 823

15.5.1 Log-normal model 824

15.6 Time-varying covariates 829

15.7 Does nitrate reduce the risk of angina pectoris? 832

15.8 Marginal modeling 835

15.8.1 Cox regression 835

15.8.2 Poisson regression with smooth baseline hazard 838

15.9 Multilevel proportional hazards models 841

15.9.1 Cox regression with gamma shared frailty 841

15.9.2 Poisson regression with normal random intercepts 845

15.9.3 Poisson regression with normal random intercept and random coefficient 847

15.10 Multilevel accelerated failure-time models 849

15.10.1 Log-normal model with gamma shared frailty 849

15.10.2 Log-normal model with log-normal shared frailty 850

15.11 A fixed-effects approach 851

15.11.1 Cox regression with subject-specific baseline hazards .... 851

15.12 Different approaches to recurrent-event data 853

15.12.1 Total time 854

15.12.2 Counting process 858

15.12.3 Gap time 859


15.14 Exercises 862

VIII Models with nested and crossed random effects 871 16 Models with nested and crossed random effects 873


16.2 Did the Guatemalan immunization campaign work? 873

16.3 A three-level random-intercept logistic regression model 875

Contents xv



Types of residual intraclass correlations of the latent responses 876

Types of median odds ratios 877

16.3.3 Three-stage formulation 877

16.4 Estimation of three-level random-intercept logistic regression

models 878



16.5 A three-level random-coefficient logistic regression model 886

16.6 Estimation of three-level random-coefficient logistic regression

models 887



16.7 Prediction of random effects 892


16.7.2 Empirical Bayes modal prediction 893


16.8.1 Predicted population-averaged or marginal probabilities:

New clusters 894

16.8.2 Predicted median or conditional probabilities 895

16.8.3 Predicted posterior mean probabilities: Existing clusters . 896

16.9 Do salamanders from different populations mate successfully? . . . 897

16.10 Crossed random-effects logistic regression 900


16.12 Exercises 908

A Syntax for gllamm, eq, and gllapred: The bare essentials 915

B Syntax for gllamm 921

C Syntax for gllapred 933

D Syntax for gllasim 937

References

Author index

Subject index

Contents

941

955

963

Documents

Multilevel and Longitudinal Modeling Using Stata · Multilevel and longitudinal models: When and why? 1 I Preliminaries 9 1 Review of linear regression 11 1.1 Introduction 11 1.2