2

Click here to load reader

Flexible Modelling of Serial Correlation in GLMMold.sis-statistica.org/files/pdf/atti/rs08_spontanee_19_5.pdf · Flexible Modelling of Serial Correlation in GLMM ... which is typically

Embed Size (px)

Citation preview

Page 1: Flexible Modelling of Serial Correlation in GLMMold.sis-statistica.org/files/pdf/atti/rs08_spontanee_19_5.pdf · Flexible Modelling of Serial Correlation in GLMM ... which is typically

Flexible Modelling of Serial Correlation in GLMMModellazione flessibile della struttura di correlazione seriale nei GLMM

Mariangela SciandraDipartimento di Scienze Statistiche e Matematiche “SilvioVianelli”, Universita degli

Studi di Palermoe-mail: [email protected]

Keywords: longitudinal data, glmm, serial correlation, fractional polynomials

1. Introduction

Mixed models have rapidly become a very useful tool for modelling longitudinaldata because, thanks to their flexibility, they are particularly suited to handle manycomplex situations, especially when extensions like Generalized Linear or Nonlinearmixed models are used. However, existing works on Generalized Linear Mixed Models(GLMMs) assume the correlation between measurements on thesame observationalunit to be modelled only by the random effects while conditionally upon them therepeated measurements are assumed to be independent. Yet, it could happen that partof an individual observed profile may be a response to time-varying stochastic processesoperating within that unit. This type of random variation results in a correlation betweenpairs of measurements on the same subject, calledserial correlation, which is usually adecreasing function of the time separation between these measurements. So, in presenceof serially correlated observations the functional specification of the covariance structureshould be done through the simultaneous specification of thecovariance structureinduced by the random effects, which is typically non-stationary, and the specificationof a “residual” component which accounts for deviations of the observations from theindividual profiles. Standard GLMMs are not able to deal withserially correlatedobservations because assuming conditional independence they ignore correlation at theresiduals level resulting in invalid inferences for the parameters in the mean structure ofthe model.

In this work a general approach will be proposed consisting in extending to theGLMMs case the procedure proposed by Lesaffreet al. (1999) in the context of LinearMixed Models (LMMs), for a flexible modelling of the GLMMs covariance structuresas a combination of a parametric modelling of the random effects part with a flexiblemodelling of the serial correlation component through the use of Fractional Polynomials.

2. GLMMs for serially correlated observations

Let Yij be thejth measurement for subjecti, i = 1, . . . ,m, j = 1, . . . , ni andYi theni-dimensional vector of all measurements taken on subjecti, a GLMM assumes that,conditionally onq-dimensional random effectsbi, the elementsYij of Yi are independent.The basis for the development of a Generalized Linear Mixed Model with autocorrelationis the decomposition

Yij = µij + ǫij = h(xTijβ + z

Tijbi) + ǫij

Page 2: Flexible Modelling of Serial Correlation in GLMMold.sis-statistica.org/files/pdf/atti/rs08_spontanee_19_5.pdf · Flexible Modelling of Serial Correlation in GLMM ... which is typically

whereh(·) is the inverse of the link function, and the error terms have an appropriatedistribution with variance equal toV ar(Yij|bi) = φv(µij) for v(·) the usual variancefunction in the exponential family. Then, using a PenalizedQuasi-Likelihood (PQL)approach to the estimation, the GLMM is estimated by definingat each step a workingvariateY

i as a Taylor expansion of the response around the conditionalmean evaluatedon the current estimatesβ andbi of fixed and random effects, respectively. The workingvariateY∗

i of the specified GLMM then it is known (Schall, 1991) to be

Y∗

i ≡ g(µi) + g′(µi)(Yi − µi) = Xiβ + Zibi + g′(µi)(Yi − µi) ≈ Xiβ + Zibi + ǫ∗i

whereǫ∗i is equal tog′(µi)ǫi, which has also mean zero∀i = 1, 2, . . . ,m.ThenE[Y∗|b] = Xβ + Zb andvar(Y∗|b) = ∆Σ∆

T where∆ is the diagonal matrixwith diagonal entriesg′(µi) andΣ = var(Y − µ). We can express the variance functionΣ so that

V ar(ǫi) = Φ1

2A1

2

i HiA1

2

i Φ1

2 (1)

whereΦ is a diagonal matrix with the over dispersion parameters along the diagonal,Hi

is the correlation matrix andAi is a diagonal matrix containing the variances followingfrom the generalized linear model specification ofYij given the random effectsbi.Using equation (1) we have the following expression for the marginal variance covariancematrix:

var(Y∗) = V = ZDZT + Φ

1

2A1

2

i HiA1

2

i Φ1

2 .

That is, the working variateY∗ takes the form of a weighted LMM with diagonal weightmatrix W = A

−1

µ [g′(µ)]−2. Then an iterative algorithm like that proposed by Schall(1991), in which a LMM is fitted to get estimates ofβ and b, is used. The use ofthis estimation algorithm allows us to introduce autocorrelation at the level of the linearpredictor in modelling the pseudo-variableY

i at each step trough a LMM or to define anautocorrelation function on the residuals evaluated at each step of the iterative process.Doing so, we can take advantage of the well-established correlation structures for LMMs.In particular, our proposal consists in modelling the serial correlation part using FractionalPolynomials (Royston and Altman, 1994) which allow the autocorrelation functiong(·)in Hi to have a parametric form which is flexible enough to assume several shapes.

This approach is very general and applies to all link function. Yet, it is important tounderline how the estimate of the autocorrelation parameter obtained using this approachhas to be considered only as an approximation of the underlying correlation structure ofthe observed data given by the correlation of its working variate.

References

Lesaffre E., Todem D., Verbeke G. (2000) Flexible modellingof the covariance matrixin a linear random effects model,Biometrical Journal, 42, 807–822.

Royston P., Altman D. G. (1994) Regression using fractional polynomials of continuouscovariates: parsimonious parametric modelling,Applied Statistics, 43, 429–468.

Schall R. (1991) Estimation in generalized linear models with random effects,Biometrika, 78, 719-727.