23
DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Youngjo Lee, Lars Rönnegård & Maengseok Noh

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

DATA ANALYSIS USINGHIERARCHICAL GENERALIZED

LINEAR MODELS WITH R

Youngjo Lee, Lars Rönnegård & Maengseok Noh

Page 2: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

What did John Nelder achieve in the 1970’s?

Page 3: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12𝜋𝜋𝜎𝜎

𝑒𝑒12𝜎𝜎2 𝑦𝑦𝑖𝑖−𝜇𝜇 2

𝜇𝜇 = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝜇𝜇𝑦𝑦𝑖𝑖𝑒𝑒−𝜇𝜇𝑦𝑦𝑖𝑖!

log(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 𝜇𝜇𝑦𝑦𝑖𝑖(1 − 𝜇𝜇)𝑛𝑛−𝑦𝑦𝑖𝑖

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Γ(𝑘𝑘) 𝜃𝜃𝑘𝑘𝑦𝑦𝑖𝑖𝑘𝑘−1𝑒𝑒

−𝑦𝑦𝑖𝑖𝜃𝜃

k𝜃𝜃 ≡ 𝜇𝜇; 1/𝜇𝜇 = 𝑋𝑋𝑋𝑋

Page 4: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12𝜋𝜋𝜎𝜎

𝑒𝑒12𝜎𝜎2 𝑦𝑦𝑖𝑖−𝜇𝜇 2

𝜇𝜇 = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝜇𝜇𝑦𝑦𝑖𝑖𝑒𝑒−𝜇𝜇𝑦𝑦𝑖𝑖!

log(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 𝜇𝜇𝑦𝑦𝑖𝑖(1 − 𝜇𝜇)𝑛𝑛−𝑦𝑦𝑖𝑖

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Γ(𝑘𝑘) 𝜃𝜃𝑘𝑘𝑦𝑦𝑖𝑖𝑘𝑘−1𝑒𝑒

−𝑦𝑦𝑖𝑖𝜃𝜃

k𝜃𝜃 ≡ 𝜇𝜇; 1/𝜇𝜇 = 𝑋𝑋𝑋𝑋

GLM

Page 5: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12𝜋𝜋𝜎𝜎

𝑒𝑒12𝜎𝜎2 𝑦𝑦𝑖𝑖−𝜇𝜇 2

𝜇𝜇 = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝜇𝜇𝑦𝑦𝑖𝑖𝑒𝑒−𝜇𝜇𝑦𝑦𝑖𝑖!

log(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 𝜇𝜇𝑦𝑦𝑖𝑖(1 − 𝜇𝜇)𝑛𝑛−𝑦𝑦𝑖𝑖

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝜇𝜇) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Γ(𝑘𝑘) 𝜃𝜃𝑘𝑘𝑦𝑦𝑖𝑖𝑘𝑘−1𝑒𝑒

−𝑦𝑦𝑖𝑖𝜃𝜃

k𝜃𝜃 ≡ 𝜇𝜇; 1/𝜇𝜇 = 𝑋𝑋𝑋𝑋

GLMCommon estimation algorithm using iterativeregression - Fast and easy to implement- Linear regression model checking tools!!!

Page 6: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Lee Y. & Nelder J. A. (1996) ”Hierarchicalgeneralized linear models” JRSS B 619-678

GLM approach for fitting• Linear mixed models• Generalized linear mixed models (Laplace approximation)

• Mixed models with non-Gaussian random effects• Above models + dispersion modelling

Page 7: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Coming out July 2017

Page 8: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Linear model

Generalized linear model (GLM)

Joint GLM Generalized linear model

including dispersion model with fixed effects

Linear mixed model (LMM)

Generalized linear mixedmodel (GLMM)

Generalized linear model including Gaussian random effects

Hierarchical GLM (HGLM)Generalized linear model including Gaussian

and/or non-Gaussian random effects. Dispersion can be modelled using fixed

effects.

Page 9: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Linear model

Generalized linear model (GLM)

Joint GLM Generalized linear model

including dispersion model with fixed effects

Linear mixed model (LMM)

Generalized linear mixedmodel (GLMM)

Generalized linear model including Gaussian random effects

Double HGLM (DHGLM)HGLM including dispersion model with both fixed and random effects

Frailty HGLMHGLMs for survival analysis including

competing risk models

Structural Equation Models (SEM)

Hierarchical GLM (HGLM)Generalized linear model including Gaussian

and/or non-Gaussian random effects. Dispersion can be modelled using fixed

effects.

HGLMs with correlated random effects

Including spatial, temporal correlations, splines, GAM.

Factor analysis

Page 10: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Why use HGLMs?

• Fast deterministic algorithms for complex models

• All parts of the model are checkable• Predictions of unobservables

Page 11: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Crack growth data

Hudak et al. (1978) presented data from an experiment where crack lengths are measured on a compact tension steel test.

• There are 21 metallic specimens with the crack lengths recorded every 104 cycles.

• y = increment of crack length• Covariate for the mean part of the model: crack length

previously recorded (crack0)• Covariate for the dispersion part of the model: number of cycles

(cycle)

Page 12: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS
Page 13: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

Page 14: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

Page 15: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)

Page 16: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)## HGLM II ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model = "dispersion", Link = "log",

LinPred = phi ~ cycle)res_hglm2 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)

V(y)= 𝜙𝜙𝜇𝜇2log(𝜙𝜙)=Xb

Page 17: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)## HGLM II ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model = "dispersion", Link = "log",

LinPred = phi ~ cycle)res_hglm2 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)## DHGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion", Link="log",

LinPred = phi ~ cycle + (1|specimen), RandDist="gaussian")res_dhglm1 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)

Page 18: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

hglm1

Page 19: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

hglm2

Page 20: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

dhglm1

Page 21: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Use HGLMs!

• Fast deterministic algorithms for complex models

• All parts of the model are checkable• Predictions of unobservables

Page 22: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

[email protected]

www.larsronnegard.se

Page 23: DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

[email protected]

DATA ANALYSIS USINGHIERARCHICAL GENERALIZED

LINEAR MODELS WITH R

Lee, Rönnegård, Noh

www.larsronnegard.se