DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS

Preview:

Citation preview

DATA ANALYSIS USINGHIERARCHICAL GENERALIZED

LINEAR MODELS WITH R

Youngjo Lee, Lars RΓΆnnegΓ₯rd & Maengseok Noh

What did John Nelder achieve in the 1970’s?

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12πœ‹πœ‹πœŽπœŽ

𝑒𝑒12𝜎𝜎2 π‘¦π‘¦π‘–π‘–βˆ’πœ‡πœ‡ 2

πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

π‘›π‘›πœ‡πœ‡π‘¦π‘¦π‘–π‘–π‘’π‘’βˆ’πœ‡πœ‡π‘¦π‘¦π‘–π‘–!

log(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 πœ‡πœ‡π‘¦π‘¦π‘–π‘–(1 βˆ’ πœ‡πœ‡)π‘›π‘›βˆ’π‘¦π‘¦π‘–π‘–

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Ξ“(π‘˜π‘˜) πœƒπœƒπ‘˜π‘˜π‘¦π‘¦π‘–π‘–π‘˜π‘˜βˆ’1𝑒𝑒

βˆ’π‘¦π‘¦π‘–π‘–πœƒπœƒ

kπœƒπœƒ ≑ πœ‡πœ‡; 1/πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12πœ‹πœ‹πœŽπœŽ

𝑒𝑒12𝜎𝜎2 π‘¦π‘¦π‘–π‘–βˆ’πœ‡πœ‡ 2

πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

π‘›π‘›πœ‡πœ‡π‘¦π‘¦π‘–π‘–π‘’π‘’βˆ’πœ‡πœ‡π‘¦π‘¦π‘–π‘–!

log(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 πœ‡πœ‡π‘¦π‘¦π‘–π‘–(1 βˆ’ πœ‡πœ‡)π‘›π‘›βˆ’π‘¦π‘¦π‘–π‘–

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Ξ“(π‘˜π‘˜) πœƒπœƒπ‘˜π‘˜π‘¦π‘¦π‘–π‘–π‘˜π‘˜βˆ’1𝑒𝑒

βˆ’π‘¦π‘¦π‘–π‘–πœƒπœƒ

kπœƒπœƒ ≑ πœ‡πœ‡; 1/πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

GLM

Gaussian

Poisson Binomial

Gamma

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛12πœ‹πœ‹πœŽπœŽ

𝑒𝑒12𝜎𝜎2 π‘¦π‘¦π‘–π‘–βˆ’πœ‡πœ‡ 2

πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

π‘›π‘›πœ‡πœ‡π‘¦π‘¦π‘–π‘–π‘’π‘’βˆ’πœ‡πœ‡π‘¦π‘¦π‘–π‘–!

log(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛𝑛𝑛𝑦𝑦𝑖𝑖 πœ‡πœ‡π‘¦π‘¦π‘–π‘–(1 βˆ’ πœ‡πœ‡)π‘›π‘›βˆ’π‘¦π‘¦π‘–π‘–

𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(πœ‡πœ‡) = 𝑋𝑋𝑋𝑋

𝐿𝐿 = �𝑖𝑖=1

𝑛𝑛1

Ξ“(π‘˜π‘˜) πœƒπœƒπ‘˜π‘˜π‘¦π‘¦π‘–π‘–π‘˜π‘˜βˆ’1𝑒𝑒

βˆ’π‘¦π‘¦π‘–π‘–πœƒπœƒ

kπœƒπœƒ ≑ πœ‡πœ‡; 1/πœ‡πœ‡ = 𝑋𝑋𝑋𝑋

GLMCommon estimation algorithm using iterativeregression - Fast and easy to implement- Linear regression model checking tools!!!

Lee Y. & Nelder J. A. (1996) ”Hierarchicalgeneralized linear models” JRSS B 619-678

GLM approach for fittingβ€’ Linear mixed modelsβ€’ Generalized linear mixed models (Laplace approximation)

β€’ Mixed models with non-Gaussian random effectsβ€’ Above models + dispersion modelling

Coming out July 2017

Linear model

Generalized linear model (GLM)

Joint GLM Generalized linear model

including dispersion model with fixed effects

Linear mixed model (LMM)

Generalized linear mixedmodel (GLMM)

Generalized linear model including Gaussian random effects

Hierarchical GLM (HGLM)Generalized linear model including Gaussian

and/or non-Gaussian random effects. Dispersion can be modelled using fixed

effects.

Linear model

Generalized linear model (GLM)

Joint GLM Generalized linear model

including dispersion model with fixed effects

Linear mixed model (LMM)

Generalized linear mixedmodel (GLMM)

Generalized linear model including Gaussian random effects

Double HGLM (DHGLM)HGLM including dispersion model with both fixed and random effects

Frailty HGLMHGLMs for survival analysis including

competing risk models

Structural Equation Models (SEM)

Hierarchical GLM (HGLM)Generalized linear model including Gaussian

and/or non-Gaussian random effects. Dispersion can be modelled using fixed

effects.

HGLMs with correlated random effects

Including spatial, temporal correlations, splines, GAM.

Factor analysis

Why use HGLMs?

β€’ Fast deterministic algorithms for complex models

β€’ All parts of the model are checkableβ€’ Predictions of unobservables

Crack growth data

Hudak et al. (1978) presented data from an experiment where crack lengths are measured on a compact tension steel test.

β€’ There are 21 metallic specimens with the crack lengths recorded every 104 cycles.

β€’ y = increment of crack lengthβ€’ Covariate for the mean part of the model: crack length

previously recorded (crack0)β€’ Covariate for the dispersion part of the model: number of cycles

(cycle)

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)## HGLM II ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model = "dispersion", Link = "log",

LinPred = phi ~ cycle)res_hglm2 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)

V(y)= πœ™πœ™πœ‡πœ‡2log(πœ™πœ™)=Xb

## GLM ##res_glm <- glm(y ~ crack0, family= Gamma(link=log), data=data_crack_growth)

library(hglm)## GLMM ##res_glmm1 <- hglm2(y ~ crack0 + (1|specimen), family = Gamma(link=log) , data = data_crack_growth )

library(dhglm)## HGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist = "inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion")res_hglm1 <- dhglmfit(RespDist="gamma", DataMain=data_crack_growth,

MeanModel=model_mu, DispersionModel=model_phi)## HGLM II ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model = "dispersion", Link = "log",

LinPred = phi ~ cycle)res_hglm2 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)## DHGLM I ##model_mu <- DHGLMMODELING(Model="mean", Link="log",

LinPred = y ~ crack0 + (1|specimen), RandDist="inverse-gamma")model_phi <- DHGLMMODELING(Model="dispersion", Link="log",

LinPred = phi ~ cycle + (1|specimen), RandDist="gaussian")res_dhglm1 <- dhglmfit(RespDist = "gamma", DataMain = data_crack_growth,

MeanModel = model_mu, DispersionModel = model_phi)

hglm1

hglm2

dhglm1

Use HGLMs!

β€’ Fast deterministic algorithms for complex models

β€’ All parts of the model are checkableβ€’ Predictions of unobservables

LRN@DU.SE

www.larsronnegard.se

LRN@DU.SE

DATA ANALYSIS USINGHIERARCHICAL GENERALIZED

LINEAR MODELS WITH R

Lee, RΓΆnnegΓ₯rd, Noh

www.larsronnegard.se

Recommended