50
Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with Application to Zero-inflated Microbiome Data Longhai Li Department of Mathematics and Statistics University of Saskatchewan Saskatoon, SK, CANADA 5 June 2018 Annual Meeting of Statistical Society of Canada McGill University, Montreal, Canada

Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Randomized Quantile Residual for Assessing GeneralizedLinear Mixed Models with Application to Zero-inflated

Microbiome Data

Longhai Li

Department of Mathematics and StatisticsUniversity of SaskatchewanSaskatoon, SK, CANADA

5 June 2018Annual Meeting of Statistical Society of Canada

McGill University, Montreal, Canada

Page 2: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Acknowledgements

This talk is based on the results of the M.Sc thesis project undertakenby Wei Bai, co-supervised with Cindy X. Feng (U of S).

Thank Prof. Wei Xu (U of T) for providing the microbiome data forthis research.

Thank NSERC and CFI for providing grants for my research.

Thank the ICSA Canada Chapter, particularly Prof. Changbao Wu(U of Waterloo), for organizing and sponsoring this session.

Page 3: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Outline

1 Introduction

2 Zero-inflated/modified Generalized Linear Mixed Models

3 Randomized Quantile Residual

4 Simulation StudiesDescription of Data Generating ProcessAssessing Models for Datasets Simulated from ZMP ModelAssessing Models for Datasets Simulated from ZMNB Model

5 Application to a Twin Study OTU Dataset

6 Conclusions and Discussions

Page 4: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Section 1

Introduction

Page 5: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Introduction

The operational taxonomic unit (OTU) counts in microbiome datasetshave characteristics of zero-inflation and over-dispersion. Variousgeneralized mixed models have been proposed to to fit the data.

Correctness in model specification plays extremely important role instatistical inference, for example in calculating p-values/q-values forselecting OTUs that are related to a phenotype.

Pearson and deviance residuals are often used in practice withoutjustification. However, when applied to count data, the distributionsof these residuals are far from the normal distribution.

Randomized quantile residual (RQR) was originally proposed by Dunnand Smyth (1996) as an alternative for Pearson and devianceresiduals. However, it has NOT been used much by statisticians.

We investigate the performance of RQR in checkingzero-inflated/modified generalized linear mixed effect (GLMM)models using simulated and real datasets.

1. Introduction/ 1/39

Page 6: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Section 2

Zero-inflated/modified Generalized Linear MixedModels

Page 7: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Generalized Linear Mixed Model

As an example of GLMM, NB mixed model (NB) is described as follows:

A probability distribution for the response (yi ) given a mean functionµi and other parameters, eg.

yi |µi ∼ Negative-Binomial(µi , k)

A link function for linking the mean µi to a linear function of fixedfactor (Xi ) and random factors (Zi ), eg.

log(µi ) = Xiβ + Ziu

Certain penalization (often normal) is imposed to u.

2. Zero-inflated/modified Generalized Linear Mixed Models/ 2/39

Page 8: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-inflated Poisson (ZIP) Model: I

The zero-inflated Poisson with parameters λi and pi , denoted byZIP(λi , pi ), is defined as:

yi ∼

{0, with probability pi

Poisson(µi ), with probability 1− pi .(1)

The following link functions are often used:

log(µi ) = offseti + Xiβ + Ziu (2)

log

(pi

1− pi

)= offseti + Xi β + Zi u, (3)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 3/39

Page 9: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-inflated Poisson (ZIP) Model: II

The PMF and CDF of the ZIP distribution:

dzip(yi = 0) = pi + (1− pi )× e−µi (4)

dzip(yi = j) = (1− pi )e−µiµji

j!, for j > 0 (5)

pzip(yi = J;µi , pi ) = pi + (1− pi )ppois(J, µi ). (6)

The mean and variance of a ZIP random variable can be calculated by

E (yi ) = (1− pi )× µi (7)

V (yi ) = (1− pi )×(µi + pi × µi 2

). (8)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 4/39

Page 10: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-inflated Negative-Binomial (ZINB) Model: I

Zero-inflated NB(ZINB) can be defined similarly as ZIP, with Poissonreplaced by NB.

The PMF and CDF of the ZINB distribution:

dzinb(yi = 0) = pi + (1− pi )×(

k

k + µi

)k

(9)

dzinb(yi = j) = (1− pi )× dnb(j , µi , k), for j > 0 (10)

pzinb(yi ;µi , k , pi ) = pi + (1− pi )pnb(yi , µi , k) (11)

The mean and variance of a ZIP random variable can be calculated by

E (yi ) = (1− pi )× µi (12)

V (yi ) = (1− pi )×(µi +

µi2

k

)+ µi

2 ×(pi

2 + pi)

(13)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 5/39

Page 11: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-modified Poisson (ZMP): I

Zero-Modified Model: Zero-modified models are also called hurdle models.A logistic regression for the zero indicator (Zi ):

Pr(Zi = z) =

{πi , z = 0

1− πi , z = 1.(14)

Given Zi , the conditional probability mass function for Yi is:Pr(Yi = yi |Zi = 0) = I (yi = 0)

Pr(Yi = yi |Zi = 1) = dpois(yi )

1−dpois(0)I (yi > 0).

(15)

The unconditional probability mass function for Yi is

Pr(Yi = yi ) =

πi , if yi = 0

(1− πi ) dpois(yi )

1−dpois(0), if yi > 0.

(16)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 6/39

Page 12: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-modified Poisson (ZMP): II

We often used the log link functions for non-zero count mean µi and logisticlink for πi :

log(µi ) = offseti + Xiβ + Ziu (17)

log

(πi

1− πi

)= offseti + Xi β + Zi u, (18)

The PMF and CDF of ZMP distribution:

dzmp(yi = 0) = πi (19)

dzmp(yi = j) = (1− πi )dpois(j)

1− ppois(0), for j > 0 (20)

pzmp(yi ;µi , πi ) = πi + (1− πi )ppois(yi ;µi , πi )− ppois(0)

1− ppois(0), (21)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 7/39

Page 13: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Zero-modified NB (ZMNB)

ZMNB model can be defined analogously. The PMF and CDF of ZMNBdistribution:

dzmnb(yi = j) = πi I (j = 0) + (1− πi )dnb(yi )

1− pnb(0)I (j > 0) (22)

pzmnb(yi ;µi , k, πi ) = πi + (1− πi )pnb(yi )− pnb(0)

1− pnb(0). (23)

The same link functions as in ZMP are used for ZMNB.

The mean and variance of ZMNB:

E (yi ) =1− πi1− p0

× µi (24)

V (yi ) =1− πi1− p0

×(µi + µ2

i +µi

2

k

)−(

1− πi1− p0

× µi

)2

. (25)

2. Zero-inflated/modified Generalized Linear Mixed Models/ 8/39

Page 14: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Section 3

Randomized Quantile Residual

Page 15: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Problems with Pearson and Deviance Residuals

In regression models for discrete outcomes, the residuals are far fromnormality, with residuals clustering on lines according to distinctresponse values, which poses great challenges for visual inspection.Therefore, residual plots for the diagnosis of models for discreteoutcome variables give very limited meaningful information for modeldiagnosis.

The Pearson χ2 statistic is written as, X 2 =∑n

i=1 r2i , and the

deviance (χ2 statistic) is written as, D =∑n

i=1 d2i . The asymptotic

distribution of D and X 2 under the true model is often assumed to beχ2n−p. However, the use of this asymptotic distribution for both X 2

and D is lack of theoretical underpinning.

3. Randomized Quantile Residual/ 9/39

Page 16: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

A First Look at Three Residuals

Pearson

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

● ● ●

● ●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●●

●● ●

●● ●

●●

●●

●●

● ●● ●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●●●

●●

●●

●●

● ●●

● ●

●●

●●●

● ●

● ●

● ●

●● ●●

●●

●●●

●●

● ●

● ●

● ●●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●●

●●

●●

●●

●●

●●

−1.5 −0.5 0.0 0.5 1.0 1.5

−1

01

23

45

x

Pea

rson

Deviance

●●

●●

●●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

● ●

● ●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●●

●●

● ●●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

● ●

● ●●

●●

●●

−1.5 −0.5 0.0 0.5 1.0 1.5−

2−

10

12

3

x

Dev

ianc

e

Randomized Quantile

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●● ●

● ●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

● ●●

● ●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

● ●

●● ●

●●

●●

● ●

●●

● ●

●● ●

●●

●●

●●

● ●

●●

−1.5 −0.5 0.0 0.5 1.0 1.5

−3

−2

−1

01

23

x

Ran

dom

ized

Qua

ntile

A simulated dataset is checked against the true generating model.However, Pearson and deviance residuals exhibit trend and cluster inlines.

In addition, the often used χ2 tests are not well-calibrated.

3. Randomized Quantile Residual/ 10/39

Page 17: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Definition of Randomized Quantile Residual

Predictive p-value for continuous yi :

F (yi ; µi , φ) = P(Yi ≤ yi | µi , φ)

Randomized predictive p-valueIf F is discrete, the estimated lower tail probability is randomized intoa uniform random number.

F ∗(yi ; µi , φ, ui ) = F (yi−; µi , φ) + ui P(yi ; µi , φ), (26)

where ui from uniform distribution on (0, 1], F (yi−; µi , φ) is the lowerlimit of F at yi , i.e., supy<yi F (y ; µi , φ), the lower limit in the “gap”

of F (·, µi , φ) at yi .Randomized quantile residual

qi = Φ−1(F ∗(yi ; µi , φ, ui )) (27)

where Φ−1 is the quantile function of a standard normal distribution

3. Randomized Quantile Residual/ 11/39

Page 18: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

An Illustrative Example for RQR: I

The true model:We simulate a response variable of size n = 1000 from a Poissonmodel with

log(µi ) = −1 + 2sin(2xi ),

where µi is the expected mean count for the ith subject andxi ∼ Uniform(0, 2π), i = 1, · · · , nA wrong model:Poisson model with mean structure

log(µi ) = β0 + β1xi

with xi as a predictor with linear effect.

3. Randomized Quantile Residual/ 12/39

Page 19: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

An Illustrative Example for RQR: II

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

x

F*

true model

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

x

F*

wrong model

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ● ● ● ● ● ●0 1 2 3 4 5 6 7

3. Randomized Quantile Residual/ 13/39

Page 20: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Normality of Randomized Quantile Residual (RQR)

Theorem

Suppose a continuous random variable Y has the CDF F (y), then F (Y ) isuniformly distributed on (0,1].

Theorem

Suppose the true distribution of Yi given Xi has the CDF F (yi ;µi , φ) andPMF P(yI ;µi , φ), where µi is a function of Xi involving the modelparameters. The randomized lower tail probability F ∗(yi ;µi , φ, ui ) isdefined as F (yi−;µi , φ) + ui P(yI ;µi , φ) (26). Suppose Ui is uniformlydistributed on (0,1]. Then, we have

F ∗(Yi ;µi , φ,Ui ) ∼ Uniform((0, 1]), (28)

andqi = φ−1(F ∗(Yi ;µi , φ,Ui )) ∼ N(0, 1). (29)

3. Randomized Quantile Residual/ 14/39

Page 21: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Proof of Normality of RQR

For any interval B ⊆ (0, 1],

P(F ∗(Yi ;µi , φ,Ui ) ∈ B|Yi = k(j)) =length(F (j) ∩ B)

p(j),

where length(·) is the length of interval. By the law of total probability,

P(F ∗(Yi ;µi , φ,Ui ) ∈ B) (30)

=∞∑j=1

P(F ∗(Yi ;µi , φ,Ui ) ∈ B|Yi = k(j))× P(Yi = k(j)) (31)

=∞∑j=1

length(F (j) ∩ B)

p(j)× p(j) (32)

=∞∑j=1

length(F (j) ∩ B) (33)

= length(∪∞j=1F(j) ∩ B) = length(B) (34)

3. Randomized Quantile Residual/ 15/39

Page 22: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Section 4

Simulation Studies

Page 23: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Subsection 1

Description of Data Generating Process

Page 24: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

General Form of Microbiome Dataset

OTU1 ... OTUm Total Reads Host Factors Sample VariablesY1 Ym Offset Fixed Factors Random Factors

sample 1 Y11 ... Y1m T1 X11 ... X1s Z11 ... Z1t

. . ... . . ... ...

. . ... . . ... ...

. . ... . . ... ...sample n Yn1 ... Ynm Tn Xn1 ... Xns Zn1 ... Znt

4. Simulation Studies/Description of Data Generating Process 16/39

Page 25: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Link Functions and Parameters in Data Generation

Link Functions

log(µi ) = log(Ti ) + β0 + β(1)Xi1

+ ...+ β(s)Xis

+ u(1)Zi1

+ ...+ u(t)Zit,

log(πi

1− πi) = β0 + β

(1)Xi1

+ ...+ β(s)Xis

+ u(1)Zi1

+ ...+ u(t)Zit

Parameters:

Parameter Generator

β0 -0.2

β, β N(0, 0.12)u, u N(0, 22)k (ZMNB) Unif(1,2)Ti Poisson(3× 105)

Other Settings: m = 3000, s = 3, t = 3; each fixed factor has 5 levelsand each random factor has 10 levels.

4. Simulation Studies/Description of Data Generating Process 17/39

Page 26: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Steps to Generate OTUs with ZMP/ZMNB Model

Step 1: Generate matrix of fixed and random factors, and total reads Ti

randomly (used for all Yj).For each response Yj :Step 2: Compute πij and µij using link functions with randomly generatedparameters.Step 3: We generate a count indicator Zij as a binary Bernoulli randomvariable:

Zij =

{0, with probability πij

1 with probability 1− πij .(35)

Step 4: If the indicator Zij = 0, then Yij = 0. If the indicator Zij = 1,then Yij follows a truncated Poisson or NB model, e.g.,

Yij ∼ Truncated-Poisson(µij).

4. Simulation Studies/Description of Data Generating Process 18/39

Page 27: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Subsection 2

Assessing Models for Datasets Simulated from ZMP Model

Page 28: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : RQR plot vs Fitted Values

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●● ●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

● ●●

●●

●●

●●

● ●

400000 600000 800000 1000000 1400000

−3−2

−10

12

Fitted values

Rand

omize

d Qu

antile

ZMP

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●●

●●

●●

●●

●●

●●

● ●

400000 600000 800000 1000000 1400000

−2−1

01

2

Fitted values

Rand

omize

d Qu

antile

ZIP

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

● ● ●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

● ●

400000 600000 800000 1000000 1400000

−3−2

−10

12

Fitted values

Rand

omize

d Qu

antile

ZMNB

● ●

● ●

●●

●●

● ●

● ● ●● ●

●●

●●

● ●

●● ●●

●●● ●

●●● ●

● ●● ●●● ● ●

●● ●

●●● ●

● ●

●●

●●

● ●●

● ● ●●

● ●●

● ● ●

●● ●●●

●●●

●●●

● ●

●●

● ●

● ● ●

●●●

● ● ●●● ●

● ●●

●● ●

●● ●●

● ●

● ●●

●● ●●

● ●

● ●

● ●

● ● ●

● ●

●● ●●

● ●

● ●●

● ● ●

● ●

● ●

●●●

● ●● ●● ●

●●●●● ● ●

● ●

●● ●

● ●

● ● ●● ●●● ● ●●●●

● ●●● ●

●●

●● ● ●

●● ●

● ● ● ●●

● ●

● ●● ●

● ●●● ● ●

●●

●●

●● ●

● ●

●●

● ●

●●●●

● ●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●●

●●

● ●

● ●●

● ●

●● ● ●● ●

●●

●● ●●●

● ●

● ● ●●

● ● ●● ●

●●

●●

● ●

● ●

●●● ●

●● ●●●

●● ●● ●

● ●●●

●●

● ●

● ●●

● ●

● ●

● ●

●●● ●

●●●

● ● ●●●● ●●

●●●

● ●●

● ●●● ●●

● ●

●●●

●●

● ●

●●

● ●

●●● ●●●●

●●

● ●●

●●

● ●

● ●

●●

● ●

●●● ●●

● ●

● ●

● ●

● ●● ●● ●● ●

●●

●●

●●

●●

● ●

●●● ●● ● ●

●●

● ●●●

● ● ●

● ●

●●●

● ●

●● ●

● ●● ●

● ●

●● ●

● ●●●

● ●●

●●

● ●

●● ●

●●

● ●

● ● ●

● ●● ●●●

●●●

● ● ●

● ●●

● ●

●● ●

●●●

● ●●●

●● ●●●

● ● ●

●● ● ● ●● ●●

● ●

● ● ●●● ●● ●

● ●

●●

●●

●●

● ●

●●

● ●● ●

●●

● ●

●●●

●●

● ●

●● ●

● ●

●●●

●● ●

●●●

● ● ●●

●●

● ● ●

● ●●

●●●

●●

●● ●● ●

● ● ● ●

●●

● ●

● ●

● ●

● ●●

●●

● ●

2e+05 5e+05 1e+06 2e+06

−6−4

−20

24

6

Fitted values

Rand

omize

d Qu

antile

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 19/39

Page 29: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : QQ-plot of RQR

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMP

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−2−1

01

2

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZIP

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMNB

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●●

●●●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●

●●

●●●

●●●●

●●●

●●●

●●●●●

●●●

●●●

●●

●●

●●

●●●

●●●

●●●●●●

●●●

●●●

●●●●

●●

●●●

●●●●

●●

●●

●●

●●●

●●

●●●●

●●

●●●

●●●

●●

●●

●●●

●●●●●●

●●●●●●●

●●

●●●

●●

●●●●●●●●●●●●

●●●●●

●●

●●●●

●●●

●●●●●

●●

●●●●

●●●●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●●●

●●

●●●●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●●●

●●●●●

●●●●●

●●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●●●●●●●

●●●

●●●

●●●●●●

●●

●●●

●●

●●

●●

●●

●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●●●●●●

●●

●●

●●

●●

●●

●●●●●●●

●●

●●●●

●●●

●●

●●●

●●

●●●

●●●●

●●

●●●

●●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●●●●

●●●

●●●

●●●

●●

●●●

●●●

●●●●

●●●●●

●●●

●●●●●●●●

●●

●●●●●●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●●

●●●●

●●

●●●

●●●

●●●

●●

●●●●●

●●●●

●●

●●

●●

●●

●●●

● ●

●●

−3 −2 −1 0 1 2 3

−6−4

−20

24

6

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 20/39

Page 30: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : Pearson Residual vs Fitted Values

● ●

●●

●●

● ●●

●●

●●●●

●●

● ●● ●

●●

● ●

●●●●

●●●

●●

● ●

●●●

● ●●

●●

●●●

● ●

●●

●●

●●

●●

●●●

●● ●

●●

●●

●●●

●●

●● ●

●●

●●

●● ● ●●

●●

●●

●●

●●

●●●

●●●

● ●

● ●●●

●●

● ●

●●●

● ●

● ●

●●●

●●●

●●●

●●

●●

●● ●● ●● ●●

●●●

●●●

●●●

●● ●

● ●

●● ●●

●●

●●

●● ●

●●

●●●●

●●●● ●●● ●

●●

● ●●

●● ●

●● ●●

●●

● ●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●●

●●

●● ●●● ●●

●● ●

●●

●●

● ●

●●

● ●●

●●

●●●

●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●● ●● ●

● ●●

●●

●● ●●

●●

●●● ●

●●

● ● ●

●●

● ●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

● ●●

●●● ●

●● ●●

●●● ●

●●

● ●

● ●●●

● ● ●●

●●

●●

● ●● ●

●● ●

● ●

●● ●●

●●

●●

●●

●●

●●●

●●

●●●

●●● ●

●●

● ●

●●●

●●

●●●●●

●●

●●●

●● ●● ●

●●●

●●●

●●

● ● ●●●

●●

●● ●

● ●

●●

● ●

400000 600000 800000 1000000 1400000

−6−4

−20

2

Fitted values

Pear

son

ZMP

● ●

●●

●●

● ●●

●●

●●●●

●●

● ●● ●

●●

● ●

●●●●

●●●

●●

● ●

●●●

● ●●

●●

●●●

● ●

●●

●●

●●

●●

●●●

●● ●

●●

●●

●●●

●●

●● ●

●●

●●

●● ● ●●

●●

●●

●●

●●

●●●

●●●

● ●

● ●●●

●●

● ●

●●●

● ●

● ●

●●●

●●●

●●●

●●

●●

●● ●● ●● ●●

●●●

●●●

●●●

●● ●

● ●

●● ●●

●●

●●

●● ●

●●

●●●●

●●●● ●●● ●

●●

● ●●

●● ●

●● ●●

●●

● ●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●●

●●

●● ●●● ●●

●● ●

●●

●●

● ●

●●

● ●●

●●

●●●

●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●● ●● ●

● ●●

●●

●● ●●

●●

●●● ●

●●

● ● ●

●●

● ●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

● ●●

●●● ●

●● ●●

●●● ●

●●

● ●

● ●●●

● ● ●●

●●

●●

● ●● ●

●● ●

● ●

●● ●●

●●

●●

●●

●●

●●●

●●

●●●

●●● ●

●●

● ●

●●●

●●

●●●●●

●●

●●●

●● ●● ●

●●●

●●●

●●

● ● ●●●

●●

●● ●

● ●

●●

● ●

400000 600000 800000 1000000 1400000

−6−4

−20

2

Fitted values

Pear

son

ZIP

● ●

●●

●●

● ●●

●●

●●●●

●●

● ●● ●

●●

● ●

●●●●

●●●

●●

● ●

●●●

● ●●

●●

●●●

● ●

●●

●●

●●

●●

●●●

●● ●

●●

●●

●●●

●●

●● ●

●●

●●

●● ● ●●

●●

●●

●●

●●

●●●

●●●

● ●

● ●●●

●●

● ●

●●●

● ●

● ●

●●●

●●●

●●●

●●

●●

●● ●● ●● ●●

●●●

●●●

●●●

●● ●

● ●

●● ●●

●●

●●

●● ●

●●

●●●●

●●●● ●●● ●

●●

● ●●

●● ●

●● ●●

●●

● ●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●●

●●

●● ●●● ●●

●● ●

●●

●●

● ●

●●

● ●●

●●

●●●

●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●● ●● ●

● ●●

●●

●● ●●

●●

●●● ●

●●

● ● ●

●●

● ●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

● ●●

●●● ●

●● ●●

●●● ●

●●

● ●

● ●●●

● ● ●●

●●

●●

● ●● ●

●● ●

● ●

●● ●●

●●

●●

●●

●●

●●●

●●

●●●

●●● ●

●●

● ●

●●●

●●

●●●●●

●●

●●●

●● ●● ●

●●●

●●●

●●

● ● ●●●

●●

●● ●

● ●

●●

● ●

400000 600000 800000 1000000 1400000

−6−4

−20

2

Fitted values

Pear

son

ZMNB

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

2e+05 5e+05 1e+06 2e+06

−100

0−5

000

500

1000

Fitted values

Pear

son

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 21/39

Page 31: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : QQ-plot of Pearson Res.

●●

●●

●●

●●●

●●

●●● ●

●●

●●● ●

●●

●●

● ●●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●●

●●

● ●●

●●

●●

● ●●●●

●●

●●

●●

●●

●●●

● ● ●●

● ●

●●● ●

●●

●●

●●●

●●

● ●

●●●

●●●

●●

●●

●●

●● ● ● ●●● ●

●●●

● ●●

● ●●

●●●

● ●

● ● ●●

●●

●●

●● ●

● ●

●● ●●

●●●●●●●●

●●

●●●

● ●●

●● ●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ● ●●● ●●

●●●

●●

●●

● ●

●●

●●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●● ●●●●

●●●

●●

●●● ●

●●

●● ●●

●●

● ● ●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ● ●●

● ●

●●

● ●

●●

● ●

●●

●●●

●●●●

●●●●

●● ●●

●●

● ●

●●●●

● ● ●●

●●

●●

●●●●

●●●

● ●

●●●●

●●

●●

●●

●●

●●●

●●

●●●

● ●●●

●●

●●

● ●●

●●

●●● ●●

●●

●●●

● ●●● ●

●●●

●●●

●●

● ●● ●●

●●

●●●

●●

●●

● ●

−3 −2 −1 0 1 2 3

−6−4

−20

2

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMP

●●

●●

●●

●●●

●●

●●● ●

●●

●●● ●

●●

●●

● ●●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●●

●●

● ●●

●●

●●

● ●●●●

●●

●●

●●

●●

●●●

● ● ●●

● ●

●●● ●

●●

●●

●●●

●●

● ●

●●●

●●●

●●

●●

●●

●● ● ● ●●● ●

●●●

● ●●

● ●●

●●●

● ●

● ● ●●

●●

●●

●● ●

● ●

●● ●●

●●●●●●●●

●●

●●●

● ●●

●● ●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ● ●●● ●●

●●●

●●

●●

● ●

●●

●●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●● ●●●●

●●●

●●

●●● ●

●●

●● ●●

●●

● ● ●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ● ●●

● ●

●●

● ●

●●

● ●

●●

●●●

●●●●

●●●●

●● ●●

●●

● ●

●●●●

● ● ●●

●●

●●

●●●●

●●●

● ●

●●●●

●●

●●

●●

●●

●●●

●●

●●●

● ●●●

●●

●●

● ●●

●●

●●● ●●

●●

●●●

● ●●● ●

●●●

●●●

●●

● ●● ●●

●●

●●●

●●

●●

● ●

−3 −2 −1 0 1 2 3

−6−4

−20

2

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZIP

●●

●●

●●

●●●

●●

●●● ●

●●

●●● ●

●●

●●

● ●●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●●

●●

● ●●

●●

●●

● ●●●●

●●

●●

●●

●●

●●●

● ● ●●

● ●

●●● ●

●●

●●

●●●

●●

● ●

●●●

●●●

●●

●●

●●

●● ● ● ●●● ●

●●●

● ●●

● ●●

●●●

● ●

● ● ●●

●●

●●

●● ●

● ●

●● ●●

●●●●●●●●

●●

●●●

● ●●

●● ●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ● ●●● ●●

●●●

●●

●●

● ●

●●

●●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●● ●●●●

●●●

●●

●●● ●

●●

●● ●●

●●

● ● ●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ● ●●

● ●

●●

● ●

●●

● ●

●●

●●●

●●●●

●●●●

●● ●●

●●

● ●

●●●●

● ● ●●

●●

●●

●●●●

●●●

● ●

●●●●

●●

●●

●●

●●

●●●

●●

●●●

● ●●●

●●

●●

● ●●

●●

●●● ●●

●●

●●●

● ●●● ●

●●●

●●●

●●

● ●● ●●

●●

●●●

●●

●●

● ●

−3 −2 −1 0 1 2 3

−6−4

−20

2

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMNB

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−100

0−5

000

500

1000

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 22/39

Page 32: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking All Yj ’s: 3000 Shapiro-Wilk P-values of RQRRandomized Quantile

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

300

ZMP

Randomized Quantile

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

300

ZIPRandomized Quantile

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

300

ZMNB

Randomized Quantile

p−value

Frequency

0e+00 1e−18 2e−18 3e−18 4e−18 5e−18 6e−18

0500

1000

1500

2000

2500

3000

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 23/39

Page 33: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking All Yj ’s: 3000 Shapiro-Wilk P-values of PearsonPearson

p−value

Frequency

0.0e+00 2.0e−47 4.0e−47 6.0e−47 8.0e−47 1.0e−46 1.2e−46

0500

1000

1500

2000

2500

3000

ZMP

Pearson

p−value

Frequency

0.0e+00 5.0e−06 1.0e−05 1.5e−05 2.0e−05 2.5e−05

0500

1000

1500

2000

2500

3000

ZIPPearson

p−value

Frequency

0.0e+00 2.0e−47 4.0e−47 6.0e−47 8.0e−47 1.0e−46 1.2e−46

0500

1000

1500

2000

2500

3000

ZMNB

Pearson

p−value

Frequency

0.0e+00 5.0e−06 1.0e−05 1.5e−05 2.0e−05 2.5e−05

0500

1000

1500

2000

2500

3000

Poisson

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 24/39

Page 34: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Type 1 Error Rates and Power

Using 0.05 as cutoff, probabilities of rejecting fitted models with RQRs andPearson residuals in 3000 Yj ’s are shown as follows:

Table 1: Using Randomized Quantile Residual

Sample size ZMP ZIP ZMNB Poisson200 0.142 0.139 0.145 1.000400 0.074 0.090 0.102 0.999800 0.068 0.068 0.082 1.000

1600 0.060 0.061 0.059 1.0003200 0.051 0.051 0.063 1.000

Table 2: Using Pearson Residual

Sample size ZMP ZIP ZMNB Poisson200 0.984 0.986 0.983 0.997400 1.000 1.000 1.000 1.000800 1.000 1.000 1.000 1.000

1600 1.000 1.000 1.000 1.0003200 1.000 1.000 1.000 1.000

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMP Model 25/39

Page 35: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Subsection 3

Assessing Models for Datasets Simulated from ZMNB Model

Page 36: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : RQR vs Fitted Values

●●

● ●●

● ●

●●

● ●

●●●●

●● ●

● ●

●●

● ● ●

● ●

●● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

600000 800000 1000000 1200000 1600000

−3−2

−10

12

34

Fitted values

Rand

omize

d Qu

antile

ZMNB

●●

● ●●

●● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●●

●● ●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

600000 800000 1000000 1200000 1600000

−3−2

−10

12

34

Fitted values

Rand

omize

d Qu

antile

ZINB

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●●

● ●

●●

●●

●●●

●●

●●●

●● ●

●●

●●

● ●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

1e+03 1e+04 1e+05 1e+06 1e+07

−3−2

−10

12

Fitted values

Rand

omize

d Qu

antile

NB

● ●

●●

●● ●

● ●

●●

● ●

●● ●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●● ●

● ●

●●●

●●

●●

● ●

●● ● ●

●●

● ●

●●

● ● ●

●●

●●

●●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

● ●

● ●●

●●

● ●

●●

● ●

●●

● ●

●●

5e+05 1e+06 2e+06

−4−2

02

46

Fitted values

Rand

omize

d Qu

antile

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 26/39

Page 37: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : QQ plot of RQR

●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

34

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMNB

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

34

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZINB

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

NB

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−4−2

02

46

Randomized Quantile

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 27/39

Page 38: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : Pearson Residuals vs Fitted Values

●●

●●

●●

●●

●●

●●

● ●●●●

● ●

●●

●●●

● ●

●●

●●●

●●

●●

● ●●

●●

●●

● ●

● ●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●●●

●●

● ●

● ●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

● ●

●●

●●

●●●

● ●

●●

●● ●

●● ●

●●

●●

●●

●● ●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ●●

●●

● ●

●●

●●

●●●● ●

●●

●●

● ●●●

●● ●●

●●

● ●●

●●●

●●●

● ●

●●●

●●●

●●

●● ●

●●

●● ●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●● ●●

●●

●●

●●

●●

● ●

●●●

●●

●●

● ●●

●●

● ●

●●

●●

● ●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●●

●● ●

●●

●●

● ●

●●

●●

●●●

●●

●●

●● ●

●●

●● ●

● ● ●

600000 800000 1000000 1200000 1600000

02

46

8

Fitted values

Pear

son

ZMNB

●●●

●●

●●

●●

● ●

● ●●

●●●

● ●●● ●

●●●●

●●

●● ●

●●

●●

●●●●

●●

●●

●●

●●

● ●●

●●●● ●● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●●●

● ●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

● ●

●●

●●

●●●

●●

●●

●● ●

●●

●●

●●

●●●

●● ●

●● ●

● ●

●●

●● ●

● ●●

●●●

● ●●●

●● ●

● ●●●

●●●

●●●

●●●

●●●

● ●

●●

●●● ●● ●

●● ●

●● ●●

●● ●●

● ●● ●

● ●

●●●

●●

●●

●●●

●● ●

●●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●● ●

● ●●●

●● ●●●

●●● ●

●● ●

●●

●●

● ●

● ●●

● ●

●●

● ●●

● ●●

● ●● ●

●●

●●

●●

● ●

●●

●●

● ● ●

● ●

●●

●●

●●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

● ●

● ● ●●

●●●

●● ●●

●●

●● ●

●●

●●

●●● ● ●●

600000 800000 1000000 1200000 1600000

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

Fitted values

Pear

son

ZINB

● ●●● ●

●●●

●●

●●

● ● ● ●●

●●● ●●

●●●

●●●

●● ● ●

●●

● ●●

● ●● ● ●

●●

●●

●●●

●●● ● ●●●●●

●●

● ●

●●●

● ●

●●

● ● ●● ●●

● ●●● ●●●

●●

●●

● ●●●

●●

●● ●

●●●●●●

● ●●●

● ●

●● ●

●●

●●

● ●

●● ●

● ●●

●●●

●●●

●●●

●● ●

●●

●●

●●●●

●● ●

●●

● ●●

● ●

●●

●●

● ●

● ●

● ●●

●●

●●

● ●

●●

● ●●

●●

● ●●

●●●

●●

●● ●●

●● ●●

● ●

● ●●

●●● ●

●● ●● ●●

● ●

●●

●●

●●

●● ●

●●

● ● ●

● ●

●●●

●● ● ●●

●●●

●●

●● ● ●

●● ●

● ●

● ●●

●●

● ●

●●●

●●●

●●

●● ●

●● ●●

●●●● ●● ●●●

●●

● ●●

●●

●●

●● ●●

●●

● ● ●●

●●

●● ●● ●●● ●

● ● ●

●●

●●

●●

●● ●●●●

● ●

●●

●● ●

● ●●

●●

● ● ●●

● ●●

● ●●

● ● ●●● ●● ● ●

●●● ●●

●●

●●

● ●

●●●

●●●

●●

● ●● ●●

●●

●●

● ●

● ●●

●●

●● ●●

●●

● ●

●●

●●

●● ●●

● ●

●●●

●●

●●

● ●●

●●●

●●●

● ●

●●

●●

●●● ● ●●● ●

●●

●● ● ●

1e+03 1e+04 1e+05 1e+06 1e+07

01

23

Fitted values

Pear

son

NB

●● ●●●●

●●●●● ●● ●● ●● ●● ●● ●●

●● ●● ●● ●●●● ● ●● ●● ●● ●● ●● ●● ●

●● ● ●● ●●●●● ● ●● ●●● ● ● ● ● ●

●● ●● ● ●● ●● ●● ●●●

● ● ●● ●● ●● ●

●●

● ●●● ●● ●●● ● ●●●

● ●

●● ●●●●● ● ●●

●● ●●

● ●●●● ●●●

● ● ●●● ●●● ● ●●

●●● ●●●●● ●● ●● ●●●

●●● ●● ●●● ●● ●● ●● ●● ●● ●● ●● ●●

●●● ● ●● ●●● ●●● ● ●● ●●●● ●● ●● ●● ●● ● ●●● ●● ● ●● ●●

●●●● ●● ●

●● ●●● ●● ●● ●●● ●● ● ● ●

●●● ●●● ● ●● ● ●● ●●● ●● ●

●●● ● ●●

●● ●● ●● ● ●●●●●●● ●●●

●● ● ●● ●● ●

● ● ●●

● ●● ●●●●● ●● ●●● ●● ● ●●●● ●● ● ●● ● ●● ●● ● ●● ●● ● ●

●● ●●●●●● ● ● ● ●● ● ●● ●●●●

●● ● ●●

●●●

● ●● ●● ●● ●●● ●● ●●

●● ●●●●● ●

●● ●● ●●●●● ●●● ●● ●● ●●● ●●● ● ●●● ● ●● ●●● ●● ●● ●● ●● ●

●●●● ●●● ●●● ●● ● ● ● ●● ●

● ●●

●●● ● ● ●● ●●● ●●

●●●● ● ●● ●● ●●● ● ●●● ●●●

● ●● ●● ●● ●● ●

●●●● ●● ●●

●●

●● ●

● ● ●● ●●● ●●● ● ● ● ●●

●● ●●

●● ● ●●● ●●

●● ●● ●● ●● ●●● ●● ●● ●

●● ●●

●● ●● ●● ●● ●●●●●● ● ●●● ●●

●●●● ● ●●● ●●●●●● ●

●●●

●● ●● ● ●● ●●● ●● ● ● ● ●● ●● ●● ●●● ●●●●

●● ● ● ●●●● ● ●●● ● ●● ●●● ●● ●● ●●● ●●

● ●● ●●

● ●● ●●● ●● ●●●

●● ●

●● ●● ●●

● ● ●●●

●●●

●● ●●

●●● ●●● ● ●

● ●● ● ●● ● ●●● ● ●●● ●●●

● ●●● ●● ●● ●●● ● ●●● ● ●●

● ●●● ● ●●

● ●

● ●●●● ●●●

● ●●● ●● ●●

● ●

●● ● ●● ●●● ●● ●● ● ●

5e+05 1e+06 2e+06

−500

050

010

0015

0020

00

Fitted values

Pear

son

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 28/39

Page 39: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking One Yj : QQ plot of Pearson Residuals

●●

●●

●●

●●

●●

●●

●● ●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

● ●

●●

●●

● ●

●●

●●

●● ●

●●

●●

●●

● ●

●● ●

●●

●●

●●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●●

●●

● ●

●●

● ●

●●

● ●

●●●

●●

●●

●● ●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●● ●

●●

●●

● ●●●

●●●●

●●

● ●●

●●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●●

●●●

−3 −2 −1 0 1 2 3

02

46

8

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMNB

●●●

● ●

●●

●●

●●

●●●

●●●

●●●●●

●●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

● ●● ●●●●

●●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●●

●●

●●

●●

● ●

● ●●

●●

●●

●●

●●

●● ●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●●

● ●●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●●

● ●●

●●●

● ● ●●

●●●

●●●●

●●●

●●●

●●●

●●●

● ●

●●

● ● ●● ● ●

●●●

●●● ●

●● ● ●

● ●●●

●●

●●●

●●

●●

●● ●

●●●

●●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●● ●

●●

●●●

●●●●

●● ● ●●

●● ●●

●●●

●●

●●

● ●

● ●●

●●

● ●

●● ●

● ●●

● ●●●

●●

●●

●●

●●

●●

● ●

● ●●

●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●●

●●●●

●●

●● ●

● ●

●●

●●● ●● ●

−3 −2 −1 0 1 2 3

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZINB

● ●●● ●

● ●●

●●

●●

●●● ●●

●●● ●●

●●●

● ● ●

●●●●

●●

●●●

●● ●●●

●●

● ●

●● ●

●● ●● ● ●●● ●

●●

● ●

●● ●

●●

●●

● ●●● ●●

●●● ● ●● ●

●●

● ●

●● ●●

●●

●●●

●● ●●

● ●

●●●●●

● ●

● ●●

●●

●●

● ●

● ● ●

● ●●

●● ●

●●●

●● ●

●● ●

●●

●●

●● ● ●

●●●

● ●

●● ●

● ●

●●

● ●

●●

● ●

●●●

●●

● ●

●●

●●

●● ●

●●

●● ●

● ●●

●●

● ●●●

●●● ●

●●

●●●

●●●●

●●● ●●●

●●

● ●

●●

●●

● ● ●

●●

●●●

●●

● ● ●

●●●●●

● ●●

●●

●● ● ●

●●●

●●

●●●

●●

●●

●●●

●●●

●●

●● ●

● ●●●

●●● ● ●● ●●●

●●

●●●

●●

●●

● ● ●●

●●

●● ●●

●●

● ●● ●●● ●●

●● ●

●●

● ●

●●

●●● ● ●●

● ●

●●

● ● ●

●● ●

●●

●● ●●

● ●●

●● ●

●●● ● ●● ●●●

●● ●●●

●●

● ●

●●

● ●●

●●●

●●

● ●●● ●

● ●

●●

● ●

● ●●

● ●

● ●●●

●●

●●

●●

● ●

● ●●●

●●

● ●●

●●

●●

●●●

● ●●

● ●●

●●

●●

●●

●● ●● ●●● ●

●●

● ●● ●

−3 −2 −1 0 1 2 3

01

23

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

NB

● ●●● ● ●●

●● ●●● ●●● ●● ● ●● ●●●●

●● ●●●●● ●●●● ●●● ● ●●● ● ● ●●●● ●●● ● ●● ●● ●●● ●● ●●● ●● ● ●

● ●● ●● ● ● ●● ● ● ●●●

● ●●● ●● ● ● ●

●●

●● ●● ● ●● ●● ●●● ●

●●

●● ●● ● ●●● ●●

● ●●●

●● ●●● ● ●●

● ● ●●●●●● ● ●●

●● ● ●●●

●● ●● ●●●●●

●● ●● ●●● ● ●● ●● ●●●●●●● ●●●●●

●● ● ●●●●● ●●●● ● ●●●● ●● ●●● ●● ●● ●● ●● ●● ●● ●●● ●

●● ●● ● ●●

● ●●● ●● ●● ● ● ● ●●●● ●●

● ●●● ● ● ●● ● ●●● ●● ● ● ●●

●●● ●● ●

●●● ● ●●●● ●● ●● ●● ●● ●

●● ●●● ● ● ●

● ●●●

●● ● ●● ●● ●●●●● ●● ● ●● ●●● ●●●●● ●●● ●●●● ● ●● ●●

●●● ● ●●●● ●● ●● ● ● ●● ● ●●●

●●● ●●

●●●

●● ● ●●● ●●●●●● ●●

● ●● ●● ●● ●

● ●●● ●● ●●● ● ●●●● ● ●●● ●●● ●● ●● ●● ● ●● ●● ●●● ●●●● ●●

●● ●●●●●●●● ● ●● ●●●●●

● ●●

●●●● ●● ●● ●●●●

● ● ● ● ●● ● ●● ●● ●● ●● ●●●●

● ●● ●● ● ●●●●

●●● ●●●●●

●●

●● ●

● ● ●●● ●● ● ●● ●● ●● ●

●● ●●

● ●●●● ●●●

●● ●●● ●● ● ●● ●● ● ●●●

●●● ●

● ●● ●● ●● ● ●● ● ●●● ● ●●●● ●

●●● ●● ● ●● ● ●●● ● ●●

● ●●

●● ● ●●●● ●●● ●●●● ● ●● ●● ●●●● ●● ● ●●

● ● ●● ●● ●● ●●●● ●● ● ●● ● ●●●●● ● ●● ●

●● ●●●

●● ● ●●● ● ● ●●●

● ●●

●●● ●●●

● ●●● ●

● ●●

● ●●●

● ●● ●●● ●●

● ●● ●● ● ●● ●●●●● ●● ●●

●● ● ●●● ●● ●● ●●●● ● ●● ●

● ●● ●●●●

●●

● ●● ●●● ●●

● ●● ●● ● ●●

●●

●● ●● ●● ●● ●●● ●●●

−3 −2 −1 0 1 2 3

−500

050

010

0015

0020

00

Pearson

Theoretical Quantiles

Sam

ple Q

uant

iles

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 29/39

Page 40: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking All Yj ’s: 3000 Shapiro-Wilk P-values of of RQRRandomized Quantile

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

300

ZMNB

Randomized Quantile

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

300

350

ZINBRandomized Quantile

p−value

Frequency

0.000 0.005 0.010 0.015 0.020

0500

1000

1500

2000

2500

3000

NB

Randomized Quantile

p−value

Frequency

0.0e+00 5.0e−48 1.0e−47 1.5e−47

0500

1000

1500

2000

2500

3000

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 30/39

Page 41: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Checking All Yj ’s: 3000 Shapiro-Wilk P-values of PearsonPearson

p−value

Frequency

0e+00 2e−43 4e−43 6e−43

0500

1000

1500

2000

2500

3000

ZMNB

Pearson

p−value

Frequency

0.0e+00 5.0e−58 1.0e−57 1.5e−57 2.0e−57 2.5e−57 3.0e−57

0500

1000

1500

2000

2500

3000

ZINBPearson

p−value

Frequency

0e+00 1e−05 2e−05 3e−05 4e−05 5e−05 6e−05

0500

1000

1500

2000

2500

3000

NB

Pearson

p−value

Frequency

0.0e+00 5.0e−60 1.0e−59 1.5e−59 2.0e−59

0500

1000

1500

2000

2500

3000

ZMP

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 31/39

Page 42: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Type 1 Error Rates and Power

Using 0.05 as cutoff, probabilities of rejecting fitted models with RQRs andPearson residuals in 3000 Yj ’s are shown as follows:

Table 3: Using Randomized Quantile Residuals

Sample size ZMNB ZINB NB ZMP200 0.067 0.153 0.957 1.000400 0.057 0.063 0.883 1.000800 0.053 0.049 0.759 1.000

1600 0.047 0.055 0.928 1.0003200 0.040 0.042 1.000 1.000

Table 4: Using Pearson Residuals

Sample size ZMNB ZINB NB ZMP200 1.000 1.000 1.000 1.000400 1.000 1.000 1.000 1.000800 1.000 1.000 1.000 1.000

1600 1.000 1.000 1.000 1.0003200 1.000 1.000 1.000 1.000

4. Simulation Studies/Assessing Models for Datasets Simulated from ZMNB Model 32/39

Page 43: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Section 5

Application to a Twin Study OTU Dataset

Page 44: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Data Description

We use a twin study OTU data at the genus level. There are m = 14different genera (14 Yj) on n = 287 samples in total.

We apply six different models proposed before to fit into this twinstudy OTU data and use randomized quantile residuals to test thegoodness of fit for all OTUs.

We choose ancestry and obesity to be host factors while age andfamily to be random factors.

At the genus level, the dataset does not have many zero. However,the ordinary NB and Possion models do not fit the dataset well (to beshown).

We combine small OTU counts smaller than 10 into a bin called“zero” for 10 genera, and using larger thresholds (less than 150) forother 4 genera.

5. Application to a Twin Study OTU Dataset/ 33/39

Page 45: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Histograms of OTUs of 4 GeneraOTU

value

Freq

uenc

y

0 1000 2000 3000 4000 5000 6000

020

4060

80

Bacteroides

OTU

value

Freq

uenc

y

0 1000 2000 3000 4000

050

100

150

Lachnospiraceae..gOTU

value

Freq

uenc

y

0 500 1000 1500 2000 2500

020

4060

8010

012

014

0

Roseburia

OTU

value

Freq

uenc

y

0 500 1000 1500 2000 2500 3000

010

2030

4050

60

Faecalibacterium

5. Application to a Twin Study OTU Dataset/ 34/39

Page 46: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Histograms of Randomized Predictive P-values for “Euba”pvaluepoisson

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

80100

120

Poisson1

pvaluenb

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

40

NB1

pvaluepoisson

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

80100

Poisson

pvaluenb

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

4050

NBpvalueZIP

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

4050

60

ZIP

pvaluehurdlep

p−value

Freq

uenc

y

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

4050

ZMP

pvaluehurdlenb

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

2025

3035

ZMNB

pvalueZINB

p−value

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

2025

3035

ZINB

5. Application to a Twin Study OTU Dataset/ 35/39

Page 47: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

RQR vs Fitted Values for “Euba”

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

5e−02 5e−01 5e+00 5e+01

−6−4

−20

24

6

Fitted values

Ran

dom

ized

Qua

ntile

Poisson1

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

1e−02 1e−01 1e+00 1e+01 1e+02

−3−2

−10

12

Fitted values

Ran

dom

ized

Qua

ntile

NB1

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

5e−02 5e−01 5e+00 5e+01

−6−4

−20

24

6

Fitted values

Ran

dom

ized

Qua

ntile

Poisson

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

1e−02 1e−01 1e+00 1e+01 1e+02

−3−2

−10

12

Fitted values

Ran

dom

ized

Qua

ntile

NB

●●● ●

●●

●●

●●

●●

●●

●● ●●

●●

●●●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

● ●●

●●

● ●

● ●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

5e−02 5e−01 5e+00 5e+01

−20

24

6

Fitted values

Ran

dom

ized

Qua

ntile

ZIP

●●

●●

●●●

● ●

●●

●● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

5e−02 5e−01 5e+00 5e+01

−20

24

6

Fitted values

Ran

dom

ized

Qua

ntile

ZMP

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

5e−02 5e−01 5e+00 5e+01

−2−1

01

2

Fitted values

Ran

dom

ized

Qua

ntile

ZINB

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

5e−02 5e−01 5e+00 5e+01

−3−2

−10

12

Fitted values

Ran

dom

ized

Qua

ntile

ZMNB

5. Application to a Twin Study OTU Dataset/ 36/39

Page 48: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

QQ-plot of RQR for “Euba”

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−6−4

−20

24

6

rqrpoisson

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Poisson1

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

rqrnb

Theoretical Quantiles

Sam

ple

Qua

ntile

s

NB1

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−6−4

−20

24

6

rqrpoisson

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Poisson

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

rqrnb

Theoretical Quantiles

Sam

ple

Qua

ntile

s

NB

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−20

24

6

rqrZIP

Theoretical Quantiles

Sam

ple

Qua

ntile

s

ZIP

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

−3 −2 −1 0 1 2 3

−20

24

6

rqrhurdlep

Theoretical Quantiles

Sam

ple

Qua

ntile

s

ZMP

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

rqrZINB

Theoretical Quantiles

Sam

ple

Qua

ntile

s

ZINB

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3−2

−10

12

rqrhurdlenb

Theoretical Quantiles

Sam

ple

Qua

ntile

s

ZMNB

5. Application to a Twin Study OTU Dataset/ 37/39

Page 49: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Shapiro-Wilk P-values for all 14 Genera

Table 5: P-values for the Shapiro-Wilk test of Randomized quantile residuals fortwin study OTU data sorted by ZMNB

Genus ZMNB ZINB ZMP ZIP NB Poisson NB1 Poisson1

Bact 0.052 0.034 < 10−19 < 10−19 < 10−16 < 10−18 < 10−8 < 10−17

Lach..g 0.072 0.074 < 10−16 < 10−15 < 10−3 < 10−11 0.005 < 10−4

Faec 0.083 0.107 < 10−17 < 10−18 < 10−17 < 10−15 < 10−10 < 10−13

Rumi 0.232 0.285 < 10−19 < 10−19 < 10−6 < 10−12 0.04 < 10−5

Rumi.1 0.238 0.366 < 10−16 < 10−16 < 10−10 < 10−11 < 10−5 < 10−10

Blau 0.251 0.104 < 10−10 < 10−10 0.087 < 10−12 0.182 < 10−12

Erys 0.344 0.258 < 10−16 < 10−17 < 10−4 < 10−7 0.314 < 10−5

Alis 0.344 0.352 < 10−16 < 10−16 < 10−9 < 10−7 0.003 < 10−6

Euba 0.461 0.539 < 10−15 < 10−15 < 10−10 < 10−6 0.006 < 10−4

Lach 0.521 0.358 < 10−9 < 10−10 < 10−10 < 10−5 0.003 0.051Oscil 0.535 0.606 < 10−15 < 10−15 < 10−9 < 10−5 0.006 < 10−4

Prev 0.605 0.269 < 10−17 < 10−17 < 10−4 < 10−12 0.002 < 10−12

Rose 0.627 0.613 < 10−13 < 10−14 < 10−6 < 10−13 0.749 < 10−13

Copr 0.752 0.721 < 10−13 < 10−14 < 10−8 < 10−6 0.245 < 10−6

5. Application to a Twin Study OTU Dataset/ 38/39

Page 50: Randomized Quantile Residual for Assessing Generalized ...longhai/doc/talks/slides_glmm_rqr.pdf · Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with

Conclusions and Discussions

Our studies show that RQR performs very well for checking GLMM.RQRs are normally distributed under the true model. In GOF test, thetype 1 error rates of RQR are close to the nominal level 0.05, and thestatistical powers of RQR in rejecting wrong models are very good.

We have applied RQR to assess models for a real human microbiomedataset at genus level and found that ZMNB and ZINB are goodmodels for the dataset and other simpler models (such as NB andPoisson) are not adequate to describe the extraordinarily small andlarge OTU counts.

We have developed generic functions for computing RQRs with fittingoutputs of R package glmmTMB. They will be released in Wei Bai’sM.Sc. thesis.

6. Conclusions and Discussions/ 39/39