Upload
ben-bolker
View
256
Download
0
Embed Size (px)
Citation preview
De�nitions Statistics Computation Sociological Conclusions References
General-purpose toolsfor generalized linear mixed models
Ben Bolker
McMaster University, Mathematics & Statistics and Biology
13 September 2013
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuouspredictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random e�ects)
Applications in ecology, neurobiology, behaviour, epidemiology, realestate, . . .
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuouspredictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random e�ects)
Applications in ecology, neurobiology, behaviour, epidemiology, realestate, . . .
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuouspredictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random e�ects)
Applications in ecology, neurobiology, behaviour, epidemiology, realestate, . . .
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuouspredictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random e�ects)
Applications in ecology, neurobiology, behaviour, epidemiology, realestate, . . .
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Technical de�nition
Yi︸︷︷︸response
∼
conditionaldistribution︷︸︸︷Distr (g−1(ηi )︸ ︷︷ ︸
inverselink
function
, φ︸︷︷︸scale
parameter
)
η︸︷︷︸linear
predictor
= Xβ︸︷︷︸�xede�ects
+ Zb︸︷︷︸randome�ects
b︸︷︷︸conditionalmodes
∼ MVN(0, Σ(θ)︸ ︷︷ ︸variance-covariancematrix
)
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Estimation
Maximum likelihood estimation
L(Yi |θ,β)︸ ︷︷ ︸likelihood
=
∫· · ·
∫L(Yi |θ,β′)︸ ︷︷ ︸
data|random e�ects
×L(β′|Σ(θ))︸ ︷︷ ︸random e�ects
dβ′
deterministic: precision vs. computational cost:penalized quasi-likelihood, Laplace approximation, adaptiveGauss-Hermite quadrature (Breslow, 2004) . . .
Monte Carlo: frequentist and Bayesian (Booth and Hobert,1999; Ponciano et al., 2009; Sung, 2007)
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Estimation: example (McKeon et al., 2012)
Log−odds of predation−6 −4 −2 0 2
Symbiont
Crab vs. Shrimp
Added symbiont
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
GLM (fixed)GLM (pooled)PQLLaplaceAGQ
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Inference
Big problem.
Inferential tools: either asymptotic
or taken from classical linear
models
boundary solutions (Stram andLee, 1994)
the great p-value/degrees offreedom debate
small numbers of clusters
solutions: computationaland/or Bayesian(parametric bootstrap, MCMC)
True p value
Infe
rred
p v
alue
0.02
0.04
0.06
0.08
0.02 0.06
Osm Cu
H2S
0.02 0.06
0.02
0.04
0.06
0.08
Anoxia
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Sparse matrix algorithms
repeated decomposition oflarge, matrices (especially Z )
�ll-reducing permutation toimprove sparsity pattern
further improvements possible:better matrix representation,parallelization?
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Bounded optimization
Parameterizevariance-covariance matrix Σ(θ)(Pinheiro and Bates, 1996)
Positive de�nite or onlysemi-de�nite?
Disadvantages of transformingto unconstrain
(Disadvantages of boundarysolutions)
raw log
0
10
20
30
0 1 2 3 −3 −2 −1 0
devi
ance
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Sociological issues
The curse of neophilia
Wide user base:
As usual when software for complicated statistical
inference procedures is broadly disseminated, there is
potential for abuse and misinterpretation.
(Breslow, 2004)
What if there is no good answer?�do no harm� vs. �better me than someone else�
Diagnostics and warning messages
End users vs. downstream developers
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Outline
1 De�nitions and context
2 Statistical challenges
3 Computational challenges
4 Sociological challenges
5 Conclusions
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Next steps
Alternative platforms/languages
Flexible correlation structures:spatial, temporal, phylogenetic . . .
Improved MCMC methods?
Simulation tests of inferential tools (sigh)
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Is it science?
Science is what we
understand well enough to
explain to a computer. Art
is everything else we do.
(Donald Knuth)
10
20
30
4050
2006 2008 2010 2012Date
artic
les
per
mon
th
key
glmm
lme4
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Acknowledgments
lme4: Doug Bates, MartinMächler, Steve Walker
Data: Adrian Stier (UBC/OSU),Sea McKeon (Smithsonian),David Julian (UF)
NSERC (Discovery)
SHARCnet
Ben Bolker
GLMMs
De�nitions Statistics Computation Sociological Conclusions References
Booth, J.G. and Hobert, J.P., 1999. Journal of the Royal Statistical Society. Series B, 61(1):265�285.doi:10.1111/1467-9868.00176.
Breslow, N.E., 2004. In D.Y. Lin and P.J. Heagerty, editors, Proceedings of the second Seattlesymposium in biostatistics: Analysis of correlated data, pages 1�22. Springer. ISBN 0387208623.
McKeon, C.S., Stier, A., et al., 2012. Oecologia, 169(4):1095�1103. ISSN 0029-8549.doi:10.1007/s00442-012-2275-2.
Pinheiro, J.C. and Bates, D.M., 1996. Statistics and Computing, 6(3):289�296.doi:10.1007/BF00140873.
Ponciano, J.M., Taper, M.L., et al., 2009. Ecology, 90(2):356�362. ISSN 0012-9658.
Stram, D.O. and Lee, J.W., 1994. Biometrics, 50(4):1171�1177.
Sung, Y.J., 2007. The Annals of Statistics, 35(3):990�1011. ISSN 0090-5364.doi:10.1214/009053606000001389.
Ben Bolker
GLMMs