Upload
cuyler
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
§❻ Hierarchical (Multi-Stage) Generalized Linear Models. Robert J. Tempelman. Introduction. Some inferential problems require non-classical approaches; e.g. Heterogeneous variances and covariances across environments. - PowerPoint PPT Presentation
Citation preview
Applied Bayesian Inference, KSU, April 29, 2012
§ / 1
§❻ Hierarchical (Multi-Stage) Generalized Linear Models
Robert J. Tempelman
Applied Bayesian Inference, KSU, April 29, 2012
§ / 2
Introduction
• Some inferential problems require non-classical approaches; e.g.– Heterogeneous variances and covariances across
environments.– Different distributional forms (e.g. heavy-tailed or
mixtures for residual/random effects).– High dimensional variable selection models
• Hierarchical Bayesian modeling provides some flexibility for such problems.
Applied Bayesian Inference, KSU, April 29, 2012
§ / 3
Heterogeneous variance models(Kizilkaya and Tempelman, 2005)
• Consider a study involving different subclasses (e.g. herds).– Mean responses are
different.– But suppose residual
variances are different too.
• Let’s discuss in context of LMM (linear mixed model)
Applied Bayesian Inference, KSU, April 29, 2012
§ / 4
Recall linear mixed model
• Given:
has a certain “heteroskedastic” specification.
• determines the nature of heterogeneous residual variances
= + +y Xβ Zu e
~ ,e 0 R ξN
R ξ
| , ~ | , ,y β u,ξ y β u,ξ Xβ Zu R ξp N
ξ
Applied Bayesian Inference, KSU, April 29, 2012
§ / 5
Modeling Heterogeneous Variances
• Suppose
– with as a “fixed” intercept residual variance
– gk > 0 kth fixed scaling effect.
– vl > 0 lth random scaling effect.
11 12e = e e est
e ~ 0,R ξ =Ikl kl
2kl kl n eN2 2 ; 1, 2, ; 1, 2, . .kle e lk kv s l tg
2e
Applied Bayesian Inference, KSU, April 29, 2012
§ / 6
Subjective and Subjective Priors• “intercept” variance : subjective flat or
conjugate vague inverted-gamma (IG) prior • Invoke typical constraints for “fixed effects”
– Corner parameterization: gs= 1.
– Flat or vague IG prior p(gk); k=1,2,..,s
• Structural prior for “random effects”
– i.e., vl ~ IG(a, a-1).
• E(vl )=1;
( 1) 1( 1)( | ) ( ) exp( )
ee ee
l e le l
p v vv
aa aaa
a
2 1=Var( )2v l
e
va
2 2~e ep
a functions like a “variance component” for residual variances.-> hyperparameter
1( | )2e
elCV v a
a
Applied Bayesian Inference, KSU, April 29, 2012
§ / 7
Remaining priors
• “Classical” random effects
• “Classical” fixed effects
• “Classical” random effects VC
• Hyperparameter (Albert, 1988)
2
1~ ( )(1 )e e
e
pa aa
~ ( )β βp
| ~ ( | ) = , ( )u φ u φ 0 G φp N
~ ( )φ φp
SAS PROC MCMC doesn’t seem to handle this…prior can’t be written as function of corresponding parameter
Applied Bayesian Inference, KSU, April 29, 2012
§ / 8
What was the last prior again???
2
11
vv
2
1vv
1
1 vUniform(0,1) on
1v
Uniform(0,1) on
Different diffuse priors can have different impacts on posterior inferences!...if data info is poor
Rosa et al. (2004)
Applied Bayesian Inference, KSU, April 29, 2012
§ / 9
Joint Posterior Density
• LMM:
2
1 1
2
2 ( ) ( | )
|
| , , ,
, , , |
(
,
)
,
,
,
y β u γ
β u γ v φ y
β u φφvs
e v
e e
t
k lk l
e
e p
p
p
p p
p
p
p
v
p
g a
a
a
Applied Bayesian Inference, KSU, April 29, 2012
§ / 10
Details on FCD
• All provided by Kizilkaya and Tempelman (2005)– All are recognizeable except for av:
– Use Metropolis-Hastings random walk on using normal as proposal density.
• For MH, generally a good idea to transform parameters so that parameter space is entire real line…but don’t forget to include Jacobian of transform.
11
1 1
| , , , , , , ,
1exp 1
β u φ γ v y L τe
e
e
t tte
e l l etl le
p
v v pa
a
a
aa a
a
loge e a
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Small simulation study• Two different levels of heterogeneity:
– ae = 5, ae = 15– = 1
• Two different average random subclass sizes:– ne = 10 vs. ne = 30– 20 subclasses (habitats) in total
• Also modeled fixed effects:– Sex (2 levels) for location and dispersion (g1=2, g2=1).
• Additional set of random effects:– 30 levels (e.g. sires) cross-classified with habitats.
11
2e
1( | )2l e
e
CV v aa
Applied Bayesian Inference, KSU, April 29, 2012
§ / 12
PROC MIXED code• “Fixed” effects models for residual variances
– REML estimates of “herd” variances expressed relative to average.
proc mixed data=phenotype; class sireid habitatid sexid; model y = sexid; random intercept /subject = habitatid ; random intercept /subject = sireid; repeated / local = exp(sexid habitatid); ods output covparms=covparms;run; 2 2 ; 1, 2, ; 1, 2, . .
kle e lk kv s l tg
but treats vl as a fixed effect.Models
Applied Bayesian Inference, KSU, April 29, 2012
§ / 13
MCMC analyses (code available online)Posterior summaries on ae.
ae = 15; ne = 10
Mean Median Std Dev
1st Pctl 99th Pctl
58.84 20.36 99.72 3.755 562.6
ae = 5; ne = 10
Mean Median Std Dev
1st Pctl 99th Pctl
4.531 3.416 3.428 2.073 22.24
ae = 5; ne = 30
Mean Median Std Dev
1st Pctl 99th Pctl
3.683 3.382 1.302 2.081 8.006
ae = 15; ne = 30
Mean Median Std Dev
1st Pctl 99th Pctl
67.24 41.25 85.30 7.918 487.5
Applied Bayesian Inference, KSU, April 29, 2012
§ / 14
MCMC (₀) and REML (•) estimates of subclass residual variances vs. truth (vl)
ae=15;ne=10ae=5;ne=10
ae=5;ne=30 ae=15;ne=30
High shrinkage situation
Low shrinkage situation
Applied Bayesian Inference, KSU, April 29, 2012
§ / 15
Heterogeneous variances for ordinal categorical data
• Suppose we had a situation where residual variances were heterogeneous on the underlying latent scale– i.e., greater
frequency of extreme vs. intermediate categories in some subclasses 5 10 15
0.00
0.05
0.10
0.15
0.20
liability
dens
ity
Herd 1Herd 2Herd 3
Applied Bayesian Inference, KSU, April 29, 2012
§ / 16
Heterogeneous variances for ordinal categorical data?
• On liability scale:
has a certain “heteroskedastic” specification.
• determines the nature of heterogeneous variances
= + +Xβ Zu e
~ ,e 0 R ξN
R ξ
| , ~ | , ,β u,ξ β u,ξ Xβ Zu R ξp N
ξ
Applied Bayesian Inference, KSU, April 29, 2012
§ / 17
Cumulative probit mixed model (CPMM)
• For CPMM, l maps to Y:
1
1 2
1
1 if ,2 if ,
if ;
o i
ii
k i C
Y
k
-111 1 1
p( | , ) 1 1y L τklns t C
j ikl j ikljk l i
L y j
Applied Bayesian Inference, KSU, April 29, 2012
§ / 18
Modeling Heterogeneous Variances in CPMM
• Suppose
– With as a “fixed” reference residual variance
– gk > 0 kth fixed scaling effect.
– vl > 0 lth random scaling effect.– All other priors same as with LMM
11 12e = e e est
e ~ 0,R ξ =Ikl kl
2kl kl n eN2 2 ; 1, 2, ; 1, 2, . .kle e lk kv s l tg
2e
Applied Bayesian Inference, KSU, April 29, 2012
§ / 19
Joint Posterior Density in CPMM
• CPMM:
2
1 1
2
2
( ) ( |
, , , , , , , , |
()
| , | ,
|
,
)
, ,
τ β u
y L τ L β u γ
L τ β u γ
v
φ φ
v φ y
s t
e
e v
e k l vk l
vpp p p p
p
p
p
p
pp vg
a
a
a
Applied Bayesian Inference, KSU, April 29, 2012
§ / 20
Another small simulation study
• Two different levels of heterogeneity:– ae = 5, ae = 15
• Average random subclass size: ne = 30– 20 subclasses (habitats) in total
• Also modeled fixed effects:– Sex (2 levels) for location and dispersion.
• Additional set of random effects:– 30 levels (e.g. sires) cross-classified with habitats.
• Thresholds: 1 = -1, 1 = 1.5
Applied Bayesian Inference, KSU, April 29, 2012
§ / 21
ae = 15; ne = 30
Mean Median Std Dev
1st Pctl 99th Pctl
49.44 23.21 75.31 5.018 404.7
ESS = 391
ae = 5; ne = 30
Mean Median Std Dev
1st Pctl 99th Pctl
5.018 4.344 2.118 2.125 11.56ESS = 1422
ae = 5; ne = 30 ae = 15; ne = 30
Applied Bayesian Inference, KSU, April 29, 2012
§ / 22
Posterior means of subclass residual variances vs. truth (vl)
ae=5;ne=30 ae=15;ne=30
No PROC GLIMMIX counterpartAnother alternative: Heterogeneous thresholds!!! (Varona and Hernandez, 2006)
Applied Bayesian Inference, KSU, April 29, 2012
§ / 23
Additional extensions• PhD work by Fernando Cardoso
– Heterogeneous residual variances as functions of multiple fixed effects and multiple random effects.
– Heterogeneous t-error (Cardoso et al., 2007).
– Helps separates outliers from high variance subclasses from effects of outliers.
– Other candidates for distribution of wj lead to alternative heavy-tailed specifications (Rosa et al., 2004)
12
12
jk jl
j
K pkk
j
L qle l
e w
| ~ ,2 2jp w Gamma
t-error is outlier
robust
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Posterior densities of breed-group heritabilities in multibreed Brazilian cattle (Fernando
Cardoso)
a) Gaussian homoskedastic model
05
1015202530
0 0.1 0.2 0.3 0.4 0.5Heritability
Post
erio
r den
sity
Nelore Hereford F1 A38
c) Gaussian heteroskedastic model
05
101520253035
0 0.1 0.2 0.3 0.4 0.5Heritability
Post
erio
r den
sity
Nelore Hereford F1 A38
Some of most variable herds were exclusively Herefords
Based on homogeneous residual variance (Cardoso and Tempelman, 2004)
Based on heterogeneous residual variances (Fixed: breed additive&dominance,sex; Random: CG (Cardoso et al., 2005)
•Estimated CV of CG-specific s2e →
0.72±0.06•F1 s2
e = 0.70±0.16 purebred s2e
Applied Bayesian Inference, KSU, April 29, 2012
§ / 25
Heterogeneous G-side scale parameters
• Could be accommodated in a similar manner.• In fact, the borrowing of information across
subclasses in estimating subclass-specific random effects variances is even more critical.– Low information per subclass? REML estimates
will converge to zero.
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Heterogeneous bivariate G-side and R-side inferences!
, ,1
2, ,
,
,
'
'
z 0
0 zuu
j milk j
j CI j
milk j
CI j
milk
CI
milk fixed effectsfixed effect eCI
e
s
Bello et al. (2010, 2012)
Investigated herd-level and cow-level relationship between 305-day milk production and calving interval (CI) as a function of various factors
CG (herd-year) effects
Residual (cow) effects
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Herd-Specific and Cow- Specific (Co)variances
,
,
2,,
2, ,
milk CImilkj
milk CI CI
e je je
e j e j
,
,
2,,
, 2, ,
milk CImilk
milk CI CI
u ku ku k
u k u k
Herd k
Cow j
, ,2
,
milk CI
milk
u kuk
u k
Let
and , ,2
,
milk CI
milk
e jej
e j
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Rewrite this
|
2 2, ,
2 2, ,
2,
milk milk
milk milk C ilk
j
I m
ej
e e
e j e
j
j
e j e j e jje
|
2 2, ,
2 2, ,
, 2,
milk milk
milk mil CI milkk
uk
u
u k u
u uk k
k
u k uk
u
k
k
ku
Herd k
Cow j
Model each of these different colored terms as functions of fixed and random effects (in addition to the classical b and u)!
Applied Bayesian Inference, KSU, April 29, 2012
§ /
bST effect on Herd-Level Association betweenMilk Yield and Calving Interval
-1.5
-1
-0.5
0
0.5
0% <50% ≥50%% herd on
bST supplementation
days
per
100
kg
milk
yie
ld
0.01a 0.07a
-1.37b
a,b P < 0.0001
bST:Bovine somatotropin
uk
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Number of times milking/day on Cow-level Association betweenMilk Yield and Calving Interval
0
0.2
0.4
0.6
2X 3+X Daily Milking Frequency
days
per
100
kg
milk
yie
ld
0.57a
0.45b
a,b P < 0.0001
Overall Antagonism
0.51±0.01 day longer CI per 100 kg increase in cumulative 305-d milk yield
ej
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Variability between Herds for (Random effects)
• DICM0 – DICM1 = 243
• Expected range between extreme herd-years
± 2 = 0.7 d / 100 kg
2ˆ 0.030 0.005em
2em
Ott and Longnecker, 2001
0.0 0.2 0.4 0.6 0.8 1.0Increase in # of days of CI / 100 kg herd milk yield
0.16 0.7 d/100kg 0.86
ej
ej
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Whole Genome Selection (WGS)
• Model: ' ; z gi i i iy fixed effects u e i = 1,2,…,n.
(e.g. age, parity)
1 2 3 'g mg g g g
21
~ ,u= 0 Ani ui
u N
21
~ ,e= 0 Ini ei
e N
Genotypes
SNP allelic substitution effects
Polygenic Effects
Residual effects
Phenotype
LD (linkage disequilibrium)
Phenotypes
'1 2 3zi i i i imz z z z
Anim
al
Genotypes
m >>>>>n
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Typical WGS specifications• Random effects spec. on g (Meuwissen et al.
2001)– BLUP:– BayesA/B:
BayesA = BayesB with = 0.– “Random effects/Bayes” modeling allows m >> n
• Borrowing of information across genes.
2~ ,g 0 I gN
22 2 , with prob : (1 )
~ , ; ~0 with prob :
g 0 j jg g
SN diag
Applied Bayesian Inference, KSU, April 29, 2012
§ /
First-order antedependence-specifications (Yang and Tempelman, 2012)
• Instead of independence, specify first order antedependence:SNP Marker Genetic Effect
SNP 1: g1 = d1,SNP 2: g2 = t21g1 + d2,SNP 3: g3 = t32g2 + d3,
⁞ ⁞SNP m: gm = tm,m-1gm-1 + dm.
2, 1 ~ ,j j t tt N
22
1
2 , with prob : (1 )~ , ; ~
0 with prob :δ 0
j
m
jjS
N diag dd
Ante-BayesB
Ante-BayesA = Ante-BayesB with p = 0
1 1 2 1 2 3
1 2 2 3
1 2 2 3
1 2 3 2 3 3
11
11
Correlation
Random effects modeling: facilitates borrowing of information across SNP intervals
, 1( )j j jf t
SNP 1 SNP 2 SNP 3 SNP 4
Applied Bayesian Inference, KSU, April 29, 2012
§ /
Results from a simulation study
• Advantage of Ante-BayesA/B over conventional BayesA/B increases with increasing marker density (LD = linkage disequilbrium)
0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32
0.70
0.75
0.80
0.85
0.90
0.95
1.00
LD level
Acc
urac
y
BayesAante.BayesABayesBante.BayesB
Accuracy of Genomic EBV vs. LD level
(r2) P<.001 Bayes A/B vs. AnteBayesA/B
Applied Bayesian Inference, KSU, April 29, 2012
§ / 36
Other examples of multi-stage hierarchical modeling?
• Spatial variability in agronomy using t-error (Besag and Higdon, 1999)
• Ecology (Cressie et al., 2009).• Conceptually, one could model heterogeneous
and spatially correlated overdispersion parameters in Poisson/binomial GLMM as well!
Applied Bayesian Inference, KSU, April 29, 2012
§ / 37
What I haven’t covered in this workshop
• Model choice criteria– Bayes factors (generally, too challenging to compute)– DIC (Deviance information criteria)
• Bayesian model averaging– Advantage over conditioning on one model (e.g. for
multiple regression involving many covariates)• Posterior predictive checks.
– Great for diagnostics• Residual diagnostics based on latent residuals for
GLMM (Johnson and Albert, 1999).
Applied Bayesian Inference, KSU, April 29, 2012
§ / 38
Some closing comments/opinions
• Merit of Bayesian inference– Marginal for LMM with classical assumptions.
• GLS with REML seems to work fine.– Of greater benefit for GLMM
• Especially binary data with complex error structures– Greatest benefit for multi-stage hierarchical
models.• Larger datasets nevertheless required than with more
classical (homogeneous assumptions).
Applied Bayesian Inference, KSU, April 29, 2012
§ / 39
Implications
• Increased programming capabilities/skills are needed.– Cloud/cluster computing wouldn’t hurt.
• Don’t go in blind with canned Bayesian software. – Watch the diagnostics (e.g. trace plots) like a hawk!
• Don’t go on autopilot.– WinBugs/PROC MCMC works nicely for the simpler stuff.– Highly hierarchical models require statistical/algorithmic
insights…do recognize limitations in parameter identifiability (Cressie et al., 2009)
Applied Bayesian Inference, KSU, April 29, 2012
§ /
National Needs PhD FellowshipsMichigan State University
Focus: Integrated training in quantitative, statistical and molecular genetics, and breeding of food animals
Features:• Research in animal genetics/genomics with collaborative faculty team• Industry internship experience• Public policy internship in Washington, DC• Statistical consulting center experience• Teaching or Extension/outreach learning opportunities• Optional affiliation with inter-departmental programs in Quantitative
Biology, Genetics, othersFaculty Team:
C. Ernst, J. Steibel, R. Tempelman, R. Bates, H. Cheng, T. Brown, B. Alston-MillsEligibility is open to citizens and nationals of the US. Women
and underrepresented groups are encouraged to apply.
Applied Bayesian Inference, KSU, April 29, 2012
§ / 41
Thank You!!!• Any more questions???
http://actuary-info.blogspot.com/2011/05/homo-actuarius-bayesianes.html