34
2019 SISG Bayesian Statistics for Genetics R Notes: Multinomial Sampling Jon Wakefield Departments of Statistics and Biostatistics, University of Washington 2019-07-15

2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

2019 SISG Bayesian Statistics for GeneticsR Notes: Multinomial Sampling

Jon WakefieldDepartments of Statistics and Biostatistics, University of

Washington

2019-07-15

Page 2: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Hardy-Weinberg via Fisher’s exact test

library(hwde)n1 <- 88n2 <- 10n3 <- 2exact <- hwexact(n1, n2, n3)exact## [1] 0.06544427

We obtain a p-value of 0.07

Page 3: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Displaying samples from a dirichlet(1,1,1)

library(VGAM) # To access the rdiric functionnsim <- 1000q <- rdiric(nsim, c(1, 1, 1))# Univariate marginal representationspar(mfrow = c(2, 3))hist(q[, 1], xlab = expression(q[1]), main = "", cex.lab = 1.5,

xlim = c(0, 1))hist(q[, 2], xlab = expression(q[2]), main = "", cex.lab = 1.5,

xlim = c(0, 1))hist(q[, 3], xlab = expression(q[3]), main = "", cex.lab = 1.5,

xlim = c(0, 1))# Bivariate representationsplot(q[, 1], q[, 2], xlim = c(0, 1), ylim = c(0, 1),

xlab = expression(q[1]), ylab = expression(q[2]),cex.lab = 1.5)

plot(q[, 1], q[, 3], xlim = c(0, 1), ylim = c(0, 1),xlab = expression(q[1]), ylab = expression(q[3]),cex.lab = 1.5)

plot(q[, 2], q[, 3], xlim = c(0, 1), ylim = c(0, 1),xlab = expression(q[2]), ylab = expression(q[3]),cex.lab = 1.5)

Page 4: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Displaying samples from a dirichlet(1,1,1)

q1

Fre

quen

cy

0.0 0.4 0.8

050

150

q2

Fre

quen

cy

0.0 0.4 0.8

050

150

q3

Fre

quen

cy

0.0 0.4 0.8

050

150

0.0 0.4 0.8

0.0

0.4

0.8

q1

q 2

0.0 0.4 0.8

0.0

0.4

0.8

q1

q 3

0.0 0.4 0.80.

00.

40.

8q2

q 3

Page 5: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Bayes analysis of (88,10,2) data

n1 <- 88n2 <- 10n3 <- 2p1 <- 88/100 + 0.5 * 10/100 # Estimated allele frequenciesp2 <- 2/100 + 0.5 * 10/100 # for A1 and A2v1 <- v2 <- v3 <- 1nsim <- 5000q <- rdiric(nsim, c(n1 + v1, n2 + v2, n3 + v3)) # The posteriorpar(mfrow = c(1, 3))hist(q[, 1], xlab = expression(q[1]), main = expression(paste("Posterior for ",

q[1])))abline(v = n1/(n1 + n2 + n3), col = "red")abline(v = p1^2, col = "blue")hist(q[, 2], xlab = expression(q[2]), main = expression(paste("Posterior for ",

q[2])))abline(v = n2/(n1 + n2 + n3), col = "red")abline(v = 2 * p1 * p2, col = "blue")hist(q[, 3], xlab = expression(q[3]), main = expression(paste("Posterior for ",

q[3])))abline(v = n3/(n1 + n2 + n3), col = "red")abline(v = p2^2, col = "blue")

Page 6: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Bayes analysis of (88,10,2) data

Univariate posterior distributions: blue lines are the MLEs in the fullmodel, red lines under the HWE model

Posterior for q1

q1

Fre

quen

cy

0.75 0.90

040

010

00

Posterior for q2

q2

Fre

quen

cy

0.05 0.20

040

010

00

Posterior for q3

q3

Fre

quen

cy

0.00 0.08

040

010

00

Page 7: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Bayes analysis of (88,10,2) data

par(mfrow = c(1, 3))plot(q[, 2] ~ q[, 1], xlab = expression(q[1]), ylab = expression(q[2]),

col = "grey")points(n1/(n1 + n2 + n3), n2/(n1 + n2 + n3), col = "red",

pch = 20, cex = 2)points(p1^2, 2 * p1 * p2, col = "blue", pch = 4, cex = 2)legend("topright", legend = c("MLE", "HWE"), col = c("red",

"blue"), pch = c(20, 20), bty = "n")plot(q[, 3] ~ q[, 1], xlab = expression(q[1]), ylab = expression(q[3]),

col = "grey")points(n1/(n1 + n2 + n3), n3/(n1 + n2 + n3), col = "red",

pch = 20, cex = 2)points(p1^2, p2^2, col = "blue", pch = 4, cex = 2)plot(q[, 3] ~ q[, 2], xlab = expression(q[2]), ylab = expression(q[3]),

col = "grey")points(n2/(n1 + n2 + n3), n3/(n1 + n2 + n3), col = "red",

pch = 20, cex = 2)points(2 * p1 * p2, p2^2, col = "blue", pch = 4, cex = 2)

Page 8: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Bayes analysis of (88,10,2) data

Bivariate posterior distributions: blue lines are the MLEs in the fullmodel, red lines under the HWE model

0.75 0.85 0.95

0.05

0.15

0.25

q1

q 2

MLEHWE

0.75 0.85 0.95

0.00

0.04

0.08

0.12

q1

q 3

0.05 0.15 0.25

0.00

0.04

0.08

0.12

q2

q 3

Page 9: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Functions of interest: implied priorsWe assume a “dirichlet(1,1,1)“‘ distribution

v1 <- v2 <- v3 <- 1nsim <- 2000samps <- rdiric(nsim, c(v1, v2, v3))q1 <- samps[, 1]q2 <- samps[, 2]q3 <- samps[, 3]p1 <- q1 + q2/2p2 <- q3 + q2/2f <- (q1 - p1^2)/(p1 * p2)D <- q1 - p1^2psi <- q2^2/(p1 * p2)## Functions of interestcat("Prior prob f>0: ", sum(f > 0)/nsim, "\n")## Prior prob f>0: 0.672cat("Prior prob D>0: ", sum(D > 0)/nsim, "\n")## Prior prob D>0: 0.672

Page 10: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Functions of interest

Examine prior summaries for different functions of interest.

par(mfrow = c(1, 3))hist(q1, main = "", xlab = expression(q[1]), cex.lab = 1.5)hist(q2, main = "", xlab = expression(q[2]), cex.lab = 1.5)hist(q3, main = "", xlab = expression(q[3]), cex.lab = 1.5)par(mfrow = c(1, 3))hist(p1, main = "", xlab = expression(p[1]), cex.lab = 1.5)hist(p2, main = "", xlab = expression(p[2]), cex.lab = 1.5)hist(D, main = "", xlab = expression(D), cex.lab = 1.5)par(mfrow = c(1, 2))hist(f, main = "", xlab = "f", cex.lab = 1.5)hist(psi, main = "", xlab = expression(psi), cex.lab = 1.5)

Page 11: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Functions of interest: prior margins on q1, q2, q3.

q1

Fre

quen

cy

0.0 0.4 0.8

010

020

030

040

0

q2

Fre

quen

cy

0.0 0.4 0.8

010

020

030

040

0

q3F

requ

ency

0.0 0.4 0.8

010

020

030

0

Page 12: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Functions of interest: priors on p1, p2,D

p1

Fre

quen

cy

0.0 0.4 0.8

010

020

030

0

p2

Fre

quen

cy

0.0 0.4 0.8

010

020

030

0

DF

requ

ency

−0.2 0.1

050

100

150

200

250

300

350

Page 13: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Functions of interest: priors on f , ψ.

f

Fre

quen

cy

−1.0 0.0 1.0

010

020

030

0

ψF

requ

ency

0 1 2 3 4

040

080

0

Page 14: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Inference for f

The MLE is f̂ = 0.23 with asymptotic standard error 0.17.

Hence, a 95% confidence interval is

(0.23 − 1.96 × 0.17, 0.23 + 1.96 × 0.17) = (−0.1032, 0.5632).

The posterior median and 95% credible interval are given below.

# Bayesian posterior quantiles arequantile(f, c(0.025, 0.5, 0.975))## 2.5% 50% 97.5%## -0.7005448 0.2681385 0.9550541

Subsequent figures give posterior distributions on functions of interest.

Page 15: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Dirichlet Posterior Distribution

q1

Fre

quen

cy

0.75 0.90

010

020

030

040

050

0

q2

Fre

quen

cy

0.05 0.20

010

020

030

040

050

0

q3F

requ

ency

0.00 0.06 0.12

010

020

030

040

050

0

Page 16: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Posterior summaries

p1

Fre

quen

cy

0.84 0.92

010

020

030

0

p2

Fre

quen

cy

0.04 0.12

010

020

030

0

DF

requ

ency

0.00 0.06

010

020

030

040

050

060

0

Page 17: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Posterior summaries

f

Fre

quen

cy

0.0 0.4

050

150

250

ψF

requ

ency

0.0 0.2 0.4

020

040

060

0

Page 18: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis

library(HWEBayes)nsim <- 50000diricvec <- c(1, 1)Prior <- LambdaOptim(nsim, diricvec, 0, 0.5, 0.5, 0.95,

init = c(-1, log(1)))## lambda mu and lambda sd = -1.255008 1.034792nvec <- c(88, 10, 2)nsamp <- 10000Check <- SinglefPrior(nsamp, diricvec, Prior$lambdamu,

Prior$lambdasd)quantile(Check$f, p = c(0.5, 0.95))## 50% 95%## -0.0004591304 0.5012976488

Page 19: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis

Out <- SinglefReject(nsamp, diricvec, Prior$lambdamu,Prior$lambdasd, nvec)

## Allele Marginal prob: 1 0.93## Allele Marginal prob: 2 0.07## Convergence = 0 (0 is successful convergence)## Probs and f at max and fmin:## 0.9299904 0.07000956 0.2318971 -0.07527987## nsim norm constant (se) 95% interval:## 10000 0.001026753 ( 7.392827e-06 ) 0.001012263 0.001041243quantile(Out$fsamp)## 0% 25% 50% 75% 100%## -0.0776904 0.1215269 0.2016118 0.2947268 0.7232500sum(Out$fsamp > 0)/nsamp## [1] 0.9806quantile(Out$fsamp)## 0% 25% 50% 75% 100%## -0.0776904 0.1215269 0.2016118 0.2947268 0.7232500

Page 20: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis

0.0 0.4 0.8

−1.

00.

5

p1

f

p1

Fre

quen

cy

0.0 0.4 0.8

030

0

f

Fre

quen

cy

−1.0 0.0 1.0

080

0

0.0 0.4 0.8

−1.

00.

5

p1

f

p1

Fre

quen

cy

0.0 0.4 0.8

015

00

fF

requ

ency

−1.0 0.0 1.0

010

00

Page 21: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis

bvec0 <- c(1, 1)bvec1 <- c(1, 1, 1)PrnH0 <- DirichNormHWE(nvec, bvec0)PrnH1sat <- DirichNormSat(nvec, bvec1)BFH0H1sat <- PrnH0/PrnH1satcat("H0 norm = ", PrnH0, "\n")## H0 norm = 0.0002993684cat("H1 (sat) norm = ", PrnH1sat, "\n")## H1 (sat) norm = 0.0001941371cat("Conjugate Bayes factor = ", BFH0H1sat, "\n")## Conjugate Bayes factor = 1.542047BFH0H1NonConj <- PrnH0/Out$PrnH1cat("Non-Conjugate Bayes factor = ", BFH0H1NonConj,

"\n")## Non-Conjugate Bayes factor = 0.2915681

Page 22: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis: Increasing Sample Sizes by 2exact2 <- hwexact(2 * n1, 2 * n2, 2 * n3)exact2## [1] 0.008886848nvec2 <- 2 * nvecPrnH02 <- DirichNormHWE(nvec2, bvec0)PrnH1sat2 <- DirichNormSat(nvec2, bvec1)BFH0H1sat2 <- PrnH02/PrnH1sat2cat("2x Conjugate Bayes factor = ", BFH0H1sat2, "\n")## 2x Conjugate Bayes factor = 0.4033526Out2 <- SinglefReject(nsamp, diricvec, Prior$lambdamu,

Prior$lambdasd, nvec2)## Allele Marginal prob: 1 0.93## Allele Marginal prob: 2 0.07## Convergence = 0 (0 is successful convergence)## Probs and f at max and fmin:## 0.9299904 0.07000956 0.2318971 -0.07527987## nsim norm constant (se) 95% interval:## 10000 0.0002860046 ( 2.047767e-06 ) 0.000281991 0.0002900182sum(Out2$fsamp > 0)/nsamp## [1] 0.9978BFH0H1NonConj2 <- PrnH02/Out2$PrnH1cat("2x Non-Conjugate Bayes factor = ", BFH0H1NonConj2,

"\n")## 2x Non-Conjugate Bayes factor = 0.06946953

Page 23: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis: Increasing Sample Sizes by 5exact5 <- hwexact(5 * n1, 5 * n2, 5 * n3)exact5## [1] 3.583063e-05nvec5 <- 5 * nvecPrnH05 <- DirichNormHWE(nvec5, bvec0)PrnH1sat5 <- DirichNormSat(nvec5, bvec1)BFH0H1sat5 <- PrnH05/PrnH1sat5cat("5x Conjugate Bayes factor = ", BFH0H1sat5, "\n")## 5x Conjugate Bayes factor = 0.003945016Out5 <- SinglefReject(nsamp, diricvec, Prior$lambdamu,

Prior$lambdasd, nvec5)## Allele Marginal prob: 1 0.93## Allele Marginal prob: 2 0.07## Convergence = 0 (0 is successful convergence)## Probs and f at max and fmin:## 0.9299904 0.07000956 0.2318971 -0.07527987## nsim norm constant (se) 95% interval:## 10000 4.874536e-05 ( 3.465192e-07 ) 4.806618e-05 4.942454e-05sum(Out5$fsamp > 0)/nsamp## [1] 1BFH0H1NonConj5 <- PrnH05/Out5$PrnH1cat("5x Non-Conjugate Bayes factor = ", BFH0H1NonConj5,

"\n")## 5x Non-Conjugate Bayes factor = 0.0006435822

Page 24: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

Non-conjugate analysis: Increasing Sample Sizes by 10exact10 <- hwexact(10 * n1, 10 * n2, 10 * n3)exact10## [1] 5.305573e-09nvec10 <- 10 * nvecPrnH010 <- DirichNormHWE(nvec10, bvec0)PrnH1sat10 <- DirichNormSat(nvec10, bvec1)BFH0H1sat10 <- PrnH010/PrnH1sat10cat("10x Conjugate Bayes factor = ", BFH0H1sat10, "\n")## 10x Conjugate Bayes factor = 1.154053e-06Out10 <- SinglefReject(nsamp, diricvec, Prior$lambdamu,

Prior$lambdasd, nvec10)## Allele Marginal prob: 1 0.93## Allele Marginal prob: 2 0.07## Convergence = 0 (0 is successful convergence)## Probs and f at max and fmin:## 0.9299904 0.07000956 0.2318971 -0.07527987## nsim norm constant (se) 95% interval:## 10000 1.265505e-05 ( 8.997236e-08 ) 1.247871e-05 1.28314e-05sum(Out10$fsamp > 0)/nsamp## [1] 1BFH0H1NonConj10 <- PrnH010/Out10$PrnH1cat("10x Non-Conjugate Bayes factor = ", BFH0H1NonConj10,

"\n")## 10x Non-Conjugate Bayes factor = 1.818402e-07

Page 25: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

library(rstan)stanexample <- stan("HWEexample.stan", data = list(y = c(88,

10, 2)), iter = 2000, chains = 1, seed = 1234)

Page 26: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

print(stanexample)## Inference for Stan model: HWEexample.## 1 chains, each with iter=2000; warmup=1000; thin=1;## post-warmup draws per chain=1000, total post-warmup draws=1000.#### mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat## p 0.93 0.00 0.02 0.89 0.91 0.93 0.94 0.96 375 1## theta[1] 0.86 0.00 0.03 0.79 0.83 0.86 0.88 0.92 372 1## theta[2] 0.14 0.00 0.03 0.08 0.12 0.13 0.16 0.20 369 1## theta[3] 0.01 0.00 0.00 0.00 0.00 0.01 0.01 0.01 409 1## lp__ -47.00 0.04 0.70 -48.97 -47.18 -46.74 -46.56 -46.50 395 1#### Samples were drawn using NUTS(diag_e) at Mon Jul 15 16:46:30 2019.## For each parameter, n_eff is a crude measure of effective sample size,## and Rhat is the potential scale reduction factor on split chains (at## convergence, Rhat=1).

Page 27: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stantraceplot(stanexample)

theta[2] theta[3]

p theta[1]

1000 1250 1500 1750 2000 1000 1250 1500 1750 2000

1000 1250 1500 1750 2000 1000 1250 1500 1750 20000.75

0.80

0.85

0.90

0.95

0.000

0.005

0.010

0.015

0.860.880.900.920.940.960.98

0.05

0.10

0.15

0.20

chain

1

Page 28: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

stan_hist(stanexample)

theta[2] theta[3]

p theta[1]

0.05 0.10 0.15 0.20 0.000 0.005 0.010 0.015

0.875 0.900 0.925 0.950 0.975 0.75 0.80 0.85 0.90 0.95

Page 29: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

stan_dens(stanexample)

theta[2] theta[3]

p theta[1]

0.05 0.10 0.15 0.20 0.000 0.005 0.010 0.015

0.86 0.88 0.90 0.92 0.94 0.96 0.980.75 0.80 0.85 0.90 0.95

Page 30: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

Now run a second example with

stanexample2 <- stan("HWEexampleSaturated.stan",data = list(y = c(88, 10, 2), alpha = c(1,

1, 1)), iter = 1000, chains = 3,seed = 1234)

Page 31: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stansummary(stanexample2)## $summary## mean se_mean sd 2.5% 25%## theta[1] 0.8633544 0.0010045618 0.03498693 0.789542970 0.83968277## theta[2] 0.1076363 0.0009255907 0.03169066 0.054685539 0.08424069## theta[3] 0.0290093 0.0004488522 0.01582794 0.007154844 0.01693602## lp__ -49.2615358 0.0373789415 1.05505684 -52.158783402 -49.64434800## 50% 75% 97.5% n_eff Rhat## theta[1] 0.8656829 0.88833494 0.92408452 1212.9929 0.9998235## theta[2] 0.1051841 0.12731235 0.17729402 1172.2619 0.9990886## theta[3] 0.0262926 0.03791406 0.06671865 1243.4900 1.0012594## lp__ -48.9534534 -48.52920059 -48.24461367 796.7053 1.0050182#### $c_summary## , , chains = chain:1#### stats## parameter mean sd 2.5% 25% 50%## theta[1] 0.86397143 0.03550858 0.78356519 0.84219709 0.86557530## theta[2] 0.10690626 0.03184192 0.05431209 0.08331108 0.10564239## theta[3] 0.02912231 0.01600151 0.00724567 0.01756700 0.02570899## lp__ -49.25777274 1.07351457 -52.21710677 -49.60770560 -48.92836748## stats## parameter 75% 97.5%## theta[1] 0.88867438 0.92270225## theta[2] 0.12656615 0.17568097## theta[3] 0.03725044 0.07123144## lp__ -48.53207886 -48.24217761#### , , chains = chain:2#### stats## parameter mean sd 2.5% 25% 50%## theta[1] 0.86260902 0.03325881 0.795550682 0.84040847 0.86702856## theta[2] 0.10908913 0.03106211 0.056939754 0.08683016 0.10513279## theta[3] 0.02830185 0.01426715 0.007800011 0.01748461 0.02599113## lp__ -49.13710842 0.91100176 -51.307164846 -49.53318322 -48.86485832## stats## parameter 75% 97.5%## theta[1] 0.8861003 0.92002478## theta[2] 0.1289670 0.17744713## theta[3] 0.0365145 0.06167385## lp__ -48.4824981 -48.25320063#### , , chains = chain:3#### stats## parameter mean sd 2.5% 25% 50%## theta[1] 0.86348273 0.03618286 0.789949328 0.83930741 0.8647971## theta[2] 0.10691354 0.03217118 0.051111315 0.08241574 0.1053986## theta[3] 0.02960374 0.01709115 0.006202672 0.01567480 0.0271636## lp__ -49.38972624 1.15305916 -52.341023338 -49.74036306 -49.0689327## stats## parameter 75% 97.5%## theta[1] 0.8909974 0.92966584## theta[2] 0.1264944 0.17673805## theta[3] 0.0407884 0.06752815## lp__ -48.5898718 -48.24437467

Page 32: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stan

traceplot(stanexample2, pars = c("theta[1]"), inc_warmup = TRUE)

0.6

0.7

0.8

0.9

0 250 500 750 1000

thet

a[1]

chain

1

2

3

Page 33: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stanplot(stanexample2, color = "green")

theta[1]

theta[2]

theta[3]

0.00 0.25 0.50 0.75 1.00

Page 34: 2019 SISG Bayesian Statistics for Genetics R Notes ...faculty.washington.edu/kenrice/sisgbayes/2019-SISG-JW-Multinomial.pdfhist(q1,main ="",xlab = expression(q[1]),cex.lab =1.5) hist(q2,main

HWE analysis via Stanstan_scat(stanexample2, pars = c("theta[1]", "theta[2]"),

color = "blue", size = 2)

0.05

0.10

0.15

0.20

0.75 0.80 0.85 0.90 0.95

theta[1]

thet

a[2]