Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Introduction Why MCMC? Intuition Writing an MCMC sampler
Markov chain Monte Carlo I
Models for Socio-Environmental Data
Chris Che-Castaldo, Mary B. Collins, N. Thompson Hobbs
December 12, 2020
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 1 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What is this course about?
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 2 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
You can understand it.
I Rules of probabilityI Conditioning and independenceI Law of total probabilityI Factoring joint probabilities
I Distribution theory
I Markov chain Monte Carlo
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 3 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
The Bayesian method
Existing theory, scientific
objectives, intuition
Write deterministic
model of process.
Design / choose
observations.
A. Design
Diagram relationship
between observed and
unobserved.
Write out posterior and joint
distributions using general
probability notation.
Choose appropriate
probability distributions.
C. Model implementation
B. Model specification
Write full conditional distributions.
Write MCMC sampling algorithm. OrWrite code for MCMC
software.
Implement MCMC on simulated data.
D. Model evaluation and inferencePosterior predictive checks
Probabilistic inference from
single model Model selection,
model averaging
Implement MCMC on real data.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 4 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
The MCMC algorithm
I Why MCMC?
I Some intuition about how it works for a single parametermodel
I MCMC for multiple parameter modelsI Full-conditional distributions (today)I Gibbs sampling (today)I Metropolis-Hastings algorithm (optional lecture notes and
reading)I MCMC software (JAGS, today and tomorrow)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 5 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
MCMC learning outcomes
1. Develop a big picture understanding of how MCMC allows usto approximate the marginal posterior distributions ofparameters and latent quantities.
2. Understand and be able to code a simple MCMC algorithm.
3. Appreciate the different methods that can be used withinMCMC algorithms to make draws from the posteriordistribution.
3.1 Metropolis3.2 Metropolis-Hastings3.3 Gibbs
4. Understand concepts of burn-in and convergence.
5. Understand and be able to write full-conditional distributions.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 6 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Remember the marginal distribution of the data
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 7 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
We have simple solutions for the posterior for simplemodels:
[φ |y ] = beta
The prior α︷︸︸︷α +y︸ ︷︷ ︸
The new α
,
The priorβ︷︸︸︷β +n−y︸ ︷︷ ︸The new β
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 8 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Problems of high dimension do not have simple solutions:
[θ1,θ2,θ3,θ4,z | y,u] =∏
ni=1[yi |θ1zi ][ui |θ2,zi ][zi |θ3,θ4][θ1][θ2][θ3][θ4]∫
....∫
∏ni=1[yi |θ1zi ][ui |θ2,zi ][zi |θ3,θ4][θ1][θ2][θ3][θ4]dzidθ1dθ2dθ3dθ4
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 9 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What we are doing in MCMC?
Recall that the posterior distribution is proportional to the joint:because the marginal distribution of the data
∫[y |θ ][θ ]dθ is a
constant after the data have been observed.
Posterior︷︸︸︷[θ |y ] ∝
Joint︷ ︸︸ ︷[y ,θ ] (1)
Posterior︷︸︸︷[θ |y ] ∝
likelihood︷ ︸︸ ︷[θ | y ]
prior︷︸︸︷[θ ] (2)
Factoring the joint distribution into a product of probabilitydistributions using the chain rule of probability is where we start allBayesian modeling. The factored joint distribution provides thebasis for MCMC.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 10 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What we are doing in MCMC?
6 8 10 12 14
0.0e+00
1.5e-09
Likelihood
θ
[y | θ]
6 8 10 12 14
0.0
0.1
0.2
0.3
Posterior
θ
[θ |
y]
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 11 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What we are doing in MCMC?
n=100, not normalized
data
Frequency
0.0 0.2 0.4 0.6
05
1015
2025
30n=100, normalized
data
Density
0.0 0.2 0.4 0.60.0
1.0
2.0
3.0
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 12 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What are we doing in MCMC?
I The posterior distribution is unknown, but the likelihood isknown as a likelihood profile and we know the priors.
I We want to accumulate many, many values that representrandom samples proportionate to their density in the marginalposterior distribution.
I MCMC generates these samples using the likelihood and thepriors to decide which samples to keep and which to throwaway.
I We can then use these samples to calculate statisticsdescribing the distribution: means, medians, variances,credible intervals etc.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 13 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What are we doing in MCMC?
The marginal posterior distribution of each unobserved quantity isapproximated by samples accumulated in the chain.
n=100000, not normalized
θ
Frequency
0.0 0.2 0.4 0.6 0.8
050
100
150
200
250
300
350
n=100000, normalized
θ
Density
0.0 0.2 0.4 0.6 0.8
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 14 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What are we doing in MCMC?
θ
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
θ
0 1
A
C
D
B
θ
Density
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
01
23
4
[θ*]
Joint distribution [y|θ][θ]
determines frequency with which
proposals are kept in chain.
..2 3 K1k .
.63....27.07.21θ
θ*
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 15 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Metropolis updates
We keep the more probable members of the posterior distributionby comparing a proposal with the current value in the chain.
k 1 2
Proposalθ ∗k+1θ ∗ 2
Test P(θ ∗ 2)> P(θ1)
Chain(θk ) θ1 θ2 = θ ∗ 2
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 16 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Metropolis updates
We keep the more probable members of the posterior distributionby comparing a proposal with the current value in the chain.
k 1 2 3
Proposalθ ∗k+1θ ∗ 2 θ ∗ 3
Test P(θ ∗ 2)> P(θ1)
P(θ2)> P(θ ∗ 3
)
Chain(θk ) θ1 θ2 = θ ∗ 2 θ3 = θ2
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 17 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Metropolis updates
We keep the more probable members of the posterior distributionby comparing a proposal with the current value in the chain.
k 1 2 3 4 K
Proposalθ∗k+1θ∗ 2 θ∗ 3 θ∗ 4 . . . .
Test P(θ∗ 2)>P(θ1)
P(θ2 )>P(
θ∗ 3)
P(θ3 )>P(
θ∗ 4)
. . . .
Chain(θk ) θ1 θ2 = θ∗ 2 θ3 = θ2 θ4 = θ3
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 18 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Metropolis updates
[θ ∗k+1|y ] =
likelihood︷ ︸︸ ︷[y |θ ∗k+1]
prior︷ ︸︸ ︷[θ ∗k+1]∫
[y |θ ][θ ]dθ
[θ k |y ] =
likelihood︷ ︸︸ ︷[y |θ k ]
prior︷︸︸︷[θ k ]∫
[y |θ ][θ ]dθ
R = [θ ∗k+1|y ][θk |y ]
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 19 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
The cunning idea behind Metropolis updates
[θ ∗k+1|y ] =
likelihood︷ ︸︸ ︷[y |θ ∗k+1]
prior︷ ︸︸ ︷[θ ∗k+1]
(((((∫[y |θ ][θ ]dθ
[θ k |y ] =
likelihood︷ ︸︸ ︷[y |θ k ]
prior︷︸︸︷[θ k ]
(((((∫[y |θ ][θ ]dθ
R = [θ ∗k+1|y ][θk |y ]
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 20 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
When do we keep the proposal?
PR =min(1,R)
Keep θ ∗k+1 as the next value in the chain with probability PR andkeep θ k with probability 1−PR.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 21 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
When do we keep the proposal?
1. Calculate R based on likelihoods and priors.
2. Draw a random number, U from uniform distribution 0,1 IfR >U , we keep the proposal θ ∗k+1 as the next value in thechain.
3. Otherwise, we retain θ k as the next value.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 22 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
A simple example for one parameter
I Mary is interested in estimating the proportion of coal-firedpower plants that fail to meet regulations for emissions of lead.
I She is not very ambitious, so she only checks 12 plants, 3 ofwhich are non-compliant. She assumes there is no priorknowledge of this proportion.
I How could she calculate the parameters of the posteriordistribution of the proportion of non-compliant plants on theback of a cocktail napkin?
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 23 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
The model
[φ |y ] ∝ binomial(y |n,φ)beta(φ |1,1)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 24 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
0.1 0.2 0.3 0.4 0.5 0.6
01
23
45
6
Simulated and Calculated Distribution, iterations = 100
φ
P(φ|data)
CalculatedSimulated
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 25 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
0.0 0.2 0.4 0.6
01
23
45
6
Simulated and Calculated Distribution, iterations = 1000
φ
P(φ|data)
CalculatedSimulated
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 26 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
0.0 0.2 0.4 0.6 0.8
01
23
45
6
Simulated and Calculated Distribution, iterations = 10000
φ
P(φ|data)
CalculatedSimulated
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 27 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
0.0 0.2 0.4 0.6 0.8
01
23
45
6
Simulated and Calculated Distribution, iterations = 100000
φ
P(φ|data)
CalculatedSimulated
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 28 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 29 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the posterior
The chain has converged when adding more samples does notchange the shape of the posterior distribution. We throw awaysamples that are accumulated before convergence (burn-in).
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 30 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Intuition for MCMC for multi-parameter models
yi
�0,�1
xi
�2
g(�0,�1, xi) = �0 + �1xi
[�0,�1,�2 | yi] / [�0,�1,�
2, yi]
factoring rhs using DAG:
[�0,�1,�2 | yi] / [yi | g(�0,�1, xi),�
2][�0], [�1][�2]
joint for all data :
[�0,�1,�2 | y] /
nY
i=1
[yi | g(�0,�1, xi),�2][�0][�1][�
2]
choose specific distributions:
[�0,�1,�2 | y] /
nY
i=1
normal(yi | g(�0,�1, xi),�2)
⇥ normal(�0 | 0, 10000)normal(�1 | 0, 10000)
⇥ uniform(�2 | 0, 500)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 31 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Intuition for MCMC for multi-parameter models
[β0,β1,σ2 | y] ∝
n
∏i=1
normal(yi |g(β0,β1,xi),σ2)
× normal(β0 | 0,10000)normal(β1 | 0,10000)× uniform(σ2 | 0,100)
1. Set initial values for β0,β1,σ2
2. Assume β1,σ2 are known and constant. Make a draw for β0.
Store the draw.
3. Assume β0,σ2 are known and constant. Make a draw for β1.
Store the draw.
4. Assume β0,β1 are known and constant. Make a draw for σ2.Store the draw.
5. Do this many times. The stored values for each parameterapproximate its marginal posterior distribution afterconvergence.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 32 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Implementing MCMC for multiple parametersI Write an expression for the posterior and joint distribution using a DAG
as a guide. Always.
I If you are using MCMC software (e.g. JAGS) use expression for theposterior and joint distribution as template for writing code. You aredone.
I If you are writing your own MCMC sampler or you simply want tounderstand what JAGS is doing for you:
I Decompose the expression of the multivariate joint distribution intoa series of univariate distributions called full-conditionaldistributions.
I Choose a sampling method for each full-conditional distribution.I Cycle through each unobserved quantity, sampling from its
full-conditional distribution, treating the others as if they wereknown and constant.
I The accumulated samples approximate the marginal posteriordistribution of each unobserved quantity.
I Note that this takes a complex, multivariate problem and turns itinto a series of simple, univariate problems that we solve, as in theexample above, one at a time.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 33 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Definition of full-conditional distribution
Let θθθ be a vector of length k containing all of the unobservedquantities we seek to understand. Let θθθ j be a vector of lengthk −1 that contains all of the unobserved quantities except θj . Thefull-conditional distribution of θj is
[θj |y ,,,θθθ j ],
which we notate as[θj |·].
It is the posterior distribution of θj conditional on all of theother parameters and the data, which we assume are known.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 34 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Writing full-conditional distributions
I You will have one full-conditional for each unobservedquantity in the posterior.
I For each unobserved quantity, write the distributions where itappears.
I Ignore the other distributions.
I Simple.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 35 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Example
I Clark 2003 considered the problem of modeling fecundity ofspotted owls and the implication of individual variation infecundity for population growth rate.
I Data were number of offspring produced by per pair of owlswith sample size n = 197.
Clark, J. S. 2003. Uncertainty and variability in demography and population growth: A hierarchical approach.Ecology 84:1370-1381.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 36 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Example
yi
�i
↵ �
[λλλ ,α,β |y] ∝
n
∏i=1
Poisson(yi |λi)gamma(λi |α,β )
× gamma(α|.001, .001)gamma(β |.001, .001)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 37 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Full-conditionals
yi
λi
α β
We use the multivariate joint distribution to find univariate full-conditional distributions for all unobserved quantities.
How many full conditionals are there?
[�,↵,�|y] /nY
i=1
Poisson (yi|�i) gamma (�i|↵,�) gamma (�|.001, .001) gamma (↵|.001, .001)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 38 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Writing full-conditional distributions
I You will have one full-conditional for each unobservedquantity in the posterior.
I For each unobserved quantity, write the distributions(including products) where it appears.
I Ignore the other distributions.
I Simple.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 39 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Full-conditional for each λi
yi[λi | .]∝ Poisson yi | λi( )gamma λi |α ,β( )Writing the full-conditional distribution for λi:
[�,↵,�|y] /nY
i=1
Poisson (yi|�i) gamma (�i|↵,�) gamma (�|.001, .001) gamma (↵|.001, .001)
λi
α β
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 40 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Full-conditional for β
Writing the full-conditional distribution for β:
[�,↵,�|y] /nY
i=1
Poisson (yi|�i) gamma (�i|↵,�) gamma (�|.001, .001) gamma (↵|.001, .001)
[�|.] /nY
i=1
gamma (�i|↵,�) gamma (�|.001, .001)
λi
α β
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 41 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Full-conditional for α
Writing the full-conditional distribution for α:
[�,↵,�|y] /nY
i=1
Poisson (yi|�i) gamma (�i|↵,�) gamma (�|.001, .001) gamma (↵|.001, .001)
λi
α β
[↵ | ·] /nY
i=1
gamma(�i | ↵,�)gamma(↵ | .001, .001)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 42 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Full-conditionals for the model
Posterior and joint:
[λλλ ,α,β |y] ∝
n
∏i=1
Poisson(yi |λi)gamma(λi |α,β )
× gamma(α|.001, .001)gamma(β |.001, .001)
Full conditionals:[λi |.] ∝ Poisson(yi |λi)gamma(λi |α,β )
[β |.] ∝
n
∏i=1
gamma(λi |α,β )gamma(β |.001, .001)
[α|.] ∝
n
∏i=1
gamma(λi |α,β )gamma(α|.001, .001)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 43 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Implementing MCMC for multiple parameters and latentquantities
I Write an expression for the posterior and joint distribution using a DAGas a guide. Always.
I If you are using MCMC software (e.g. JAGS) use expression for posteriorand joint as template for writing code.
I If you are writing your own MCMC sampler:
I Decompose the expression of the multivariate joint distribution intoa series of univariate distributions called full-conditionaldistributions.
I Choose a sampling method for each full-conditional distribution.I Cycle through each unobserved quantity, sampling from the its
full-conditional distribution, treating the others as if they wereknown and constant.
I Note that this takes a complex, multivariate problem and turns itinto a series of simple, univariate problems that we solve, as in theexample above, one at a time.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 44 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Choosing a sampling method
1. Acept-reject:
1.1 Metropolis: requires a symmetric proposal distribution (e.g.,normal, uniform). This is what we used above in the examplefor one parameter.
1.2 Metropolis-Hastings: allows asymmetric proposal distributions(e.g., beta, gamma, lognormal). A minor modification ofMetropolis. See optional notes.
2. Gibbs: accepts all proposals because they are especially wellchosen. Requires conjugates. In lab today.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 45 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Why do you need to understand conjugate priors?
I A easy way to find parameters of posterior distributions forsimple problems.
I Critical to understanding Gibbs updates in Markov chainMonte Carlo as you are about to learn.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 46 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
What are conjugate priors?
Assume we have a likelihood and a prior:
posterior︷︸︸︷[θ |y ] =
likelihood︷︸︸︷[y |θ ]
prior︷︸︸︷[θ ]
[y ] .
If the form of the distribution of the posterior[θ |y ]is the same as the form of the distribution of the prior,[θ ]then the likelihood and the prior are said to be conjugates[y |θ ][θ ]︸ ︷︷ ︸congugates
and the prior is called a conjugate prior for θ .
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 47 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Gibbs updates
When priors and likelihoods are conjugate, we know all but one ofthe parameters of the full-conditional because they are assumedto be known at each iteration. We make a draw of the singleunknown directly from its posterior distribution as if the otherparameters were fixed.
Wickedly clever.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 48 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Gamma-Poisson conjugate relationship for λ
The conjugate prior distribution for a Poisson likelihood isgamma(λ |α,β ). Given n observations yi of new data, theposterior distribution of λ is
[λ |y] = gamma
The prior α︷︸︸︷α0 +
n
∑i=1
yi
︸ ︷︷ ︸The new α
,
The prior β︷︸︸︷β0 +n︸ ︷︷ ︸
The new β
. (3)
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 49 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the Poisson-gamma conjugate:
Full conditional:
[λi | .] ∝ Poisson(yi | λi)gamma(λi | α,β ) (4)
Gibbs sample:
[λ ki |yi ] = gamma
The currentα︷ ︸︸ ︷αk−1 +yi ,
The current β︷ ︸︸ ︷βk−1 +1
. (5)
In R, this would be:shape[k] = alpha[k-1] + y_i
rate[k] = beta[k-1] + 1
lambda[k,i] = rgamma(1, shape[k], rate[k])
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 50 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Gamma-gamma conjugate relationship
The conjugate prior distribution for the β parameter (rate) in agamma likelihood gamma(yi |α,β ) is a gamma distributiongamma(β | α0,β0). Given n observations yi of new data, theposterior distribution of β (assuming that α (shape) is known) isgiven by:
[β |y] = gamma
The prior α︷︸︸︷α0 +nα︸ ︷︷ ︸The new α
,
The prior β︷︸︸︷βo +
n
∑i=1︸ ︷︷ ︸
The new β
yi
. (6)
We can substitute any “known” quantity for y, e.g., λλλ in MCMC.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 51 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Sampling from the gamma-gamma conjugate:
The full conditional:
[β |.] ∝n
∏i=1
gamma(λi |α,β )gamma(β |.001, .001)
Gibbs sample:
β k ∼ gamma
(.001+nαk−1, .001+
n
∑i=1
λ ki
)
In R this would be:shape[k] = .001 + length(y) * alpha[k-1]rate[k] = .001 + sum(lambda[,k])beta[k] = rgamma(1, shape[k], rate[k])
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 52 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
MCMC algorithm
1. Iterate over i = 1...197
2. At each i , make a draw from
λki ∼ gamma
(αk−1+yi , β
k−1+1)
︸ ︷︷ ︸Gibbs update using gamma - Poisson conjugate for eachλi
(7)
βk ∼ gamma
(.001+α
k−1n, .001+n
∑i=1
λki
)
︸ ︷︷ ︸Gibbs update using gamma - gamma conjugate for β
(8)
αk
∝
n
∏i=1
gamma(
λki |αk−1,β k
)gamma
(αk−1|.001, .001
)
︸ ︷︷ ︸No conguate for α. Use Metropolis - Hastings update
(9)
3. Repeat for k = 1...K iterations, storing λ ki ,α
k and β k . Store the value of eachparameter at each iteration in a vector.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 53 / 54
Introduction Why MCMC? Intuition Writing an MCMC sampler
Inference from MCMC
Make inference on each unobserved quantity using the elementsof their vectors stored after convergence. These post-convergencevectors, (i.e., the“rug”described above) approximate the marginalposterior distributions of unobserved quantities.
Che-Castaldo, Collins, Hobbs DBI-1052875, DBI-1639145, DEB 1145200 54 / 54