Week 7: Bayesian inference · Week 7: Bayesian inference – p.8/21. Two programs demonstrating...

Week 7: Bayesian inference

Genome 570

February, 2016

Week 7: Bayesian inference – p.1/21

Bayes’ Theorem

Conditional probability of hypothesis given data is:

Prob (H | D) =Prob (H & D)

Prob (D)

SinceProb (H & D) = Prob (H) Prob (D | H),

substituting this in:

Prob (H | D) =Prob (H) Prob (D | H)

Prob (D)

The denominator Prob (D) is the sum of the numerators over all possiblehypotheses H, so it is the sum of the numerators over those, giving

Prob (H | D) =Prob (H) Prob (D | H)

H Prob (H) Prob (D | H)

A visual example of Bayes’ Theorem

P(H 1) P(H ) P(H )2 3

2P(D | H )2

P(H |D ) = 3 2+ +

12P(D | H )

prior probabilities

posterior probability

2P(D | H 3 )

A dramatic, if not real, example

Example: “Space probe photos show no Little Green Men on Mars!”

priors

posteriors

yes no

likelihoods

Calculations for that example

Using the Odds Ratio form of Bayes’ Theorem:

Prob (H1|D)

Prob (H2|D)=

Prob (D|H1)

Prob (D|H2)

Prob (H1)

Prob (H2)

︸︷︷︸︸︷︷︸︸︷︷︸

posterior likelihood priorodds ratio ratio odds ratio

For the odds favoring their existence, the calculation is, for the optimistabout Little Green Men:

1= 4 : 3

While for the pessimist it is

1= 1 : 12

With repeated observation the prior matters less

If we send 5 space probes, and all fail to see LGMs, since the probabilityof this observation is (1/3)5 if there are LGMs, and 1 if there aren’t,we get for the optimist about Little Green Men:

(1/3)5

1= 4 : 243

while for the pessimist about Little Green Men:

(1/3)5

1= 1 : 972

A coin tossing example

Theprior

0.0 0.2 0.4 0.6 0.8 1.0

p0.0 0.2 0.4 0.6 0.8 1.0

Thelikelihoodfunction

0.0 0.2 0.4 0.6 0.8 1.00

0.0 0.2 0.4 0.6 0.8 1.0

Theposterior

0.0 0.2 0.4 0.6 0.8 1.0

11 tosses with 5 heads 44 tosses with 20 headsWeek 7: Bayesian inference – p.7/21

Markov Chain Monte Carlo sampling

To draw trees from a distribution whose probabilities are proportional tof(t), we can use the Metropolis algorithm:

1. Start at some tree. Call this Ti.2. Pick a tree that is a neighbor of this tree in the graph of trees. Call

this the proposal Tj.

3. Compute the ratio of the probabilities (or probability densityfunctions) of the proposed new tree and the old tree:

R =f(Tj)

4. If R ≥ 1, accept the new tree as the current tree.5. If R < 1, draw a uniform random number (a random fraction between

0 and 1). If it is less than R, accept the new tree as the current tree.6. Otherwise reject the new tree and continue with tree Ti as the

current tree.7. Return to step 2.

Two programs demonstrating MCMC sampling

Two excellent programs exist to demonstrate how MCMC finds peaks in atwo-dimensional space. The user can place the peaks and can run bothregular and “heated” chains to explore the space.

Paul Lewis’s Windows program MCRobot which is available athttp://www.eeb.uconn.edu/people/plewis/software.php

John Huelsenbeck’s similar Mac OS X program McmcApp (alsocalled iMCMC) which is available athttp://cteg.berkeley.edu/software.html(hint: to place a peak on the run window, before running move thecursor on the window while holding down the mouse button. Thepeak is much wider than the initial ellipse, so keep that smallish).

We will demonstrate one of these in class.

Bayesian MCMC

We try to achieve the posterior

Prob (T) Prob (D | T) / (denominator)

and this turns out to need the acceptance ratio

R =Prob (Tnew) Prob (D | Tnew)

Prob (Told) Prob (D | Told)

(the denominators are the same and cancel out. This is a greatconvenience, as we often cannot evaluate the deonominator, but we canusually evaluate the numerators).

Note that we could also have a prior on model parameters too, and as wemove through tree space we could also be moving through parameterspace.

Using MrBayes on the primates data

BovineLemurTarsierCrab E.MacRhesus MacJpn MacaqBarbMacaqOrangChimpHumanGorillaGibbonSquir MonkMouse

Frequencies of partitions (posterior clade probabilities)Week 7: Bayesian inference – p.11/21

Issues to think about with Bayesian inference

Where do you get your prior from?

Where do you get your prior from?Are you assuming each branch has a length drawnindependently from a distribution? How wide a distribution?

Where do you get your prior from?Are you assuming each branch has a length drawnindependently from a distribution? How wide a distribution?Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?

Where do you get your prior from?Are you assuming each branch has a length drawnindependently from a distribution? How wide a distribution?Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosenare selected from all extant species of the group? How do youmodel that? Are you modelling biologists’ decision-makingprocesses?

Is your prior the same as your reader’s prior?

Is your prior the same as your reader’s prior?If not, what do you do?

Is your prior the same as your reader’s prior?If not, what do you do?Use several priors?

Is your prior the same as your reader’s prior?If not, what do you do?Use several priors?Just give the reader the likelihood surface and let them providetheir own prior?

An example

Suppose we have two species with a Jukes-Cantor model, so that theestimation of the (unrooted) tree is simply the estimation of the branchlength between the two species.

We can express the result either as branch length t, or as the netprobability of base change

1 − exp(−4

A flat prior on p

0.0 0.5 0.750.25

The corresponding prior on t

0 1 2 3 4 5

So which is the “flat prior”?Week 7: Bayesian inference – p.15/21

Flat prior for t between 0 and 5

0 1 2 3 4 5

The corresponding prior on p

0.0 0.750.50.25

The invariance of the ML estimator to scale change

^t̂ = 0.383112

( p = 0.3 )

0 1 2 3 4 5−14

Likelihood curve for twhen 3 sites differ out of 10

The invariance of the ML estimator to scale change

0.0 0.2 0.4 0.6 0.8−14

p = 0.3^

^( t = 0.383112 )Likelihood curve for p

when 3 sites differ out of 10

When t has a wide flat prior

1 10 100 1000

t 95% interval

The 95% two-tailed credible interval for t with varioustruncation points on a flat prior for t

What is going on in that case is ...

T is so large this is < 2.5% of the area

Week 7: Bayesian inference · Week 7: Bayesian inference – p.8/21. Two programs demonstrating...

Documents

An Introduction to Bayesian Inference and MCMC Methods for ...people.stat.sfu.ca/~cschwarz/Consulting/Trinity/... · An Introduction to MCMC Sampling from the Posterior Concept If

Computational Roleof Astrocytes as Bayesian Inference ... · on neural networks as Bayesian inference over synapse parametrization, analogous to Markov Chain Monte Carlo (MCMC) sampling

MCMC Using Parallel Computation Andrew Beam ST 790 – Advanced Bayesian Inference 02/21/2013

Simulation Based Inference for Dynamic Multinomial Choice Models · 2019. 10. 1. · This chapter focuses on Bayesian inference for dynamic multinomial choice models via the MCMC

§ / Applied Bayesian Inference, KSU, April 29, 2012 §❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM Robert J. Tempelman 1

Adaptive MCMC with Bayesian Optimization - …proceedings.mlr.press/v22/mahendran12/mahendran12.… · · 2018-01-03Adaptive MCMC with Bayesian Optimization ... We believe this

Bayesian Inference in Survey Research: Applications to ... · Introduction Bayes’ Theorem Sample Size Issues MCMC Summarizing the Posterior Distribution Bayesian Factor Analysis

Introduction to Bayesian MCMC Models - HOME (EN) · Introduction to Bayesian MCMC Models Glenn Meyers Introduction MCMC Theory MCMC History Introductory Example Using Stan Loss Reserve

Research Article An Integrated Procedure for Bayesian ...downloads.hindawi.com/archive/2014/264920.pdf · An Integrated Procedure for Bayesian Reliability Inference Using MCMC JingLin

Bayesian Changepoint Analysis for Extreme Events (Typhoons ... change... · change points in a rare event count series a. MCMC method applied to Bayesian analysis 1) BASIC INFERENCE

Bayesian Inference for Categorical Data Analysis: A Surveypeople.stat.sc.edu/Hitchcock/bayesfinal.pdf · Bayesian Inference for Categorical Data Analysis: ... Bayesian Inference for

BAYESIAN MODEL FITTING AND MCMC - Home Cornell …cordes/A6523/Lecture20_A6523_Spring2017.pdf · Bayesian Inference When should I use Bayesian methods? • Parameter estimation of

Bayesian parameter inference for stochastic biochemical ...nag48/pmcmc.pdfBayesian parameter inference for stochastic biochemical network models using particle MCMC Andrew Golightly

MCMC and Applied Bayesian Statistics - Department …ripley/MCMC_2010/MCMC-lects.pdfM.Sc. in Applied Statistics MT2010 MCMC and Applied Bayesian Statistics c 2008–10 B. D. Ripley1

Bayesian parameter and state estimation for partially ... · Stochastic modelling of dynamical systems Bayesian inference Particle MCMC Summary and conclusions Bayesian parameter

Inference for SDE models via Approximate Bayesian Computation · to estimate , in section 3 ABC-MCMC methodology is introduced. 3 ABC-MCMC methods For ease of notation in this section

Bayesian inference by reversible jump MCMC for clustering ...amansystem.com/apps/publications/papers/Bayesian... · Keywords Finite mixtures · Generalized inverted Dirichlet · Bayesian

Bayesian inference

Inference in Bayesian networks - courses.csail.mit.edu › 6.034s › handouts › spring12 › ...Approximate inference using MCMC \State" of network = current assignment to all variables

MCMC: Does it work? How can we tell? - School of Statisticsusers.stat.umn.edu/~geyer/jsm09.pdf · MCMC Markov chain Monte Carlo (MCMC) is great stu . MCMC revitalized Bayesian inference