77
Week 7: Bayesian inference, Testing trees, Bootstraps Genome 570 February, 2012 Week 7: Bayesian inference, Testing trees, Bootstraps – p.1/65

Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Week 7: Bayesian inference, Testing trees, Bootstraps

Genome 570

February, 2012

Week 7: Bayesian inference, Testing trees, Bootstraps – p.1/65

Page 2: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Bayes’ Theorem

Conditional probability of hypothesis given data is:

Prob (H | D) =Prob (H & D)

Prob (D)

SinceProb (H & D) = Prob (H) Prob (D | H)

Substituting this in:

Prob (H | D) =Prob (H) Prob (D | H)

Prob (D)

The denominator Prob (D) is the sum of the numerators over all possible

hypotheses H

Prob (H | D) =Prob (H) Prob (D | H)∑H Prob (H) Prob (D | H)

Week 7: Bayesian inference, Testing trees, Bootstraps – p.2/65

Page 3: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A visual example of Bayes’ Theorem

P(H 1

) P(H ) P(H )2 3

2P(D | H )

2

P(H |D ) = 3 2

+ +

12P(D | H ) 2

P(D )| H3

prior probabilities

likelih

oods

posterior probability

Week 7: Bayesian inference, Testing trees, Bootstraps – p.3/65

Page 4: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A dramatic, if not real, example

Example: “Space probe photos show no Little Green Men on Mars!”

priors

posteriors

no

yes no

yes

noyes

no

yes

likelihoods

0

no

yes

1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.4/65

Page 5: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Calculations for that example

Using the Odds Ratio form of Bayes’ Theorem:

Prob (H1|D)

Prob (H2|D)=

Prob (D|H1)

Prob (D|H2)

Prob (H1)

Prob (H2)

︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸︷︷︸posterior likelihood priorodds ratio ratio odds ratio

For the odds favoring their existence, the calculation is, for the optimist

about Little Green Men:

4

1× 1/3

1=

4/3

1= 4 : 3

While for the pessimist it is

1

4× 1/3

1=

1/12

1= 1 : 12

Week 7: Bayesian inference, Testing trees, Bootstraps – p.5/65

Page 6: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

With repeated observation the prior matters less

If we send 5 space probes, and all fail to see LGMs, since the probability

of this observation is (1/3)5 if there are LGMs, and 1 if there aren’t,

we get for the optimist about Little Green Men:

4

1× (1/3)5

1= =

4/243

1= 4 : 243

while for the pessimist about Little Green Men:

1

4× (1/3)5

1= =

1/972

1= 1 : 972

Week 7: Bayesian inference, Testing trees, Bootstraps – p.6/65

Page 7: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A coin tossing example

The

prior

0.0 0.2 0.4 0.6 0.8 1.0

p0.0 0.2 0.4 0.6 0.8 1.0

p

The

likelihood

function

0.0 0.2 0.4 0.6 0.8 1.00

0.0 0.2 0.4 0.6 0.8 1.0

The

posterior

0.0 0.2 0.4 0.6 0.8 1.0

p

0.0 0.2 0.4 0.6 0.8 1.0

p

11 tosses with 5 heads 44 tosses with 20 heads

Week 7: Bayesian inference, Testing trees, Bootstraps – p.7/65

Page 8: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Markov Chain Monte Carlo sampling

To draw trees from a distribution whose probabilities are proportional to

f(t), we can use the Metropolis algorithm:

1. Start at some tree. Call this Ti.

2. Pick a tree that is a neighbor of this tree in the graph of trees. Callthis the proposal Tj.

3. Compute the ratio of the probabilities (or probability densityfunctions) of the proposed new tree and the old tree: trees:

R =f(Tj)

f(Ti)

4. If R ≥ 1, accept the new tree as the current tree.

5. If R < 1, draw a uniform random number (a random fraction between0 and 1). If it is less than R, accept the new tree as the current tree.

6. Otherwise reject the new tree and continue with tree Ti as thecurrent tree.

7. Return to step 2.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.8/65

Page 9: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Reversibility (again)

πi

jπPij

Pji

Week 7: Bayesian inference, Testing trees, Bootstraps – p.9/65

Page 10: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Does it achieve the desired equilibrium distribution?

If f(Ti) > f(Tj), then Prob (Ti | Tj) = 1, and Prob (Tj | Ti) = f(Tj)/f(Ti)so that

Prob (Tj | Ti)

Prob (Ti | Tj)=

f(Tj)

f(Ti)

(the same formula can be shown to hold when f(Ti) < f(Tj) ). Then in

both casesf(Ti) Prob (Tj | Ti) = Prob (Ti | Tj) f(Tj)

Summing over all Ti, the right-hand side sums up to f(Tj) so

Ti

f(Ti) Prob (Tj | Ti) = f(Tj)

This shows that applying this algorithm, if we start in the desired

distribution, we stay in it. It is also true under most conditions that if youstart in any other distribution, you converge to this one.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.10/65

Page 11: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Bayesian MCMC

We try to achieve the posterior

Prob (T) Prob (D | T) / (denominator)

and this turns out to need the acceptance ratio

R =Prob (Tnew) Prob (D | Tnew)

Prob (Told) Prob (D | Told)

(the denominators are the same and cancel out. This is a great

convenience, as we often cannot evaluate the deonominator, but we canusually evaluate the numerators).

Note that we could also have a prior on model parameters too, and as wemove through tree space we could also be moving through parameterspace.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.11/65

Page 12: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Mau and Newton’s proposal mechanism

A C BD EF ACB D EF ACB D EF

ACB D EF ACB D EFACB D EF

Week 7: Bayesian inference, Testing trees, Bootstraps – p.12/65

Page 13: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Using MrBayes on the primates data

Bovine

Lemur

Tarsier

Crab E.Mac

Rhesus Mac

Jpn Macaq

BarbMacaq

Orang

Chimp

Human

Gorilla

Gibbon

Squir Monk

Mouse

0.416

0.899

1.00

0.999

1.00

0.997

0.986

0.905

0.949

0.938

0.856

Frequencies of partitions (posterior clade probabilities)Week 7: Bayesian inference, Testing trees, Bootstraps – p.13/65

Page 14: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 15: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 16: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 17: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosen

are selected from all extant species of the group? How do you

model that? Are you modelling biologists’ decision-makingprocesses?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 18: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosen

are selected from all extant species of the group? How do you

model that? Are you modelling biologists’ decision-makingprocesses?

Is your prior the same as your reader’s prior?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 19: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosen

are selected from all extant species of the group? How do you

model that? Are you modelling biologists’ decision-makingprocesses?

Is your prior the same as your reader’s prior?

If not, what do you do?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 20: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosen

are selected from all extant species of the group? How do you

model that? Are you modelling biologists’ decision-makingprocesses?

Is your prior the same as your reader’s prior?

If not, what do you do?

Use several priors?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 21: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Issues to think about with Bayesian inference

Where do you get your prior from?

Are you assuming each branch has a length drawn

independently from a distribution? How wide a distribution?

Or is the tree drawn from a birth-death process? If so, what arethe rates of birth and death?Or is there also a stage where the species studied are chosen

are selected from all extant species of the group? How do you

model that? Are you modelling biologists’ decision-makingprocesses?

Is your prior the same as your reader’s prior?

If not, what do you do?

Use several priors?

Just give the reader the likelihood surface and let them provide

their own prior?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.14/65

Page 22: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

An example

Suppose we have two species with a Jukes-Cantor model, so that theestimation of the (unrooted) tree is simply the estimation of the branchlength between the two species.

We can express the result either as branch length t, or as the net

probability of base change

p =3

4

(1 − exp(−4

3t)

)

Week 7: Bayesian inference, Testing trees, Bootstraps – p.15/65

Page 23: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A flat prior on p

0.0 0.5 0.750.25

p

Week 7: Bayesian inference, Testing trees, Bootstraps – p.16/65

Page 24: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The corresponding prior on t

0 1 2 3 4 5

t

So which is the “flat prior”?Week 7: Bayesian inference, Testing trees, Bootstraps – p.17/65

Page 25: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Flat prior for t between 0 and 5

0 1 2 3 4 5

t

Week 7: Bayesian inference, Testing trees, Bootstraps – p.18/65

Page 26: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The corresponding prior on p

0.0 0.750.50.25

p

Week 7: Bayesian inference, Testing trees, Bootstraps – p.19/65

Page 27: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The invariance of the ML estimator to scale change

^t̂ = 0.383112

( p = 0.3 )

0 1 2 3 4 5−14

−12

−10

−8

−6

t

ln L

Likelihood curve for twhen 3 sites differ out of 10

Week 7: Bayesian inference, Testing trees, Bootstraps – p.20/65

Page 28: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The invariance of the ML estimator to scale change

0.0 0.2 0.4 0.6 0.8−14

−12

−10

−8

−6

ln L

p

p = 0.3^

^( t = 0.383112 )Likelihood curve for p

when 3 sites differ out of 10

Week 7: Bayesian inference, Testing trees, Bootstraps – p.21/65

Page 29: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

When t has a wide flat prior

0

1

2

3

4

5

1 10 100 1000

T

t 95% interval

MLE

The 95% two-tailed credible interval for t with varioustruncation points on a flat prior for t

Week 7: Bayesian inference, Testing trees, Bootstraps – p.22/65

Page 30: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

What is going on in that case is ...

0

T is so large this is < 2.5% of the area

T

Week 7: Bayesian inference, Testing trees, Bootstraps – p.23/65

Page 31: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.24/65

Page 32: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.25/65

Page 33: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.26/65

Page 34: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.27/65

Page 35: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.28/65

Page 36: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.29/65

Page 37: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.30/65

Page 38: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The likelihood curve is nearly a normal distribution

for large amounts of data

θ

θ

θ from t(x), the "sufficient statistic"

the value for our data set

If we have large amounts of data, the values of parameters we need to tryare all very similar, and the shape of the distribution (which is nearlynormal) will not be too different for these values.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.31/65

Page 39: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Curvatures and covariances of ML estimates

ML estimates have covariances computable from curvatures of theexpected log-likelihood:

Var[θ̂]≃ −1

/ (d2E(log(L))

dθ2

)

The same is true when there are multiple parameters:

Var[θ̂

]≃ V ≃ −C

−1

where

Cij = E

(∂2 log(L)

∂θi ∂θj

)

Week 7: Bayesian inference, Testing trees, Bootstraps – p.32/65

Page 40: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

With large amounts of data, asymptotically

When the true value of θ is θ0,

θ̂ − θ0√v

∼ N (0, 1)

Since 1/v is the negative of the curvature of the log-likelihood:

ln L (θ0) = ln L(θ̂) − 1

2

(θ0 − θ̂)2

v

so that twice the difference of log-likelihoods is the square of a normal:

2(ln L(θ̂) − ln L (θ0)

)∼ χ2

1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.33/65

Page 41: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Corresponding results for multiple parameters

ln L(θ0) ≃ ln L(θ0) −1

2(θ0 − θ)

TC (θ0 − θ)

(θ − θ0)TC (θ − θ0) ∼ χ2

p

so that the log-likelihood difference is:

2(ln L(θ̂) − ln L (θ0)

)∼ χ2

p

When q of the p parameters are constrained:

2(ln L(θ̂) − ln L (θ0)

)∼ χ2

q

Week 7: Bayesian inference, Testing trees, Bootstraps – p.34/65

Page 42: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Likelihood ratio interval for a parameter

−2620

−2625

−2630

−2635

−2640

5 10 20 50 100 200

Transition / transversion ratio

ln L

Inferring the transition/transversion ratio for an F84 model with the14-species primate mitochondria data set.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.35/65

Page 43: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

LRT of a molecular clock – how many parameters?

A B C D E

v2

v1

v3

v4

v5

v6

v7

v8

Constraints for a clock

v2

v1 =

v4

v5=

v3

v7

v4

v8=+ +

v1

v6

v3=+

How does each equation constrain the branch lengths in the unrooted

tree? What about the red equation?

Week 7: Bayesian inference, Testing trees, Bootstraps – p.36/65

Page 44: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Likelihood Ratio Test for a molecular clock

Using the 7-species mitochondrial DNA data set (the great apes plus

Bovine and Mouse), we get with Ts/Tn = 30 and an F84 model:

Tree ln L

No clock −1372.77620

Clock −1414.45053

Difference 41.67473

Chi-square statistic: 2 × 41.675 = 83.35, with n − 2 = 5 degrees of

freedom – highly significant.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.37/65

Page 45: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Model selection using the LRT

Jukes−Cantor K2P, T=2

K2P, T estimated

F84, T=2F81

25

26

28

29

27

F84, T estimated

Parameters

The problem with using likelihood ratio tests is the multiplicity of tests andthe multiple routes to the same hypotheses.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.38/65

Page 46: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The Akaike Information Criterion

Compare between hypotheses −2 ln L + 2p (the same as reducing the

log-likelihood by the number of parameters)

Number ofModel ln L parameters AIC

Jukes-Cantor −3068.29186 25 6186.58K2P, R = 2.0 −2953.15830 25 5956.32

K2P, R̂ = 1.889 −2952.94264 26 5957.89F81 −2935.25430 28 5926.51F84, R = 2.0 −2680.32982 28 5416.66

F84, R̂ = 28.95 −2616.3981 29 5290.80

Week 7: Bayesian inference, Testing trees, Bootstraps – p.39/65

Page 47: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Can we test trees using the LRT?

0.0 0.1 0.2−200

−198

−196

−194

−192

A B C

A

A C B

B C

0.2 0.1

t

t

t

t t

ln L

ike

lih

oo

d

If so, how many degrees of freedom for the comparison of the two peaks?

These are three-species clocklike trees (shown here plotted in a “profilelog-likelihood plot” plotting the highest likelihood for each value of the

interior branch length). Week 7: Bayesian inference, Testing trees, Bootstraps – p.40/65

Page 48: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The bootstrapθ(unknown) true value of

(unknown) true distribution

estimate of θ

Distribution of estimates of parameters

150 data points

(each 150 draws)

empirical distribution of sample

Bootstrap replicates

An example with mixed normal distributions. Draw from the empiricaldistribution 150 times if there are 150 data points. With replacement!

Week 7: Bayesian inference, Testing trees, Bootstraps – p.41/65

Page 49: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The bootstrap for phylogenies

OriginalData

sequences

sites

Bootstrapsample#1

Bootstrapsample

#2

Estimate of the tree

Bootstrap estimate ofthe tree, #1

Bootstrap estimate of

sample same number

of sites, with replacement

sample same number

of sites, with replacement

sequences

sequences

sites

sites

(and so on)the tree, #2

Drawing columns of the data matrix, with replacement.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.42/65

Page 50: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A partition defined by a branch in the first tree

Trees:

How many times each partition of species is found:

AE | BCDFACE | BDFACEF | BD 1AC | BDEFAEF | BCDADEF | BCABDF | ECABCE | DF

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

Week 7: Bayesian inference, Testing trees, Bootstraps – p.43/65

Page 51: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Another partition from the first tree

Trees:

How many times each partition of species is found:

AE | BCDFACE | BDFACEF | BD 1AC | BDEFAEF | BCDADEF | BCABDF | ECABCE | DF

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.44/65

Page 52: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The third partition from that tree

Trees:

How many times each partition of species is found:

AE | BCDFACE | BDFACEF | BD 1AC | BDEFAEF | BCDADEF | BCABDF | ECABCE | DF

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

11

Week 7: Bayesian inference, Testing trees, Bootstraps – p.45/65

Page 53: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Partitions from the second tree

Trees:

How many times each partition of species is found:

AE | BCDF

ACEF | BD 1

AEF | BCDADEF | BCABDF | EC

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

1ACE | BDF

AC | BDEF

ABCE | DF

B

DF

E

C

A

1

2

1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.46/65

Page 54: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Partitions from the third tree

Trees:

How many times each partition of species is found:

ACE | BDFACEF | BDAC | BDEF 1

ABDF | ECABCE | DF

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

12

AE | BCDF

AEF | BCD 1ADEF | BC

B

DF

E

A C

2

1

1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.47/65

Page 55: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Partitions from the fourth tree

Trees:

How many times each partition of species is found:

ACE | BDFACEF | BD 1AC | BDEF 1AEF | BCD 1

ABDF | EC

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

2AE | BCDF 3

ADEF | BC 2

ABCE | DF

B

DF

E

C

A

2

Week 7: Bayesian inference, Testing trees, Bootstraps – p.48/65

Page 56: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Partitions from the fifth tree

Trees:

How many times each partition of species is found:

AE | BCDF 3

ACEF | BD 1AC | BDEF 1AEF | BCD 1ADEF | BC 2

B

DF

E

C

A

B

DF

E

C

A

ACE | BDF 3

ABDF | EC 1ABCE | DF 3

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

Week 7: Bayesian inference, Testing trees, Bootstraps – p.49/65

Page 57: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The table of partitions from all trees

Trees:

How many times each partition of species is found:

AE | BCDF 3ACE | BDF 3ACEF | BD 1AC | BDEF 1AEF | BCD 1ADEF | BC 2ABDF | EC 1ABCE | DF 3

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

Week 7: Bayesian inference, Testing trees, Bootstraps – p.50/65

Page 58: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The majority-rule consensus tree

C

A

Trees:

How many times each partition of species is found:

AE | BCDF 3ACE | BDF 3ACEF | BD 1AC | BDEF 1AEF | BCD 1ADEF | BC 2ABDF | EC 1ABCE | DF 3

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

C

A

B

DF

E

A C

B

DF

E60

60 60

Week 7: Bayesian inference, Testing trees, Bootstraps – p.51/65

Page 59: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 60: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Such a character is compatible with a tree if (and only if) it containsthat partition.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 61: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Such a character is compatible with a tree if (and only if) it containsthat partition.

If two of these characters both occur in more than 50% of the trees,they must co-occur in at least one tree.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 62: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Such a character is compatible with a tree if (and only if) it containsthat partition.

If two of these characters both occur in more than 50% of the trees,they must co-occur in at least one tree.

Thus the set of these “characters” that occur in more then 50% ofthe trees are all pairwise compatible.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 63: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Such a character is compatible with a tree if (and only if) it containsthat partition.

If two of these characters both occur in more than 50% of the trees,they must co-occur in at least one tree.

Thus the set of these “characters” that occur in more then 50% ofthe trees are all pairwise compatible.

By the Pairwise Compatibility Theorem (remember that?) they mustthen be jointly compatible

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 64: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How do we know that the MR consensus tree will be a tree?

Suppose that for each partition in a tree we construct a (fake)morphological character with 0 for one set in the partition, 1 for theother.

Such a character is compatible with a tree if (and only if) it containsthat partition.

If two of these characters both occur in more than 50% of the trees,they must co-occur in at least one tree.

Thus the set of these “characters” that occur in more then 50% ofthe trees are all pairwise compatible.

By the Pairwise Compatibility Theorem (remember that?) they mustthen be jointly compatible

So there must be a tree that contains them all.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.52/65

Page 65: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The MR tree with 14-species primate mtDNA data

Bovine

Mouse

Squir Monk

Chimp

Human

Gorilla

Orang

Gibbon

Rhesus Mac

Jpn Macaq

Crab−E.Mac

BarbMacaq

Tarsier

Lemur

80

72

74

99

99

100

77

42

35

49

84

Week 7: Bayesian inference, Testing trees, Bootstraps – p.53/65

Page 66: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Potential problems with the bootstrap

1. Sites may not evolve independently

2. Sites may not come from a common distribution (but can considerthem sampled from a mixture of possible distributions)

3. If do not know which branch is of interest at the outset, a“multiple-tests" problem means P values are overstated

4. P values are biased (too conservative)

5. Bootstrapping does not correct biases in phylogeny methods

Week 7: Bayesian inference, Testing trees, Bootstraps – p.54/65

Page 67: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Other resampling methods

Delete-half jackknife. Sample a random 50% of the sites, without

replacement.

Delete-1/e jackknife (Farris et. al. 1996) (too little deletion from a

statistical viewpoint).

Reweighting characters by choosing weights from an exponentialdistribution.

In fact, reweighting them by any exchangeable weights having

coefficient of variation of 1

Parametric bootstrap – simulate data sets of this size assuming the

estimate of the tree is the truth

(to correct for correlation among adjacent sites) (Künsch, 1989)

Block-bootstrapping – sample n/b blocks of b adjacent sites.

Week 7: Bayesian inference, Testing trees, Bootstraps – p.55/65

Page 68: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

With the delete-half jackknife

Bovine

Mouse

Squir Monk

Chimp

Human

Gorilla

Orang

Gibbon

Rhesus Mac

Jpn Macaq

Crab−E.Mac

BarbMacaq

Tarsier

Lemur

80

99

100

84

98

69

72

80

50

59

32

Week 7: Bayesian inference, Testing trees, Bootstraps – p.56/65

Page 69: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Bootstrap versus jackknife in a simple case

Exact computation of the effects of

deletion fraction for the jackknife

n1

n2

n characters

n(1−δ) charactersm

1m

2

m2

>m1

Prob( )

m2

>m1

Prob( )

m2

>m1

Prob( )Prob(

m2

=m1

Prob( )+ 1

2

We can compute for various n’s the probabilities

of getting more evidence for group 1 than for group 2

A typical result is for n1

= 10, n2

= 8, n = 100 :

Bootstrap

Jackknife

δ = 1/2 δ = 1/e

0.6384

0.7230

0.6807

0.5923

0.7587

0.6755

0.6441

0.8040

0.7240

(suppose 1 and 2 are conflicting groups)

Week 7: Bayesian inference, Testing trees, Bootstraps – p.57/65

Page 70: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Probability of a character being omitted from a bootstrap

N (1 − 1/N)N

1 0 11 0.35049 25 0.36040 100 0.36603

2 0.25 12 0.35200 30 0.36166 150 0.36665

3 0.29630 13 0.35326 35 0.36256 200 0.36696

4 0.31641 14 0.35434 40 0.36323 250 0.36714

5 0.32768 15 0.35526 45 0.36375 300 0.36727

6 0.33490 16 0.35607 50 0.36417 500 0.36751

7 0.33992 17 0.35679 60 0.36479 1000 0.36770

8 0.34361 18 0.35742 70 0.36524 ∞ 0.36788

9 0.34644 19 0.35798 80 0.36557

10 0.34868 20 0.35849 90 0.36583

Week 7: Bayesian inference, Testing trees, Bootstraps – p.58/65

Page 71: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

A toy example to examine bias of P values

0

True value of mean

"Topology" II "Topology" I

True distribution of sample meansDistribution of individualvalues of x

of sample means

Estimated distributions

Assuming a normal distribution, trying to infer whether the mean is above0, when the mean is unknown and the variance known to be 1

Week 7: Bayesian inference, Testing trees, Bootstraps – p.59/65

Page 72: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Bias in the P values

estimate of the "phylogeny"

topology Itopology II 0

P

note that thetrue P is moreextreme thanthe average ofthe P’s

the true mean

Week 7: Bayesian inference, Testing trees, Bootstraps – p.60/65

Page 73: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

How much bias in the P values?

1.0

0.8

0.6

0.4

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0

Avera

ge P

True P

Week 7: Bayesian inference, Testing trees, Bootstraps – p.61/65

Page 74: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Bias in the P values with different priors

0.00

0.20

0.40

0.60

0.80

1.00

0.00 0.50 1.00

Pro

babili

ty o

f corr

ect to

polo

gy

1.0

0.1

2.02

= n σ

µP for expectation of

Week 7: Bayesian inference, Testing trees, Bootstraps – p.62/65

Page 75: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The parametric bootstrap

original

data

estimate

of tree

data

set #1

data

data

data

set #2

set #3

set #100

computer

simulation

estimation

of tree

T1

T

T

2

T3

100

Week 7: Bayesian inference, Testing trees, Bootstraps – p.63/65

Page 76: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

The parametric bootstrap with the primates data

Bovine

Lemur

Tarsier

Squirrel Monkey

Mouse

Jp Macacque

Barbary Mac

Crab−Eating Mac

Rhesus Mac

Gorilla

Chimp

Human

Orang

Gibbon

96

95

96

93

98

82

98

81

78

83

98

Week 7: Bayesian inference, Testing trees, Bootstraps – p.64/65

Page 77: Week 7: Bayesian inference, Testing trees, Bootstraps · To draw trees from a distribution whose probabilities are proportional to f(t), we can use the Metropolis algorithm: 1. Start

Goldman’s test using simulation

data

T

Tc

no clock

clock

l

l

l − l2 (

. . .

data data data . . . data

estimating clocklike and nonclocklike trees from each data set ...

l − l2 ( l − l2 ( l − l2 (

l − l2 (

simulating data sets

. . .

(related to the "parametric bootstrap")

c )l − l2 ( )

c )

c )

c

c )

) c

c

log−likelihoodtree

Week 7: Bayesian inference, Testing trees, Bootstraps – p.65/65