Upload
center-for-evolutionary-medicine-informatics-at-arizona-state-university
View
1.293
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Dr. Battistuzzi's presentation during the 2011 CEMI MPA Workshop.
Citation preview
Center for Evolutionary Medicine and Informatics
MPAWEstimation of divergence times
Fabia U. Battistuzzi
Center for Evolutionary Medicine and Informatics
Two dimensions of evolution Staphylococcaceae
Lactobacillaceae
Mycoplasmataceae1
Symbiobacterium
Thermoanaerobacteriaceae1
Dehalococcoides
Synechococcaceae2
Merismopediaceae
Frankiaceae
Nocardiaceae
Bifidobacteriaceae
Francisellaceae
Enterobacteriaceae
Colwelliaceae
Pseudomonadaceae
Legionellaceae
Piscirickettsiaceae
Rhodocyclaceae
Alcaligenaceae
Erythrobacteraceae
Bradyrhizobiaceae
Bartonellaceae
Acetobacteraceae
Rickettsiaceae
Myxococcaceae
Geobacteraceae
Chlamydiaceae
Bacteroidaceae
Spirochaetaceae
050010001500200025003000
Lineage Relations
Time frame
Evolutionary Rate
Center for Evolutionary Medicine and Informatics
Molecular clocks – brief overview
Time
Se
que
nce
Cha
nge
X
Kumar, Nature Reviews Genetics (2005)
1962 1968
19721976
1984 1989
19972006
1st protein clock
Center for Evolutionary Medicine and Informatics
Molecular clocks – brief overview
Kumar, Nature Reviews Genetics (2005)
1962 1968
19721976
1984 1989
19972006
1st protein clock
Neutral theory
Rate tests
Center for Evolutionary Medicine and Informatics
Molecular clocks – brief overview
Kumar, Nature Reviews Genetics (2005)
1962 1968
19721976
1984 1989
19972006
1st protein clock
Neutral theory
Deut.-Prot. divergenceRate tests
Rate Autocorrelation
Ancestor
Descendant
slower faster
Center for Evolutionary Medicine and Informatics
Molecular clocks – brief overview
Kumar, Nature Reviews Genetics (2005)
1962 1968
19721976
1984 1989
19972006
1st protein clock
Neutral theory
Deut.-Prot. divergenceRate tests
Rate Autocorrelation
Local rates
slower faster
Center for Evolutionary Medicine and Informatics
Molecular clocks – brief overview
Kumar, Nature Reviews Genetics (2005)
1962 1968
19721976
1984 1989
19972006
1st protein clock
Neutral theory
Deut.-Prot. divergenceRate tests
Rate Autocorrelation
Local rates
Autocorrelated clocks
Uncorrelated clocks
Center for Evolutionary Medicine and Informatics
What can we do with molecular clocks?
• Species divergence• Phylogeography• Epidemiology• Rate estimations
Eastern fox squirrel (Sciurus niger) lacks phylogeographic structure: recent range expansion and phenotypic differentiation
Center for Evolutionary Medicine and Informatics
Molecular clock packages available
• BEAST – Drummond & Rambaut– Uncorrelated rates
• MCMCTree – Yang– Uncorrelated and autocorrelated rates
• MultiDivTime – Thorne & Kishino– Autocorrelated rates between ancestor-descendant
• Pathd8 – Britton et al.– Autocorrelation between sister groups
• Phylobayes – Lartillot et al.– Uncorrelated and autocorrelated rates
• R8s – Sanderson– Strict, local, relaxed clock
Center for Evolutionary Medicine and Informatics
Molecular clock packages available
• BEAST – uncorrelated rates• MCMCTree – uncorrelated & autocorrelated rates• MultiDivTime – autocorrelated rates
Basic functionality
1. bayesian methods: based on priors and data, estimate posteriors (divergence times and credibility intervals)
2. analyze partitioned data (codon positions, genes)
3. calibration points
4. estimate phylogeny and/or branch lengths
Center for Evolutionary Medicine and Informatics
Calibration priors
Minimum only Maximum only Minimum-Maximum
time
lognormal
: 95% probability
uniformtime
exponentialtime
normaltime
Center for Evolutionary Medicine and Informatics
Calibration priors
Minimum only Maximum only Minimum-Maximum
time
: 95% probability
time
exponentialtime
normaltime
Hedges and Kumar, Trends in Genetics (2004)
Center for Evolutionary Medicine and Informatics
BEAUTI & BEAST
• nexus file • xml file
Center for Evolutionary Medicine and Informatics
Phylogeny specification
<newick id="startingTree">(((((Ssc:0.65,Bta:0.65):0.16,((Cfa:0.46,Fca:0.46):0.28,Eca:0.74):0.07):0.11,
(((Rno:0.20,Mmu:0.20):0.65,Ocu:0.85):0.05,(((Hsa:0.05,Ptr:0.05):0.05,Ppy:0.10):0.13,Mml:0.23):0.67):0.02):0.81,Tvu1:1.73):1.37,Gga:3.10);</newick>
Center for Evolutionary Medicine and Informatics
Strict clock & Relaxed clock
Priors
Center for Evolutionary Medicine and Informatics
Operators
• remove “Tree” operator for fixed phylogeny
Center for Evolutionary Medicine and Informatics
Generations
• Convergence & ESS values
Center for Evolutionary Medicine and Informatics
Beast running…
Center for Evolutionary Medicine and Informatics
A fuzzy caterpillar
Center for Evolutionary Medicine and Informatics
MCMCTree
seqfile = exampleseqs.phy treefile = example.tre outfile = exampleseqs_3.out
(((((Ssc,Bta),((Cfa,Fca)'B(0.45,0.47)',Eca)),(((Rno,Mmu),Ocu),(((Hsa,Ptr),Ppy)'B(0.09,0.11)',Mml))),Tvu1),Gga);
Center for Evolutionary Medicine and Informatics
MCMCTree
seqfile = exampleseqs.phy treefile = example.tre outfile = exampleseqs_3.out
(((((Ssc,Bta),((Cfa,Fca)‘L(0.35,0.1,0.5,0.025)',Eca)),(((Rno,Mmu),Ocu),(((Hsa,Ptr),Ppy),Mml))),Tvu1),Gga);
ppLL
ppLLttLL pp cc
Center for Evolutionary Medicine and Informatics
MCMCTree
seqfile = exampleseqs.phy treefile = example.tre outfile = exampleseqs_3.out
ndata = 1 usedata = 3 * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV clock = 3 * 1: global clock; 2: independent rates; 3: correlated rates RootAge = < 3.0 * safe constraint on root age, used if no fossil for root.
Ancestor
Descendant
slower fasterslower faster
Ancestor
Descendant
uncorrelated autocorrelated
Center for Evolutionary Medicine and Informatics
MCMCTree
( , )F c TL
model = 4 * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85alpha = 0 * alpha for gamma rates at sitesncatG = 5 * No. categories in discrete gamma
cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)?
BDparas = 2 2 0.1 * birth, death, samplingkappa_gamma = 6 2 * gamma prior for kappaalpha_gamma = 1 1 * gamma prior for alpha
rgene_gamma = 1 7.13 * gamma prior for overall rates for genessigma2_gamma = 1 1.15 * gamma prior for sigma^2 (for clock=2 or 3)
rgene: prior on rate parameter;
Sigma2: prior on rate heterogeneity;
( , )F BL TL
Center for Evolutionary Medicine and Informatics
TimeTrees
300 250 200 150 100 50 0
Time (millions of years)
Center for Evolutionary Medicine and Informatics
TimeTrees
Ssc Bta Cfa Fca Eca Rno Mmu Ocu Hsa Ptr Ppy Mml Tvu1 Gga
050100150200
Time (millions of years)
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
true time
est
ima
ted
tim
e
Model Match
Model Violation
Center for Evolutionary Medicine and Informatics
TimeTrees
Ssc Bta Cfa Fca Eca Rno Mmu Ocu Hsa Ptr Ppy Mml Tvu1 Gga
050100150200
Time (millions of years)
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
true time
est
ima
ted
tim
e
Model Match
Model Violation
Model ViolationBEAST
Center for Evolutionary Medicine and Informatics
0
10
20
30
40
50
60
70
-45 -35 -25 -15 -5 5 15 25 35 45 55 65
% difference from true time
freq
uenc
y
ModelMatch
ModelViolationBEAST
Center for Evolutionary Medicine and Informatics
MCMCTree
0
10
20
30
40
50
60
-45 -35 -25 -15 -5 5 15 25 35 45 55 65
% difference from true time
fre
que
ncy
ModelMatch
ModelViolation autocorrelation
0
10
20
30
40
50
60
-45 -35 -25 -15 -5 5 15 25 35 45 55 65
% difference from true time
fre
que
ncy
ModelMatch
ModelViolation uncorrelation
Center for Evolutionary Medicine and Informatics
MultiDivTime
0
10
20
30
40
50
60
70
-45 -35 -25 -15 -5 5 15 25 35 45 55 65
% difference from true time
fre
qu
en
cyModel Match
Model Violation
Center for Evolutionary Medicine and Informatics
16%
84%
95% Credibility intervals
6%
94%
4%
96%
12%
88%
TT
TT
Success
Failure
Model Match
Model Violation
Center for Evolutionary Medicine and Informatics
Things to remember
• Check priors for calibrations, substitution rate, rate model, etc.• Repeat every analyses at least twice to check for convergence• Look for the “fuzzy caterpillar” for all parameters• Test assumptions’ effects using multiple methods and priors (bayes factors)• Credibility intervals are a conservative estimate of divergences
Questions ?