46
Assessing Model Fit

Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Assessing Model Fit

Page 2: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Evidence for Systematic Error (Bias)

The maximum-likelihood analysis recovered strong support forctenophores as sister to all other metazoan lineages (BS = 93)(Fig. S5D). However, Bayesian inference (Fig. S5E) recoveredsponges as sister to all other metazoans, but support for this andother deep nodes were low (PP ≤ 90).

Systematic Biases and Their Effect on Phylogenetic Inference. Long-branch (LB) scores (28), a measurement for identifying taxa andOGs that could cause LBA, were calculated for each species andOG with TreSpEx (25). In total, we identified six “long-branched”taxa, all nonmetazoans (Fig. S6A and Table S2), and 28 OGswith high LB scores compared with other OGs (Fig. S6 B and C).We found complete congruence in relationships among basalmetazoan phyla in trees inferred with (datasets 1, 2, 8, 12, and 18in Fig. 2) and without (datasets 3–7, 9–11, 13–17, 19–21, and 22–25 in Fig. 2) taxa and genes that had high LB scores, and nodalsupport for critical nodes showed little variation among analyses(Fig. 3 and Figs. S1–S5). Removing OGs with high amino acidcompositional heterogeneity (datasets 7–11, 17–21, 23, and 25 inFig. 2) also had no effect on branching order (Fig. 3 and Figs. S2A–E, S3 E and F, S4 A–E, and S5A). Topologies inferred withonly the slowest evolving half of OGs assembled here (datasets 6and 16 in Fig. 2) (i.e., least saturated and least prone to homo-plasy; see Fig. S7 for saturation plots) recovered high support forctenophores sister to all other animals and sponge monophylywith both maximum-likelihood (BS = 100) (Fig. 3 and Figs. S1Fand S3D) and Bayesian inference using the CAT-GTR model(PP = 1) (Fig. 3 and Fig. S5 B and C). Importantly, our datasetsof the slowest evolving half of OGs were of a broad range of

protein classes (SI Methods; figshare), rather than consisting of amajority of ribosomal proteins (7, 9).Inaccurate orthology assignment can also introduce systematic

error into phylogenomic analyses. Although relationships amongbasal lineages were unaffected, removal of paralogs as identifiedby TreSpEx appeared to have the greatest effect on support forsome critical nodes. For example, most topologies with bothcertain and uncertain paralogs removed had strong support forsponge monophyly (i.e., ≥ 95% BS) (datasets 12–14 and 18–20 inFigs. 2 and 3 and Figs. S2F, S3 A, B, and F, and S4 A and B), butfour analyses with only certain paralogs removed recovered lowsupport (< 90% BS) for sponge monophyly (datasets 5, 7, 9, and10 in Figs. 2 and 3 and Figs. S1E and S2 A, D, and E).Because outgroup sampling has the potential to influence

rooting of the animal tree, we explored outgroup sampling aswell. When all outgroups except two choanoflagellates were re-moved (datasets 5, 11, 15, and 21 in Fig. 2), inferred nonbilaterianrelationships were identical as in analyses we performed with fulloutgroup sampling (datasets 5, 11, 15, and 21 in Figs. 2 and 3 andFigs. S1E, S2E, S3C, and S4C), but support for sponge mono-phyly decreased. In these analyses the leaf-stability indices forhomoscleromorph and calcareous sponges were less than 0.94,but in all other analyses they were greater than 0.97 (Fig. S5 F andG). Regardless, when choanoflagellates were the only outgroup,ctenophores were still recovered as the deepest split within theanimal tree with 100% BS support. Analyses with all outgrouptaxa removed (datasets 22–25 in Fig. 2) recovered identical re-lationships among major metazoan lineages as other analyses(Figs. S4 D–F and S5A). However, we observed low support forrelationships among ctenophores, sponges, and placozoans inthese analyses. This resulted from the long placozoan branchbeing attracted to ctenophores in the absence of outgroup taxa asindicated by bootstrap tree topologies and leaf-stability index forTrichoplax of less than 0.92, whereas leaf-stability indices weregreater than 0.99 in all other analyses (Fig. S5 F and G).

DiscussionPlacement of Ctenophores Sister to all Remaining Animals Is NotSensitive to Systematic Errors. Every analysis conducted hereinstrongly supported the ctenophore-sister hypothesis (Fig. 3 andTable 1). A major hurdle to wide acceptance of ctenophores assister to other animals has been that different analyses haveyielded conflicting hypotheses of early animal phylogeny (2–9).Sensitivity to the selected model of molecular evolution has beenespecially problematic (2–9). In contrast, both maximum-likeli-hood analyses using data partitioning and Bayesian analysesusing the CAT-GTR model of our datasets resulted in identicalbranching patterns among ctenophores, sponges, placozoans,cnidarians, and bilaterians. Past critiques of studies that foundctenophores to be sister to all other animals have emphasized theCAT model as the most appropriate model for deep phyloge-nomics because it is an infinite mixture model that accounts forsite-heterogeneity (7, 8, 29). Notably, when the CAT-GTR modelwas used here (datasets 6 and 16 in Fig. 2), we recovered cteno-phores-sister to all other metazoans (Fig. 3 and Fig. S5 B and C).The argument for LBA (7–10) or saturated datasets (7, 8) as the

reason past studies found ctenophores to be sister to all other an-imals seems to have been overstated. The recovered position ofctenophores was identical in analyses with (datasets 1, 2, 8, 12, and18 in Fig. 2 and Figs. S1 A and B, S2 B and F, and S3F) and without(datasets 3–7, 9–11, 13–17, and 19–25 in Fig. 2, and Figs. S1 C–F, S2A and C–E, S3 A–E, S4, and S5 A–C) taxa and genes with high LBscores, and analyses with the slowest evolving genes (datasets 6 and16 in Fig. 2 and Fig. S7) also recovered ctenophores sister to allother animals (Fig. 3 and Figs. S1F, S3D, and S5 B and C). Fur-thermore, despite the long internal branch leading to the cteno-phore clade, the position of this lineage did not change in anyanalysis including those when outgroups were removed (datasets 5,11, 15, 21, and 22–25 in Fig. 2 and Figs. S1E, S2E, S3C, and S4 C–F). If this branch was being artificially attracted toward outgroups,then employment of different outgroup schemes would be expected

Aurelia aurita

Hormathia digitata

Rossella fibulata

Beroe abyssicola

Drosophila melanogaster

Craseoa lathetica

Petromyzon marinus

Hydra oligactis

Spongilla alba

Strongylocentrotus purpuratus

Abylopsis tetragona

Ministeria vibrans

Danio rerio

Pleurobrachia bachei

Lottia gigantea

Eunicella verrucosa

Hydra viridissima

Tethya wilhelma

Mortierella verticillata

Nanomia bijuga

Sycon coactum

Oscarella carmela

Vallicula sp.

Spizellomyces punctatus

Mnemiopsis leidyi

Aiptasia pallida

Allomyces macrogynus

Mertensiidae sp.

Petrosia ficiformisPseudospongosorites suberitoides

Rhizopus oryzae

Amoebidium parasiticum

Capitella teleta

Crella elegans

Monosiga ovata

Acropora digitifera

Aphrocallistes vastus

Bolinopsis infundibulum

Capsaspora owczarzakiSalpingoeca rosetta

Tubulanus polymorphus

Agalma elegans

Corticium candelabrum

Hyalonema populiferum

Hydra vulgaris

Euplokamis dunlapae

Lithobius forficatus

Bolocera tuediae

Pleurobrachia sp.

Amphimedon queenslandica

Hemithiris psittacea

Ircinia fasciculata

Dryodora glandiformis

Kirkpatrickia variolosa

Sphaeroforma arctica

Daphnia pulex

Homo sapiens

Platygyra carnosa

Ephydatia muelleriChondrilla nucula

Saccoglossus kowalevskii

Periphylla periphylla

Trichoplax adhaerens

Sympagella nux

Latrunculia apicalis

Coeloplana astericola

Physalia physalis

Sycon ciliatum

Nematostella vectensis

Priapulus caudatus

96

95

61

9965

94

79

0.2

Beroe abyssicolaa

Pleurobrachia bacheiVallicula sp.

Pl

Mnemiopsis leidyii

Mertensiidae sp.M i t

Bolinopsis infundibulumM t iid

Euplokamis dunlapae

Pleurobrachia ssp.Dryodora glandifoormisB b i lb

p

Coeloplana asterricolaV lli l

pp

99

Rosse

Spongilla albaTethya wilhelma

S

Sycon coactum

Oscarella carmela

Petrosia ficiformisPseudospongosorites su

Crella elegans

Aphroc

Corticium candelabrum

H

Amphimedon queenAi fi if i

p

Ircinia fasciculata

Kirkpatrickia variolosagg

Ephydatia muellerip gp g

Chondrilla nuculayy

p

SympaA hh

Latrunculia apicalisC ll l

Sycon ciliatumll l

96

61

y gTrichoplax adhaerens

Aurelia aurita

Hormathia digitata

Craseoa lathetica

Hydra oligactis

Abylopsis tetragonaAl th ti

Eunicella verrucosa

Hydra viridissima

Nanomia bijuga

Aiptasia pallidaH thi di ithi

Acropora digitifera

Agalma elegans

Hydra vulgarisH d li tiH d li

yy

Bolocera tuediaegg

Platygyr

Periphylla periphyllaAb l ib l

Physalia physalisH d i idi id i idi

Nematostella vectensis

94

ella fifibulata

uberittoides

callisttes vastus

Hyaloonema populiferum

nslanndica

agellaa nuxlli tl t

raa carnosa

Fig. 3. Reconstructed maximum-likelihood topology of metazoan relation-ships inferred with dataset 10. Maximum likelihood and Bayesian topologiesinferred with other datasets (Fig. 2) have identical basal branching patterns(Figs. S1–S5). Nodes are supported with 100% bootstrap support unless oth-erwise noted. Support, as inferred from each dataset (Fig. 2), for nodes cov-ered by black boxes are in Table 1.

Whelan et al. PNAS | May 5, 2015 | vol. 112 | no. 18 | 5775

EVOLU

TION

Ctenophores

Whelan et al., PNAS, 2015

strongly supported Porifera-sister instead (Fig. 1 A–C). In otherwords, under the better-fitting site-heterogeneous model, cteno-phores emerge as sister to all other animals only when the mostdistantly related outgroup, Fungi, is included, suggesting Cteno-phora-sister most likely represents a long-branch attraction artifact.Repeating the analyses under CAT-GTR also gave preliminarysupport for Porifera-sister, but we were unable to run this analysisto convergence within the time frame of this study (Fig. S1D).

Analysis of the Moroz et al. Phylogenomic Datasets. In the Pleuro-brachia bachei genome study (5), the Ctenophora-sister hy-pothesis was obtained from the analysis of two datasets, one ofwhich was constructed to maximize the number of species and theother to maximize the number of proteins. Whereas the datasetemphasizing protein sampling was broadly comparable to thedataset of Ryan et al. (4), the dataset emphasizing species sampling(Moroz-3D; Methods) was unique because it included the largestnumber of ctenophores sampled thus far. Given that the sameauthors have now assembled new datasets (6) that supersede theprotein-rich datasets of Moroz et al. (5) (discussed in the nextsection), we only analyzed the species-rich dataset Moroz-3D.The analysis of Moroz et al. (5) was conducted under the site-

homogeneous Whelan and Goldman (WAG) model (20), whichgave a tree congruent with the Ctenophora-sister hypothesis,albeit with weak statistical support. However, analyzing theMoroz-3D dataset using the similar but generally better-fittingsite-homogeneous Le and Gascuel (LG) model (44), we found adifferent tree with a better likelihood score (Fig. S2A). This treeunited demosponges and glass sponges as the sister group of allother animals, followed by ctenophores and then by calcareousand homoscleromorph sponges. Although statistical support for

this branching order is very low (Fig. S2A), the same is true forthe tree found by Moroz et al. (5). Finally, an analysis of thisdataset using the better-fitting site-heterogeneous CAT-GTRmodel (45) supported demosponges, glass sponges, and homo-scleromorphs as the sister group of all other animals, followed byctenophores. However, in this tree, the calcareous sponges aredeeply nested within cnidarians (Fig. S2B), and, furthermore,this analysis did not converge. The high dissimilarity betweenthese three trees and the uniformly low support obtained acrossall analyses suggest the phylogenetic signal in this dataset is veryweak. This weakness of signal might, among other factors, be re-lated to massive amounts of missing data, which reach 98% for thecalcareous sponges, the most unstable lineage in this dataset.Furthermore, Moroz et al. (5) reported that using a subset of theirdata consisting only of the most conserved proteins, they wereunable to resolve relationships of the major animal lineages andcould not reject Porifera-sister with statistical tests. Accordingly, weconclude the Moroz-3D dataset does not provide sufficient signalfor resolving the position of Ctenophora.

Analysis of the Whelan et al. Phylogenomic Datasets. Whelan et al.(6) assembled 25 datasets differing in protein and species selec-tion, and recovered Ctenophora-sister with strong support from allof them. Although they pointed out the importance of using site-heterogeneous substitution models, as well as the impact of out-group composition, they did not examine the combined effect ofthese factors. That is, all of the outgroup-subsampled datasetswere analyzed exclusively using site-homogeneous substitutionmodels, whereas the analyses using the better-fitting site-heterogeneousmodel were exclusively performed using the full set of outgroups, whichincluded distantly related Fungi.

0.98

0.88

0.99

0.77

0.99

0.3 0.3

0.99

0.99

0.98

Porifera

Ctenophora

Cnidaria

Bilateria

Choanoflagellata

Placozoa

Demospongiae

Homoscleromorpha

Calcarea

Hexactinellida

A B

Fig. 1. (A) Phylogeny inferred from Ryan-Choano (4) using the site-heterogeneous CAT model. (B) Phylogeny inferred from Whelan-D16-Choano (6) usingthe site-heterogeneous CAT-GTR model. For both analyses, we used the site-heterogeneous model implemented by the original study and limited theoutgroups to include only choanoflagellates (the closest living relatives of animals) (details and justifications are provided in Addressing Biases in PhylogeneticReconstruction and Methods). Major groups are summarized, and full phylogenies illustrated are in Figs. S1 and S4C. Nodes with maximal statistical supportare marked with a circle. Most silhouettes from organisms are from Phylopic (phylopic.org/).

15404 | www.pnas.org/cgi/doi/10.1073/pnas.1518127112 Pisani et al.

Pisani et al., PNAS, 2015

Who are the earliest diverging animals?

Sponges

Page 3: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Evidence for Systematic Error (Bias)Backbone Tree for All Birds

modern birds have been notoriously diffi-cult to resolve2–4. The problem for evolution-ary ornithologists is that ancient divergences over short periods are exceedingly difficult to tease apart, and this has limited their ability to make robust inferences about early bird evo-lution. Prum and colleagues use a genomic sequen cing technique called anchored hybrid enrichment5 to sample highly conserved (slowly evolving) regions of the genome and faster-evolving flanking regions that together are particularly well suited to teasing apart rapid, but ancient, radiations.

Prum and colleagues’ phylogeny differs dramatically from another analysis reported last year, by Jarvis et al.2, of an exceptionally large data set of more than 40 million base pairs of nucleo tide sequence data from 48 avian genomes. Not surprisingly, the conflict is focused on the earliest branching events that separate major non-passerine taxa (Fig. 1). For example, a clade including hum-mingbirds, swifts and nightjars is shown to be sister to the rest of the Neoaves — a clade that includes all living bird species except for Pal-aeognathae (such as ostriches and kiwis), Gal-liformes (landfowl) and Anseriformes (ducks and geese) — rather than sister to grebes and flamingos. And an entirely new clade, called Aequorlitornithes, is identified with strong support and consists of the majority of Neo-avian groups of waterbirds.

Why does the topology of some parts of the two trees differ so fundamentally despite the use of exceptionally large genomic data sets in both studies? One possibility is that an explosion of speciation after the Cretaceous–Palaeo gene extinction saw all major lineages of birds branch off near-simultaneously. Indeed, the early diversification of birds may have been so rapid that it resembles a network, or bush, rather than a beautifully bifurcating tree of life. Recent support for this idea comes from the finding6 that a process called incomplete lineage sorting (ILS) was rampant when the major lineages of birds diversified. The effect of ILS is that different parts of the genome yield different evolutionary relationships and produce a pattern akin to tangled roots, rather than a tree. ILS is usually identified only in recent species radiations, but large genomic data sets allow for detailed tests that delve deeper into the evolutionary past. Although the difference between the phylo genies could simply result from the two data sets having sampled different parts of the genome that happen to be incongruent as a result of ILS, this seems unlikely given the vast amount of genomic data sequenced in each study.

Instead, Prum et al. explore an alternative and perhaps more likely cause for discrepan-cies between phylogenetic hypotheses — a phenomenon called long-branch attraction. Long-branch attraction occurs when distant evolutionary relatives are incorrectly inferred to be close relatives; this can arise if evolution

has proceeded at an exceptionally high rate, or when one lineage has no close relatives. Prum et al. deal with long-branch attraction by sam-pling both deeply (the number of nucleotides sequenced) and broadly (the number of spe-cies). This strategy is rooted in well-established systematic theory, which shows that sampling more species can break long branches. When the 198-species data set of Prum et al. is reduced to include only the 48 species in Jarvis and colleagues’ data, Prum and colleagues’ phylogeny breaks down because many rela-tionships change fundamentally. The breadth of sampling is critical.

A well-resolved phylogeny is the basis of robust dating. Congruence between dates from molecular phylogenies and the fossil record is a rare thing, and for birds it is likely to prove controversial7,8. Prum and colleagues’ conclu-sion of an explosive radiation after the Creta-ceous–Palaeogene mass extinction is markedly different from the conclusions of many previ-ous molecular studies, which typically suggest that most major avian orders and many fami-lies originated further back in the Cretaceous period (for examples, see refs 9–11).

Although genomic-scale data can add pre-cision to dating estimates, accuracy relies on the quality of the fossil material and our ability to place it in the correct evolutionary context. Phylogenetic trees are calibrated using the fossil record with all its inherent imperfec-tions. But the fossil record of birds is patchy, incomplete and geographically biased. For most divergences no fossil evidence is avail-able, and for others the fossil record probably

underestimates the true age of origination, because the fossils discovered for a particu-lar group are likely to be younger than the age of divergence. The age of the root of the avian tree is particularly contentious, with different calibrations placing the explosive radiation of Neoaves either before7 or after8 the Cretaceous–Palaeogene mass extinction. In the absence of a perfect fossil record, the best we can do is experiment with different cali-bration dates and levels of uncertainty around those dates. The new genomic data sets await extensive experimentation of this type, and so, as compelling as the historical narrative may be, it is perhaps for now best treated with hopeful caution. ■

Gavin H. Thomas is in the Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.e-mail: [email protected] 1. Prum, R. O. et al. Nature 526, 569–573 (2015).2. Jarvis, E. D. et al. Science 346, 1320–1331 (2014).3. Hackett, S. J. et al. Science 320, 1763–1768 (2008).4. McCormack, J. E. et al. PLoS ONE 8, e54848 (2013).5. Lemmon, A. R., Emme, S. & Lemmon, E. C. System.

Biol. 61, 721–744 (2012).6. Suh, A., Smeds, L. & Ellegren, H. PLoS Biol. 13,

e1002224 (2015).7. Mitchell, K. J., Cooper, A. & Phillips, M. J. Science

349, 1460 (2015).8. Cracraft, J. et al. Science 349, 1460 (2015).9. Jetz, W., Thomas, G. H., Joy, J. B., Hartmann, K. &

Mooers, A. O. Nature 491, 444–448 (2012).10. Pacheco, M. A. et al. Mol. Biol. Evol. 28, 1927–1942

(2011).11. Brown, J. W., Rest, J. S., García-Moreno, J., Sorenson, M. D.

& Mindell, D. P. BMC Biol. 6, 6 (2008).

This article was published online on 7 October 2015.

Prum et al. Jarvis et al.

Hummingbirds, swifts and nightjars

Turacos

Bustards

Cuckoos

Sandgrouse

Mesites

Pigeons

Cranes

Flamingos and grebes

Waders

Tropicbirds and sunbitterns

Penguins

Tubenoses

Pelicans

Cormorants

Herons

Hoatzins

Owls to woodpeckers

Seriemas

Songbirds, parrots and falcons

Eagles and New World vultures

Divers

Ibises

Figure 1 | A comparison of avian phylogenies. The phylogenetic relationships presented by Prum et al.1

and Jarvis et al.2 have here been distilled down to major bird lineages. The comparison reveals several differences between the postulated evolutionary relationships of taxa.

2 2 O C T O B E R 2 0 1 5 | V O L 5 2 6 | N A T U R E | 5 1 7

NEWS & VIEWS RESEARCH

© 2015 Macmillan Publishers Limited. All rights reserved

Thomas 2015.

These are enormous datasets, yet they

conflict strongly for early divergences.

Page 4: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Evidence for Systematic Error (Bias)

These are two high-profile examples, but there are many others (we’ll talk about turtles later).

When conflict is this strong, stochastic error is not a plausible explanation.

Data is no longer limiting. We are now limited by our ability to accurately extract information from the data.

Page 5: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D)

Page 6: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D) (2) Define ModelsM1

M2

M3

D

D

D

Page 7: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D) (2) Define ModelsM1

M2

M3

D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

Page 8: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D) (2) Define ModelsM1

M2

M3

D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFLRT

L1

L2

L3}M2 > M1 > M3

Page 9: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D) (2) Define ModelsM1

M2

M3

D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFLRT

L1

L2

L3}M2 > M1 > M3

(5) Report Inferences from “Best” Model

NatureThe Shapes of our DataIntro

2012

Methods

Results

Blah blah blah blah. Blah blah blah blah blah blah. Blah blah blah important blah blah blah.Science blah blah blah blah blah.

Blah blah blah blah. Blah blah blah liklihood blah 3 models. Blah blah blah AIC blah blah blah.Science blah blah blah blah.

M2 !Conclusions

Both squares and circles are important.

Page 10: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Standard Approach

{3.4, 2.1, 5.4, ...}

(1) Collect Data (D) (2) Define ModelsM1

M2

M3

D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFLRT

L1

L2

L3}M2 > M1 > M3

(5) Report Inferences from “Best” Model

NatureThe Shapes of our DataIntro

2012

Methods

Results

Blah blah blah blah. Blah blah blah blah blah blah. Blah blah blah important blah blah blah.Science blah blah blah blah blah.

Blah blah blah blah. Blah blah blah liklihood blah 3 models. Blah blah blah AIC blah blah blah.Science blah blah blah blah.

M2 !Conclusions

Both squares and circles are important.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

The guinea-pig is not a rodentD Erchia, Anna Maria;Gissi, Carmela;Pesole, Graziano;Saccone, Cecilia;Arnason, UlfurNature; Jun 13, 1996; 381, 6583; Agricultural & Environmental Science Databasepg. 597

Page 11: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Next Step - Assessing Fit

{3.4, 2.1, 5.4, ...}

(2) Define ModelsM1

M2

M3

(1) Collect Data (D)D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFDTLRT

L1

L2

L3}M2 > M1 > M3

(5) Report Inferences from “Best” Model

NatureThe Shapes of our DataIntro

2012

Methods

Results

Blah blah blah blah. Blah blah blah blah blah blah. Blah blah blah important blah blah blah.Science blah blah blah blah blah.

Blah blah blah blah. Blah blah blah liklihood blah 3 models. Blah blah blah AIC blah blah blah.Science blah blah blah blah.

M2 !Conclusions

Both squares and circles are important.

We know that none of our models is really true. Can we be sure that the chosen model captures the salient features of the evolutionary process and provides reliable inferences?

Page 12: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Next Step - Assessing Fit

{3.4, 2.1, 5.4, ...}

(2) Define ModelsM1

M2

M3

(1) Collect Data (D)D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFDTLRT

L1

L2

L3}M2 > M1 > M3

(5) Report Inferences from “Best” Model

NatureThe Shapes of our DataIntro

2012

Methods

Results

Blah blah blah blah. Blah blah blah blah blah blah. Blah blah blah important blah blah blah.Science blah blah blah blah blah.

Blah blah blah blah. Blah blah blah liklihood blah 3 models. Blah blah blah AIC blah blah blah.Science blah blah blah blah.

M2 !Conclusions

Both squares and circles are important.

We know that none of our models is really true. Can we be sure that the chosen model captures the salient features of the evolutionary process and provides reliable inferences?

(6) Check Fit of Model to Data

M2

}D*1D*2D*3D*4D*5

D

Page 13: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

The Next Step - Assessing Fit

{3.4, 2.1, 5.4, ...}

(2) Define ModelsM1

M2

M3

(1) Collect Data (D)D

D

D

(3) Fit ModelsM1

M2

M3

D

D

D

: L1

: L2

: L3

(4) Compare Models &Choose “Best”

AICBICBFDTLRT

L1

L2

L3}M2 > M1 > M3

(5) Report Inferences from “Best” Model

NatureThe Shapes of our DataIntro

2012

Methods

Results

Blah blah blah blah. Blah blah blah blah blah blah. Blah blah blah important blah blah blah.Science blah blah blah blah blah.

Blah blah blah blah. Blah blah blah liklihood blah 3 models. Blah blah blah AIC blah blah blah.Science blah blah blah blah.

M2 !Conclusions

Both squares and circles are important.

(6) Check Fit of Model to Data

M2

}D*1D*2D*3D*4D*5

D

Journal of Mammalian Evolution, Vol. 4, No. 2, 1997

Are Guinea Pigs Rodents? The Importance of AdequateModels in Molecular Phylogenetics

Jack Sullivan1'2 and David L. Swofford1

The monophyly of Rodentia has repeatedly been challenged based on several studies of molec-ular sequence data. Most recently, D'Erchia et al. (1996) analyzed complete mtDNA sequencesof 16 mammals and concluded that rodents are not monophyletic. We have reanalyzed thesedata using maximum-likelihood methods. We use two methods to test tor significance of dif-ferences among alternative topologies and show that (1) models that incorporate variation inevolutionary rates across sites fit the data dramatically better than models used in the originalanalyses, (2) the mtDNA data fail to refute rodent monophyly, and (3) the original interpretationof strong support for nonmonophyly results from systematic error associated with an oversim-plified model of sequence evolution. These analyses illustrate the importance of incorporatingrecent theoretical advances into molecular phylogenetic analyses, especially when results ofthese analyses conflict with classical hypotheses of relationships.

KEY WORDS: inconsistency; maximum likelihood; molecular systematics; rodents; rate het-erogeneity.

INTRODUCTION

The assertions made in several molecular phylogenetic studies (Graur et al., 1991;Li et al., 1992; Ma et al., 1993) have led to the growing acceptance of the conclusionthat the order Rodentia is not monophyletic, in spite of the facts that these data setsessentially provide no significant refutation of the classical hypothesis (e.g., Hasegawaet al., 1992; Cao et al., 1994), and other molecular studies actually support rodentmonophyly (Martignetti and Brosius, 1993; Porter et al., 1996). Recently, D'Erchia etal. (1996) suggested that their phylogenetic analyses of complete mtDNA sequences of16 mammalian species firmly establish that the guinea pig is not a rodent, based on itsplacement as a sister taxon to a clade containing Lagomorpha, Carnivore, Primates,Perissodactyla, and Artiodactyla (including cetaceans), rather than in a clade with mouseand rat. They claim that this placement both is consistent across phylogenetic reconstruc-tion methodologies and is supported by "very significant" bootstrap values. Becausenonmonophyly of the rodents would imply a remarkable amount of convergence in mor-

1 Laboratory of Molecular Systematics, MSC, Smithsonian Institution, MRC-534, Washington, DC, 20560.2To whom correspondence should be addressed at Department of Biological Sciences, University of Idaho,Moscow, Idaho 83844.

77

l064-7554/97/0600-0077$I2.50;0 © 1997 Plenum Publishing Corporation

Page 14: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

How might we assess fit?(1) Use our prior knowledge to ask if the data are reasonable.

(2) Use our prior knowledge to ask if inferences are reasonable.

Above are “gut checks”. Very useful, but perhaps subjective. Also difficult to have strong priors for complicated data and models.

(3) Use your data (all or part) to make a prediction and see if your prediction matches what you’ve seen.

(Posterior Prediction and Cross Validation)

Page 15: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

Could have come from ?P ( , ✓| )

Could the model and priors plausibly have given rise to the data?

Page 16: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

θ θ θ θ θ θ θ θ θ 1

1 2 3 4 5 6 7 8 9

2 3 4 5 6 7 8 9

P("""""",θ|###""")#

1

2

3

4

5

6

7

8

9

Page 17: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

1

2

3

4

5

6

7

8

9

Page 18: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

T( ) T( )

Good Model

Poor Model

1

2

3

4

5

6

7

8

9 T( ) T( ) T( ) T( )

T( ) T( ) T( )

T( )

Page 19: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

Previously proposed statistics based on the data:

Multinomial Likelihood (based on frequencies of site patterns) Number of Unique Site Patterns Frequency of Invariant Sites Heterogeneity of Base Frequencies Number of parsimony-inferred “parallel” sites

Page 20: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

“We do not like to ask, ‘Is our model true or false?’, since most probability models in most analyses will not be perfectly true...

The more relevant question is, ‘Do the model’s deficiencies have a noticeable effect on the substantive inferences?’ “

- Gelman, Carlin, Stern, and RubinBayesian Data Analysis

Page 21: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior Prediction

What about using the inferences provided by our data as a test statistic(s)?

Page 22: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Posterior PredictionP("""""|###""")#

P("""""|###""")##…# P("""""|###""")##…#

1

2

3

4

5

6

7

8

9

1 5 9 P("""""|###""")#

Page 23: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Topology Test Statistics

Tree Space Tree Space

Page 24: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Topology Test Statistics

Tree Space Tree Space

Page 25: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Topology Test Statistics

Tree Space Tree Space

Frequency

05

1015

2025

3035

Frequency

02

46

810

12

Tree-to-Tree Distance Tree-to-Tree Distance

Page 26: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Topology Test StatisticsP

oste

rior P

roba

bilit

y

1

0.5

0 P

oste

rior P

roba

bilit

y

1

0.5

0

Higher Entropy Lower Entropy

1 2 3 1 2 3

Page 27: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Branch-Specific Test StatisticsP

oste

rior P

roba

bilit

y

1

0.5

0 P

oste

rior P

roba

bilit

y

1

0.5

0

Higher Entropy Lower Entropy

AB|CD No AB|CD AB|CD No AB|CD

(not yet in RevBayes)

Page 28: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Branch-Specific Test Statistics

Species B

Species A

Species E

0.78

0.99

0.95

0.89Species C

Species D

Species F

Posterior Predictive p-values

0 10.5

(not yet in RevBayes)

Page 29: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Branch-length Test Statistics

Marginalizing across topologies

Mean Tree Length = 3.15 Variance in Tree Length = 2.30

Page 30: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Simulation

Complex

P("""""|###""")####�Correct�#Posteriors#

P("""""|###""")#

Complex Simple

Complex Simple

Model Adequacy P-value

1x 10x 50x 50 Each

Page 31: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Simulation

Topological Error (True - Incorrect)

Mea

n P

-val

ue

-20% -10% 0% 10% 20%

00.

20.

40.

60.

81

0/150

0/6

4/64

15/3218/25

8/12 7/7 1/1 2/2

-40% -20% 0% 20% 40%Tree-length Error (True - Incorrect)

6/6 44/44

0/50

48/50

TopologyTree Length

Reliable

Unreliable

Page 32: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Empirical

microbewiki.kenyon.eduphylopic.org

Yeast343 orthologs

18 taxa Hess & Goldman (2011)

Amniotes1,145 orthologs

10 taxa Crawford et al. (2012)

Yeast!343!orthologs!

18!taxa!Hess!&!Goldman!(2011)!

Amniotes!1145!UCEs!10!taxa!

Crawford,!Faircloth,!McCormack,!!Brumfield,!Winker!&!Glenn!(2012)!

!

Lowest!!10%!

Highest!!10%!

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

Doyle et al., 2015, Syst Biol

Page 33: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Empirical

Tree$Distance$(to$a$reference$tree$or$to$each$other)$

Freq

uency$ DECILE 1

34 GenesDECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

1"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

2"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

3"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

4"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

5"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

6"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

7"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

8"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

9"

DECILE 134 Genes

DECILE 334 Genes

DECILE 234 Genes

DECILE 434 Genes

DECILE 534 Genes

DECILE 634 Genes

DECILE 734 Genes

DECILE 834 Genes

DECILE 934 Genes

DECILE 1034 Genes

10"

What"might"we"expect"from"ideal"filtering"approaches?"Perfect$associa6on$between$decile$membership$and$tree$distance$

rho$(rs)$=$1$

Doyle et al., 2015, Syst Biol

Page 34: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - EmpiricalFreq

uenc

y

1.0 1.2 1.4 1.6

010

0030

001 23 4

5 768 910

Freq

uenc

y

0.8 1.0 1.2 1.4 1.6

010

0030

00

1 2 3 45 6 7 8 9 10

Freq

uenc

y

1.0 1.2 1.4 1.6

010

0030

00

12 7 3 10

45 6

89

Mean%Conflic+ng%Splits%

Mean%Conflic+ng%Splits%

Mean%Conflic+ng%Splits%

Clock%Filtering%

Posterior%Predic+ve%Filtering%

Rate%Filtering%

rs=0.711,%P=0.01055%%

rs=0.964,%P=2.2x10D16%%

rs=0.482,%P=0.0793%%

Amniote(UCEs(–(Conflic0ng(with(Reference(

Doyle et al., 2015, Syst Biol

Page 35: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - EmpiricalMean conflicting splits

Freq

uenc

y

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

010

0020

0030

00 12 34 5 6 7

8 9 10

Mean conflicting splits

Freq

uenc

y

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

010

0020

0030

00 12 34 5 6 78910

Mean conflicting splits

Freq

uenc

y

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

010

0020

0030

00 12345 678 9

Mean%Conflic+ng%Splits%

Mean%Conflic+ng%Splits%

Mean%Conflic+ng%Splits%

Clock%Filtering%

Posterior%Predic+ve%Filtering%

Rate%Filtering%

rs=0.957,%P=6.9x10B6%%

rs=0.600,%P=0.03656%%

rs=0.103,%P=0.3925%%

Yeast&Orthologs&–&Conflic3ng&with&Reference&

Doyle et al., 2015, Syst Biol

Page 36: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Barcodes

Barley & Thomson 2016 Mol Ecol

Barcode data2,083 alignments

269,929 sequences from ibol.org

Page 37: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Motivating Results - Barcodes

Barley & Thomson 2016 Mol Ecol

Barcode data2,083 alignments

269,929 sequences from ibol.org

Page 38: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Active Development!

Our current inference-based statistics are computationally intense (lots of MCMC). We are:

working on faster approximations for inference statistics

conducting baseline simulation studies to establish power

making the workflow easier and faster (including HPC)

Page 39: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Thoughts on Interpretation

Assessing model fit is probably most useful with big data

Not meant to be a hypothesis test. We can always reject the fit of a model in a strict sense. All models are abstractions.

Based on the aspects of our model that don’t fit well, think about how to structure new models. Remember, with RevBayes you can design your own new models!

Page 40: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

TutorialAssessing Phylogenetic Reliability Using RevBayes and P

Data and Inference versions Assess adequacy of JC and GTR on example data

3

Page 41: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia
Page 42: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Model Selection v Absolute Fit

M1 M2

M3 M4

M5

Page 43: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Model Selection v Absolute Fit

M1 M2

M3 M4

M5

Page 44: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Model Selection v Absolute Fit

M1 M2

M3 M4

M5

P ( , ✓| ,M4)

Page 45: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

Model Selection v Absolute Fit

M1 M2

M3 M4

M5

P ( , ✓| ,M4)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

The guinea-pig is not a rodentD Erchia, Anna Maria;Gissi, Carmela;Pesole, Graziano;Saccone, Cecilia;Arnason, UlfurNature; Jun 13, 1996; 381, 6583; Agricultural & Environmental Science Databasepg. 597

Page 46: Assessing Model Fittreethinkers.org/wp-content/uploads/2019/05/Bob...Sycon ciliatum Nematostella vectensis Priapulus caudatus 96 95 61 99 65 94 79 0.2 Beroe abyssicolaa Pleurobrachia

θ θ θ θ θ θ θ θ θ 1

1 2 3 4 5 6 7 8 9

2 3 4 5 6 7 8 9

P("""""",θ|###""")#

1

2

3

4

5

6

7

8

9

Model Selection v Absolute FitM1 M2

M3 M4

M5

P ( , ✓| ,M4)