Marsupial herbivore evolution and the failure of morphological algorithmic phylogenetics
[email protected] for Macroevolution & Macroecology, Research School of Biology, Australian National University
New Guinea forest wallaby
“Phylogenetics is concerned with the problem of reconstructing the past evolutionary history of extant organisms from present day molecular data”
– Phylomania 2010 website
Darwin’s (the 1st ?) phylogenetic tree
Horse evolution & Macroevolutionary theory, e.g. Cope’s rule
Molecular data:
Invaluable for phylogenetic inference
Morphological studies had left us with 1.99 1021
possible relationships among the 29 orders
Molecular studies now leave us with ≈405 possible relationships
Phillips & Penny (2010)
Molecular data:
Is molecular phylogeny above the species level a pursuit of diminishing returns (for theoreticians)?
Remaining uncertainty involves lineage sorting: genomic retroposons better than species-tree methods for assigning ancestry
In either case, the interesting question of individual gene ancestry is defeated by stochastic error
99 mya(83-116HPD)
Cretaceous Period
21kb of nuclear genes for 57 marsupial&placental mammals
BEAST relaxed clock (lognormal dist. branch rates), 13 FR calibration priors – unconstrained, 20 lineages originate in the Cretaceous
Work with Kate Loynes
Loynes & Phillips (in prep)
Rates of DNA substitution (subs/site/Ma) on individual branches
Dark blue: unconstrainedLight blue: 4 placental lineages in CretaceousRed: no placentals cross into Cretaceous
Cretaceous Tertiary
82 mya(73-93HPD)
Cretaceous Period
21kb nuclear genes for 57 marsupial & placentals mammals
BEAST relaxed clock (as previous) – now constraining ≤4 placental lineages to originate in the Cretaceous
92
65
86
86
67
Cut-out section of the placental mammal tree, with putative relationships of fossils from close to or before the K/T boundary
More fossils confidently assigned to branches on the modern tree could immediately solve the K/T boundary problem
And for the overall evolutionary timescale , reduces reliance on assumptions for how rates vary among branches
Kulbeckia
All these fossils may be stem placentals
Meredith et al (MPE, 2009)Foraging height
Ancestral state reconstruction
Arboreality inferred at all deep nodes
Include 5 extinct sub-families
But megafaunal extinction was biased towards large/terrestrial
Palorchestes
Lineage through time analysis
Penny & Phillips (Nature, 2006)
Null hypothesis of constant net diversification (speciation-extinction) is linear
“Pull of the recent” peaksTurnover associated with recent biotic/aboitic events overwrites more ancient signals
Million years ago
Ln
(acc
umul
ated
bra
nchi
ng e
vent
s)
Marsupial divergence times
Hurdles for morphological phylogenetics: progress is being made in some areas
• Long branch attraction – A serious problem when MP is standard
ML models (e.g. Mk or Mkv of Lewis (Syst Biol, 2001) outperform MP
freq
uenc
y
Trait score
State 1 State 2
If no clear pattern or unimodal, exclude or score as constant
Other problems include:
• Developmental correlations (e.g. upper/lower molars)• Outgroup attraction of ecological long branches (e.g. turtles)• Objectivity in character state discrimination
Functional/ecological correlations
Emu
Pigeon
Chicken
Ducks
Not really three characters providing a strong phylogenetic signal Evolutionarily non-independent, associated with parenting strategy
• Babies cute/ugly• Wing development slow/rapid• Leg development rapid/slow
Galah
Marsupials arrived in Australasia 55-70 mya from S.America, via Antarctica
Diprotodontia “Polyprotodontia”
Microbiotheria
Diprotodontia: The most ecologically diverse mammal order
Diprotodon opatum ~2500kg
Thylacoleo carnifex 110kg
Terrestrial herbivores, arboreal insectivores and a multitude of niches in between
Diprotodontia: 10 extant families (≈ 120 species)
Phascolarctidae = koala (Arboreal folivores)
Vombatidae = wombats (Burrowing grazers)
Burramyidae = pygmy possums (Mostly-terrestrial to mostly arboreal gramnivores and generalized omnivores)
Macropodidae = kangaroos and potoroos (Bipedal hopping browsers/grazers and semi-fossorial root/fungi feeders)
Hypsiprymnodontidae = musky rat-kangaroo (Terrestrial, bounding frugivore-omnivore)
Tarsipedidae = honey possum (Arboreal nectivore)
Acrobatidae = feathertail possums (Gliding/arboreal omnivores)Pseudocheiridae = Ringtail possums (Arboreal folivores)
Phalangeridae = Brushtail possums and cuscuces (Scansorial to arboreal frugivores-folivores)
Petauridae = gliders and trioks (Gliding gumnivores and arboreal insectivores)
Diprotodontian consensus phylogeny: Cardillo et al. (J. Zool, 2004)
Vombatidae (wombats)
Phascolarctidae (koala)
Burramyidae (pygmy possums)
Pseudocheiridae (ringtail possums)
Phalangeridae (cuscuses and brushtail possums)
Tarsipedidae (honey possum)
Petauridae (gliders, stripped possums)
Macropodidae (kangaroos and potoroos)
Acrobatidae (feathertail possums)
Vombatiformes
“Cor
e”
Pet
auro
idea
Hypsiprymnodontidae (musky rat-kangaroo)Macropodoidea
Phillips and Pratt (MPE, 2008): mitochondrial (mt) genomes
Beck (J. Mammalogy, 2008): several mt & nuclear genes
Meredith et al. (MPE, 2009): 5-nuclear genes
Vombatidae
Phascolarctidae
Pseudocheiridae
Burramyidae
Phalangeridae
Macropodidae
Acrobatidae
Tarsipedidae
Petauridae
Hypsiprymnodontidae
Molecular “supermatrix”: 26 marsupials 20,654 nucleotides
Complete mt genome protein/RNA coding sequences & 5 nuclear genes (RAG1, BRCA1, IRBP, vWF, APOB)
• Analysed as 13 separately modelled process partitions
• Mitochondrial protein 3rd codons RY-coded to reduce saturation and compositional non-stationarity
wombatskoala
pygmy possums
ringtail possums
honey possumgliders
kangaroos
feathertail possums
musky rat-kangaroo
cuscuses
bandicootsmarsupial mole
dasyurids
All nodes MrBayes BPP = 1.00 and RAxML BP >95%, (except where noted)
0.97 / 72
Diproto
donti
a
“Poly
proto
donti
a”
Previous work on the family-level phylogeny of Diprotodontia
Algorithmic morphology (MP) MRP supertree summary
Informal-comparative morphology MRP supertree
Mt sequence analyses MRP supertree summary
Albumin M’CF Baverstock et al. 1990 (review)
Single nuclear genes MRP supertree summary
DNA hybridization Kirsch et al. 1997 (review)
Algorithmic morphology morphol352 (MP)
Algorithmic morphology morphol352 (ML, Bayesian)
Differences between informal-comparative and algorithmic morphology
Selection criterion
Character analysis
Algorithmic
MP, ML etc.
Homology, otherwise biology-free
Many and varied (inc. bootstrap)
Informal-comparative
vague
Homology, untangling funct/dev correlation form phylogenetic signal
Non-statisticalHypothesis testing
How do these data / methods perform?
One test would be whether or not they reject the molecular consensus - not helpful … Hypothesis testing is difficult with distance methods like DNA hybridization and impossible with informal-comparative morphology
Alternative: Likelihood disadvantage on the 20,654 nucleotide molecular matrix for a fairer comparison of data / methods
Example:
–lnL(consensus) = 121,316.3
–lnL(DNA hybridization tree) = 121,438.2
lnL disadvantage = 121.9
Algorithmic morphology (MP) 690.1
Informal-comparative morphology 71.7
Mt sequence analyses 84.4
Albumin M’CF 182.4
Single nuclear genes 96.0
DNA hybridization 121.9
Algorithmic morphology morphol352 (MP/ML) 594.6
Algorithmic morphology morphol352 (Bayes) 617.5
Likelihood disadvantages
Do the algorithmic analyses just suffer from stochastic blindness?
• Scaled the molecular-dated marsupial tree to the treelength of the morphol352 ML tree
• Simulated 60,000 character “pseudomorphological” dataset, Sim352 in Seq-gen (JC, equivalent to Mk4). 1000 boots, 352 chs
VombatidaePhascolarctidae
Pseudocheiridae
BurramyidaePhalangeridae
Macropodidae
AcrobatidaeTarsipedidaePetauridae
Hypsiprymnodontidae
95100
87
100
7361
5374
*
5 outgroup taxa
Vombatidae
Phascolarctidae
Pseudocheiridae
Burramyidae
Phalangeridae
Macropodidae
Acrobatidae
Tarsipedidae
Petauridae
Molecular consensus
Phascolarctidae
Pseudoch’idae
Phalangeridae
Acrobatidae
Tarsipedidae
Vombatidae
Burramyidae
Macropodidae
Petauridae
Algorithmic morphology
Can we mimic the real morphological data by combining molecular phylogenetic and ecological signals ?
VombatidaePhascolarctidae
Pseudocheiridae
BurramyidaePhalangeridaeMacropodidae
AcrobatidaeTarsipedidaePetauridae
5 outgroup taxa
60,000 characters
Sim352
phylogenetic signal as per the molecular dated tree, scaled to morphol352 treelength
Size Diet
0 = <50g1 = 50-200g2 = 200-800g3 = 800g-3kg4 = 3-12kg5 = >12kg
0 = herb1 = sub-herb2 = omniv3 = sub-carn4 = carn
ordered states
6
4
2
0
0
10
8
3224168
% M
P tr
ee le
ngth
dis
adva
ntag
e
% ecological contribution to MP tree length
Optimum fit to the molecular consensus tree (0% ecological contribution)
Optimum fit to the algorithmic morphology tree
(9% ecol. cont.)
Vombatidae
Phascolarctidae
Pseudocheiridae
Burramyidae
Phalangeridae
Macropodidae
Acrobatidae
Tarsipedidae
Petauridae
Molecular consensus
Phylogenetic randomization test P-value = 0.00016
Phylo-ecol sim
Phascolarctidae
Pseudoch’idae
Phalangeridae
Acrobatidae
Tarsipedidae
Vombatidae
Burramyidae
Macropodidae
Petauridae
Algorithmic morphology
Tempting next move: Reverse engineered phylogeny
If the algorithmic morphology (morphol352) data is effectively 91% phylogenetic signal, 9% ecological signal … what if we subtract the 9% ecological signal from the observed signal?
b. MP on diet+size: 14 steps 26 steps
a. MP on morphol352 741 steps 782 steps
Alg. Morphol. tree “True” tree
c. Ave. over 9% “true” TL 61.8 steps 114.7 steps
c. Rev Eng Phylogeny (a-c) 679.2 steps 667.3 steps
Improvements
Co-inferring the relative weightings of the ecological correlates simultaneously with the relative apparent contributions of phylogenetic and ecological signal
Searching tree space for the reverse engineered phylogeny - current phylogenetic programs are well set up for addition of log-likelihoods (e.g. for partitioned data), but not for subtraction
Molecular tree is employed in the discrimination of apparent phylogenetic and ecological signals - so has some influence on the reverse engineered phylogeny.
However, the ultimate aim here is the placement of fossils. The correction for ecological signal (inferred with extant taxa) can be employed for fossil taxa, independent of their DNA
Eomaia
Microraptor
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Their fossils provide the only direct evidence for answering many key questions in macroecology and macroevolution and for calibrating molecular timescales
>99% of all species are extinct
Acknowledgements
• Kate Loynes (ANU, PhD student)
• Emily Lake (ANU, Honours student)
• Australian Research Council