36
Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? Rob King and Matt Kretz

Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic?

Rob King and Matt Kretz

Page 2: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Outline

  Review

  Hexapod Origins

  Response to Hexapod Origins

  How the same data = different trees

  Arthropod Origins

  The Enigmatic Insect

Page 3: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Phylogenetics

  Seeks to explain relatedness in terms of evolutionary history

Darwin’s Trees of Life Modern Tree of Life

Page 4: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review   Monophyletic

  Paraphyletic

  Polyphyletic

  Apomorphies

  Autapomorphies

  Synapomorphies

  Plesiomorphies

Page 5: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Kingdom

  Phylum

  Subphylum   Superclass

  Class

Page 6: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Animalia

  Phylum

  Subphylum   Superclass

  Class

Page 7: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Animalia

  Arthropoda

  Subphylum   Superclass

  Class

Page 8: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Animalia

  Arthropoda

 Mandibulata   Superclass

  Class

Page 9: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Animalia

  Arthropoda

 Mandibulata   Hexapoda

  Class

Page 10: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Taxonomy Review

 Animalia

  Arthropoda

 Mandibulata   Hexapoda

  Insecta

Page 11: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Methods of Reconstructing Relationships

Paleontology Morphology Developmental Biology

Comparative Morphology Molecular Phylogenetics

Page 12: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Hexapod Origins   4 mitochondrial protein sequences:

  Tricholepidion gertschi (Zygentoma)   Gomphiocephalus hodgsoni (Collembolan)   Other Arthropods

  Results: Collembola (springtails) separate evolutionary lineage that adapted independently to life on land   polyphyletic

Page 13: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Implications

  Hexapoda now polyphyletic   Collembolans outside of Crustaceans and Insecta clade

  Supports Pancrustacea (Hexapods + Crustaceans)

which could potentially cause artefactual at-traction (22) in this analysis. Such sequenc-es are usually removed from phylogeneticanalyses owing to their evidently incorrectplacement and disturbance to the reconstruc-tion. To recognize and exclude from theanalysis those sequences whose placementin the phylogenetic tree could be influenced

by such anomalies in the mechanism of evo-lution, rather than by the true historical pro-cess, we performed a detailed statistical test(19) to select a subset of sequences withhomogeneous modes of evolution and whoserate of evolution is compatible with that ofGomphiocephalus and Tricholepidion. Theplacement of these two taxa is key to assess-

ing the monophyly of the Hexapoda, so it isespecially important that the taxa comparedare compatible with them. The methods ofanalysis outlined above, applied to this re-duced data set, produced the two treesshown in Fig. 2, which differ only in theplacement of Ostrinia with respect to theremaining Holometabola. Again, strong sup-port is obtained for the Pancrustacea, withTricholepidion basal to the remaining ptery-gotan insects, and the two collembolansplaced outside the Crustacea ! Insectaclade. The trees also show monophyly ofCrustacea, although with a lower level ofsupport. Limulus is recovered as the sistergroup of the Pancrustacea, in contrast withthe analysis based on the 35-taxon data set,but again with very low support. The result-ing trees do not seem to be sensitive to thetaxa included (fig. S2).

The most interesting result produced bythis study is certainly the nonmonophyly ofHexapoda—that is the position of the twocollembolans outside the Crustacea ! Insectaclade, agreed upon by all analyses and withhigh levels of support. To test the relativepositioning of Crustacea, Collembola, andInsecta in more detail, we compared twoalternative topologies using analytical tests.The hypothesis of Crustacea external to amonophyletic Hexapoda (here, Insecta !Collembola) is strongly rejected (Table 1) infavor of the proposed nonmonophyly ofHexapoda. We also applied the same tests tothe problem of the basal trichotomy betweenChelicerata, Myriapoda, and Pancrustacea. Asister group relationship between Pancrusta-cea and Myriapoda ("Mandibulata) isstrongly rejected (Table 1), and no significantdifference in support was found for the othertwo possible hypotheses. This accords withthe low levels of support found in all trees atthis node.

It has been generally accepted that the taxonHexapoda, including the basal apterygotan or-ders, is monophyletic. This conclusion isstrongly supported by similarities in their bodyorganization (composed of head, thorax, andabdomen), as well as other morphological char-acters including eye and leg structure and theabsence of limbs in one of the cephalic seg-ments (9). On the other hand, the interpretationof such characters also depends on which is theclosest relative of the Hexapoda, and even onthe basal splitting of the latter taxon (9). Nev-ertheless, apterygotan taxa, including Collem-bola, show a number of peculiar features that atleast complicate the analysis of their affinitieswith the Insecta sensu stricto (9, 23) and leavesome room to question these affinities altogeth-er. The acceptance of nonmonophyly ofHexapoda implies that the tripartite and six-legged body plan typical of Hexapoda would bea convergent acquisition of collembolans andthe “true insects.”

Fig. 1. Maximum-likelihood[ProtML (24)] phylogenetic re-construction, complete data set.Numerals at each node show lo-cal bootstrap probability values.Branch lengths are drawn pro-portionally to maximum-likeli-hood estimates.

Fig. 2. Maximum-likeli-hood [ProtML (24) andMrBayes (25)] phyloge-netic reconstructions, re-duced data set. Al-ternative placement ofOstrinia follows MrBayesreconstruction. Numeralsabove each node showlocal bootstrap probabili-ty values (ProtML), andnumerals below eachnode indicate posteriorprobabilities (MrBayes).Branch lengths are pro-portionate to maximum-likelihood estimates pro-duced by ProtML.

Table 1. Tests of significance for competing hypotheses. Statistical tests of significance were conductedfor different competing phylogenetic hypotheses within Pancrustacea and within arthropod classes. au,approximately unbiased test; kh, Kishino-Hasegawa test; sh, Shimodaira-Hasegawa test (26).

Tree #ln L (ProtML) au kh sh

(Collembola, (Crustacea, Insecta)) 19723.73 0.991 0.979 0.979 Best(Crustacea, (Collembola, Insecta)) 19744.96 0.009 0.021 0.021(Myriapoda, (Chelicerata, Pancrustacea)) 19723.73 0.509 0.496 0.649 Best((Myriapoda, Chelicerata), Pancrustacea) 19723.97 0.509 0.504 0.626(Chelicerata, (Myriapoda, Pancrustacea)) 19739.90 0.006 0.032 0.084

R E P O R T S

21 MARCH 2003 VOL 299 SCIENCE www.sciencemag.org1888

on

Mar

ch 2

4, 2

009

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

Page 14: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Interesting Outcomes   What’s odd about this tree?

  Artifacts:   Apis mellifera (Honey Bees)

  Heterodoxus (Dog Louse)

which could potentially cause artefactual at-traction (22) in this analysis. Such sequenc-es are usually removed from phylogeneticanalyses owing to their evidently incorrectplacement and disturbance to the reconstruc-tion. To recognize and exclude from theanalysis those sequences whose placementin the phylogenetic tree could be influenced

by such anomalies in the mechanism of evo-lution, rather than by the true historical pro-cess, we performed a detailed statistical test(19) to select a subset of sequences withhomogeneous modes of evolution and whoserate of evolution is compatible with that ofGomphiocephalus and Tricholepidion. Theplacement of these two taxa is key to assess-

ing the monophyly of the Hexapoda, so it isespecially important that the taxa comparedare compatible with them. The methods ofanalysis outlined above, applied to this re-duced data set, produced the two treesshown in Fig. 2, which differ only in theplacement of Ostrinia with respect to theremaining Holometabola. Again, strong sup-port is obtained for the Pancrustacea, withTricholepidion basal to the remaining ptery-gotan insects, and the two collembolansplaced outside the Crustacea ! Insectaclade. The trees also show monophyly ofCrustacea, although with a lower level ofsupport. Limulus is recovered as the sistergroup of the Pancrustacea, in contrast withthe analysis based on the 35-taxon data set,but again with very low support. The result-ing trees do not seem to be sensitive to thetaxa included (fig. S2).

The most interesting result produced bythis study is certainly the nonmonophyly ofHexapoda—that is the position of the twocollembolans outside the Crustacea ! Insectaclade, agreed upon by all analyses and withhigh levels of support. To test the relativepositioning of Crustacea, Collembola, andInsecta in more detail, we compared twoalternative topologies using analytical tests.The hypothesis of Crustacea external to amonophyletic Hexapoda (here, Insecta !Collembola) is strongly rejected (Table 1) infavor of the proposed nonmonophyly ofHexapoda. We also applied the same tests tothe problem of the basal trichotomy betweenChelicerata, Myriapoda, and Pancrustacea. Asister group relationship between Pancrusta-cea and Myriapoda ("Mandibulata) isstrongly rejected (Table 1), and no significantdifference in support was found for the othertwo possible hypotheses. This accords withthe low levels of support found in all trees atthis node.

It has been generally accepted that the taxonHexapoda, including the basal apterygotan or-ders, is monophyletic. This conclusion isstrongly supported by similarities in their bodyorganization (composed of head, thorax, andabdomen), as well as other morphological char-acters including eye and leg structure and theabsence of limbs in one of the cephalic seg-ments (9). On the other hand, the interpretationof such characters also depends on which is theclosest relative of the Hexapoda, and even onthe basal splitting of the latter taxon (9). Nev-ertheless, apterygotan taxa, including Collem-bola, show a number of peculiar features that atleast complicate the analysis of their affinitieswith the Insecta sensu stricto (9, 23) and leavesome room to question these affinities altogeth-er. The acceptance of nonmonophyly ofHexapoda implies that the tripartite and six-legged body plan typical of Hexapoda would bea convergent acquisition of collembolans andthe “true insects.”

Fig. 1. Maximum-likelihood[ProtML (24)] phylogenetic re-construction, complete data set.Numerals at each node show lo-cal bootstrap probability values.Branch lengths are drawn pro-portionally to maximum-likeli-hood estimates.

Fig. 2. Maximum-likeli-hood [ProtML (24) andMrBayes (25)] phyloge-netic reconstructions, re-duced data set. Al-ternative placement ofOstrinia follows MrBayesreconstruction. Numeralsabove each node showlocal bootstrap probabili-ty values (ProtML), andnumerals below eachnode indicate posteriorprobabilities (MrBayes).Branch lengths are pro-portionate to maximum-likelihood estimates pro-duced by ProtML.

Table 1. Tests of significance for competing hypotheses. Statistical tests of significance were conductedfor different competing phylogenetic hypotheses within Pancrustacea and within arthropod classes. au,approximately unbiased test; kh, Kishino-Hasegawa test; sh, Shimodaira-Hasegawa test (26).

Tree #ln L (ProtML) au kh sh

(Collembola, (Crustacea, Insecta)) 19723.73 0.991 0.979 0.979 Best(Crustacea, (Collembola, Insecta)) 19744.96 0.009 0.021 0.021(Myriapoda, (Chelicerata, Pancrustacea)) 19723.73 0.509 0.496 0.649 Best((Myriapoda, Chelicerata), Pancrustacea) 19723.97 0.509 0.504 0.626(Chelicerata, (Myriapoda, Pancrustacea)) 19739.90 0.006 0.032 0.084

R E P O R T S

21 MARCH 2003 VOL 299 SCIENCE www.sciencemag.org1888 o

n M

arch

24,

200

9 ww

w.sc

ienc

emag

.org

Down

load

ed fr

om

Page 15: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Long Branch Attraction

  A phenomenon that occurs when rapidly evolving lineages are inferred to be closely related, regardless of true evolutionary relationships   How are the results interpreted?

  Homoplasy seen as ___________? Synapomorphy

Page 16: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Reduced Data Set   What’s the costs & benefits of a reduced data set?

  15 vs. 35 taxa?

which could potentially cause artefactual at-traction (22) in this analysis. Such sequenc-es are usually removed from phylogeneticanalyses owing to their evidently incorrectplacement and disturbance to the reconstruc-tion. To recognize and exclude from theanalysis those sequences whose placementin the phylogenetic tree could be influenced

by such anomalies in the mechanism of evo-lution, rather than by the true historical pro-cess, we performed a detailed statistical test(19) to select a subset of sequences withhomogeneous modes of evolution and whoserate of evolution is compatible with that ofGomphiocephalus and Tricholepidion. Theplacement of these two taxa is key to assess-

ing the monophyly of the Hexapoda, so it isespecially important that the taxa comparedare compatible with them. The methods ofanalysis outlined above, applied to this re-duced data set, produced the two treesshown in Fig. 2, which differ only in theplacement of Ostrinia with respect to theremaining Holometabola. Again, strong sup-port is obtained for the Pancrustacea, withTricholepidion basal to the remaining ptery-gotan insects, and the two collembolansplaced outside the Crustacea ! Insectaclade. The trees also show monophyly ofCrustacea, although with a lower level ofsupport. Limulus is recovered as the sistergroup of the Pancrustacea, in contrast withthe analysis based on the 35-taxon data set,but again with very low support. The result-ing trees do not seem to be sensitive to thetaxa included (fig. S2).

The most interesting result produced bythis study is certainly the nonmonophyly ofHexapoda—that is the position of the twocollembolans outside the Crustacea ! Insectaclade, agreed upon by all analyses and withhigh levels of support. To test the relativepositioning of Crustacea, Collembola, andInsecta in more detail, we compared twoalternative topologies using analytical tests.The hypothesis of Crustacea external to amonophyletic Hexapoda (here, Insecta !Collembola) is strongly rejected (Table 1) infavor of the proposed nonmonophyly ofHexapoda. We also applied the same tests tothe problem of the basal trichotomy betweenChelicerata, Myriapoda, and Pancrustacea. Asister group relationship between Pancrusta-cea and Myriapoda ("Mandibulata) isstrongly rejected (Table 1), and no significantdifference in support was found for the othertwo possible hypotheses. This accords withthe low levels of support found in all trees atthis node.

It has been generally accepted that the taxonHexapoda, including the basal apterygotan or-ders, is monophyletic. This conclusion isstrongly supported by similarities in their bodyorganization (composed of head, thorax, andabdomen), as well as other morphological char-acters including eye and leg structure and theabsence of limbs in one of the cephalic seg-ments (9). On the other hand, the interpretationof such characters also depends on which is theclosest relative of the Hexapoda, and even onthe basal splitting of the latter taxon (9). Nev-ertheless, apterygotan taxa, including Collem-bola, show a number of peculiar features that atleast complicate the analysis of their affinitieswith the Insecta sensu stricto (9, 23) and leavesome room to question these affinities altogeth-er. The acceptance of nonmonophyly ofHexapoda implies that the tripartite and six-legged body plan typical of Hexapoda would bea convergent acquisition of collembolans andthe “true insects.”

Fig. 1. Maximum-likelihood[ProtML (24)] phylogenetic re-construction, complete data set.Numerals at each node show lo-cal bootstrap probability values.Branch lengths are drawn pro-portionally to maximum-likeli-hood estimates.

Fig. 2. Maximum-likeli-hood [ProtML (24) andMrBayes (25)] phyloge-netic reconstructions, re-duced data set. Al-ternative placement ofOstrinia follows MrBayesreconstruction. Numeralsabove each node showlocal bootstrap probabili-ty values (ProtML), andnumerals below eachnode indicate posteriorprobabilities (MrBayes).Branch lengths are pro-portionate to maximum-likelihood estimates pro-duced by ProtML.

Table 1. Tests of significance for competing hypotheses. Statistical tests of significance were conductedfor different competing phylogenetic hypotheses within Pancrustacea and within arthropod classes. au,approximately unbiased test; kh, Kishino-Hasegawa test; sh, Shimodaira-Hasegawa test (26).

Tree #ln L (ProtML) au kh sh

(Collembola, (Crustacea, Insecta)) 19723.73 0.991 0.979 0.979 Best(Crustacea, (Collembola, Insecta)) 19744.96 0.009 0.021 0.021(Myriapoda, (Chelicerata, Pancrustacea)) 19723.73 0.509 0.496 0.649 Best((Myriapoda, Chelicerata), Pancrustacea) 19723.97 0.509 0.504 0.626(Chelicerata, (Myriapoda, Pancrustacea)) 19739.90 0.006 0.032 0.084

R E P O R T S

21 MARCH 2003 VOL 299 SCIENCE www.sciencemag.org1888

on

Mar

ch 2

4, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fro

m

Page 17: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Technical Comment on Hexapod Origins

  Same mitochondrial protein sequences   Different tree

  What’s new?

Comment on “Hexapod Origins:Monophyletic or Paraphyletic?”

Nardi et al. (1) suggested, rather cautiously,that hexapods (insects plus collembolans intheir data set) might be a diphyletic ratherthan a monophyletic group. According totheir interpretation, collembolans evolvedseparately from other insectsand emerged before crusta-ceans. This unexpected resulthas huge consequences for theinterpretation of both morpho-logical and developmentalevolution in arthropods (2)and therefore deserves furtherscrutiny— especially from amethodological standpoint.

Nardi et al. drew their con-clusions from maximum like-lihood and Bayesian analysesat the amino acid level of fourof the 13 mitochondrial pro-teins for both the original 35-taxon data set and a 15-taxonsubset. However, phyloge-netic analyses of amino acidscarry several potential cave-ats. First, the currently avail-able models of mitochondrialamino acid substitution arebased on empirically deducedmatrices from mammalian-dominated sequence databas-es. Second, the maximumlikelihood analysis used in (1)does not model the variationof rate across sites, which isknown to be one of the mostimportant parameters of thelikelihood model (3). Third,bias in nucleotide compositionalso affects the amino acidcomposition of the gene prod-uct, thereby causing potentialproblems for phylogenetic re-construction (4).

Some of these pitfallsmight be avoided by analyz-ing nucleotide sequences forwhich more realistic modelsof sequence evolution andpowerful reconstruction meth-ods are available. In particu-lar, we have recently shownthat in the case of mammaliancomplete mitochondrial ge-nomes (5), it is possible todeal with saturation and base

composition heterogeneity by recoding nu-cleotides as purines (R) and pyrimidines (Y).This approach provided a solution to long-standing controversies concerning the posi-tion of the root of the mammalian tree (5).

Applying this strategy to nucleotides fromthe original Nardi et al. data set stronglysuggests that by correcting for different arte-facts it is possible to extract a useful histori-cal signal. Unlike Nardi et al., we were ableto place the honeybee (Apis) and louse (Het-erodoxus) within Insecta (Fig. 1). The arte-factual position of these taxa as sister-groupsof ticks in (1) was explained as being aconsequence of high shared AT nucleotidecomposition in the mitochondrial genome

sequences. From our results,base composition heterogeneityseems to be more easily accom-modated in phylogenetic recon-structions using nucleotides.More importantly, our analysisconforms to classical views ofarthropod phylogeny: Collem-bolans are a sister group of in-sects, and these monophy-letic hexapods group with crus-taceans into Pancrustacea (Fig.1). One remaining problemwith this tree concerns theparaphyly of myriapods in-duced by the nesting of thecentipede (Lithobius) insidechelicerates.

As noted in (1), the phylo-genetic analysis performed onthe 35-taxon data set indicatesuneven rates of evolutionamong taxa, making it diffi-cult to draw firm conclusionsabout relationships betweenlineages. To test the collembo-lan position further, Nardi etal. reduced the data set to 15taxa with more homogeneousevolution rates and amino acidcompositions. Despite theirconservative analysis, they stillreported collembolans outsideboth insects and crustaceans,rendering hexapods diphyletic.However, such a reduced dataset is particularly prone to sys-tematic biases from low taxonsampling (6). Although deletingtaxa with anomalous rates andbase composition can be help-ful, care must be taken not todelete taxa that could leave iso-lated branches and lead to a“long branches attract” phe-nomenon (7). More specifical-ly, the inclusion of a single out-group can have a strong impacton phylogenetic reconstruction,even in the absence of rate het-erogeneity (8). In the case of

0.1

Katharina

Loligo

Albinaria

Lumbricus

Platynereis

Narceus

Thyropygus

Ixodes

Rhipicephalus

Limulus

Lithobius0.61

0.84

Artemia

Daphnia

Pagurus

Penaeus

0.91

0.75

Gomphiocephalus

Tetrodontophora

Tricholepidion

Locusta

Triatoma0.40

Crioceris

Tribolium

Apis

Heterodoxus

0.70

Bombyx mandarina

Bombyx mori

Ostrinia furnacalis

Ostrinia nubilalis

Anopheles quadrimaculatus

Anopheles gambiae

Chrysomya

Cochliomyia

Ceratitis

Drosophila yakuba

Drosophila melanogaster

*

0.46

0.76

0.68

0.86

0.87

0.57

0.96

0.72

0.89

*

*

*

*

*

**

*

*

*

*

*

*

*

*

HEXAPODA

PANCRUSTACEA

ARTHROPODA

PTERYGOTA

1.00

INSECTA

Diptera

Lepidoptera

Coleoptera

Hymenoptera

Phthiraptera

Hemiptera

Orthoptera

Zygentoma

COLLEMBOLA

CRUSTACEA

MYRIAPODA

CHELICERATA

Fig. 1. Bayesian 50% majority rule consensus tree with associated branchlengths obtained using nucleotide sequences of cox1, cox2, cox3, and cytb(3750 sites) corresponding to the 35-taxon data set in (1). The first and thirdcodon positions were RY-coded, whereas second codon positions were keptas nucleotides. MrBayes version 3.0b4 (12) was used to perform a parti-tioned-likelihood Bayesian search in which three independent substitutionmodels were attributed to each codon position—a two-state substitutionmodel! I! " for RY-coded first and third codon positions, and a GTR! I!model for second codon position nucleotides. Four incrementally heatedMetropolis-coupled Markov chain Monte Carlo (MCMCMC) were run for500,000 generations, sampling trees and parameters every 10 generations.The consensus tree was obtained from the 35,000 trees sampled after theinitial burn-in period. Values at nodes indicate Bayesian posterior probabil-ities (* # 1.00). Note that the terminal branch lengths leading to the bee(Apis) and louse (Heterodoxus) have been reduced by a factor of three.Underlined taxa are not included in the 25-taxon data set.

TECHNICAL COMMENT

www.sciencemag.org SCIENCE VOL 301 12 SEPTEMBER 2003 1482d

on

Mar

ch 2

4, 2

009

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

Page 18: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Technical Comment on Hexapod Origins

  25 taxon: reduced data set:   Why is this tree better then

then Nardi’s 15 taxon tree?

placental mammal mitogenomics, taxon sam-pling has been shown to be a major source ofphylogenetic error (9), and we found that in-creasing the number and diversity of taxa pro-duced excellent agreement between nuclear andmitochondrial sequence data (10).

To maximize taxon sampling, we construct-ed a well-balanced 25-taxon data set designedto break isolated long branches (especially inthe outgroup) without adding strong rate heter-

ogeneity. Phylogenetic analyses of this nucleo-tide data set, including RY-coded third codonpositions, produced a tree in which Arthropoda,Pancrustacea, Hexapoda, Insecta, and Pterygotaall appear as monophyletic groups, though withvariable support (Fig. 2). Moreover, this topol-ogy is much more compatible with currentviews of arthropod phylogeny (11). The prob-ability of randomly selecting a topology com-patible with this prior hypothesis is so small

(10) that it provides strong evidence in favor ofits veracity. Obviously, additional complete mi-tochondrial genomes are needed to strengthenthe tree further. However, with the data andmethods currently available, the hypothesis of acommon ancestry for extant hexapods cannotbe rejected.

Frederic DelsucMatthew J. Phillips

David PennyThe Allan Wilson Center for Molecular

Ecology and EvolutionInstitute of Molecular BioSciences,

Science Tower DMassey University

Post Office Box 11-222Palmerston North, New ZealandE-mail: [email protected]

References and Notes1. F. Nardi et al., Science 299, 1887 (2003).2. R. H. Thomas, Science 299, 1854 (2003).3. J. Sullivan, D. L. Swofford, Syst. Biol. 50, 723 (2001).4. P. G. Foster, D. A. Hickey, J. Mol. Evol. 48, 284 (1999).5. M. J. Phillips, D. Penny, Mol. Phylogenet. Evol. 28, 171(2003).

6. D. J. Zwickl, D. M. Hillis, Syst. Biol. 51, 588 (2002).7. M. D. Hendy, D. Penny, Syst. Zool. 38, 297 (1989).8. B. R. Holland, D. Penny, M. D. Hendy, Syst. Biol. 52,229 (2003).

9. H. Philippe, J. Mol. Evol. 45, 712 (1997).10. Y.-H. Lin et al., Mol. Biol. Evol. 19, 2060 (2002).11. G. Giribet, G. D. Hedgecombe, W. C. Wheeler, Nature413, 157 (2001).

12. F. Ronquist, J. P. Huelsenbeck, Bioinformatics 19, 1572(2003).

13. D. L. Swofford, PAUP* version 4.0b10 (Sinauer Asso-ciates, Sunderland, MA, 2002).

14. D. Posada, K. A. Crandall, Bioinformatics 14, 817(1998).

15. F. Nardi and colleagues kindly sent us their aminoacid data set. E. Douzery provided helpful comments.Our data sets are available at http://awcmee.massey.ac.nz/downloads.htm. This work was supported by aLavoisier Postdoctoral Grant from the French Minis-try of Foreign Affairs to F.D. and by the New ZealandMarsden Fund.

7 May 2003; accepted 15 August 2003

0.1

Katharina

Loligo

Lumbricus

Platynereis

Limulus

Narceus

Thyropygus

Pagurus

Penaeus

Daphnia

Artemia

Gomphiocephalus

Tetrodontophora

Tricholepidion

Triatoma

Crioceris

Tribolium

Locusta

Anopheles

Ostrinia

Bombyx

Ceratitis

Drosophila

Chrysomya

Cochliomyia

ANNELIDA

MOLLUSCA

Myriapoda

Chelicerata

Crustacea

HEXAPODA

PANCRUSTACEA

ARTHROPODA

63 / 1.00

56 / 0.99

100 / 1.00

PTERYGOTA 49 / 0.88

INSECTA

77 / 0.98

91 / 1.00

99 / 1.00

95 / 1.00

100 / 1.00

100 / 1.00

51 / 1.00

45 / 0.92

100 / 1.00

100 / 1.00

100 / 1.00

54 / 0.76

66 / 0.98

71 / 0.98

77 / 0.99

100 / 1.00

70 / 0.91

100 / 1.00

Fig. 2. Maximum likelihood(ML) phylogram obtainedusing nucleotide sequencesof cox1, cox2, cox3, andcytb for a 25-taxon data set(3777 sites). The thirdcodon positions were RY-coded, whereas first andsecond codon positionswere kept as nucleotides.PAUP* (13) was used toperform a ML heuristicsearch under the best fittingGTR ! I ! " model andassociated ML estimates ofparameters as determinedby Modeltest version 3.06(14). A partitioned-likeli-hood Bayesian search wascarried out with MrBayes(12) using a GTR ! I ! "model for first and secondcodon position nucleotidesand a two-state substitu-tion model! I! " for theRY-coded third codon posi-tions, with the same param-eter settings as in Fig. 1. Val-ues at nodes indicate MLbootstrap proportions (100replications)/Bayesian pos-terior probabilities. The twocollembolans are figured inbold.

T E C H N I C A L C O M M E N T

12 SEPTEMBER 2003 VOL 301 SCIENCE www.sciencemag.org1482d o

n M

arch

24,

200

9 ww

w.sc

ienc

emag

.org

Down

load

ed fr

om

Page 19: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Nardi vs. Delsuc

  How did Delsuc get a different tree from Nardi using the same data?

  What are the implications of phylogenetic analysis of amino acids?   Empirically deduced matrices from mammalian

dominated sequence databases

  Does not model the variation of rate across sites

  Bias in nucleotide composition

Page 20: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Variation and Position in Codon Sites

  RY recoding is simply when a computer program recodes the third codon position for a purine or a pyrimidine depending on the bias.   Why the third codon position?

Page 21: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Nardi vs. Delsuc

  How did Delsuc get a different tree from Nardi using the same data?

  What are the implications of phylogenetic analysis of amino acids?   Empirically deduced matrices from mammalian

dominated sequence databases

  Does not model the variation of rate across sites

  Bias in nucleotide composition

Page 22: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Bias in Nucleotide Composition

  Some taxa share a high AT bias   Apis (84%)

  Rhipicephalus (78%)

  Heterodoxus (79%)

  Why would there be a high AT nucleotide concentration within the mitochondrial genome?

Page 23: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Arthropod Relationships Revealed

  Used 41 kilobases of DNA from 62 single copy nuclear-protein coding genes (wow)   75 Arthropods   Mandibulata (Myriapoda + Hexapoda + Crustacea)

  Subphylum

  Results:   Hexapods rest within Mandibulata and are monophyletic   Pancrustacea is monophyletic   Crustacea is paraphyletic   Recover the Entognatha as sister group to Ectognatha

Page 24: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Hexapoda in Mandibulata

Page 25: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Monophyly & Paraphyly

Hexapoda = monophyletic

Pancrustacea = monophyletic

Crustacea = paraphyletic

Page 26: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Entognatha - Ectognatha

Page 27: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

The Enigmatic Insect

Page 28: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Strepsiptera

  “strepsi” = twisted, “ptera” = wing twisted winged parasites

  10 families with 600 species

  Endoparasitic in bees, wasps, leafhoppers, silverfish, and cockroaches

Page 29: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Classifying the Enigma

Page 30: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Possibility 1 Neuoptera

Megaloptera

Raphidoptera Coleoptera

Diptera

Mecoptera

Siphonaptera

Trichoptera

Lepidoptera

Hymenoptera

Page 31: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Neuoptera

Megaloptera

Raphidoptera

Diptera

Mecoptera Siphonaptera

Trichoptera

Lepidoptera

Hymenoptera

Coleoptera

Possibility 2

Page 32: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Hymenoptera Lepidoptera

Trichoptera

Siphonaptera Mecoptera Diptera

Coleoptera Raphidioptera

Megaloptera

Neuroptera Hempitera Thysanoptera

Phthiraptera Pscopotera

Neo

pter

a

End

opte

rygo

ta

Para

neop

tera

Possibility 3

Page 33: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

What the scientists think…

Whiting et al. 1997

Page 34: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

What the scientists think…

Wiegmann et al. 2009

Page 35: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Possibility1 Possibility2

Possibility3

Page 36: Hexapoda Origins: Monophyletic, Paraphyletic or Polyphyletic? · Reduced Data Set What’s the costs & benefits of a reduced data set? 15 vs. 35 taxa? which could potentially cause

Characters:Possibility1–SistergrouptoColeopteraSimilarCharacteristics 1.Absenceofthoracicnotopleuralsuture(grooveonsideofprothorax) 2.Modificationofforewingsintoelytra 3.Posteromotorism(hindwingpoweredflight)Possibility2–SistergrouptoDipteraSimilarCharacteristics

1.18SrDNAsupportplacementofhalteresonforewingaspossiblehomeoticmutation

2.ModificationofforewingsintohalteresPossibility3–SistergrouptoHolometabolousSimilarCharacteristics 1.Exopterygotadevelopment(externalwingbudformation) 2.Possessionoflarvalstomata(lateralocelli)