Upload
harvey-armstrong
View
217
Download
2
Tags:
Embed Size (px)
Citation preview
Agreement among gene trees Agreement among gene trees could be used as evidence of could be used as evidence of
common ancestry ?common ancestry ?
Jessica Clarke and Flor RodriguezJessica Clarke and Flor RodriguezMarch 21March 21stst , 2006 , 2006
Arguments for common ancestryArguments for common ancestry
the genetic code is a “frozen accident”the genetic code is a “frozen accident”
when life first arises it alters the environment so when life first arises it alters the environment so as to make subsequent start-ups much less as to make subsequent start-ups much less probableprobable
species with common ancestor are more likely to species with common ancestor are more likely to exhibit congruence in character state patterns exhibit congruence in character state patterns than species that originated separatelythan species that originated separately
Hypothesis of common ancestryHypothesis of common ancestryby Penny by Penny et alet al. (1982). (1982)
Prediction:Prediction:Orthologous genes should lead to similar trees Orthologous genes should lead to similar trees because they are expected to share the same because they are expected to share the same evolutionary historyevolutionary history
developed an algorithm that guaranteed to find all developed an algorithm that guaranteed to find all minimal-length treesminimal-length trees
implemented a tree-comparison metric to measure implemented a tree-comparison metric to measure closenesscloseness
calculated the expected distribution of this metriccalculated the expected distribution of this metric
Conclusion:Conclusion:Theory of evolution leads to quantitative predictions Theory of evolution leads to quantitative predictions that are testable and falsifiablethat are testable and falsifiable
Measuring the differenceMeasuring the difference
T1-T11 complete data set
T12-T17 cytochrome c
T18 fibrinopeptide A
T19-T26 fibrinopeptide B
T27-T32 haemoglobin T33-T39 haemoglobin
Measuring the differenceMeasuring the difference
* * ***
*
The symmetric difference metric on two trees counts The symmetric difference metric on two trees counts the number of edges that occur in one, but not both, treesthe number of edges that occur in one, but not both, trees
Critic by Sober and Steel 2002Critic by Sober and Steel 2002 Common ancestry might be untestableCommon ancestry might be untestable
I(X,Z) I(X,Z) ≤ 4k max≤ 4k maxi i {n{niiee-4rti-4rti}} I(X, Y) ≤ neI(X, Y) ≤ ne-4rt-4rt
n = 1000n = 1000t = 20 million yearst = 20 million yearsr = 1 in 2million yearsr = 1 in 2million years
n = 100n = 100t = 20 million yearst = 20 million yearsr = 1 in 100 million yearsr = 1 in 100 million years
K = 10 000K = 10 000
No method can infer X from YNo method can infer X from Ywith a probability that is any with a probability that is any better than simply ignoring Y better than simply ignoring Y and blindly guessing Xand blindly guessing X
No method can reliably No method can reliably determine from this data determine from this data how these four groups are how these four groups are related historicallyrelated historically
Long ages of time might have erased the pertinent evidenceLong ages of time might have erased the pertinent evidence
Response from Penny Response from Penny et al.et al. 2003 2003
Methods of tree construction based on parsimony Methods of tree construction based on parsimony
assume common ancestryassume common ancestry
Methods other than parsimony can be used, and Methods other than parsimony can be used, and should be favored if they give more consistent results should be favored if they give more consistent results
when analyzing and comparing different data setswhen analyzing and comparing different data sets
Response from Penny Response from Penny et al.et al. 2003 2003
The hypothesis of common ancestry (CA)The hypothesis of common ancestry (CA)might be untestablemight be untestable
Some alternatives of the Some alternatives of the theory of common ancestrytheory of common ancestry
can be formulated, tested and rejectedcan be formulated, tested and rejected
The theory of influenza viruses from outer spaceThe theory of influenza viruses from outer space The theory that every species was created separately The theory that every species was created separately
(ID)(ID)
Influenza viruses continue to arrive Influenza viruses continue to arrive from outer space via cometsfrom outer space via comets
Hoyle and Wickramasinghe 1984, 1986Hoyle and Wickramasinghe 1984, 1986
under the theory of descent linear tree is expectedunder the theory of descent linear tree is expected if each epidemic was carried on different comets, a if each epidemic was carried on different comets, a
correlation between their order of arrival and their correlation between their order of arrival and their phylogeny is not expectedphylogeny is not expected
Test 1:Test 1: Probability of sequences occurring on a linear tree in the Probability of sequences occurring on a linear tree in the
same order as the year of appearance same order as the year of appearance P < 10P < 10-6 -6 , that the linear tree (observed order) occurs by , that the linear tree (observed order) occurs by
chancechance
The theory of descent was not rejectedThe theory of descent was not rejected
Influenza viruses from spaceInfluenza viruses from space
Test 2:Test 2:
Steiner tree (Binary tree) was not rejectedSteiner tree (Binary tree) was not rejected It is not necessary that all possible alternatives to a It is not necessary that all possible alternatives to a
model MUST be rejected simultaneouslymodel MUST be rejected simultaneously
t1
t2
t3
t4
Binary treet1 t2 t3 t4
Star-tree
1 in 1064
Intelligent designIntelligent designTheory of descent vs. theory of individual creationTheory of descent vs. theory of individual creation
Example:Example: Photosynthetic enzymes from plants living in hot-dry Photosynthetic enzymes from plants living in hot-dry
environments and those living in a moist-temperate lawnenvironments and those living in a moist-temperate lawn
correct prediction correct prediction
Theory of descent leads to testable predictionsTheory of descent leads to testable predictions
Agreement Between Agreement Between Gene TreesGene Trees
Evidence for common descent….. Evidence for common descent….. or NOT?or NOT?
History of LifeHistory of Life
3.5byo - oldest prokayotic fossils3.5byo - oldest prokayotic fossils 1.7byo - oldest eukaryotic fossils1.7byo - oldest eukaryotic fossils 545-525myo - cambrian explosion545-525myo - cambrian explosion 475myo - first land plants475myo - first land plants 400myo - origin of vascular tissue400myo - origin of vascular tissue 300myo - origin of seed plants300myo - origin of seed plants 130myo - origin of flowering plants130myo - origin of flowering plants
Campbell, 1999
Main SourcesMain Sources
www.talkorigins.orgwww.talkorigins.org
www.trueorigins.orgwww.trueorigins.org
Main ArgumentsMain Arguments
Trees do not match Trees do not match Design not ruled outDesign not ruled out Evolution is not falsifiableEvolution is not falsifiable Molecules do not evolve according to Molecules do not evolve according to
predictionspredictions
Predictions ViolatedPredictions Violated
Common ancestry predicts agreement Common ancestry predicts agreement among trees.among trees.
Trees do not agree perfectly.Trees do not agree perfectly. Therefore, the common ancestry claim is Therefore, the common ancestry claim is
rejected.rejected.
ResponseResponse
NR = (2n-3)!! = (2n-3)!/(2n-2(n-2)!
Rooted Unrooted Number of Possible Trees2 3 13 4 34 5 155 6 1056 7 9457 8 10,3958 9 135,1359 10 2,027,025
10 11 654,729,075
20 21 8,200,794,532,637,890,000,000
Number of Taxa
Theobald, 2006
Design Not RejectedDesign Not Rejected
Anatomy and biochemistry are not independent.Anatomy and biochemistry are not independent. Organisms similar anatomically, are similar Organisms similar anatomically, are similar
biochemically- and vise versa.biochemically- and vise versa. Thus, gene agreement could reflect design.Thus, gene agreement could reflect design.
Brand 1997
ResponseResponse
There is no biological reason, besides There is no biological reason, besides common descent, that similar common descent, that similar morphologies should have similar morphologies should have similar biochemestry.biochemestry.
Besides, we can use neutral genes, and Besides, we can use neutral genes, and genes with vastly different functions to genes with vastly different functions to construct trees.construct trees.
Theobald, 2006
Not Falsifiable / Not ScienceNot Falsifiable / Not Science
Evolutionary predictions are shown falseEvolutionary predictions are shown false Evolution is not falsified.Evolution is not falsified. Thus, evolution is not falsifiable, and is not science.Thus, evolution is not falsifiable, and is not science.
Possible examplesPossible examples horizontal transferhorizontal transfer hybridizationhybridization
Predictions ViolatedPredictions Violated
Evolution predicts that divergence between Evolution predicts that divergence between lineages is proportional to evolutionary distance lineages is proportional to evolutionary distance (constant rate of evolution).(constant rate of evolution).
# bp changes between lineages does not match # bp changes between lineages does not match predictionspredictions
Therefore, claim is false (& molecular data are Therefore, claim is false (& molecular data are bunk).bunk).
Camp, 2001
14 bp
22 bp
Turtle HumanRattlesnake
Cytochrome CCytochrome C
10 bp
12 bp
Kangaroo HumanHorse
Cytochrome CCytochrome C
Response Response
Common ancestry does not predict Common ancestry does not predict uniform rates.uniform rates.
Even given uniform rates, events are Even given uniform rates, events are stochastic, and thus should not match stochastic, and thus should not match predictions exactly.predictions exactly.
Distribution of genetic distances between human and mouse Distribution of genetic distances between human and mouse genes. genes. The histogram is the actual data from 2,019 human and mouse The histogram is the actual data from 2,019 human and mouse genes. The solid curve shows the expected distribution of genetic genes. The solid curve shows the expected distribution of genetic distances assuming only a constant rate of background mutation (~10distances assuming only a constant rate of background mutation (~10-9-9 substitutions per site per year) (reproduced from Figure 3a in Kumar and substitutions per site per year) (reproduced from Figure 3a in Kumar and Subramanian 2002).Subramanian 2002).
Theobald, 2006
ReferencesReferences Brand, Leonard. 1997. Brand, Leonard. 1997. Faith, Reason, and Earth HistoryFaith, Reason, and Earth History. Andrews University Press, . Andrews University Press,
Berrien Springs, MI.Berrien Springs, MI. Camp, Ashby. 2001. A critique of Douglas Theobald’s “29 Evidences for Evolution”. Camp, Ashby. 2001. A critique of Douglas Theobald’s “29 Evidences for Evolution”.
09 March, 2006. 09 March, 2006. www.trueorigin.org/theobald1a.aspwww.trueorigin.org/theobald1a.asp Campbell, N., Reece J., Mitchell, L. 1999. Biology, fifth edition. Campbell, N., Reece J., Mitchell, L. 1999. Biology, fifth edition.
Benjamin/Cummings, Menol Park, CA.Benjamin/Cummings, Menol Park, CA. Kumar, S., and Subramanian, S. 2002. Mutation rates in mammalian genomes. Proc Kumar, S., and Subramanian, S. 2002. Mutation rates in mammalian genomes. Proc
Natl Acad Sci. 99: 803-808.Natl Acad Sci. 99: 803-808. Penny D., Hendy M., Zimmer E. and R. Hamby. 1990. Trees from sequences: Penny D., Hendy M., Zimmer E. and R. Hamby. 1990. Trees from sequences:
Panacea or Pandora’s box?. Aus. Syst. Bot., 3, 21-38.Panacea or Pandora’s box?. Aus. Syst. Bot., 3, 21-38. Penny D., Hendy M. and M. Steel. 1991.Testing the theory of descent. In: Penny D., Hendy M. and M. Steel. 1991.Testing the theory of descent. In:
Phylogenetic analysis of DNA sequences. 155-183.Phylogenetic analysis of DNA sequences. 155-183. Penny D., Foulds L. and M. Hendy. 1982. Testing the theory of evolution by Penny D., Foulds L. and M. Hendy. 1982. Testing the theory of evolution by
comparing phylogenetic trees constructed from five different protein sequences. comparing phylogenetic trees constructed from five different protein sequences. Nature. 297:197-200.Nature. 297:197-200.
Penny D., Hendy M. and A. Poole. 2003. Testing fundamental evolutionary Penny D., Hendy M. and A. Poole. 2003. Testing fundamental evolutionary hypotheses. J. Theor. Biol. 223:377-385.hypotheses. J. Theor. Biol. 223:377-385.
Robinson D. and L. Foulds. 1981.Comparison of phylogenetic trees. Math. Biosc. Robinson D. and L. Foulds. 1981.Comparison of phylogenetic trees. Math. Biosc. 53:131-147.53:131-147.
ReferencesReferences Rokas A. and S. Carroll. 2005. More genes or more taxa?. The relative contribution of Rokas A. and S. Carroll. 2005. More genes or more taxa?. The relative contribution of
gene number and taxon number to phylogenetic accuracy. Mol. Biol. Evol. 22(5):1337-gene number and taxon number to phylogenetic accuracy. Mol. Biol. Evol. 22(5):1337-1344.1344.
Rokas A., Williams B., King Nicole and S. Carroll. 2003. Genome-scale approaches to Rokas A., Williams B., King Nicole and S. Carroll. 2003. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature. 425:798-804resolving incongruence in molecular phylogenies. Nature. 425:798-804
Sober E. and M. Steel. 2002. Testing the hypothesis of common ancestry. J. Theor. Sober E. and M. Steel. 2002. Testing the hypothesis of common ancestry. J. Theor. Biol. 218:395-408.Biol. 218:395-408.
Theobald, Douglas L. "29+ Evidences Macroevolution: The Scientific Case for Common Theobald, Douglas L. "29+ Evidences Macroevolution: The Scientific Case for Common Descent." Descent." The Talk.Origins ArchiveThe Talk.Origins Archive. Vers. 2.85. 8 Jan, 2006. Vers. 2.85. 8 Jan, 2006 http://www.talkorigins.org/faqs/comdesc/
Theobald, Douglas. “29+ Evidences for Macroevolution: A Response to Ashby Camp’s Theobald, Douglas. “29+ Evidences for Macroevolution: A Response to Ashby Camp’s “Critique”. 21 March, 2002“Critique”. 21 March, 2002
www.www.talkoriginstalkorigins..org/faqs/comdesc/camporg/faqs/comdesc/camp.html.html
The competing hypothesesThe competing hypotheses
Ho: CA-1 single ancestral origin
Ha: CA-i, i >1 separate origination events
The competing hypothesesThe competing hypotheses
Simplest modelSimplest model
A , all trait follows the same rulesA , all trait follows the same rulesB, each trait follows the same rules B, each trait follows the same rules
on all brancheson all branchesC, all the changes that a single C, all the changes that a single
character can experience on a character can experience on a given branch must have the given branch must have the same probabilitysame probability
Most complex modelMost complex model
-A , allow traits to follow different -A , allow traits to follow different rulesrules
-B, -B, allow a single trait to follow allow a single trait to follow different rules on different different rules on different branchesbranches
-C, -C, each possible change of a each possible change of a single trait on a single single trait on a single branch to have its own branch to have its own probabilityprobability
Ho: CA-1Ho: CA-1 single ancestral originsingle ancestral origin
Ha: CA-Ha: CA-ii,, i i >1>1 separate origination separate origination eventsevents
The competing hypothesesThe competing hypotheses
GraphGraph Process modelProcess model GGii MMjj
GGii & MMjjestimate parameters in the model estimate parameters in the model
L(L(GGii && MMjj))
One can not compare different topologiesOne can not compare different topologiesthat have different process modelsthat have different process models
attached to themattached to them
L(L(GGii && MMjj))
L(L(GGii && MMkk))L(L(GGii && MMjj))
L(L(GGkk&& MMjj)) LRT does not applyLRT does not apply
Akaike information criterion (AIC)Akaike information criterion (AIC)
AIC is based on a theorem that describes how the AIC is based on a theorem that describes how the predictive accuracy of a model predictive accuracy of a model M M containing adjustable containing adjustable parameter can be estimatedparameter can be estimated
L(M)L(M) is the hypothesis obtained from is the hypothesis obtained from MM by assigning by assigning values to adjustable parameters that maximize the values to adjustable parameters that maximize the probability of the dataprobability of the data
Good fit-to-data increase predictive accuracy Good fit-to-data increase predictive accuracy
Penalty for complexityPenalty for complexity
Applies to nested and non-nested modelsApplies to nested and non-nested models
AdvantagesAdvantages scope or domain a characterscope or domain a character range of evolutionary ratesrange of evolutionary rates large number of characterslarge number of characters mechanism of evolutionmechanism of evolution easier data handleeasier data handle expectation of useful charactersexpectation of useful characters cost of obtaining datacost of obtaining data
Penny et al. 1990Penny et al. 1990
LimitationsLimitations Sampling errors:Sampling errors:
- sequences too short- sequences too short- unrepresentative sequences- unrepresentative sequences
Methodological problems:Methodological problems:- large number of possible trees- large number of possible trees- incomplete use of information- incomplete use of information- converging to an incorrect tree- converging to an incorrect tree- deviations from the standard - deviations from the standard
modelmodel Human error:Human error:
- errors in data and programming- errors in data and programming- misreading the tree- misreading the tree
Trees from sequencesTrees from sequences
The limits to phylogeny reconstruction The limits to phylogeny reconstruction depend on the modeldepend on the model
A good method for reconstructing trees should A good method for reconstructing trees should have the properties of beinghave the properties of being
fastfast consistent consistent
efficientefficient robust robust
falsifiablefalsifiable
Results from current methods should be treated Results from current methods should be treated as hypotheses for future testingas hypotheses for future testing