View
215
Download
1
Embed Size (px)
Citation preview
QuestionsQuestions
• What specifies the differences between us and What specifies the differences between us and rodents, or us and chimps?rodents, or us and chimps?Are we tail-less, slightly more intelligent / less hirsute rats?Are we tail-less, slightly more intelligent / less hirsute rats?
• Do all genes evolve at the same rate?Do all genes evolve at the same rate?• How can gene function contribute to How can gene function contribute to
behaviour?behaviour?• Do all tissues & organs evolve at the same rate?Do all tissues & organs evolve at the same rate?• Where do we fit in the tree of life?Where do we fit in the tree of life?• Can we understand sequence variation among Can we understand sequence variation among
humans?humans?• Where do new genes come from?Where do new genes come from?
Rapid Evolutionary ChangeRapid Evolutionary Change
• What is the big deal?What is the big deal?
• How does it happen?How does it happen?
• Can we measure it …Can we measure it …
… … accurately?accurately?
• What examples are there?What examples are there?
• Why does it happen?Why does it happen?
THE ORIGIN AND EVOLUTION OF MODEL ORGANISMS Hedges, SB Nature Reviews Genetics 3, 838 -849 (2002)
Human and Crown Group SpeciesHuman and Crown Group Species
Genomes StudiedGenomes Studied
Thomas et al., Nature 424, 788 - 793
Pan troglodytes
Papio cynocephalus anubis
Felis catus
Canis familiaris
Bos taurus
Sus scrofa
0.05 substitutions per site
Homo SapiensHomo Sapiens
Mus musculusMus musculus
Rattus norvegicusRattus norvegicus
14-21 Mya
75-110 Mya
Tempo and Mode of Protein Tempo and Mode of Protein EvolutionEvolution
• Pseudogenisation
• De novo creation
• Rapid sequence change
• Gene duplication
• Gene fusion / fission
Rapid Sequence ChangeRapid Sequence Change
• Do some genes evolve faster than others?Do some genes evolve faster than others?
• Is this adaptive or random?Is this adaptive or random?
• Compare substitutions in corresponding Compare substitutions in corresponding genes in different species: Orthologsgenes in different species: Orthologs
10 sites which can have substitutions10 sites which can have substitutions
Substituted siteSubstituted site
IdenticalIdentical
ConservedConservedDivergentDivergent
Rapid Sequence ChangeRapid Sequence Change
• Do some genes evolve faster than others?Do some genes evolve faster than others?
• Compare the substitutions under natural selection Compare the substitutions under natural selection to random changes due to genetic driftto random changes due to genetic drift
– Purifying selectionPurifying selection
– Neutral selectionNeutral selection
– Positive selectionPositive selection
under selectionunder selection
random changesrandom changes
SubstitutedSubstituted
IdenticalIdentical
• Redundant genetic codeRedundant genetic code, e.g., e.g. GC GCAA GC GCCC GC GCGG GC GCTT
• Third base of a codon “wobbles” without Third base of a codon “wobbles” without changing the translated amino acid changing the translated amino acid
• KKSS measures neutral mutation rate measures neutral mutation rate in coding regions without selectionin coding regions without selection
KKSS: : synonymous changes synonymous changes
→ → AlanineAlanine}}
KKAA/K/KS S (( dN/dS, dN/dS, ωω ))
<< 1<< 1 purifying selectionpurifying selection
> 1> 1 positive diversifying selectionpositive diversifying selection
0.00.0 1.01.0
←← conservingconserving diversifying →diversifying →
• compares sequence changes under selectioncompares sequence changes under selection with the background rate with the background rate
• a measure of selection pressure averaged a measure of selection pressure averaged across the entire geneacross the entire gene
Most genes are under purifying Most genes are under purifying selection selection
KKAA/K/KSS
% o
f P
rote
in S
equ
encs
% o
f P
rote
in S
equ
encs
0.100.10 0.300.30 0.400.40 0.500.50 0.600.60 0.700.700.0%0.0%
2.5%2.5%
5.0%5.0%
7.5%7.5%
10.0%10.0%
12.5%12.5%
0.200.200.000.00
Do some protein classes have Do some protein classes have high Khigh KAA/K/KSS??
Protein domains Protein domains with high Kwith high KAA/K/KSS
KKAA/K/KSSFunctionFunction
CLECTCLECT 0.1500.150 Immunity (Ly49)Immunity (Ly49)
IGIG 0.1510.151 Immunity (immunoglobulins)Immunity (immunoglobulins)
SRSR 0.1560.156 Immunity (scavenger receptors)Immunity (scavenger receptors)
TNFRTNFR 0.1670.167 Immunity (CD30)Immunity (CD30)
P450P450 0.1740.174 Immunity (Metabolism of toxic compounds)Immunity (Metabolism of toxic compounds)
CCPCCP 0.1810.181 Immunity (CD21)Immunity (CD21)
SCYSCY 0.2520.252 Immunity (CXC chomokines)Immunity (CXC chomokines)
KRABKRAB 0.2790.279 Transcription (ZNF 133)Transcription (ZNF 133)
N.B. Each is present in > 20 mouse and > 20 human.N.B. Each is present in > 20 mouse and > 20 human.
0%0%
20%20%
40%40%
60%60%
80%80%
100%100%
0.00.0 0.50.5 1.01.0 1.51.5 2.02.0
Paralogs sequences are diverging Paralogs sequences are diverging fasterfaster
KKAA/K/KSS
% o
f P
rote
in S
equ
encs
% o
f P
rote
in S
equ
encs
Mouse Paralogues:Mouse Paralogues: Median KMedian KAA/K/KSS 0.5220.522Human MouseHuman MouseOrthologues:Orthologues: Median KMedian KAA/K/KSS:: 0.1150.115
Paralogs sequences are diverging Paralogs sequences are diverging fasterfaster
0.0
0.5
1.0
1.5
2.0
2.5
K/K
AS
P ara log C lu sters
• What genes are duplicating?What genes are duplicating?
Examine Local gene clustersExamine Local gene clusters
OOlfactory receptorlfactory receptorss
Adapted from: Zhang and Firestein Nature Neuroscience 5 (2002) 124-133
Odorant binding proteins / aphrodisinOdorant binding proteins / aphrodisin 88 Aphrodisiac hormoneAphrodisiac hormone
Hydroxysteroid dehydrogenaseHydroxysteroid dehydrogenase 77 Biosynthesis of hormonal steroids.Biosynthesis of hormonal steroids.
Class CYP4A Cytochromes P450Class CYP4A Cytochromes P450 77 Oxidation of compounds.Oxidation of compounds.
Seminal vesicle-antigen (SVA)Seminal vesicle-antigen (SVA) 44 Suppression of spermatozoa motility.Suppression of spermatozoa motility.
Submandibular gland secretory proteinsSubmandibular gland secretory proteins 99 Expression is androgen-dependent.Expression is androgen-dependent.
Obox, homeobox proteinsObox, homeobox proteins 66 Homeobox proteins.Homeobox proteins.
Androgen-binding protein-αAndrogen-binding protein-α 99 Mate selection.Mate selection.
Prolactin related proteinsProlactin related proteins 1717 Placentation.Placentation.
Cathepsin J-like enzymesCathepsin J-like enzymes 66 Placentation. Placentation.
Cystatins / StefinsCystatins / Stefins 77 PlacentationPlacentation
HOX clusterHOX cluster 88 Placentation.Placentation.
Class CYP2D Cytochromes P450Class CYP2D Cytochromes P450 55 Regulated by androgens.Regulated by androgens.
MHC class IMHC class I 88 Immunity / Mate selection ?Immunity / Mate selection ?
Elafin, eppin, and antileukoproteinase 1Elafin, eppin, and antileukoproteinase 1 77 Anti-microbial.Anti-microbial.
Beta-defensin proteins. X 2 clustersBeta-defensin proteins. X 2 clusters 5/55/5 Anti-microbial.Anti-microbial.
Eosinophil-associated ribonuclease.Eosinophil-associated ribonuclease. 1111 Pathogen response.Pathogen response.
Reproduction Clusters
Immunity/Defense Clusters
Recent Gene Duplication in MouseRecent Gene Duplication in Mouse
Median Mouse / Human / Rat Median Mouse / Human / Rat divergencedivergence
Human: 0.15
Rodent: 0.31
Mouse: 0.093
Rat: 0.102
75-110 Mya 14-21 Mya
Phylogenetic Trees confirm Phylogenetic Trees confirm lineage specific duplicationslineage specific duplications
ENSG00000163739ENSG00000163739ENSG00000163734ENSG00000163734ENSG00000081041ENSG00000081041
ENSG00000163737ENSG00000163737ENSG00000109272ENSG00000109272
ENSG00000163735ENSG00000163735ENSG00000124875ENSG00000124875
ENSG00000163736ENSG00000163736ENSMUSG00000029372ENSMUSG00000029372ENSMUSG00000029373ENSMUSG00000029373
ENSMUSG00000029371ENSMUSG00000029371
ENSMUSG00000029380ENSMUSG00000029380ENSMUSG00000029379ENSMUSG00000029379
Mammalian themesMammalian themes
• ChemosensationChemosensation
• ImmunityImmunity
• ReproductionReproduction
Recent duplications:Recent duplications:
Gene duplicationsGene duplications
Why are immunity, reproduction and Why are immunity, reproduction and chemosensation over-represented?chemosensation over-represented?
Our hypothesis: Darwinian evolutionOur hypothesis: Darwinian evolutionDuplications are adaptiveDuplications are adaptive
Hypothesis: Darwinian evolutionHypothesis: Darwinian evolution
Competition:Competition:–Inter-specific (pathogens, predators)Inter-specific (pathogens, predators)
–Con-specificCon-specific» matingmating
» sub-speciation / kin-selectionsub-speciation / kin-selection
» gender conflictgender conflict
• Per gene KPer gene KAA/K/KSS is an average over the whole is an average over the whole sequencesequence
Site specific KSite specific KAA/K/KSS analysis analysis
0
1
0 20 40 60 80 100 120 140 160
KKAA/K/K
SS
Position along the sequencePosition along the sequence
Site specific KSite specific KAA/K/KSS analysis analysis• With multiple sequences, predictWith multiple sequences, predict
– distribution of broad categories of Kdistribution of broad categories of KAA/K/KSS sites sites
– which residues fall into which part of distributionwhich residues fall into which part of distribution
– which residues have high Kwhich residues have high KAA/K/KSS / positively selected / positively selected
0
1
0 20 40 60 80 100 120 140 160
KKAA/K/K
SS
Position along the sequencePosition along the sequence
Richard Emes and Scott BeatsonRichard Emes and Scott Beatson
Sites subject toSites subject topositive selection:positive selection:marked in bluemarked in blue
Codon analysis confirms aCodon analysis confirms adaptationdaptation
Adaptive evolutionAdaptive evolution
What we know:What we know:
• Per site KPer site KAA/K/KSS direct evidence for positive direct evidence for positive selectionselection
• Adaptive mutations Adaptive mutations have spread faster through have spread faster through the population than random changesthe population than random changes
What we don’t know:What we don’t know:
• Adaptive advantage Adaptive advantage of these molecular changes of these molecular changes for individuals of the speciesfor individuals of the species
Caveats: UseCaveats: Use commensurate commensurate analysesanalyses
Problems:Problems:
1)1) Alternative transcriptsAlternative transcripts
2)2) Mis-predictionsMis-predictions
S p ec ie s A
S p ec ie s B
S p ec ie s C
Caveats: “Violations” of Caveats: “Violations” of inheritanceinheritance
• RecombinationRecombination
• Gene conversionGene conversion
• PolymorphismPolymorphism
Caveats: Use appropriate dataCaveats: Use appropriate data
under selectionunder selection
random changesrandom changes
SubstitutedSubstituted
IdenticalIdentical
•Evolutionary distance too large:Evolutionary distance too large:Saturation: most sites >1 substitution.Saturation: most sites >1 substitution.Phylogenetic tree samples too far apartPhylogenetic tree samples too far apartNeed more species to fill in the gaps Need more species to fill in the gaps Make average distances smallerMake average distances smaller
Caveats: Use appropriate dataCaveats: Use appropriate data
under selectionunder selection
random changesrandom changes
SubstitutedSubstituted
IdenticalIdentical
•Evolutionary distance too small:Evolutionary distance too small:Insufficient data to distinguish selection from randomInsufficient data to distinguish selection from randomchangeschangesNeed more species or population data Need more species or population data (may not be possible)(may not be possible)
Caveats: Use appropriate dataCaveats: Use appropriate data
under selectionunder selection
random changesrandom changes
SubstitutedSubstituted
IdenticalIdentical
•Sufficiently sampled phylogenetic treeSufficiently sampled phylogenetic tree•Sufficient number of sequencesSufficient number of sequences•Sufficient length of sequencesSufficient length of sequences
Caveats: Beware of statistical Caveats: Beware of statistical certaintycertainty
• With 30,000 genes, p-values of 0.01, expect 300 With 30,000 genes, p-values of 0.01, expect 300 errors!errors!
• Ensure predictions make biological senseEnsure predictions make biological sense
• Be careful if lineage-specific predictions Be careful if lineage-specific predictions contradict general trendscontradict general trends
What defines our species?What defines our species?
• Comparative genomicsComparative genomicsignore similarities and look for differencesignore similarities and look for differences
• Gene duplication Gene duplication raw material:raw material:
• Fast sequence changeFast sequence change• AdaptationAdaptation has shaped genomes since lineage has shaped genomes since lineage
divergence. divergence. Is gene duplication itself adaptive?Is gene duplication itself adaptive?
• Per-site KPer-site KAA/K/KS S direct evidence for adaptationdirect evidence for adaptation
The Human GenomeThe Human Genome
• Reproduction, Chemosensation Reproduction, Chemosensation andand Immunity Immunity genes evolving rapidly across genes evolving rapidly across entire mammalian cladeentire mammalian clade
• What are What are human lineage-specifichuman lineage-specific changes changes rather than rather than mammal-specificmammal-specific changes? changes?
• What in our genome marks us asWhat in our genome marks us as Human?Human?
Gene Loss?Gene Loss?
The Human GenomeThe Human Genome
• Reproduction, Chemosensation Reproduction, Chemosensation andand Immunity Immunity genes evolving rapidly across genes evolving rapidly across entire mammalian cladeentire mammalian clade
• What are What are human lineage-specifichuman lineage-specific changes changes rather than rather than mammal-specificmammal-specific changes? changes?
• What in our genome marks us asWhat in our genome marks us as
Human?Human?