If you can't read please download the document
Upload
may-lamb
View
222
Download
0
Embed Size (px)
DESCRIPTION
Molecular Evolution AGAMOUS; transcription factor [ Arabidopsis thaliana ] What information can DNA sequences give us? Evaluating the role of drift/demography vs. selection on trait divergence. Identify function. Looking at genes whose evolutionary history was shared. www.ncbi.nlm.nih.gov Biol336-12
Citation preview
Molecular evolution Part I: The evolution of
macromolecules.
Part II:The reconstruction of the evolutionary history of genes and
organisms. Molecular evolution encompasses two areas of study: The
evolution of macromolecules: the rates and patterns of change in
the genetic material (DNA sequences) and in the encoded products
proteins The evolutionary history of genes and organisms Biol336-12
Molecular Evolution AGAMOUS; transcription factor [ Arabidopsis
thaliana ] What information can DNA sequences give us? Evaluating
the role of drift/demography vs. selection on trait divergence.
Identify function.Looking at genes whose evolutionary history
wasshared. Biol336-12 Molecular Evolution 3D Protein Structure of
Human proinsulin 1952: Frederick Sanger and coworkers determine the
complete amino acid sequence of insulin.
MALWMRLLPLLALLALWGPDPAAAFVNQHLCG This field has its roots in two
separate disciplines: population genetics and molecular
biology.Population genetics provides the theoretical background and
molecular biology provides the empirical data. The first complete
sequence of a protein (insulin) was determined in 1952 by F.Sanger
and colleagues. Munte et al FEBS J Biol336-12 How and why have
molecular sequences evolved to be the way they are?
Molecular Evolution How and why have molecular sequences evolved to
be the way they are? Biol336-12 Molecular Evolution Learning
Objectives:
Variability within a population Subsitution rates Neutral Theory
Detecting selection at the DNA level. Biol336-12 Molecular
Evolution REVIEW: NUCLEOTIDE SUBSITUTIONS
LysAla Leu ValLeu Leu AgAt145 AAG GCA CTG GTC CTG TTG AgAt134
AAAGCA CTG GTC CTC TTG SepAt145 AGG GCA CTG GTC CTG GTG SepAt134
AGG GCA CTG GTC CTG GTG CalAt145 AAG CTG TTC CTG TTG CalAt134 AAG
CTG TTC CTG TTG Here are some sequences taken from Arabidopsis
thaliana two individuals 145 and These are the MADS BOX genes
important for flower development.On the left is the identifier
=name of gene, species, individual Bolded in black is the reference
individual.We will compare this sequence to the others. Btwn
AgAt145 and AgAt134 we have two changes in the third position.Both
are synonymous change because the amino acid stay the same.The
first is a transitions because G->A (a change from one purine to
another) but the second case G->C (from purine to pyrimidine to
another) is an example of a transversion. Comparing AgAt145 to
Sep1134 and Sep1145 A->G (a transition)and T->G (a
transversion) but this time lysine changes to arginine in the first
and in the second case leucine changes to valine both are
nonsynonymous changes because they result in the amino acid
changing. Comparing AgAt145 to CalAt145 and CalAt134:we see there
is the deletion of a codon creating a gap. But there is also a
change from G->T (transversion) resulting in a nonsynonymous
change = valine changes to phenylalanine. GENE SPECIES Biol336-12
Molecular Evolution What happens after a mutation arises in the DNA
sequence at a locus? Polymorphism:mutant allele is one of -several
present in population. Substitution:the mutant allele fixes in the
population.(New mutations at other nucleotides may occur later.)
Mutations occur-sometimes they are the result of DNA replication
errors or errors in DNA repair leading to the changes in
nucleotides, however, sometimes the changes are larger creating
deletions and insertions. When we ask the question HOW and WHY have
molecular sequences evolved the way they do?We are really
interested in knowing once a mutation arises what happens to
it?Does it remain in the population as one of several mutant
alleles? In which case we consider this a polymorphism. If on the
other hand it fixes in a population, then the change is considered
asubsitution. Biol336-12 Molecular Evolution 0 aaat aaat aaat aaat
aaat aaat aaat
10aaat aaat aaataaatacataaataaat 20aaat aaat acataaatacatacatacat
30acat acat acatacatacatacatacat 40acat acat actt acatacatacatacat
Generation new mutation polymorphism Generation 30 mutation fixed
substitution Generation 40 new mutation polymorphism L1 L2 L3 L4 L5
L6 L7 Time (generations) At the start, time 0 generation, everyone
in the population is aaat.As time passes a mutation arises in
individual 5.Now there are two different sequences segregating in
the population = polymorphism.This polymorphism is present in
several individuals within the population and hangs around for
generations 10 to about 29.Finally in generation 30 it is fixed
every individual in the population has a c at this second nt. This
is the same as the frequency of p reaching 1. Again in generation
40, a new mutation arises and we have a polymorphism again.
Biol336-12 Molecular Evolution Imagine that five sequences are
obtained from each of two species, and that the sequences are
related to each other as shown here. Any mutation that happens on a
red branch will appear as a polymorphism within species one
Biol336-12 Molecular Evolution Any mutation that happens on a red
branch will appear as a polymorphism within species 1. Any mutation
that happens on a blue branch will appear as a polymorphism within
species 2. Any mutation that happens on the green branch will
appear as a fixed difference between the species within species
between species Here the phylogeny is divided into two
parts:between species branches and within species branches. Within
species branches connect all the alleles within each species to
their most recent common ancestor. Between species branches connect
these common ancestors to the common ancestor of the whole
phylogeny. A mutation on a between species branch will appear in
all the descendant alleles and thus will be a fixed difference
between species.A mutation on a within species branch will be a
polymorphism within a species Biol336-12 Molecular Evolution What
happens after a mutation arises in the DNA sequence at a locus?
Polymorphism:mutant allele is one of -several present in
population. Substitution:the mutant allele fixes in the
population.(New mutations at other nucleotides may occur later.)
Biol336-12 Molecular Evolution Substitution rate:the rate at which
mutant alleles rise to fix within a lineage By comparing DNA
sequences from different organisms, we can estimate the rate at
which mutations appear and fix, causing basepair substitutions.
Biol336-12 Molecular Evolution How many selectively neutral mutants
reach fixation per unit time? Neutral mutations occur at a rate,
per locus per generation. In a diploid population at a particular
locus, there are 2N alleles. The number of mutants arising every
generation at a givenlocus in a diploid population of size N is The
probability of fixation of selectively neutral allele? Thus, the
substitution rate for neutral alleles is 2N* A new mutant arising
as a single copy in a diploid population of size N has an initial
frequency of 1/2N.If only drift is acting what is the probability
of fixation for that neutral allele?1/2N 1/2N (1/2N)( 2N*) =
Biol336-12 Fixation probability for a beneficial allele
Molecular Evolution What is the substitution rate for neutral
alleles? What is the substitution rate for beneficial alleles
(s>0)? What is the substitution rate for deleterious alleles?
Fixation probability for a beneficial allele (2N)(2s) = 4Ns
Probability of fixation of an advantageous allele * the number of
new mutants arising every generation. Probability of fixation for
positive values of s when N is large, is 2s. IF the absolute value
of s is small the probability of fixation is 2s/1-exp(-4Ns) Close
to zero. Biol336-12 Molecular Evolution Consider a numerical
example:
A new mutant arises in a population of 1000 individuals. If it is
neutral the probability it will fix is If it confers a selective
advantage of s=0.01, then the probability it will fix is, If it has
a selective disadvantage of s=-0.001? 1/2N=1/(2*1000) These last
two results are noteworthy because it means advantageous mutations
dont always fix in a population.In the case of an advantageous
mutation with s=0.01, the probability it will fix is 2% but that
also means 98% of all the mutations with the selective advantage of
0.01 are lost. On the other hand, even slightly deleterious
mutations have a finite (albeit small) chance of fixing in a
population. 2*s=0.02 (2%) 0.004% Biol336-12 Molecular Evolution If
the population size is very large then the probability of fixation
for an advantageous mutation converges to 2*s Given s=0.01, N=1000,
P(fixation)= 0.02 or 2%, Given s=0.01, N=100, P(fixation)=
Biol336-12 What about slightly deleterious mutations?
Molecular Evolution What about slightly deleterious mutations? s= ,
N=1000P(fixation)= s=-0.001, N=100, P(fixation)= s=-0.001,
N=10,P(fixation) = Biol336-12 Molecular Evolution Are most
substitutions (fixed changes) due to drift or natural selection?
vs. Agree that: Most mutations are deleterious and are removed.Some
mutations are favourable and are fixed. At Dispute: Are most
replacement mutations that fix beneficial or neutral? Is observed
polymorphism due to selection or drift? The 1960s witnessed a
revolution in population genetics.With the introduction of
electrophoresis into popgen studies, soon led to the discovery of
large amounts of genetic variability in natural populations such as
humans and Drosophila. In 1968 Kimura postulated that the majority
of the molecular changes in evolution were due to the random
fixation of neutral or nearly neutral mutations.This created a
dispute between neutralists and selectionist.The dispute
essentially concerns the distribution of fitness values of mutant
alleles. Biol336-12 Molecular Evolution Silent (or synonymous)
mutations, where the amino acid remains unchanged, are more likely
to be neutral. Replacement (or non-synonymous) mutations causing an
amino acid change are more likely to experience selection. Form and
strength depends on gene and its function Biol336-12 Molecular
Evolution Mammalian Genes Non-synonymous substitution rate
(per site per 109 years) Synonymous substitution rate (per site per
109 yrs) Histone 4 0.00 4.52 Histone 3 3.94 Myosin 0.10 2.15
Insulin 0.20 3.03 Growth Hormone 1.34 3.79 Immunoglobulin k 2.03
5.56 From this table, it is clear that the rate of nonsynonymous
substitution is variable among different genes, rangingfrom zero to
about 2x10-9 substitutions per nonsynonymous site per year Histones
have an unusually low replacement substitution rate. Look at the
column describing the rate of synonymous substitution.It also
varies though not as much as the rate of nonsynonymous substittuion
Biol336-12 Histones seem to have an unusually low replacement
substitution rate.
Molecular Evolution Histones seem to have an unusually low
replacement substitution rate. This suggests that mutations causing
basepair changes in histones are deleterious WHY? Biol336-12
Molecular Evolution Looking at H3 and H4 it is clear there is some
interaction with both the DNA and other histones Histones are DNA
binding proteins around which DNA is coiled to form chromatin.Many
positions within the protein interact with the DNA or other
histones. Biol336-12 Molecular Evolution Most amino acid changes in
histone proteins may have negative or even lethal consequences.
Histone proteins have strong functional constraints. Biol336-12
Molecular Evolution Mammalian Genes Non-synonymous substitution
rate
(per site per 109 years) Synonymous substitution rate (per site per
109 yrs) Histone 4 0.00 4.52 Histone 3 3.94 Myosin 0.10 2.15
Insulin 0.20 3.03 Growth Hormone 1.34 3.79 Immunoglobulin k 2.03
5.56 From this table, it is clear that the rate of nonsynonymous
substitution is variable among different genes, rangingfrom zero to
about 2x10-9 substitutions per nonsynonymous site per year Histones
have an unusually low replacement substitution rate. Look at the
column describing the rate of synonymous substitution.It also
varies though not as much as the rate of nonsynonymous substittuion
Biol336-12 Molecular Evolution Active sites (antigen binding sites
of immunoglobins often have higher substitution rates than silent
sites Immunoglobin genes are proteins found in the blood or bodily
fluids of vertebrates and are used by the immune system to identify
and neutral foreign objects.It is the small region at the tip of
the protein that is extremely variable.Each variant can bind a
different target or antigen.A huge diversity in this region allows
the immune system to recognize an equally wide diversity of
antigens Biol336-12 Molecular Evolution It could be that selection
favours mutations in these regions, thereby increasing the
diversity among antibodies produced by the body and improving the
immune response Immunoglobin genes are proteins found in the blood
or bodily fluids of vertebrates and are used by the immune system
to identify and neutral foreign objects.It is the small region at
the tip of the protein that is extremely variable.Each variant can
bind a different target or antigen.A huge diversity in this region
allows the immune system to recognize an equally wide diversity of
antigens Biol336-12 How and why have molecular sequences evolved to
be the way they are?
Molecular Evolution How and why have molecular sequences evolved to
be the way they are? Biol336-12 Molecular Evolution To infer that
selection has acted within a genome, one must reject the null
hypothesis that no selection has acted. Null hypothesis:describes
pattern of sequence evolution under the forces of mutation and
drift. Remember from neutral theory:The rate at which one
nucleotide is replaced by another nucleotide throughout a
population (substitution) equals the rate of mutation () at that
site. Probability of fixation of an advantageous allele * the
number of new mutants arising every generation. Probability of
fixation for positive values of s when N is large, is 2s. IF the
absolue value of s is small the probability of fixation is
2s/1-exp(-4Ns) Biol336-12 How do we detect selection at DNA
sequences?
Molecular Evolution How do we detect selection at DNA sequences?
Comparing intra-species polymorphism to inter-species differences
(McDonald-Kreitman test). Linked/neighbouring neutral markers.
Examine genes for Dn/Ds ratios. Biol336-12 Molecular Evolution: The
McDonald Kreitman Test
Kreitman and Hudson (1991) sequenced a 4750 basepair region near
the alcohol dehydrogenase (ADH) gene from 11 individuals of D.
melanogaster and found higher than expected levels of polymorphism
Biol336-12 Molecular Evolution: The McDonald Kreitman Test
There is only one amino acid polymorphism (AdhF/AdhS) within this
region which occurs at site 1490. Biol336-12 Molecular Evolution:
The McDonald Kreitman Test
Selection may be maintaining this polymorphism at or near this
site. Biol336-12 Molecular Evolution: The McDonald Kreitman
Test
ADH is an enzyme that breaks down ethanol. Flies carrying the ADHF
allele survive better when their food is spiked with ethanol than
do flies carrying the ADHS allele (Cavener and Clegg 1981)
Nonetheless, the factor that maintains ADHF/ADHS polymorphism
remains unknown. Alchohol dehydrogenase Biol336-12 How and why have
molecular sequences evolved to be the way they are?
Molecular Evolution: The McDonald-Kreitman Test How and why have
molecular sequences evolved to be the way they are? How do we
explain the patterns of variation observed in ADH DNA sequences?
Biol336-12 Molecular Evolution: McDonald Kreitman Test
Imagine that five sequences are obtained from each of two species,
and that the sequences are related to each other as shown here. Any
mutation that happens on a red branch will appear as a polymorphism
within species one Biol336-12 Molecular Evolution: McDonald
Kreitman Test
Any mutation that happens on a red branch will appear as a
polymorphism within species 1. Any mutation that happens on a blue
branch will appear as a polymorphism within species 2. Any mutation
that happens on the green branch will appear as a fixed difference
between the species within species between species Here the
phylogeny is divided into two parts:between species branches and
within species branches. Within species branches connect all the
alleles within each species to their most recent common ancestor.
Between species branches connect these common ancestors to the
common ancestor of the whole phylogeny. A mutation on a between
species branch will appear in all the descendant alleles and thus
will be a fixed difference between species.A mutation on a within
species branch will be a polymorphism within a species Biol336-12
Molecular Evolution: McDonald Kreitman Test
Some abbreviations: Within species Ps=numbers of synonymous
polymorphisms Pn=numbers of non-synonymous polymorphisms Between
species Ds=numbers of synonymous substitutions Dn=numbers of
non-synonymous substitutions Biol336-12 Molecular Evolution:
McDonald Kreitman Test
If mutations occur randomly over time and if the chance that a
mutation does or does not cause an amino acid change remains
constant, then the ratio of replacement to silent changes should be
the same along any of these branches Between species Remember weve
divided nt subsitutions in a coding region into two
types:replacement (non-synonymous) and synonymous. For a particular
phylogeny and mutation rate, if mutations occur randomly over time
and if the chance that a mutation does or doesnt cause a change in
the amino acid remains constant, then ratio of the replacement
changes to silent changes should be the same along any of these
branches. Biol336-12 Molecular Evolution: McDonald Kreitman
Test
If mutations are neutral any of these mutations has an equal chance
of persisting. So the ratio of replacement to silent polymorphisms
within a species (Pn/Ps) should be the same as the ratio of
replacement to silent differences fixed between species (Dn/Ds)
Pn/Ps Dn/Ds Remember weve divided nt subsitutions in a coding
region into two types:replacement (non-synonymous) and synonymous.
For a particular phylogeny and mutation rate, if mutations occur
randomly over time and if the chance that a mutation does or doesnt
cause a change in the amino acid remains constant, then ratio of
the replacement changes to silent changes should be the same along
any of these branches. Biol336-12 Molecular Evolution The
McDonald-Kreitman Test:
Ho:If all changes are neutral, the ratio of replacement to silent
changes at polymorphic sites (within species) should equal the
ratio among fixed differences (between species). H1: If replacement
mutations are advantageous, they fix rapidly, causing a higher
replacement to silent ratio between species and a lower replacement
to silent ratio within species. Biol336-12 Molecular Evolution The
McDonald-Kreitman Test:
H2:If replacement mutations are deleterious, they rarely fix.Thus
there will be a lower ratio of replacement to silent changes
between species and a higher replacement to silent ratio within
species. H3: If replacement mutations are subject to heterozygote
advantage or frequency dependent selection, they rarely fix,
causing a lower replacement to silent ratio between species and a
higher replacement to silent ratio within species. Biol336-12
Molecular Evolution Null: all changes are neutral : drift
H1: changes are advantageous, positive selection H2: changes are
deleterious, purifying selection H3: replacement changes never fix
because of heterozygote advantage. Biol336-12 Molecular Evolution:
McDonald Kreitman Test
ADH gene Fixed differences Between species Polymorphisms Within
species Replacement 7 2 Silent 17 42 Btwn species: Ratio of
replacement to silent = 7/17 =0.41 Wn species:Ratio of replacement
to silent = 2/42 =0.05 FIXED>POLYMORPHISM Biol336-12 Molecular
Evolution: McDonald Kreitman Test
Using a X2 test, the null hypothesis that selection is absent is
statistically rejected for ADH. The excess of replacement
differences between species suggests that mutations have been
postively favoured. Biol336-12 Molecular Evolution: McDonald
Kreitman Test
Assumes: All synonymous mutations are neutral (codon bias). All
non-synonymous mutations are either strongly deleterious, neutral
or strongly advantageous. Levels of polymorphism are governed by
the neutral mutation rate. Within a species, advantageous mutations
contribute little to polymorphism but can contribute to divergence
between species. A problem with this test is that: A failure to
reject the null hypothesis could be because both purifying and
directional selection have taken place. Not all synonymous changes
are in fact neutral. In some organisms, some codons are
preferentially used. Biol336-12 Molecular Evolution How else might
you detect selection in the genome, in particular the presence of
selective sweeps? Biol336-12 Molecular Evolution: Neighbouring
marker sites
If a beneficial mutation appears and sweeps through a population,
what will happen to the level of polymorphism present at
neighbouring DNA sites? Biol336-12 Molecular Evolution:
Neighbouring marker sites
If a beneficial mutation appears and sweeps through a population,
what will happen to the level of polymorphism present at
neighbouring DNA sites? Genetic hitchhiking will decrease
variation. Biol336-12 Molecular Evolution: Neighbouring marker
sites
In the case of Plasmodium falciparum, diversity at neighbouring
marker loci decreased. Biol336-12 Molecular Evolution: Neighbouring
marker sites
Biol336-12 Wootton et al.(2002) Nature Molecular Evolution:
Neighbouring marker sites
If there is overdominance at a nucleotide site, what will happen to
the level of polymorphism at neighbouring sites? Variation at
linked sites is more likely to be maintained. Biol336-12 Molecular
Evolution: Neighbouring marker sites
If there is directional selection to remove a particular mutant
allele (purifying selection), what will happen to the marker allele
that happens to be on the same chromosome? It will decrease in
frequency as a result of this association.This is called background
selection. Biol336-12 So what is the evidence for natural selection
shaping DNA sequences?
Molecular Evolution So what is the evidence for natural selection
shaping DNA sequences? Nielsen et al.(2005) PloS Biology H0:
neutral H1: positive Biol336-12 Molecular Evolution Nielsen et
al.(2005) PloS Biology Biol336-12 Molecular Evolution How can you
detect the signature of selection?
Comparing intra-species polymorphism to inter-species differences
(McDonald-Kreitman test). Linked/neighbouring neutral markers.
Examine genes for Dn/Ds ratios. Biol336-12 Zayed and Whitfield
(2008) PNAS
Molecular Evolution Zayed and Whitfield (2008) PNAS If drift and
demography are important then the effects will be seen on the whole
genome. If selection is important, then the effects will be seen in
specific regions of the genome. Biol336-12