37
Topic 14. Lecture 20. Positive, negative, and balancing selection in natural populations. We considered the five factors of Microevolution - mutation, selection, mode of reproduction, population structure, and genetic drift - separately. Of course, in nature they act together, and now we need to understand how this happens. Natural selection is the key factor of Microevolution, as far as adaptive evolution of phenotypes is considered. Thus, the presentation is structured around different modes of selection. Still, other factors are also important. In particular, we need to understand the relationship between selection - systematic differences of fitnesses, and genetic drift - random differences of fitnesses. Which one prevails? There is a simple answer to this question. Strength of selection, acting on a particular pair of alleles A and a (there could be no selection on one allele!) is characterized by coefficient of selection, or selective advantage of A over a s = 1-w a /w A . Strength of genetic drift is characterized by effective population size N e , the size of an equivalent Wright-Fisher population (with some exceptions, N e is approximately the same for all loci in the genome). The key fact: If, at some locus, N e s >> 1, selection rules. In particular, the most fit

Topic 14. Lecture 20. Positive, negative, and balancing selection in natural populations

  • Upload
    mahina

  • View
    26

  • Download
    3

Embed Size (px)

DESCRIPTION

Topic 14. Lecture 20. Positive, negative, and balancing selection in natural populations. - PowerPoint PPT Presentation

Citation preview

Page 1: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Topic 14. Lecture 20. Positive, negative, and balancing selection in natural populations.

We considered the five factors of Microevolution - mutation, selection, mode of reproduction, population structure, and genetic drift - separately. Of course, in nature they act together, and now we need to understand how this happens. Natural selection is the key factor of Microevolution, as far as adaptive evolution of phenotypes is considered. Thus, the presentation is structured around different modes of selection. Still, other factors are also important.

In particular, we need to understand the relationship between selection - systematic differences of fitnesses, and genetic drift - random differences of fitnesses. Which one prevails? There is a simple answer to this question.

Strength of selection, acting on a particular pair of alleles A and a (there could be no selection on one allele!) is characterized by coefficient of selection, or selective advantage of A over a s = 1-wa/wA.

Strength of genetic drift is characterized by effective population size Ne, the size of an equivalent Wright-Fisher population (with some exceptions, Ne is approximately the same for all loci in the genome).

The key fact:If, at some locus, Nes >> 1, selection rules. In particular, the most fit allele will be, eventually, fixed, and will never be lost after this. In contrast, if, at some locus, Nes << 1, random drift rules. In particular, evolution will be reversible in this case - no allele will be fixed forever.

Page 2: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Also, effective population size determines the level of genetic variation at selectively neutral loci. At such a locus, virtual heterozygosity H = 4Ne, where is the mutation rate at this locus. Thus, knowledge of makes it possible to estimate Ne for natural populations from easily observable levels of genetic heterogeneity. Some estimates of Ne in nature are:

humans - 10,000 (not today!)

whales - 35,000

fruit flies - 1,000,000

worms - 100,000 - 1,000,000

marine invertebrates - 1,000,000

ciliates - 10,000,000 (or even more)

bacteria - 10,000,000 (only!)

Effective sizes of most natural populations are much smaller than their actual head counts, because of high variation in the number of offspring per individual. Thus, the minimal strengths selection that still matters must be at least ~10-6 - and in some populations only a much stronger selection can affect evolution.

Let us now consider: 1) Selection that promotes changes (positive) 2) Selection that prevents changes (negative and balancing) 3) Weak or absent selection

Page 3: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

1) Selection that promotes changes

a) the complete story of one allele replacement:

Replacements of old, inferior alleles by new, superior alleles, driven by positive selection, in the most important process in Microevolution, responsible for evolution of adaptations. We already considered the last phase of this process. However, the whole story consists of 3 phases, the first two of which are affected by stochasticity, due to mutation and to genetic drift, respectively: i) a beneficial mutation appears - population has to wait for this to happen.

ii) the mutation survives unavoidable initial genetic drift.iii) the mutation takes over: d[A]/dt = s[A](1-[A]).

Page 4: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Quantitative analysis of an allele replacement (rough, but fair):

i) a beneficial mutation appears - population has to wait. For how long?

A particular mutation appears once every N generations, where N is a number of breeding individuals. Thus, a typical waiting time may be 100 generations for a simple nucleotide substitution (if = 10-8, N = 106), or 100,000 generations for a 3-nucleotide insertion (why 3?), or forever for a complex event (evolution is mutation-limited to a large extent).

ii) the mutation survives unavoidable initial genetic drift. With what probability?

A mutation will survive with probability ~2s, after which the number of its carriers will be lifted up to ~1/2s. Thus, if a typical strength of selection for a new, beneficial allele is 0.01, only ~1/100 alleles are not lost initially. In other words, the waiting time must be multiplied by 100 (may be more). This initial drift takes ~1/2s generations, after which a mutation is either lost or out of danger.

iii) the mutation takes over: d[A]/dt = s[A](1-[A]). How fast?

Deterministic propagation of the mutation - after the number of its carriers becomes larger than ~100 goes fast: it takes ~ 10/s generations for [A] to grow from ~0.0001 to 0.9999. Although our equation was derived for asexuals, sex even with diploid selection does not change much, as long as wAA > wAa > waa.

Page 5: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

ii) more on the unavoidable initial genetic drift

Drift is a fair process. Thus, after a mutation appears, the expected number of neutral mutants is always 1 - so that if the mutation was not lost after some generations (with probability, say, 1%), the expected number of mutants is 100. During a short time when this branching process matters, selective advantage of the new beneficial allele does not matter much. If a beneficial mutation is lost, the population has to wait for its next occurrences.

Page 6: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

iii) more on how a mutation takes over after becoming frequent

Is dominance/recessivity important for Darwinian evolution? Not much, at least as a rule. If the beneficial allele is partially recessive, its initial expansion is retarded, and if it is partially dominant, the final stage of its expansion is retarded. And without dominance (wAa

2 = wAA x waa, sex changes nothing.

Page 7: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Still, this is not yet a complete story of an adaptive allele replacement. Indeed, a replacement affects genetic variation at other loci, due to a phenomenon known as hitch-hiking. In asexuals, when a unique beneficial mutation reaches fixation, it "accidentally" drives to fixation all those variants that happened to be in the genotype in which it occurred.

Page 8: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

In this respect, sex makes a substantial difference, and limits , due to recombination, the impact of hitch-hiking to only a relatively small region of the chromosome. A replacement driven by positive selection produces a region of very low variation, flanked by regions with some high-frequency derived alleles.

A beneficial mutation (red)in a population with manysegregating neutral (green)and slightly deleterious (blue) variants.

Half-way towards fixation, the beneficial mutation carries with it the close-by variants.

Some of these variants become detached, due to crossing-over, by the time of the fixation.

This process of removal of genetic variation close to the site of an advantageous allele replacement is called selective sweep. At the boundaries of the region of a sweep, initially rare variants can reach high frequencies, but not fixation.

Page 9: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

The size of the segment of a chromosome that is swept of genetic variation due to an adaptive allele replacement increases with s, the coefficient of selection in favor of a new allele and declines Ne and r, the probability of recombination between two nucleotides, increases. If selection is strong, the allele replacement occurs fast, and a larger segment of the genome will be swept. If the population is infinite and/or recombination is extremely fast, the effect of hitch-hiking would disappear.

Page 10: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

b) Overlapping selection-driven allele replacements:

Under reasonable strength of selection, a selection-driven allele replacement can take from ~100 to ~10,000 generations. Thus, successive replacements in a population would not overlap in time, if there is, on average, much less that 1 of them per 100-10,000 generations. It seems that some populations accumulate beneficial alleles at faster rates. If so, different selection-driven allele replacements must overlap in time. How can this happen?

Here, the role of sex is critical: in a population of a realistic size, overlapping adaptive allele replacements can happen only with sex.

Asexual population: beneficial alleles at different loci that emerged in different individuals compete with each other. A replacement sweeps the whole genome!

Sexual population: beneficial alleles at different loci that emerged in different individuals can find their way into the same genotype, due to recombination.

Page 11: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Still, even with sex, overlapping adaptive allele replacements may signify a problem for the population. Indeed, they imply that fitness landscape changes fast and, if this happens, can the population survive?

When a population follows a rapidly moving fitness peak, it lags behind it substantially - and without a big lag there would be no overlapping adaptive replacements. The reduction of the mean population fitness relative to the optimal fitness that corresponds to the top of the fitness peak is called the lag genetic load: L = (wmax-W)/wmax .

Haldane's dilemma: If a population has to follow a fitness peak that moves too rapidly, and, thus, tries to accumulate too many adaptive allele replacements per unit time, it may suffer from too high a lag load and go extinct. Remember that, in order to sustain the population with L = 0.8, individuals of the optimal genotype must produce, on average, at least 5 (or 10 with sex) offspring.

Suppose that, at a given moment, 100 adaptive allele replacements are occurring. Thus, an average individual lags behind the optimum by 50 alleles. Would such an individual be viable?

Page 12: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Is Haldane's dilemma real? If positive selection favors new, advantageous alleles independently, the fitness of an individual with k such alleles, each with advantage s, is (1+s)k ~ eks. Then, the average individual may have fitness that is way below the fitness of the optimal genotype and the lag load would be too high if many replacements occur at the same time. However, epistatic selection can abolish this problem: if selection is soft and 50% of the population is left to reproduce, the per individual number of good alleles increases by ~1 standard deviation of the number of beneficial alleles per individual - which can be a lot. We do not know what kind of selection - hard and exponential or soft and epistatic - is responsible for adaptive evolution. Depending on the answer, the rate of adaptive evolution is either limited or not limited by the lag load.

Page 13: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

c) Selection-driven allele replacements in spatially structured populations

Replacement of an old, inferior allele with the new, beneficial allele can be substantially slowed down by spatial structure of the population. Propagation of a beneficial allele follows "traveling wave" dynamics, with the velocity of propagation 2(ms)1/2, where m is the rate of (localized) migration and s is selective advantage of the new allele. Occasional long-distance leaps of some individuals can speed up this process substantially.

Without epistasis, waves of propagation of different alleles approximately independently within sexual populations.

Page 14: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Global Spread of Chloroquine-Resistant Strains of Plasmodium falciparum.

Microevolutionary theory obviously needs to take into account spatial structure when the spread of an advantageous genotype is considered. However, it does not radically alter the outcome of evolution under strong selection, with the only exception of speciation.

Page 15: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

2) Selection that prevents changes

Two forms of selection prevent changes - negative selection and balancing selection.

Selection that prevents changes is much less important, from the point of view of evolutionary biology, than positive selection. Human's eye and peacock's tail evolved due to positive selection! Still, selection that prevent changes cannot be ignored completely, because negative selection is the most common form of selection in natural populations. In other words, a vast majority of mutations that lead to a substantial change of the phenotype are deleterious.

One more example of negative selection:

deleterious mutations in human rhodopsin.

ADRP: autosomal dominant retinitis pigmentosa.

ARRP: autosomal recessive retinitis pigmentosa.

CSNB: congenital stationary night blindness.

Page 16: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Under negative selection, population suffers from genetic load, which can be referred to as mutation load (abolish mutation, and this load would disappear).

Let us consider the simplest case of one locus with two alleles, A and a, assuming asexual reproduction or sex with selection in the haploid phase. Fitnesses of alleles A and a are 1 and 1-s, respectively. Deleterious mutations A -> a occur with rate . Mutation and selection lead to the following changes in allele frequencies (assuming that a is rare, because s >> )

mutation selection [a] ---------------> [a] + ---------------> ([a] + )(1-s) = [a]t+1

At equilibrium:

([a] + )(1-s) = [a]

[a] + - s[a] - s = [a]

Ignoring term -s (a product of two small numbers), we obtain

[a]eq = /s.

What is the value of genetic load under mutation-selection equilibrium?

L = 1 - W/wmax; wmax = 1; W = 1[A]eq + (1-s)[a]eq; [a]eq = /s.

Thus, L = 1 - 1(1-/s) - (1-s)/s =

Page 17: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

This remarkable fact, L = is known as Haldane-Muller principle: mutation load is equal to mutation rate, and does not depend on the strength of selection against mutations ("one mutation - one genetic death"). Of course, this is true only if selection removes mutations one-by-one. What other situations are possible?

1) Recessive mutations at one locus. Consider two alleles at one locus of sexual diploids, with fitnesses 1 (AA), 1-hs (Aa), and 1-s (aa), where h characterizes dominance of the deleterious allele a. When a is recessive, mutation load is two times lower: if deleterious alleles are removed only as homozygotes, each genetic death removes two alleles.

Page 18: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

2) Truncation or similar epistatic selection against mutations at many loci. Exponential selection removes mutations, in a sense, one-by-one, but under truncation one genetic death can remove many mutations, reducing the mutation load (only with sex).

Both recessivity and truncation are forms of synergistic epistasis between different deleterious alleles; when present together, mutations reinforce deleterious effects of each other. How important is this phenomenon in nature remains a matter of debates.

Page 19: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

In addition to negative selection, changes of the population can also be prevented by balancing selection, which, however, keeps the population variable. One form of balancing selection is the direct dependence of fitnesses of genotypes on allele frequencies, with rare genotypes having an advantage. Interactions of selection with Mendelian segregation lead to another curious form of balancing selection, due to advantage of heterozygotes.

Consider a population of sexual diploids with two alleles, A and a, at one locus. Fitnesses of the 3 possible genotypes, AA, Aa, and aa, are wAA, wAa, and waa, respectively. If wAA < wAa > waa (advantage of heterozygotes), selection protects variation.

Indeed, due to Hardy-Weinberg law, a rare allele is mostly exposed to selection in heterozygous state and, thus, advantage of heterozygotes leads to a higher fitness of rare alleles.

Frequencies of the 3 possible genotypes, AA, Aa, and aa are [A]2, 2[A][a], and [a]2. If, for example, A is rare, [A]2 is small relatively to 2[A][a], so that rare A will be mostly present in heterozygotes.

Page 20: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

A few examples of balancing selection are known, but this mode of selection is rare.

Page 21: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

3) Weak or absent selection

The simplest case of strict selective neutrality is particularly important, because there are many neutral nucleotide sites, at least in large genomes. In this case

1. Equilibrium virtual heterozygosity is H = 4Ne in a diploid population. Derivation is easy, but we will not consider it.

1. Rate of evolution, the per generation frequency of allele replacements, equals to the mutation rate .

The probability of occurrence of a new mutation is N per generation (say, 0.001). A new mutation then will be fixed with probability 1/N, because it has the same probability of eventually taking over the population as any other allele (selection is absent). This gives us allele replacements per generation.

Page 22: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

If selection is not totally absent (s = 0), it can be regarded as "weak" is Nes < 20. In this case the superior allele is not fixed permanently. In the simplest case of symmetric mutation, the rate of evolution and the level of variation are maximal when selection is absent, and decline very rapidly when selection gets stronger.

Locus A with two alleles A1 and A2, symmetric mutation with rate , such that 4Ne << 1, so that most of the time either allele is fixed. Rate of evolution is the frequency of switches between A1 and A2 fixations.

Page 23: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Detecting natural selection

We reviewed the very basics of the direct theory of Microevolution, which tells us how all its factors, working together, affect genetic variation within populations. However, this theory is useful only if we know the actual parameters of factors of Microevolution. This can be accomplished either by direct measurements, for example of the mutation rate (by parent-offspring comparisons), or through inverse theory of Microevolution, which infers, from patterns in genetic variation and allele replacement, the parameters of these factors.

We already saw how this works for measuring genetic drift: theory predicts that without selection H = 4Ne, thus, if we know and can measure H, we can recover 4Ne (which is almost impossible to observe directly).

Now we will consider the key issue of measuring natural selection. Indeed, measuring fitnesses directly is very difficult (it is essentially impossible to measure fitness of a multicellular organism with an error less than 1-3%), and the results obtained in the laboratory cannot be applied to wild populations. Thus, indirect methods based on inverse theory are crucial.

Page 24: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

1) Detecting negative selection

This is a relatively easy task - because negative selection is very common. Negative selection affects evolving sequences in two ways:

1) it reduces the probability of fixation of a mutation with s < 0

2) it reduces the time until elimination of a mutation with s < 0

As a result, negative selection leaves two kinds of footprints:

1) reduced rate of evolution and the level of within-population variation

Reduced relative to what? - to the rate of evolution at selectively neutral sites. According to the fundamental theorem of neutral evolution, neutral sites evolve at the mutation rate (this is intuitively obvious). Practically, negative selection is detected by comparing the amount of interspecies divergence or within-population polymorphism to that at plausibly neutral sequence sites.

Can we detect negative selection at individual sites or only at sequence segments? This depends on the depths of the alignment.

Alignment of orthologous regulatory regions of 4 mammals. A transcription factor-binding site with low divergence is marked by blue. If the alignment includes only a few sequences, we can only detect substantial segments with reduced divergence rates (never call them mutation rates!) - for example, using Hidden Markov Model technique.

Page 25: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

A typical segment of an alignment of orthologous proteins from different species. Here the number of sequences makes it possible to detect negative selection even at individual sites.

Data on within-population variation usually allow us only to detect negative selection in wide classes of sites, for example to show that non-synonymous coding sites are under stronger selection than synonymous sites. However, with high H making inferences about individual sites may become possible. We badly need 100 genotypes of Ciona savignyi.

Page 26: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

2) An excess of rare allelesDistribution of allele (nucleotide) frequencies in Arabidopsis thaliana. PLoS Biology 3, 1289-1299, 2005.

At non-synonymous sites an excess of rare alleles, relative to the neutral expectation, is higher. Of course, here we cannot make inferences about individual sites.

However, we can make inferences about the strength of negative selection - because only alleles with small s are observed as rare polymorphisms.

In contrast, reduced rate of evolution tells us very little about the strength of selection: s = -0.001 is enough to stop evolution.

Page 27: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

2) Detecting positive selection

This is a difficult and important problem - because positive selection is rare, relatively to negative selection (this was proposed in 1935 by Ivan Schmalhausen) and because positive selection is the only driving force of adaptive evolution.

Positive selection affects evolving sequences in two ways:

1) it increases the probability of fixation of a mutation with s > 0

2) it reduces the time until fixation of a mutation with s > 0

Footprint of positive selection looks rather differently depending on its age.

1) Positive selection accomplished a long time ago - interspecies comparisons

In contrast to negative selection, positive selection accelerates evolution (not the rate of evolution!). Thus, it makes sites or segments to evolve faster than neutrally. As a result, we can detect positive selection only from comparing relatively close species, such that the number of accepted substitutions between them per neutral site, Kneu, is ~1-3. Ancient actions of positive selection, that occurred more than 1/m generations ago (m is the per nucleotide mutation rate) could never be detected.

Page 28: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Distribution of amino acid replacements along the Neisseria gonorrhoeae transmembrane porin sequence. Each dot represents one replacement. Obviously, sequence segments exposed outside the cell evolve much faster, probably due to positive selection. Molecular Biology and Evolution 17, 423-436, 2000.

So, if we have a large number of close enough sequences, even individual sites where K > Kneu (Kneu is measured for sites that are probably under no selection) can be detected. This approach works well for pathogens, with multiple moderately different strains.

Page 29: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Positive selection in HIV-1 protease, detected on samples from 40,000 patients. For each codon site, the ratio of the rate of the most common allele replacement over the neutral rate is shown (Journal of Virology 78, 3722-3732, 2004).

Page 30: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

However, there are two problems with this approach:

1) Positive selection can act only within one clade, with negative selection acting at the same site in the rest of the phylogeny. Then, overall K will be low at the site.

2) There may be not enough species to measure K for individual sites. If so, all probably important sites are treated together, and their average per site number of changes, K imp, is calculated. Trouble is, sites under positive are generally scattered between numerous sites under negative selection, leading to Kimp < Kneu. Only very rarely, there are long enough segments with a majority of sites under positive selection.

Positive selection acting in one clade, on a sparse phylogenetic tree.

Sophisticated statistical methods can be used to analyze such data - but, in my opinion, they reliably detect positive selection only if a substantial fraction of sites to K imp > Kneu. at least within a large clade - and this is generally very rare. Most of "important" sites are, most of the time, under negative, and not positive selection.

Page 31: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

A clever idea of MacDonald and Kreitman can offer some help. They realized that the condition Kimp > Kneu (or Kimp/Kneu > 1) can be relaxed. If negative selection is strong, "important" sites under it will not be polymorphic in the population. Sites under positive selection also make only minimal contribution to polymorphism (because polymorphism in the course of an allele replacement is very short-lived). Thus, instead of asking for

Kimp/Kneu > 1

as a signature of positive selection it is enough to ask for

Kimp/Kneu > Himp/Hneu

Himp/Hneu can be as low as 0.2-0.3 (due to a large fraction of sites under negative selection among "important" sites), so this is a much less stringent condition.

One problem with this approach is that slightly deleterious variants with -s ~ 1/Ne can segregate within the population, but are only rarely fixed, and thus inflate H imp/Hneu. A possible way of dealing with this problem is to ignore rare variants.

Some applications of MacDonald-Kreitman test to Drosophila species suggest that as many as 50% of allele replacements in fly evolution were driven by positive selection, because

Kimp/Kneu = 2Himp/Hneu

In contrast, in mammals Kimp/Kneu < Himp/Hneu, suggesting no positive selection. The reasons for such contrast are unclear. Anyway, MK test could never establish identities of individual sites under positive selection.

Page 32: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

2) Positive selection accomplished recently - within-population variation

A recent allele replacement driven by positive selection produces a region of very low variation, flanked by regions with some high-frequency derived alleles. Such a scar of an allele replacement is due to an effect called hitch-hiking, and it remains visible for << 1/Ne generations, where Ne is the effective population size per nucleotide mutation rate.

A beneficial mutation (red)in a population with manysegregating neutral (green)and slightly deleterious (blue) variants.

Half-way towards fixation, the beneficial mutation carries with it the close-by variants.

Some of these variants become detached, due to crossing-over, by the time of the fixation.

Page 33: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Reduced levels of genetic variation around the site of recent positive selection-driven allele replacement (selective sweep) in human populations from Africa (a), Europe (b), and East Asia (c) (Nature Genetics 39, 218 - 225, 2007).

There are several definite known cases of recently accomplished selective sweeps.

Page 34: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

(a) Kenyan and Tanzanian C-14010 lactase-persistent (red) and non-persistent G-14010 (blue) homozygosity tracts. (b) European and Asian T-13910 lactase-persistent (green) and C-13910 non-persistent (orange) homozygosity tracts. Positions are relative to the start codon of lactase locus (Nature Genetics 39, 31 - 40, 2006).

3) Ongoing positive selection - within-population variation

One must be lucky to study the right population at the right time. Still, there are some definite cases of ongoing allele replacements driven by strong positive selection. One of them is parallel acquisition the ability of adults to digest milk (due to persistent expression of lactase) in Africans and non-Africans. These ongoing sweeps left clear-cut signatures.

Page 35: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

4) A different approach - detecting positive selection by bursts of substitutions

Suppose that at a codon site fitness landscape was suddenly changed. The new optimal amino acid may not be reachable from the old one by a single nucleotide substitution. Then, a clump of two or even three non-synonymous substitutions may follow. Such clumps were observed in evolution of mammals and HIV-1 (PNAS 103, 19396-19401, 2006).

Clumping of nonsynonymous substitutions is the strongest in conservative regions of proteins, where the 1:1 situations occur only in ~20% of codons. Indeed - if an important amino acid is replaced, this must be beneficial. This approach reveals a number of slowly-evolving sites that occasionally undergo positive selection.

Amino acid sites inferred to be under positive selection in HIV-1 gp120. Left: rapidly evolving sites previously inferred to be under positive selection. Right: conservative sites with strongly clumped substitutions.

Page 36: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

3) Detecting balancing selection

Balancing selection, which requires changing fitness landscapes, favors rare alleles. It prevents fixations and losses of the alleles involved, leading to durable polymorphisms.

In the extreme case this can lead to transspecies polymorphisms, persisting from the time of species divergence. This is the case for sad csd (complementary sex determination) locus in bees. Female must be heterozygous at this locus, and homozygotes develop into sterile males, causing strong selection against common alleles (Genome Res. 16, 1366-1375, 2006).

Page 37: Topic 14. Lecture 20.  Positive, negative, and balancing selection in natural populations

Quiz:

Suppose that there is a genome segment with low genetic variation within a population. This can be due to either negative selection or a recent selective sweep within this segment. What additional data can be used to distinguish between these two explanations?