19
Balancing Selection Balancing selection occurs when selection acts to maintain different aleles in a population. This can occur through several different mechanisms. Heterozygote advantage/Overdominance - heterozygotes have higher fitness than homozygotes. Coevolutionary dynamics - alleles that are common in one species may become less fit because they are exploited by opposing species (pathogens, predators). Negative assortative mating - individuals preferentially mate with individuals that carry different alleles. Spatially and temporally fluctuating selection - different alleles may be favored at different times or in different locations. What these all have in common is that they lead to negative frequency-dependent selection: as an allele becomes more common, its fitness decreases below the mean fitness of the population. Jay Taylor (ASU) Balancing Selection 23 Feb 2017 1 / 19

Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Balancing Selection

Balancing selection occurs when selection acts to maintain different aleles in apopulation. This can occur through several different mechanisms.

Heterozygote advantage/Overdominance - heterozygotes have higher fitnessthan homozygotes.

Coevolutionary dynamics - alleles that are common in one species may becomeless fit because they are exploited by opposing species (pathogens, predators).

Negative assortative mating - individuals preferentially mate with individualsthat carry different alleles.

Spatially and temporally fluctuating selection - different alleles may be favoredat different times or in different locations.

What these all have in common is that they lead to negative frequency-dependentselection: as an allele becomes more common, its fitness decreases below the meanfitness of the population.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 1 / 19

Page 2: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Example: Sickle cell anemia

The sickle cell mutation in the β-globin gene results in a single amino acidsubstitution that causes hemoglobin to form long rod-like structures in βSβS

homozygotes.

Heterozygotes βAβS are less susceptible to P. falciparum malaria and have higherfitness than either homozygote in areas of high malaria incidence.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 2 / 19

Page 3: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Example: Self-incompatibility Loci

Many plants avoid inbreeding through self-incompatibility loci.

SI is mediated through pollen-pistil interactions.

Fertilization can occur only if the pollen and pistil carry different SI alleles

Common SI alleles are at selective disadvantage because they are more likely toexperience incompatibility.

This leads to strong balancing selection on SI loci, which can harbor tens tohundreds of distinct haplotypes.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 3 / 19

Page 4: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Example: Social polymorphism in White-throated Sparrows

WTSP’s come in two morphs (white-striped and tan-striped) that differ inplumage, behavior and mate preference.

Controlled by a 104 Mb supergene with two divergent haplotypes 2 and 2m.

Maintained by strong (> 96%) negative assortative mating.

Recombination is suppressed by inversions.

2/2 = tan-striped

2/2m = white-striped

2m/2m = lethal

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 4 / 19

Page 5: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Overdominance and Balancing Selection

If the fitness of the heterozygote is greater than the fitness of either homozygote,i.e., if σ12 > σ11, σ22, then the heterozygote is said to be overdominant. In this case,there is an intermediate frequency

p̄ =σ12 − σ22

2σ12 − σ11 − σ22

such that

A1 is more fit when p < p̄;

A2 is more fit when p > p̄;

Both alelles are equally fit whenp = p̄.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 5 / 19

Page 6: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Overdominance in Finite Populations

In a finite population, balancing selection will interact with genetic drift and the allelefrequencies will fluctuate around their equilibrium values.

0 5000 100000

0.5

1

NeutralityN=1000,µ=0.0002

Generation

p

0 5000 100000

0.5

1

Directional Selections =0.01

Generation

p

0 5000 100000

0.5

1

BalancingSelections =0.01*(0.5−p)

Generation

p

This produces a very different pattern than directional selection, but it may be difficultto distinguish weak or moderate balancing selection from neutral evolution.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 6 / 19

Page 7: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

The tendency of balancing selection to maintain variation can also be seen in thedensity of the stationary distribution for this diffusion:

π(p) =1

Cp2µ1−1q2µ2−1e(2σ12−σ11)p(2p̄−p).

Symmetric Balancing Selection: In the following histograms, σ11 = σ22 = 0,2σ12 = 4Ns, and 4Nµ = 0.1.

0.05 0.2 0.35 0.5 0.65 0.8 0.95

p

0.00

0.05

0.10

0.15

0.20

Ns = 2.5

0.05 0.2 0.35 0.5 0.65 0.8 0.95

p

0.00

0.05

0.10

0.15

0.20

Ns = 10

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 7 / 19

Page 8: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Genealogical Consequences of Balancing Selection

Balancing selection tends to increase the average coalescent time between haplotypescarrying distinct selected alleles.

This occurs because the alleles themselves can be maintained at high frequenciesfor long times.

Elevated coalescent times can also be seen at linked neutral loci.

This leads to elevated nucleotide diversity in the vicinity of the selected locus.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 8 / 19

Page 9: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Changes to the ancestral process can occurthrough the following events:

Two A1 lineages can coalesce.

Two A2 lineages can coalesce.

Each lineage can migrate between

backgrounds, through:

mutation at the selected locus;recombination between the selectedand marker loci.

The allele frequencies at the selected locuschange as we go backwards in time.

p

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.1

0.2

0.3

0.4

0.5

past%

present%

Common%ancestor%

coal%

coal%

coal%

rec%

mut%

1% 1% 1% 2%

2%

2% 2%

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 9 / 19

Page 10: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

The impact of balancing selection on linked marker loci is reduced both by high mutationrates at the selected locus and by recombination between the selected and marker loci.

1124 N. H. Barton and A. M. Etheridge

Figure 10.—The effect of balancing selection on the meanFigure 11.—The mean coalescence time, !, as a functioncoalescence time, E[!]. The strength of balancing selection,

of allele frequency; U " 0.5, p " 0.5, R " 0, Sb " 32, p0 "S b, increases to the right; negative values correspond to disrup-0.7. The thin solid curve gives the exact solution from Equa-tive selection; p0 " 0.7. The two curves are for mutation ratetions 5. This is compared with the prediction from EquationU " 0.5 (thick curve) and 0.25 (thin curve). There is no15 (dashed line), which assumes that allele frequency is fixed;recombination (R " 0). The straight lines are the predictionsthis prediction is independent of selection. The thick curveassuming that allele frequency is fixed at p0.shows the stationary distribution.

No movement between backgrounds: If there is noall mean, E[!], can be estimated either by integratingrecombination or mutation between alleles (U, R " 0)over the stationary distribution or by fixing p at its de-then the three identities given by Equations 1 changeterministic equilibrium value (Neuhauser and Kroneindependently. The between-class identity f1,1 clearly1997).tends to zero, while the within-class identities are those

Figure 10 shows the mean coalescence time as a func-of two separate populations of size p, q, which fluctuatetion of the strength of balancing selection. Here, U "according to a diffusion process. (Strong frequency de-N, # " 0.5 or 0.25, and so effects are not large: balancingpendence is required to prevent loss of variation at theselection has a substantial effect on genealogies onlyselected locus in the absence of mutation: near thewhen mutation is low enough that genes rarely moveboundaries, S → p$% as p → 0, % & 1.) However, evenbetween backgrounds. The mean coalescence time doesthis uncoupling into a set of single-variable equationsapproach the prediction from Equation 15 for large Sdoes not lead to an explicit solution. The identities(right side of figure), but balancing selection must becould be found by a change of timescale in the standardextremely strong for the deterministic limit to be accu-coalescent, but this does not lead to closed-form solu-rate. That is, weak random fluctuations in allele fre-tions (Donnelly and Kurtz 1999b).quency can substantially reduce coalescence times. NoteWeak random drift: If random drift is weak relativethat weak disruptive selection (e.g., underdominance)to mutation, recombination, and selection, then allelereduces coalescence times slightly, because allele fre-frequencies will be close to a deterministic limit. Manyquencies tend to sweep back and forth between alterna-treatments of the effects of selection on genealogiestive alleles (see left of graph). However, strong under-assume this limit and apply the standard structured coa-dominance has little effect, because the population islescent (e.g., Kaplan et al. 1988, 1989; Hudson andthen near fixation.Kaplan 1995; Wakeley 2001). For the model presented

Figure 11 shows an example in which fluctuationshere, this limit corresponds to dropping the diffusionsignificantly reduce mean coalescence time despiteterms (i.e., those terms involving !p or !p,p from Equationstrong selection. With S " Ns " 32, allele frequencies1) and thereby assuming that the population has alwayscluster around the equilibrium of p0 " 0.7 (bell-shapedhad the same allele frequency. This approximation iscurve). The coalescence time is almost independent ofnot explicitly dependent on the strength of selection,allele frequency, simply because populations away frombut does depend on both mutation and recombinationequilibrium are recently derived from populations closerates, which determine the rate of mixing between al-to equilibrium (thin solid curve). In contrast, the deter-lelic classes. Under this approximation, the mean pair-ministic prediction (dashed line) ignores the diffusionwise coalescence time isof populations between different allele frequencies andso depends more strongly on allele frequency. Taking! "

pqR ' p2q2(1 ' R) ' 4(Uc2 ' pqR)2 ' U(c3 ' 3p2q2)(Uc2 ' pqR) ' 4U 2(c1c2 $ pqpq) ' 8URc2pq ' 4p2q2R 2

,the value of this approximation at p0 " 0.7 gives a sub-stantial overestimate of mean coalescence time, even

where cj " (pq j ' qp j) . (15) under such strong selection.Figures 12 and 13 show similar results, but for muta-A similar expression can be obtained for the identity

in allelic state, which allows calculation of higher mo- tion/selection balance rather than for balancing selec-tion. The lines to the right in Figure 12 show the pre-ments of the distribution of coalescence times. The over-

1126 N. H. Barton and A. M. Etheridge

Figure 14.—The effect of balancing selection on mean Figure 15.—The effect of purifying selection on mean co-coalescence time, plotted against recombination rate, R; U ! alescence time, plotted against recombination rate, R; U !0.05. The vertical axis shows increases over the neutral value, 0.5, p ! 0.5. The vertical axis shows decreases from the neutralE["] # 1, on a logarithmic scale. The top thick curve shows value, 1 # E["], on a logarithmic scale. The top thick curvethe deterministic limit, in which allele frequency is assumed shows the deterministic limit for S ! 8, in which allele fre-to be fixed at p0 ! 0.7 (Equation 15). The thin solid curves quency is assumed to be fixed at p0 ! 1 # (U/S) (Equationare for Sb ! 4, 8, 16, 32 (bottom to top), calculated using 15). The thin solid curves are for S ! 0.5, 1, 2, 4, 8 (rightEquations 1 and 11. The dashed curves show the high recombi- side, bottom to top), calculated using Equations 1 and 11.nation limit (Equation 19). The dashed curves show the high recombination limit (Equa-

tion 19).

coefficient S ! S b(p0 # p) is zero at p ! p0. However,allele frequency fluctuates around this expectation with which is never the case for U ! 0.5, even when selectionvariance var(p) ! 1/4S b for large S b. Hence, E[S 2pq/ is strong. This approximation is expected to be accurateR 2] ! S bp0q0/4R 2. Note that this is not the same as the only for U $ 1. The large R approximation of Equa-limit of Equation 15 as R → ∞, which is 1/4R. Allele tion 19 is more accurate (dashed lines), but systemati-frequency fluctuations have a significant effect that can- cally underestimates the effect by !18%. Examinationnot be neglected even for strong selection. of the differences in mean coalescence time between

We examine the accuracy of these approximations backgrounds, %", &", shows that the approximation ofby considering the effect of increasing recombination. Equations 17 breaks down near the margins (q !Figure 14 shows the increase in mean coalescence time (1/R)). Since the stationary density is appreciable incaused by balancing selection, plotted against recombi- this region for R ! 10, much larger recombination ratesnation rate. Mutation is set to a small positive value would be needed for accuracy to be improved. As for(U ! 0.05), to ensure that a stationary distribution balancing selection, this approximation is accurate only

in cases where the effect on coalescence time is small.exists. However, mutation has a negligible effect on theresults unless linkage is tight (R ! U). The increase For tight linkage, the effect of purifying selection at

first increases with selection, but then decreases (seeover the neutral value, E["] # 1, is plotted on a logscale, because when R is large, small effects must be Figure 15, left side). This can be seen more clearly in

Figure 16, which shows the mean coalescence time asdiscerned. As selection increases (S b ! 4, 8, 16, 32,bottom to top), mean coalescence time converges to the a function of S ! Ns, with complete linkage. As selection

becomes very strong, the rare allele is driven out of thedeterministic limit (thick line; Equation 15) However,convergence is slow for large R. There, the large R population, and so the genealogy returns toward its

form under the neutral coalescent.approximation of Equation 19 is accurate (dashed linesto right). However, the approximation is good only for Large genealogies: We have concentrated on numeri-

cal examples for pairwise coalescence time partly be-R ' 10, in which case mean coalescence time is in-creased by at most 2.5%. The large R approximation is cause computations are then much faster, but also be-

cause the effects of fluctuations in allele frequency, andnot helpful for parameters that give a large effect.Figure 15 shows a similar plot for the decrease in hence of selection, are primarily on the deeper parts

of the genealogy, when there are just a few ancestralmean coalescence time caused by purifying selection.Now, mutation is set to a moderately high level (U ! lineages. Here, we use Equation 5 to consider the effect

of purifying selection on the expected total length of a0.5); with weak mutation, the population would almostalways be fixed, and effects on linked variation would larger genealogy. Figure 17 shows how the expected

length depends on allele frequency and on the composi-be negligible. The deterministic limit of Equation 15(top thick line) now performs poorly. This is because tion of the sample. If five genes of type P are sampled,

then the genealogy is much shorter when that allele isit is based on the assumption that allele frequency isclose to the deterministic equilibrium of 1 # (U/S), rare (line rising steeply from left to right); similarly, if

Source: Barton & Etheridge (2004)

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 10 / 19

Page 11: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Balancing Selection and Transpecific Polymorphism

When balancing selection is both sufficiently strong and consistent, it can maintainancestral polymorphisms in multiple descendant species.

This is best documented for severalMHC loci, which exhibit TSP in bothprimates and rodents.

Other suspected examples involve lociinvolved in adaptive immunity.

Proving TSP is difficult because oneneeds to rule out convergent molecularevolution and horizontal exchange.

Source: Azevedo et al. (2015)

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 11 / 19

Page 12: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Fluctuating Selection

In some cases, the fitness of an allele may depend on environmental conditions thatchange over time. How such loci evolve will then depend on how rapidly theenvironmental fluctuations occur relative to the generation time of the population.

If the environment fluctuates rapidly within the lifespan of an individual, then thefitness of each allele will be approximately the same from generation and classicaltheory will apply.

If environmental fluctuations take place over a small number of generations, thenthe fitness of each allele will change rapidly in relationship to genetic drift.

If environmental fluctuations take place over a large number of generations, thenthe locus may experience successive selective sweeps when a previously deleteriousallele becomes beneficial.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 12 / 19

Page 13: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

A Model of Fluctuating Selection in a Finite Population

We can incorporate fluctuating selection into the bi-allelic Wright-Fisher model byallowing the fitnesses of the allleles A1 and A2 to vary at random from one generation tothe next. Here we will consider two scenarios.

Haploid model: Suppose that in generation t each allele Ai and A2 has fitness

wi (t) = 1 + s̄i + si (t),

where 1 + s̄i is the mean fitness of that allele (averaged over environments) and(s1(t), s2(t)) are IID random vectors.

Additive diploid model: Suppose that in generation t each genotype AiAj hasfitness

wij(t) = 1 + s̄ij + (si (t) + sj(t)),

where 1 + s̄ij is the mean fitness of that genotype (averaged over environments)and (s1(t), s2(t)) are IID random vectors.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 13 / 19

Page 14: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

The effects of the environmental fluctuations on the allelic fitnesses are captured by therandom vectors (s1(t), s2(t)), which we will assume are independent andidentically-distributed (IID) with the following properties:

E[si (t)] = 0

E[si (t)sj(t)] = σij

The first condition is purely for convenience, since we have already explicitly specifiedthe mean fitness of each allele or genotype as s̄i or s̄ij . This then allows us to interpretthe quantities σij given in the second equation as the variances (σ11, σ22) or covariances(σij) of the environmental components of fitness.

Caveat: The results that follow also assume that the higher order moments E[σ4ij ] are

small relative to 1/N.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 14 / 19

Page 15: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Fluctuating selection affects the distribution of allele frequencies (p + q = 1) at theselected locus in three ways. First, it increases the variance of the frequency of A1, ascan be seen in the following identity:

Var(∆p) = v1p(1 − p) + v2p2(1 − p)2

where

v1 =1

N(haploid) or

1

2N(diploid)

v2 = Var(s1(t) − s2(t)) = σ11 − 2σ12 + σ22

Remarks:

This result shows that fluctuating selection can mimic genetic drift in the sensethat it increases the rate at which the allele frequencies will fluctuate from onegeneration to the next.

Like genetic drift, this effect will tend to reduce variation at the selected locus.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 15 / 19

Page 16: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

In addition, fluctuating selection can also give rise to a mix of directional and balancingselection on the alleles. Specifically,

E[∆p] = d1p(1 − p) + d2p(1 − p)(1 − 2p)

where

d1 =

s̄1 − s̄2 + 1

2(σ22 − σ11) (haploid)

12(s̄11 − s̄22) + (σ22 − σ11) (diploid).

and

d2 =

12v2 (haploid)

v2 + s̄12 − 12(s̄11 + s̄22) (diploid).

Here, d1 is the strength of directional selection for or against A1, while d2 is the strengthof balancing selection. Notice that d1 is affected by differences in mean fitness and inthe variance of fitness.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 16 / 19

Page 17: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Qualitative impacts of fluctuating selection:

Fluctuating selection can both increase and reduce the amount of variationmaintained in a population.

Increased variation can be maintained by the balancing selection that occurs in afluctuating environment. The strength of this effect depends on the variance ofthe environmental fluctuations, as measured by d2 (which also accounts foroverdominance in the diploid model).

Variation will tend to be reduced both by the directional component of fluctuatingselection (measured by d1) and by the increased variance of the allele frequencies(measured by v2).

Whether variation will be increased or reduced depends on the relative sizes of v2,d1 and d2.

Alleles can be favored both because they have higher mean fitness or because theyhave lower fitness variance. The latter effect can be thought of as selection for bethedging.

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 17 / 19

Page 18: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Sample paths under neutrality, fluctuating selection and balancing selection

All cases: v1 = 1, µ1 = µ2 = 0.01,d1 = 0 (no directional selection)

(A) haploid fluctuating: d2 = 0.5v2 = 25

(B) neutral: v2 = 0

(C) diploid fluctuating: d2 = v2 = 50

(D) balancing d2 = 5.53

(E) diploid fluctuating: d2 = 1.5v2 = 75

(F) balancing: d2 = 10.1

The strength of balancing selection in (D) and (F)was chosen to match the expected heterozygositymaintained by fluctuating selection in (C) and (E).

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 18 / 19

Page 19: Balancing Selectionjtaylor/teaching/Spring2017/BIO545/lectures/balancing.pdfBalancing Selection Balancing selection occurs when selection acts to maintain di erent aleles in a population

Fluctuating Selection and Genealogies

Figures show the distribution of the Tmrca of asample of 20 chromosomes at a neutral marker locuslinked to a locus under fluctuating (or balancing)selection in a diploid population.

All cases: v1 = 1, µ1 = µ2 = 0.01, d1 = 0 (nodirectional selection)

(A): d2 = v2 = 50

(B): d2 = 1.5v2 = 75

(C): d2 = 2v2 = 100

In each case, the strength of balancing selection waschosen to match the expected heterozygosity of thefluctuating selection model.

−40 −20 0 20 400

1

2

3

4

−40 −20 0 20 400

1

2

3

4

5

−10 −5 0 5 100

2

4

6

8

10

12

Jay Taylor (ASU) Balancing Selection 23 Feb 2017 19 / 19