23
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection sweep tests Positive selection is when a new advantageous trait is segregating in a population Selection sweep is when reduction of neutral allele diversity linked to a selected loci is fixed

neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Neutrality Test

l  First suggested by Kimura (1968) and King and Jukes (1969)

l  Shift to using neutrality as a null hypothesis in positive selection and selection sweep tests

l  Positive selection is when a new advantageous trait is segregating in a population

l  Selection sweep is when reduction of neutral allele diversity linked to a selected loci is fixed

Page 2: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Neutrality Test

l  Neutrality tests allow us to: -  Identify causes of species-specific phenotype

differences -  Identify regions currently under selection -  Form hypotheses on function from genome

data

l  Challenges in neutrality tests -  Extracting the data -  Identifying the loci under selection

Page 3: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Neutrality Test

l  Two main classes of neutrality test -  Allelic distribution and/or level of variability -  Comparisons of divergence/variability between

different mutation classes within a locus

l  The former relies on major assumptions on population demographics

Page 4: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Single-Locus Test

l  Ewens Sampling Formula -  Sampling probability under infinite allele model -  Ewens-Watterson Test

l  Compare the expected homozygosity with the observed homozygosity

l  If larger than a threshold value, reject the null hypothesis

Page 5: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Single-Locus Test

l  Tajima's D-Test

-  Nucleotide data -  D = θπ-θω/Sθπ-θω

-  D is the scaled difference in the estimate of θ=4Νeµ -  Θπ is an estimator of θ based on average number of

pairwise differences -  Θω is an estimator of θ based on number of segregating

sites -  Sθπ is an estimate of the standard error of the difference of

the two estimates

Page 6: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Single-Locus Test

l  D-Test -  Difficulty in interpreting significant results -  Useful for detecting bottlenecks and

subdivision as well as selection sweeps

Page 7: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Multiple-Loci Test

l  Lewontin-Krakauer test -  Data from diallelic loci from multiple

populations -  F = σp

2/[p(1-p)] -  P and σp

2 are the mean and variance of allele frequencies across populations

-  If F is too large, the neutral hypothesis is rejected

Page 8: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Multiple-Loci Test

l  HKA Test -  Variability between and within species is compared for

two or more loci -  Assumes that under neutrality

l  Expected number of segregating sites within species and expected number of fixed differences between species are proportional to mutation rate

l  Ratio of two expectations is constant among loci

-  Therefore if divergence:polymorphism ratio is too high, selection is at work

Page 9: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Multiple-Loci Test

l  HKA Test -  Challenge:

Variance in segregating sites highly depends on demographics -  Example:

Immigration from unknown population

l  M = Immigration rate l  CV = Standard deviation

divided by mean in segregating sites number

Page 10: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Multiple-Loci Test

l  Assumptions and Challenges -  Selection will contrast target alleles/loci -  Selection can be seen if significant difference

in adherence to the neutral model between loci

l  Challenges -  Our expected value and variance of D

depends heavily on the demographic model

Page 11: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Multiple-Loci Test Example

Page 12: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Comparing Variability in Different Classes of Mutations

McDonald-Kreitman (MK) Type Tests •  Traditionally used to detect and measure the amount

of adaptive evolution within a species by determining whether adaptive evolution has occurred, and the proportion of substitutions that resulted from positive selection.

•  In general, the MK test compares the amount of species polymorphism and the divergence (substitutions) between species at neutral and non-neutral sites (advantageous or deleterious).

Page 13: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

McDonald-Kreitman cont.

Setting up a MK test • Set up a two way contingency table show to the right

• Term clarification:

• Synonymous—a point mutation causing a silent mutation (phenotypically normal)—often used as a control

• Nonsynonymous—mutation that causes a change in phenotype

Fixed Polymorphic

Synonymous Ds Ps

Nonsynonymous

Dn Pn

Ds: the number of synonymous substitutions per gene

Dn: the number of nonsynonymous substitutions per gene

Ps: the number of synonymous polymorphisms per gene

Pn: the number of nonsynonymous polymorphisms per gene

Page 14: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

McDonald-Kreitman cont.

First used with drosophila in 1991 and the ADH gene. The test proposed a method to estimate the proportion of substitutions that are fixed by positive selection rather than by genetic drift.

The ratio of ns. to s. variation within a species is going to equal the ratio of ns. to s. variation between species:

Dn/Ds = Pn/Ps

Page 15: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

McDonald-Kreitman cont.

When positive or negative selection influences ns. variation, the ratios will no longer be equal.

The ratio of ns. to s. between species is lower than the ratio of ns. to s. within species when negative selection is high and deleterious alleles strongly affect polymorphism:

Dn/Ds < Pn/Ps

The ratio of ns. to s. within species is lower than the ratio of ns. to s. between species when positive selection is high.

Dn/Ds > Pn/Ps

These do not necessarily contribute to polymorphism but have an effect on divergence.

Page 16: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

McDonald-Kreitman cont.

Possible shortcoming of the MK type tests: •  It’s not always clear what type of selection is acting

upon a gene •  Ex—changes in pop size combined with weak selection

against slightly deleterious mutation may either increase or decrease the number of ns polymorphisms

•  An increase in pop size will lead to excessive ns polymorphisms

•  Significant results from MK cannot be interpreted directly as evidence for positive selection

Page 17: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

The Genomic Rate of Adaptive Evolution—Smith and Eyre-Walker

Additional work with MK tests by Smith and Eyre-Walker:

α = 1 – (DsPn)/(DnPs) •  In the above equation, α = proportion of

substitutions driven by positive selection. •  See research handouts.

Page 18: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Test Based on Allelic Distribution in ns and s Sites

Some tests are done by examining different types of sites (non-protein coding sites)

•  Differences I the allelic distributions (frequency spectra) between s and ns polymorphisms.

•  Used for genomic sets in which large number of polymorphisms can be obtained. •  Microsat data?

•  Nielsen and Weinreich performed frequency spectra analysis in the human genome. (1999) •  Differences in the average age of ns and s mutations provided evidence

for selection.

Page 19: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Tests Based on the dN/ds Ratio or ω

The most direct method for showing the presence of positive selection is to demonstrate that the number of ns substitutions per ns sites (dN) is much larger than the number of s substitutions per s sites (dS)

Page 20: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Definitions…The dN

dN (alternatively designated Ka) is a measure of the degree to which two homologous coding sequences differ with respect to amino-acid content.

•  Specifically, it indicates the degree to which two sequences differ at ns sites (substitution that changes the aa).

•  dN is the average number of nucleotide differences between the sequences per ns site.

Page 21: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

More Definitions…The dS

dS (alternatively designated Ks) is a measure of the degree to which two homologous coding sequences differ with respect to silent nucleotide substitutions (substitutions that do not cause an amino-acid substitution).

•  It indicates the degree to which two sequences differ at s sites (substitution that does not change the aa).

•  dS is the average number of nucleotide differences between sequences per synonymous site.

Page 22: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Tests Based on the dN/ds Ratio or ω

A value of dN > dS implies that ns mutations are fixed with a higher P than neutral ones due to positive selection.

If testing dN < dS (ω ≤ 1)for an entire gene is a very conservative test of neutrality. Purifying selection must occur frequently in functional genes to preserve function.

Therefore, the average dN is expected to be much less than the average dS, even if positive selection is occurring in some sites.

Page 23: neutrality - Arizona State Universityjtaylor/teaching/Spring... · Jukes (1969) ! Shift to using neutrality as a null hypothesis in ... 2/[p(1-p)] - P and σ p 2 are ... CV = Standard

Differences in MK and H0:ω≤1

ω≤1 is to date the only direct method available to provide data for detecting positive selection.

ω>1 is to date the only direct method available for detecting positive selection from DNA sequence data.

Limitations: they assume no recombination and the effect of strong codon bias on these methods have not been systematically explored (2001).

**Have the above limitations been investigated yet?