Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Intro to population genetics
Shamil Sunyaev
Broad Institute of M.I.T. and Harvard
Forces responsible for genetic change
Mutation
Selection s
NeDrift
Population structure FST
Mutations
Mutation rate in humans and flies
~102 per nt changes genome
2.5x10-8 (Nachman & Crowell) 1.8x10-8 (Kondrashov)
Other events: indels (10-9)
repeat extensions/contractions (10-5)
large events (?)
NGS estimates ~1.2X10-8 per nt changes genome
Mutation rate is variable along the genome
Regional variation of mutation rate
Context dependence of mutation rate
Replication fidelity DNA damage DNA repair CpG deamination
Genetic drift
Drift is a random change of allele
frequencies
Drift depends on population size
Demographic history
Selection
12
NeutralDeleterious Advantageous
New
mutation
Functional
Nonfunctional
Selection indicates functional mutations, whether or
not the tested trait is under selection
Selective effect of mutation
Most functional mutations are deleterious
Methods of mathematical
population genetics
Dynamic of allelic substitution
time
0
1
Mathematically, allele frequency change in a population
follows a one-dimensional random walk
Diffusion approximation
Random walk that does not jump long distances can be
approximated by a diffusion process
¶f x, p,t( )¶t
= -¶Mf x, p,t( )
¶x+
1
2
¶2Vf x, p,t( )¶x2
Coalescent theory
Instead of modeling a population, we can model our sample
Time goes backwards !
t
Natural selection in protein
coding regions
Effect of new missense mutations
Computer simulations
time
¶f x, p,t( )¶t
= -¶Mf x, p,t( )
¶x+
1
2
¶2Vf x, p,t( )¶x2
Demographic history
Natural selection
• Can we find additional evidence in sequence data?
• Is there any information beyond frequency? Can we
tell alleles under selection from neutral alleles if they
are of the same frequency?
25
Maruyama effect (1974): at any frequency advantageous ,
or deleterious alleles are younger than neutral alleles
Frequency x
Frequency 0%
Time
At a given frequency deleterious and
advantageous alleles are younger than
neutral
Longer trajectory: 6 jumps
Shorter trajectory: 4 jumps
Frequency 0%
Frequency x
Time
Intuition: shorter trajectories require
fewer lucky jumps
time
allele
frequencyNeutrals: equal time at each frequency
Selecteds: faster through higher frequencies
Idea: low accumulation of mutations at linked
sites indicates selection
Diffusion theory: deleterious alleles pass
fast through higher frequencies
10
!
!
!
!
!
−25 −20 −15 −10 −5 0
0.0
0.1
0.2
0.3
0.4
0.5
selection coefficient 2Ns
mean age (2N generations)
!
!
!
!
!
!
!
!
!
!
!
!
!
Population frequency
7%
5%
3%
!"
#$%&' ( ") "
0 5 10 15 20
0.000
0.005
0.010
0.015
0.020
Intermediate allele frequency (%)
mean sojourn time (2N generations)
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
! ! ! ! !
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!! ! ! ! ! ! ! ! !
Selection coefficient (2Ns)
0 (neutral)
−2 (weakly deleterious)
−10 (deleterious)
3%
*"
−0.20 −0.15 −0.10 −0.05 0.00
05
10
15
time (generations before present, in 2N units)
population frequency (%)
Allele
neutral
deleterious
+"
""
""
Figur e 1. Simulat ion and t heor et ical r esul t s for al lel ic age and sojour n t imes. a. Example
t rajectories for a neutral and deleterious allele with current populat ion frequencies 3% (indicated by an
arrow). The shaded areas indicate sojourn t imes at frequencies above 5%. b. Mean ages for neutral and
deleterious alleles at a given populat ion frequency (lines show theoret ical predict ions, dots show
simulat ion results with standard error bars). The graph shows that deleterious alleles at a given
frequency are younger than neutral alleles, and that the e↵ect is greater for more st rongly selectedalleles. c. Mean sojourn t imes for neut ral and deleterious alleles. Vert ical line denotes the current
populat ion frequency of the variant (3%). Mean sojourn t imes have been computed in bins of 1%. Line
connects theoret ical predict ions for each frequency bin. Dots show simulat ion results. The graph
illust rates that deleterious alleles spend much less t ime than neutral alleles at higher populat ion
frequencies in the past even if they have the same current frequency.
Neighborhood clock
(fuzzy clock)
29
Neighborhood clock is consistent
with Maruyama-effect expectations
Data: pilot Genome of Netherlands dataset