Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
A General Framework for Modeling EffectiveSizes and Gene Differentiation of Structured
Populations
Ola HossjerDept. of MathematicsStockholm University1
Coalescence Workshop MontrealOctober 2013
1joint work with Fredrik Olsson, Linda Laikre and Nils Ryman
Conservation biology
I Keep biodiversity and prevent species extinctionI Protect habitat
I temperature, supply of prey animals, food, water, soil, . . .
I Protect genetic diversity in order toI Prevent inbreedingI Keep species viableI Improve environmental adjustment of species
Outline
I General class of structured populations with subpopulations.I Census and effective local subpopulation may
I differI vary with time (local bottlenecks, extinction, cyclic changes, ...)
I Use matrix analytic methods toI compute predictor G∗ST of subpopulation differentiationI quantify effective sizes Ne
I as functions ofI time (to capture transience)I weighting schemes (uniform, size, reproduction, sampled
populations, non-ghost populations)
I Unify various types of effective sizes
Structured population with s = 5 subpopulations
Subpop 1Subpop 2
Subpop 3
Subpop 4
Subpop 5
N4=400
N1=200
N3=50
N5=100
N2=400
5
2
10
3
4
252
5
At generations t = 0 (present) and t = 1, 2, . . . (future)
vector of local census sizes = (200, 400, 50, 400, 100)= (Nti , i = 1, . . . , s)
vector of local effective sizes = (200, 400, 50, 400, 100)= (Neti , i = 1, . . . , s)
Forward and backward migration matrices
Subpop 1Subpop 2
Subpop 3
Subpop 4
Subpop 5
N4=400
N1=200
N3=50
N5=100
N2=400
5
2
10
3
4
252
5
Between generations t − 1 and t:
Mt = (Mtki )
=
0.94 0.025 0 0.01 00.025 0.9825 0.0125 0 00 0.04 0.82 0.06 00.005 0 0.01 0.9875 0.01250 0 0 0 0.95
,
Bt = (Btik)
=
0.94 0.05 0 0.01 00.0125 0.9825 0.005 0 00 0.1 0.82 0.08 00.005 0 0.0075 0.9875 00 0 0 0.05 0.95
Btik = MtkiNt−1,k/Nti .
Reproduction cycle, generations t − 1→ t
Breeding: 2Ne,t−1,i genes of Ne,t−1,i ≤ Nt−1,i breeders contributeto premigration gamete pool i
Migration: Postmigration pool i receives fraction Btik frompremigration pool k
Fertilization: Draw 2Nti genes from postmigration pool i
Predicted gene diversities
Define in generation t
a ∈ {1, . . . , nt} = all alleles,Ptia = frequency of allele a in subpopulation i ,Htij =
∑a 6=b PtiaPtjb
= gene diversity2 between subpopulations i and jhtij = E0(Htij)
= predicted gene diversity given present (t = 0)ht = vec(htij , 1 ≤ i , j ≤ s)
= s2 × 1 column vector of predicted gene diversities
2See Nei (1973).
Recursion for predicted gene diversities
Letµ = mutation probability.
Then
ht = (1− µ)2Atht−1 +(1− (1− µ)2
)(1− δt),
where At = (Atij ,kl) is an s2 × s2 matrix with elements
Atij ,kl =
(1− 1
2Nti
){i=j}BtikBtjl
(1− 1
2Ne,t−1,k
1− 12Nt−1,k
){k=l}
,
1 and δt = (δtij) are s2 × 1 col. vectors with elements 1 and
δtij = 1{i=j}/(2Nti )
respectively.
Subpopulation weighting and sampling
Letwi = non-negative weight of subpopulation i ,
with∑
i wi = 1.
Sample two genes randomly from population according to:
Scheme T : 1. Draw pair i , j of subpop. with probabilities wiwj
2. Draw one gene from each of i and j
Scheme S : 1. Draw a subpopulation i with probabilities wi
2. Draw two genes from i
Genes from same subpopulation can be drawn
I with replacement
I without replacement
Possible weighting schemes
Scheme wi
Uniform 1/sSize proportional Nti/
∑j Ntj
Reproductive3 γi
Truncated scheme
wi ← wi1{i∈S}/∑j∈S
wj ,
where
S = set of sampled/non-ghost subpopulations
3γ = (γi ) is asympt. backward distr. (if Bt = B, γ = γB, Nagylaki, 1980).
Coefficient of gene differentiationGene diversity of total population and within subpopulations:
HTt =∑s
i ,j=1 wiwjHtij = WTHt ,
HSt =∑s
i=1 wiHtii = WSHt ,
whereHt = vec(Htij ; 1 ≤ i , j ≤ s),
WT = vec(wiwj ; 1 ≤ i , j ≤ s)′,WS = vec(wi1{i=j}; 1 ≤ i , j ≤ s)′
The coefficient of gene differentiation4 of generation t
GST ,t =HTt − HSt
HTt=
WTHt −WSHt
WTHt.
Predicted coefficient of gene differentiation
G ∗ST ,t =E0(HTt)− E0(HSt)
E0(HTt)=
hTt − hSt
hTt=
WTht −WSht
WTht.
4Nei (1973, 1977).
Predicted gene diversity & subpop. differentiation (µ = 0)
t → hTtµ=0= WTAt · . . . · A1h0
t → hStµ=0= WSAt · . . . · A1h0
t → G ∗ST ,t
0 20 40 60 80 100
0.90
0.92
0.94
0.96
0.98
1.00
Time
hTt Uniform WeightshSt Uniform WeightshTt Size Propotional WeightshSt Size Propotional WeightshTt Reproductive WeightshSt Reproductive Weights
0 20 40 60 80 1000.00
0.02
0.04
0.06
0.08
0.10
Time
G∗ ST,t
Uniform WeightsSize Propotional WeightsReproductive Weights
All alleles different at t = 0.
Inbreeding effective size NeI ([0, t]) with replacementSize of Wright Fisher population with the same predicted decline of genediversity between 0 and t:
hTt
hT0=
(1− 1
2NeI ([0, t])
)t
On the other hand,
hTt
hT0
µ=0=
WTAt · . . . · A1h0
WTh0.
This gives
NeI ([0, t]) =
1
2
(1−(
WT At ·...A1h0
WT h0
)1/t) , hTt < hT0,
not defined, hTt ≥ hT0.
Variance effective size5 NeV ([0, t]) equivalent if:
I No subpopulation differentiation at t = 0 =⇒ h0 = h01
5Crow (1954) for time interval [0, 1].
Inbreeding effective size6 NeI ([0, t])Draw two different genes in future (generation t) with scheme T
τ = coalescence time = t − time to MRCAhtij = non-coalescence probability = P(τ > t|subpop. i , j drawn).
Then τ has distribution
P(τ > t) =∑
i,j wiwjhtij = WTht
= WTDt · . . . ·D11
where Dt = (Dtij,kl) is an s2 × s2 matrix with elements
Dtij,kl = BtikBtjl
(1− 1
2Ne,t−1,k
){k=l}
Wright-Fisher model of size NeI ([0, t]) has
P(τ > t) =
(1− 1
2NeI ([0, t])
)t
.
This gives
NeI ([0, t]) =1
2(1− (WTDt · . . . ·D11)1/t
) .6Wright (1931,1938) for time interval [-1,0].
Eigenvalue effective size7 NeEIf subpopulation sizes and migration matrices constant,
At = A, Dt = D,
have the same spectra. In particular largest eigenvalues
λ = λmax(A) = λmax(D)
are the same. Predicted gene diversity:
hTtt→∞
= C1λt(1 + o(1)),
Coalescence time:
P(τ > t)t→∞
= C2λt(1 + o(1)).
A Wright Fisher population of size NeE has
λ = 1− 1
2NeE.
Hence
NeE =1
2(1− λ).
7Crow (1954), Ewens (1982).
Effect of weighting scheme
0 20 40 60 80 100
600
700
800
900
100
0110
0120
0
Time
Eff
ecti
veS
ize
Asymptotic LimitNeI without replacement
Uniform WeightsSize Proportional WeightsReproductive Weights
NeI with replacement
Uniform WeightsSize Proportional WeightsReproductive Weights
Constant subpopulation sizesDashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horisontal: NeE
Structured population with s = 5 subpopulations
Subpop 1Subpop 2
Subpop 3
Subpop 4
Subpop 5
N4=400
N1=200
N3=50
N5=100
N2=400
5
2
10
3
4
252
5
Local functional effective sizes
Constant subpop. sizes
0 20 40 60 80 100
020
0400
600
800
100
0120
0
Time
Eff
ecti
veSiz
e
Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit
Solid - with replacementDashed - without replacement
Local bottleneck in 1
0 20 40 60 80 100
020
0400
600
800
100
0
Time
Eff
ecti
veSiz
e
Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit
Solid - with replacementDashed - without replacement
Blocked migration 1-2
0 20 40 60 80 100
020
0400
600
800
100
0120
0
Time
Eff
ecti
veSiz
e
Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit
Solid - with replacementDashed - without replacement
Weights, whole population: wj = 1/sWeight, subpopulation i : wj = 1{j=i}Dashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horizontal: NeE
Functional effective size, groups of subpopulationsUniform weights
0 20 40 60 80 100
020
040
060
080
010
0012
00
Time
Eff
ecti
veS
ize
Subpop 1Subpop 1,2Subpop 1,2,3Subpop 1,2,3,4Subpop 1,2,3,4,5Asymptotic Limit
Solid - with replacementDashed - without replacement
Size proportional weights
0 20 40 60 80 100
020
040
060
080
010
0012
00
Time
Eff
ecti
veS
ize
Subpop 1Subpop 1,2Subpop 1,2,3Subpop 1,2,3,4Subpop 1,2,3,4,5Asymptotic Limit
Solid - with replacementDashed - without replacement
Nested sets of populations: S = {1, . . . , i}, i = 1, . . . , 5Dashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horizontal: NeE
Effect of migration rate
Long term eff size NeE
0.000 0.005 0.010 0.015 0.020 0.025 0.030
1000
1500
2000
2500
Migration Rate
NeE
Long term subpop diff G ∗ST ,∞
0.000 0.005 0.010 0.015 0.020 0.025 0.0300.0
0.2
0.4
0.6
0.8
1.0
Migration Rate
G∗ ST,∞
Time independent sizes and migration rates: Ntk = Nk , Mtki = Mki
Horizontal axis: Overall migration ratem =
∑sk=1(Nk/N)
∑i 6=k Mki
Other effective sizes
Nucleotide diversity effective size8
Neπ((−∞, t]) = E(τ)2
= 12
∑∞r=0 P(τ > r)
= 12 (1 +
∑∞r=1 WTDt · . . . ·Dt−r+11) .
Coalescence effective size9 NeC satisfies
N =s∑
i=1
Ni →∞ =⇒ λ→ 1 =⇒ NeE =1
2(1− λ)= NeC (1+o(1))
for constant subpopulation sizes (Nti = Ni ).
8Ewens (1989), Slatkin (1991), Durrett (2008).9Kingman (1982), Wakeley (1999), Nordborg and Krone (2002), Sjodin et
al. (2005).
Conclusions
I General matrix analytic framework of structured populations:I Genetic drift, migration and mutationsI Subpopulation differentiationI Various weighting schemesI Local and global effects
I Focus on time profiles of Ne and predicted GST
I Unified framework for various types of Ne
I A user friendly software GESP under implementation
References
Hossjer, O. (2011). Coalescence theory for a general class of structured populations with fast migration. Advancesin Probability Theory 43(4), 1027-1047.
Hossjer, O., Jorde, P.E. and Ryman, N. (2013). Quasi equilibrium approximations of the fixation index underneutrality: The island model. Theoretical Population Biology 84, 9-24.
Hossjer, O. (2013). Spatial autocorrelation for subdivided populations with invariant migration schemes. To appearin Methodology and Computing in Applied Probability.
Ryman, N., Allendorf, F.W., Jorde P.E., Laikre, L. and Hossjer, O. (2013). Samples from subdivided populationsyield biased estimates of effective size that overestimate the rate of loss of genetic variation. To appear inMolecular Ecology Resources.
Hossjer, O. and Ryman, N. (2013). Quasi equilibrium, variance effective population size and fixation index formodels with spatial structure. To appear in Journal of Mathematical Biology.
Olsson, F. Hossjer, O., Laikre, L. and Ryman, N. (2013). Characteristics of the variance effective population sizeover time using an age structured model with variable size. To appear in Theoretical Population Biology.
Hossjer, O. (2013). On the eigenvalue effective size in structured populations. Manuscript.
Hossjer, O., Olsson, F., Laikre, L. and Ryman, N. (2013). A general framework for modeling transient and longterm behavior of gene differentiation and effective size of structured populations. Manuscript.
Olsson, F. Hossjer, O., Laikre, L. and Ryman, N. (2013b). GESP - A program for genetic exploration ofstructured populations. Manuscript. (TENTATIVE TITLE!)
THANKS!