A General Framework for Modeling Effective Sizes and Gene ... · Outline I General class of structured populations with subpopulations. I Census and e ective local subpopulation may

A General Framework for Modeling EffectiveSizes and Gene Differentiation of Structured

Populations

Ola HossjerDept. of MathematicsStockholm University1

Coalescence Workshop MontrealOctober 2013

1joint work with Fredrik Olsson, Linda Laikre and Nils Ryman

Conservation biology

I Keep biodiversity and prevent species extinctionI Protect habitat

I temperature, supply of prey animals, food, water, soil, . . .

I Protect genetic diversity in order toI Prevent inbreedingI Keep species viableI Improve environmental adjustment of species

Outline

I General class of structured populations with subpopulations.I Census and effective local subpopulation may

I differI vary with time (local bottlenecks, extinction, cyclic changes, ...)

I Use matrix analytic methods toI compute predictor G∗ST of subpopulation differentiationI quantify effective sizes Ne

I as functions ofI time (to capture transience)I weighting schemes (uniform, size, reproduction, sampled

populations, non-ghost populations)

I Unify various types of effective sizes

Structured population with s = 5 subpopulations

Subpop 1Subpop 2

Subpop 3

Subpop 4

Subpop 5

N4=400

N1=200

N3=50

N5=100

N2=400

5

2

10

3

4

252

5

At generations t = 0 (present) and t = 1, 2, . . . (future)

vector of local census sizes = (200, 400, 50, 400, 100)= (Nti , i = 1, . . . , s)

vector of local effective sizes = (200, 400, 50, 400, 100)= (Neti , i = 1, . . . , s)

Forward and backward migration matrices

Subpop 1Subpop 2

Subpop 3

Subpop 4

Subpop 5

N4=400

N1=200

N3=50

N5=100

N2=400

5

2

10

3

4

252

5

Between generations t − 1 and t:

Mt = (Mtki )

=

0.94 0.025 0 0.01 00.025 0.9825 0.0125 0 00 0.04 0.82 0.06 00.005 0 0.01 0.9875 0.01250 0 0 0 0.95

,

Bt = (Btik)

=

0.94 0.05 0 0.01 00.0125 0.9825 0.005 0 00 0.1 0.82 0.08 00.005 0 0.0075 0.9875 00 0 0 0.05 0.95

Btik = MtkiNt−1,k/Nti .

Reproduction cycle, generations t − 1→ t

Breeding: 2Ne,t−1,i genes of Ne,t−1,i ≤ Nt−1,i breeders contributeto premigration gamete pool i

Migration: Postmigration pool i receives fraction Btik frompremigration pool k

Fertilization: Draw 2Nti genes from postmigration pool i

Predicted gene diversities

Define in generation t

a ∈ {1, . . . , nt} = all alleles,Ptia = frequency of allele a in subpopulation i ,Htij =

∑a 6=b PtiaPtjb

= gene diversity2 between subpopulations i and jhtij = E0(Htij)

= predicted gene diversity given present (t = 0)ht = vec(htij , 1 ≤ i , j ≤ s)

= s2 × 1 column vector of predicted gene diversities

2See Nei (1973).

Recursion for predicted gene diversities

Letµ = mutation probability.

Then

ht = (1− µ)2Atht−1 +(1− (1− µ)2

)(1− δt),

where At = (Atij ,kl) is an s2 × s2 matrix with elements

Atij ,kl =

(1− 1

2Nti

){i=j}BtikBtjl

(1− 1

2Ne,t−1,k

1− 12Nt−1,k

){k=l}

,

1 and δt = (δtij) are s2 × 1 col. vectors with elements 1 and

δtij = 1{i=j}/(2Nti )

respectively.

Subpopulation weighting and sampling

Letwi = non-negative weight of subpopulation i ,

with∑

i wi = 1.

Sample two genes randomly from population according to:

Scheme T : 1. Draw pair i , j of subpop. with probabilities wiwj

2. Draw one gene from each of i and j

Scheme S : 1. Draw a subpopulation i with probabilities wi

2. Draw two genes from i

Genes from same subpopulation can be drawn

I with replacement

I without replacement

Possible weighting schemes

Scheme wi

Uniform 1/sSize proportional Nti/

∑j Ntj

Reproductive3 γi

Truncated scheme

wi ← wi1{i∈S}/∑j∈S

wj ,

where

S = set of sampled/non-ghost subpopulations

3γ = (γi ) is asympt. backward distr. (if Bt = B, γ = γB, Nagylaki, 1980).

Coefficient of gene differentiationGene diversity of total population and within subpopulations:

HTt =∑s

i ,j=1 wiwjHtij = WTHt ,

HSt =∑s

i=1 wiHtii = WSHt ,

whereHt = vec(Htij ; 1 ≤ i , j ≤ s),

WT = vec(wiwj ; 1 ≤ i , j ≤ s)′,WS = vec(wi1{i=j}; 1 ≤ i , j ≤ s)′

The coefficient of gene differentiation4 of generation t

GST ,t =HTt − HSt

HTt=

WTHt −WSHt

WTHt.

Predicted coefficient of gene differentiation

G ∗ST ,t =E0(HTt)− E0(HSt)

E0(HTt)=

hTt − hSt

hTt=

WTht −WSht

WTht.

4Nei (1973, 1977).

Predicted gene diversity & subpop. differentiation (µ = 0)

t → hTtµ=0= WTAt · . . . · A1h0

t → hStµ=0= WSAt · . . . · A1h0

t → G ∗ST ,t

0 20 40 60 80 100

0.90

0.92

0.94

0.96

0.98

1.00

Time

hTt Uniform WeightshSt Uniform WeightshTt Size Propotional WeightshSt Size Propotional WeightshTt Reproductive WeightshSt Reproductive Weights

0 20 40 60 80 1000.00

0.02

0.04

0.06

0.08

0.10

Time

G∗ ST,t

Uniform WeightsSize Propotional WeightsReproductive Weights

All alleles different at t = 0.

Inbreeding effective size NeI ([0, t]) with replacementSize of Wright Fisher population with the same predicted decline of genediversity between 0 and t:

hTt

hT0=

(1− 1

2NeI ([0, t])

)t

On the other hand,

hTt

hT0

µ=0=

WTAt · . . . · A1h0

WTh0.

This gives

NeI ([0, t]) =

1

2

(1−(

WT At ·...A1h0

WT h0

)1/t) , hTt < hT0,

not defined, hTt ≥ hT0.

Variance effective size5 NeV ([0, t]) equivalent if:

I No subpopulation differentiation at t = 0 =⇒ h0 = h01

5Crow (1954) for time interval [0, 1].

Inbreeding effective size6 NeI ([0, t])Draw two different genes in future (generation t) with scheme T

τ = coalescence time = t − time to MRCAhtij = non-coalescence probability = P(τ > t|subpop. i , j drawn).

Then τ has distribution

P(τ > t) =∑

i,j wiwjhtij = WTht

= WTDt · . . . ·D11

where Dt = (Dtij,kl) is an s2 × s2 matrix with elements

Dtij,kl = BtikBtjl

(1− 1

2Ne,t−1,k

){k=l}

Wright-Fisher model of size NeI ([0, t]) has

P(τ > t) =

(1− 1

2NeI ([0, t])

)t

.

This gives

NeI ([0, t]) =1

2(1− (WTDt · . . . ·D11)1/t

) .6Wright (1931,1938) for time interval [-1,0].

Eigenvalue effective size7 NeEIf subpopulation sizes and migration matrices constant,

At = A, Dt = D,

have the same spectra. In particular largest eigenvalues

λ = λmax(A) = λmax(D)

are the same. Predicted gene diversity:

hTtt→∞

= C1λt(1 + o(1)),

Coalescence time:

P(τ > t)t→∞

= C2λt(1 + o(1)).

A Wright Fisher population of size NeE has

λ = 1− 1

2NeE.

Hence

NeE =1

2(1− λ).

7Crow (1954), Ewens (1982).

Effect of weighting scheme

0 20 40 60 80 100

600

700

800

900

100

0110

0120

0

Time

Eff

ecti

veS

ize

Asymptotic LimitNeI without replacement

Uniform WeightsSize Proportional WeightsReproductive Weights

NeI with replacement

Uniform WeightsSize Proportional WeightsReproductive Weights

Constant subpopulation sizesDashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horisontal: NeE

Structured population with s = 5 subpopulations

Subpop 1Subpop 2

Subpop 3

Subpop 4

Subpop 5

N4=400

N1=200

N3=50

N5=100

N2=400

5

2

10

3

4

252

5

Local functional effective sizes

Constant subpop. sizes

0 20 40 60 80 100

020

0400

600

800

100

0120

0

Time

Eff

ecti

veSiz

e

Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit

Solid - with replacementDashed - without replacement

Local bottleneck in 1

0 20 40 60 80 100

020

0400

600

800

100

0

Time

Eff

ecti

veSiz

e



Blocked migration 1-2

0 20 40 60 80 100

020

0400

600

800

100

0120

0

Time

Eff

ecti

veSiz

e



Weights, whole population: wj = 1/sWeight, subpopulation i : wj = 1{j=i}Dashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horizontal: NeE

Functional effective size, groups of subpopulationsUniform weights

0 20 40 60 80 100

020

040

060

080

010

0012

00

Time

Eff

ecti

veS

ize

Subpop 1Subpop 1,2Subpop 1,2,3Subpop 1,2,3,4Subpop 1,2,3,4,5Asymptotic Limit


Size proportional weights

0 20 40 60 80 100

020

040

060

080

010

0012

00

Time

Eff

ecti

veS

ize

Subpop 1Subpop 1,2Subpop 1,2,3Subpop 1,2,3,4Subpop 1,2,3,4,5Asymptotic Limit


Nested sets of populations: S = {1, . . . , i}, i = 1, . . . , 5Dashed: NeI ([0, t]) without replacementSolid: NeI ([0, t]) with replacement (two 0.5:0.5 alleles at t = 0)Horizontal: NeE

Effect of migration rate

Long term eff size NeE

0.000 0.005 0.010 0.015 0.020 0.025 0.030

1000

1500

2000

2500

Migration Rate

NeE

Long term subpop diff G ∗ST ,∞

0.000 0.005 0.010 0.015 0.020 0.025 0.0300.0

0.2

0.4

0.6

0.8

1.0

Migration Rate

G∗ ST,∞

Time independent sizes and migration rates: Ntk = Nk , Mtki = Mki

Horizontal axis: Overall migration ratem =

∑sk=1(Nk/N)

∑i 6=k Mki

Other effective sizes

Nucleotide diversity effective size8

Neπ((−∞, t]) = E(τ)2

= 12

∑∞r=0 P(τ > r)

= 12 (1 +

∑∞r=1 WTDt · . . . ·Dt−r+11) .

Coalescence effective size9 NeC satisfies

N =s∑

i=1

Ni →∞ =⇒ λ→ 1 =⇒ NeE =1

2(1− λ)= NeC (1+o(1))

for constant subpopulation sizes (Nti = Ni ).

8Ewens (1989), Slatkin (1991), Durrett (2008).9Kingman (1982), Wakeley (1999), Nordborg and Krone (2002), Sjodin et

al. (2005).

Conclusions

I General matrix analytic framework of structured populations:I Genetic drift, migration and mutationsI Subpopulation differentiationI Various weighting schemesI Local and global effects

I Focus on time profiles of Ne and predicted GST

I Unified framework for various types of Ne

I A user friendly software GESP under implementation

References

Hossjer, O. (2011). Coalescence theory for a general class of structured populations with fast migration. Advancesin Probability Theory 43(4), 1027-1047.

Hossjer, O., Jorde, P.E. and Ryman, N. (2013). Quasi equilibrium approximations of the fixation index underneutrality: The island model. Theoretical Population Biology 84, 9-24.

Hossjer, O. (2013). Spatial autocorrelation for subdivided populations with invariant migration schemes. To appearin Methodology and Computing in Applied Probability.

Ryman, N., Allendorf, F.W., Jorde P.E., Laikre, L. and Hossjer, O. (2013). Samples from subdivided populationsyield biased estimates of effective size that overestimate the rate of loss of genetic variation. To appear inMolecular Ecology Resources.

Hossjer, O. and Ryman, N. (2013). Quasi equilibrium, variance effective population size and fixation index formodels with spatial structure. To appear in Journal of Mathematical Biology.

Olsson, F. Hossjer, O., Laikre, L. and Ryman, N. (2013). Characteristics of the variance effective population sizeover time using an age structured model with variable size. To appear in Theoretical Population Biology.

Hossjer, O. (2013). On the eigenvalue effective size in structured populations. Manuscript.

Hossjer, O., Olsson, F., Laikre, L. and Ryman, N. (2013). A general framework for modeling transient and longterm behavior of gene differentiation and effective size of structured populations. Manuscript.

Olsson, F. Hossjer, O., Laikre, L. and Ryman, N. (2013b). GESP - A program for genetic exploration ofstructured populations. Manuscript. (TENTATIVE TITLE!)

THANKS!

Documents

A General Framework for Modeling Effective Sizes and Gene ... · Outline I General class of structured populations with subpopulations. I Census and e ective local subpopulation may