Recombination and genome evolution – Recombination and selection
Sylvain GléminInstitut des Sciences de l’Evolution - [email protected]
Introduction
Recombination: (mostly) universal Key role in genetic process genome
evolution Variations in patterns of recombination
Between species (sexual/asexual, outcrossing/selfing,…)
Between genomic compartments (nuclear/organelles)
Between chromosomes (sexual/autosomes, size,…) Along chromosomes (recombination gradients,
hotspots,…)
Introduction
Direct effects of recombination Possible mutagenic effects Genic conversion
At the initiation sites (see Hotspot paradox, Boulton et al. 1997)
At flanking regions [gBGC, see L. Duret’ lecture]
Indirect effects of recombination Crossing-over: mixing alleles between
genes
Outline
I. Effects of genetic linkage on selection: general predictions
II. Evolution of non-recombining genomic regions: the example of mtDNA
III. The surprising effects of gBGC on selection
IV. Using genomes to study breeding system evolution
I. Effects of genetic linkage on selection: general predictions
I.1 Recombination and linkage disequilibriumI.2 Genetic hitch-hiking effectsI.3 Consequences for genomic patterns
I.1 Recombination and linkage disequilibrium LD: non-random
association between alleles at two (or more) loci
f(AB) = f(A)f(B) +D f(Ab) = f(A)f(b) – D f(aB) = f(a)f(B) – D f(ab) = f(a)f(b) + D
b
b
B
B
A
a
A
a
I.1 Recombination and linkage disequilibrium LD: non-random
association between alleles at two (or more) loci
Created by Drift Selection Population structure …
Removed by Recombination (Mutation)
f(AB) = f(A)f(B) +D f(Ab) = f(A)f(b) – D f(aB) = f(a)f(B) – D f(ab) = f(a)f(b) + D
b
b
B
B
A
a
A
a
Genomic patterns of linkage disequilibrium
Maize (Outcrosser)
Arabidopsis thaliana(selfer)
(Remington et al. 2001)
(Nordborg et al. 2005)
I.2. Genetic hitch-hiking effects
“The hitch-hiking effect of a favorable gene” (Maynard-smith and Haig, 1974)
Two-locus dynamics Genotypes: AA Aa aa BB Bb
bbFitness: 1 1+s 1+2s1 11
A aa
p p sp
W
b
Dsp
W
The different forms of genetic hitch-hiking effects
Strongly advantageous Mildly deleterious Weakly deleteriousWeakly advantageous Neutral
Genetic drift of haplotypes
Hill-Robertson interferenc
e
Loss of advantageous mutations
(Hill & Robertson 1966)
The different forms of genetic hitch-hiking effects
Strongly advantageous Mildly deleterious Weakly deleteriousWeakly advantageous Neutral
Genetic drift of haplotypes
Hill-Robertson interferenc
e
Loss of advantageous mutations
Selection of strongly advantageous mutations
Selective sweep
Loss of polymorphism and fixation of weakly deleterious mutations
(Hill & Robertson 1966)(Maynard-
Smith & Haig 1974)
The different forms of genetic hitch-hiking effects
Strongly advantageous Mildly deleterious Weakly deleteriousWeakly advantageous Neutral
Genetic drift of haplotypes
Hill-Robertson interferenc
e
Loss of advantageous mutations
Selection of strongly advantageous mutations
Selective sweep
Loss of polymorphism and fixation of weakly deleterious mutations
Random loss (by drif) of crhomosomes free of
mutation
Muller’s ratchet
Accumulation of deleterious mutations
(Hill & Robertson 1966)(Maynard-
Smith & Haig 1974)
(Muller 1932)
The different forms of genetic hitch-hiking effects
Strongly advantageous Mildly deleterious Weakly deleteriousWeakly advantageous Neutral
Genetic drift of haplotypes
Hill-Robertson interferenc
e
Loss of advantageous mutations
Selection of strongly advantageous mutations
Selective sweep
Loss of polymorphism and fixation of weakly deleterious mutations
Random loss (by drif) of crhomosomes free of
mutation
Muller’s ratchet
Accumulation of deleterious mutations
Selection against deleterious mutations
Background selection
Loss of neutral and weakly advantageous
variants
(Hill & Robertson 1966)(Maynard-
Smith & Haig 1974)
(Charlesworth et al. 1993)(Muller 1932)
Hitch-hiking effects: formalisation
Roughly equivalent to reducing effective size
Hitch-hiking effects: formalisation
Neutral alleles
Deleterious allele
Background selection
time
Hitch-hiking effects: formalisation
Roughly equivalent to reducing effective size Background selection (Charlesworth et al. 1993)
Ne = Nf0 avec f0 = exp(-U/s) (haploid)
Hitch-hiking effects: formalisation
Advantageous allele
Neutral alleles
Deleterious allele
Neutral alleles
Background selection Selective sweep
time time
Hitch-hiking effects: formalisation
Roughly equivalent to reducing effective size Background selection (Charlesworth et al. 1993)
Ne = Nf0 avec f0 = exp(-U/s) (haploid) Recurrent selective sweep (genetic draft)
(Gillespie 2000)
Ne = N/(1+2Nρ) Rq: for some processes the effect of draft cannot
be simply summarized by an effect on Ne
I.3 Consequences for genomic patterns Neutral mutations (synonymous, introns,…)
Polymorphism π = 4Neµ Divergence D = µT
Indirect effects
Direct effects
I.3 Consequences for genomic patterns Neutral mutations (synonymous, introns,…)
Polymorphism π = 4Neµ Divergence D = µT
Selected mutations (non-synonymous, regulatory sequences,…) Polymorphism π = 4Nefnµ + πweak selection
Divergence D = µTfn + Dweak selection + Dadvantageous
Indirect effects
Direct effects
Indirect effects
Interaction between selection and drift
3 2 1 1 2
1
2
3
4
S = 4Nes
Dn/Ds
n/s
Low recHigh rec High rec
( )1 S
SF S u
e
Substitution rate
Summary of predictions
πn/πS
πS/DS
rec
rec
rec
Dn/DS Stabilizing
selectionPositiv
e selectio
n
Recombination and genomic patterns: polymorphism
(Nordborg et al. 2005)
C. elegans A. thaliana
Significant effect after controlling for divergence
(Cutter et al. 2003)
Recombination and genomic patterns: polymorphism
(Hellmann et al. 2003)
Human: an example of the confounding effects of recombination
Recombination and genomic patterns: Dn/Ds
(Bullaughey et al. 2008)
Human/chimp divergence
Recombination classes
Dn/Ds per classes
Recombination and genomic patterns: Dn/Ds
~ 7600 genes Recombination
classes High Intermediate Low NA: no recombination N4: 4th chromosome
D. melanogaster/ D. yacuba divergence
(Haldrill et al. 2007)
NO
Summary
Globally: weak effects of recombination gradients
Strong contrast between recombining and non-recombining regions
Weak recombination is sufficient to counteract Hill-Robertson effects (if c/u > 1)
II. Evolution of non-recombining genomic regions:the example of mtDNAII.1 mtDNA characteristics and classical
assumptionsII.2 Patterns of mtDNA polymorphism in animalsII.3 The genetic draft hypothesis
II.1 mtDNA characteristics and classical assumptions
Mitochondrial genome in animals Small genome ~13 to 20 kb Non-recombining High gene density High mutation rates
Classical assumptions Strong purifying selection Observed polymorphism ~neutral Good marker for molecular biodiversity
II.1 mtDNA characteristics and classical assumptions
Tests of these assumptions Does mtDNA polymorphism correlate with
Ne through life history/ecological traits effects
~ Ne .
demography
structure
selection
Life history traitsEcological traits
II.2 Patterns of mtDNA polymorphism in animals The Polymorphix data base (Bazin et al.
2005)
Homologous sequences within species + outgroups
Several homology criteria
Database cleaning Remove genome projects, transposons,
specific genes (MHC, rRNA,…) Manually check highly polymorphic genes
II.2 Patterns of mtDNA polymorphism in animals Data set available in Polymorphix
mtDNA ~1350 species nucDNA ~100 species
Allozyme diversity (Nevo et al. 1984) ~750 species
Computation of s
Average over loci within species Average over species within taxa / ecological
groups Comparison with allozimic diversity
Global comparison
synonymo
us
Allozyme heterozygosity
Vertebrates Invertebrates
(Bazin et al. 2006)
nucDNA
Global comparison
synonymo
us
Allozyme heterozygosity
Vertebrates Invertebrates
(Bazin et al. 2006)
mtDNA
nucDNA
Life history traits and diversity
Crustaceans
Branch.Dec.
0.30
H
Allozymes Allozymes
0.40
H
continent marine
Molluscs
Allozymes
Fishes
H
0.08
***
*
(Bazin et al. 2006)
Fresh-watermarine
Life history traits and diversity
Crustaceans
Branch.Dec. Branch.Dec.
0.10
s
0.30
H
Allozymes mtDNA ADNmtAllozymes
0.400.08
sH
continent marine continentmarine
Molluscs
ADNmtAllozymes
Fishes
H s
0.08 0.08
***
*
Fresh-watermarine
(Bazin et al. 2006)
Fresh-watermarine
II.3 The genetic draft hypothesis(Gillespie 2000, 2001)
Advantageous mutationsNeutral alleles
Recurrent selective sweeps
temps
1 1 0 1 0 0 1 0 0 0 1 0 4 1 0 5 1 0 6
1 0
1 0 0
1 0 0 0
1 0 4
1 0 5
1 0 6
DraftzoneDrift zone
N
Ne Pure drift
Continuous adaptation
Ne = N / (1+ 2N)
Signature of positive selection on mtDNA
(Bazin et al. 2006)
Neutrality index: NI = (N / S) / (dN / dS)
Vert. Invert. Vert. Invert.
mtDNA nuclear DNA
0
1
5
10
NI (log scale)
adaptative
purifying
neutral
Signature of positive selection on mtDNA
(Bazin et al. 2006)
Neutrality index: NI = (N / S) / (dN / dS)
Vert. Invert. Vert. Invert.
mtDNA nuclear DNA
0
1
5
10
NI (log scale)
adaptative
purifying
neutral
Signature of positive selection on mtDNA
(Bazin et al. 2006)
Neutrality index: NI = (N / S) / (dN / dS)
Vert. Invert. Vert. Invert.
mtDNA nuclear DNA
0
1
5
10
NI (log scale)
adaptative
purifying
neutral
The effect of draft on the whole mtDNA genome
(Bazin et al. 2006)
Complete mitochondrial genomes:
Groups of closely related species (max dS < 50%)
Patterns of selection on the whole mtDNA genome
Recurrent selective sweeps in large population Fixation of weakly
deleterious alleles (dN/dS)
Selection of strongly advantageous mutations
Selective sweep
Loss of polymorphism and fixation of weakly deleterious mutations
(Maynard-Smith & Haig
1974)
Summary
Importance of linkage in the evolution of mtDNA Classical assumptions
Strong purifying selection + recurrent positive selection Observed polymorphism ~neutral but not related to global
Ne
Good Bad marker for molecular biodiversity What is mtDNA (in large populations) adapting to ?
Metabolism adaptations? Selfish genes? (two-level selection) Cyto-nuclear interactions Association with maternal-inherited endosymbionts
(Wolbachia,…) …?
III. The surprising effect of gBGC on selection
III.1 The population genetics of gBGC/selection interferenceIII.2 The fitness consequences of gBGC
II.1 The population genetics of gBGC/selection interference gBGC ~ meiotic distortion: formally
equivalent to genic selection for GC alleles (Nagylaki 1983) pGC = pGC (1 – pGC)b Selection: p = p (1 – p)s / W (W ~ 1)
(Normalized) substitution rates Selection alone
gBGC alone B = 4Neb
gBGC + selection( )1 B S
B SF
e
1 B
BF
e
Indirect effects
Direct effects
1 S
SF
e
gBGC/selection: substitution rates
1 S
SF
e
3 2 1 1 2
1
2
3
4
S = 4Ne s
gBGC/selection: substitution rates
( )1 B S
B SF
e
1 S
SF
e
3 2 1 1 2
1
2
3
4
S = 4Ne s
GC AT mutations (B < 0)
gBGC helps purging deleterious mutations
gBGC/selection: substitution rates
( )1 B S
B SF
e
1 S
SF
e
3 2 1 1 2
1
2
3
4
S = 4Ne s
AT GC mutations (B > 0)
gBGC contribute to fixing deleterious mutations
gBGC/selection: substitution rates
3 2 1 1 2
1
2
3
4
S = 4Ne s
Fixation > purging
Achilles’ heel (Galtier and Duret 2007)
gBGC/selection: from dominance to overdominance
(Partially) recessive deleterious ATGC mutations Genotypes: WW WS SS Fitness: 1 1- hs 1 – s gBGC: 1 - b 1 1 + b Total effects: 1 – b 1 – hs 1
+ b – s If hs < b < (1 – h)s : overdominance-like
dynamics
III.2 The fitness consequences of gBGC Fixation of weakly deleterious GC alleles
Increases Dn/Ds: false signature of relaxed or positive selection [see L. Duret’s lecture]
GC-associated fixation load Increase in the frequency and ages of
GC deleterious alleles Increases the frequency of genetic diseases Increases deleterious standing variation
inbreeding depression
The gBGC-associated load
106 105 104 0.001 0.01 0.1s
0.1
1
10
100
load
Genetic structure of the load (without gBGC)
L = 1 - W
The gBGC-associated load
106 105 104 0.001 0.01 0.1s
0.1
1
10
100
load
Genetic structure of the load (without gBGC)
L = 1 - W
Load due to segregating mutations
L = 2u
The gBGC-associated load
106 105 104 0.001 0.01 0.1s
0.1
1
10
100
load
Genetic structure of the load (without gBGC)
L = 1 - W
Load due to segregating mutations
L = 2u
Load due to fixed mutations
L = s
Fixed mutations but nearly
neutral
The gBGC-associated load
106 105 104 0.001 0.01 0.1s
0.1
1
10
100
load
Genetic structure of the load (without gBGC)
L = 1 - W
Load due to segregating mutations
L = 2u
Load due to fixed mutations
L = s
~ 1/Ne
Fixed mutations but nearly
neutral
Most damaging ones
The gBGC-associated load
1 0 6 1 0 5 1 0 4 0 .0 0 1 0 .0 1 0 .1s
0 .1
1
1 0
1 0 0
load
N = 100 000
N = 50 000
N = 10 000
N = 5 000
(x10-6)
Whithout gBGC
The load is higher in small populationsMost damaging mutations ~ 1/Ne
(Glémin submitted)
The gBGC-associated load
1 0 6 1 0 5 1 0 4 0 .0 0 1 0 .0 1 0 .1s
0 .1
1
1 0
1 0 0
load
1 0 6 1 0 5 1 0 4 0 .0 0 1 0 .0 1 0 .1s
0 .1
1
1 0
1 0 0
load
N = 100 000
N = 50 000
N = 10 000
N = 5 000
(x10-6) (x10-6)
Whithout gBGC With gBGC (b = 0.0002)
~ 2b
The load is higher in small populationsMost damaging mutations ~ 1/Ne
The load is independent of Ne
Most damaging mutations ~ 2b
(Glémin submitted)
The gBGC-associated load
(Glémin submitted)
Drift loadgBGC
fixation loadSegregating load
s0 ~2b~1/2N
Structure of the load with gBGC
Large population size
Small population size
Quantification of the load
Without gBGC With gBGC-1 With gBGC-2
N = 10 000 0.634 0.689 0.865N = 50 000 0.143 0.301 0.708N = 100 000 0.074 0.248 0.687
Load
1: 3% of gBGC hotspot with b = 0.0002 (Human data, e.g. Spencer et al. 2006)2: 5% of gBGC hotspot with b = 0.0005
(Glémin submitted)
Gamma distribution of fitness effectsMean = 0.03 shape = 0.2 (Human data, e.g. Eyre-Walker et al. 2006)
IV. Using genomes to study breeding system evolution
IV.1 The genetic consequences of selfing and the “dead-end hypothesis”IV. 2 An integrated test of the “dead-end hypothesis” in TriticeaeIV.3 gBGC: another cost of outcrossing ?
IV.1 The genetic consequences of selfing and the “dead-end hypothesis” SelferOutcrosser
Effective size
- Sampling- + ecology 2N gametes
Heterozygosity
N individuals
Effective recombination
~ 0
~ 0
Genomic consequences of selfing
Reduced Ne and re
Low polymorphism levels Low S
High LD Weak efficacy of selection
Increase in DN/DS and N/S
- Low adaptive potential- Accumulation of deleterious alleles
Evolutionary dead-end?(Stebbins 1957)
Genomic consequences of selfing
Reduced Ne and re
Low polymorphism levels Low S
High LD Weak efficacy of selection
Increase in DN/DS and N/S
High homozygosity gBGC ineffective
- Low adaptive potential- Accumulation of deleterious alleles
Evolutionary dead-end?(Stebbins 1957)
GC marker of selfing evolution ?
The effect of selfing on patterns of polymorphism in plant
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Allozymes(Hamrick & Godt1990)
Microsatellites(Nybom2004)
Sequence(Gléminet al. 2006)
Outcrossers Selfers
He
He
S(x 10)
p= 0.002n= 108
p -value Allogames Autogames
r² 0.014 0.401 0.561 = 4Nec 0.049 6.24 1.55
n
9686
LD patterns
(Glémin et al. 2006)
The effect of selfing on patterns of polymorphism in plant
0n
s
πf =
πWeakly deleterious mutations
(Glémin et al. 2006)
Lower selection efficacy in selfers ?
IV.2 An integrated test of the “dead-end hypothesis” in Triticeae
19 species 10 outcrossers
9 selfers
27 genes 21 along one
chromosome (strong recombination gradient) (Akhunov et al. 2003)
IV.2 An integrated test of the “dead-end hypothesis” in Triticeae
19 species 10 outcrossers
9 selfers
27 genes 21 along one
chromosome (strong recombination gradient) (Akhunov et al. 2003)
IV.2 An integrated test of the “dead-end hypothesis” in Triticeae
Variations between selfing and outcrossing branches Dn/Ds GC*
Variations along the recombination gradient Dn/Ds GC*
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgare
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgare
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgare
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgareModel 1
Model 0
Mating systems have no effect…
(Escobar et al. soumis)
… but recombination does
rspearman= -0.47, p = 0.03 rspearman= 0.41, p = 0.04
(Escobar et al. soumis)
The tempo of selfing evolution in Triticeae
Strong and stable recombination gradient
Very recent transition to selfing Similar results in Arabidopsis (Wright et al.
2002) and Caenorhabditis (Cutter et al. 2008) [Dn/Ds, Codon usage]
?
The tempo of selfing evolution in Triticeae
p = 0.03
Analyses of a single transition 52 genes
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgare
(Haudry et al. 2008)
GC GC*
Outcrossers 0.473 0.453
Selfers 0.3530.471
GC* marker of the shift to selfing…
The tempo of selfing evolution in Triticeae
p = 0.03
Analyses of a single transition 52 genes
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgare
(Haudry et al. 2008)
GC GC*
Outcrossers 0.473 0.453
Selfers 0.3530.471
… but still no effect of selfing on Dn/DsDn/Ds
Outcrossers 0.150
Selfers 0.105p > 0.03
GC* marker of the shift to selfing…
IV.3 gBGC: another cost of outcrossing? gBGC: probably strong in grasses Achilles’ heel in outcrossing grasses ?
Poaceae Others
GC
(Glémin et al. 2006)
Test of the Achilles’ heel hypothesis
S. cereale
Ae. speltoides
T. monococcum
T. urartu
H. vulgareTwo classes of genes-GC* < 0.45 (median)-GC* > 0.45
ωself
ωout
GC* < 0.45 GC* > 0.45
ωself0.111 0.099
ωout0.126 0.180
p-value 0.664 0.032
(Haudry et al. 2008)
Conclusion
Several evidences of recent origin of selfing (Triticées, Medicago, Polemoniaceae,…)
But no (or few) evidence of genomic degradation
Conclusion
Differences between selfing/asexual lineages
Taxonomic group Groups compared Dataset dN/dS N/S
Angiosperms 29 selfers / 42 outcrossers Meta-analysis (polymorphism) +/-Arabidopsis 1 selfer / 1 outcrosser 23 nuclear genes + 1 chloroplastic gene -Caenorhabditis 2 selfers / 4 outcrossers > 1000 nuclear genes -Triticeae 2 selfers / 2 outcrossers 52 nuclear genes + 1 chloroplastic gene -Rotifers 3 asexuals / 2 sexuals Hsp82 -
3 asexuals / 4 sexuals Cox I +/- +Campeloma 6 asexuals / 12 sexuals Cytb +Daphnia 14 asexuals / 14 sexuals Complete mitocondrial genome +
Outcrossing/Selfing
Sexuals/Asexuals
Conclusion
Differences between selfing/asexual lineages
Recombination
Segregation
Asexuality Selfing
Outcrossing
Effective population size
General conclusion Effects of recombination on selection
Through crossing-over: hitch-hiking effects Through gBGC
Genomic approaches to understand recombination and breeding system evolution
Open questions Drift vs draft Extinction of self-fertilizing species? Role of gBGC in the evolution of genetic systems
(recombination, breeding systems) Evolution of gBGC