Upload
shaojun-xie
View
66
Download
0
Embed Size (px)
Citation preview
Plant breeding relies upon variation
What are the molecular variants that underlie phenotypic diversity? • SNPs • InDels – CNV/PAV • Transposons • Epigenetics • Expression levels
How prevalent are these types of variation? How do they behave in breeding?
Variation: heterosis and transgressive segregation
• Transgressive variation is basis for much of classical breeding efforts
• Apparent phenotypic similarity does not indicate similar genetic mechanisms
F1
B73 Mo17 Short RILs
Intermediate RILs
Tall RILs
RIL
B73 Mo17 F1
F2
Pro
po
rtio
n o
f p
op
ula
tio
n
Outline
• Molecular variation in crop genomes
– Expression variation
– Structural variation
– Epigenetic variation
• Heterosis
B73 Mo17
0
200
400
600
800
1000
1200
1400
Gene Q Gene S Gene T
Exp
ressio
n
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Gene A Gene B Gene C
Exp
ressio
n
B14
B37
B73
B84
Mo17
Oh43
W22
Wf9
Transcriptome differences among parents
• Comparisons of different individuals of the same species reveal a surprising number of gene expression differences
B37 B73 B84 Mo17 Oh43 W22 Wf9
B14A 11.8% 8.5% 6.5% 16.2% 16.6% 12.0% 11.7%
B37 11.0% 9.5% 14.2% 12.7% 12.1% 11.8%
B73 4.3% 15.6% 14.7% 13.1% 12.0%
B84 15.7% 15.6% 13.5% 11.3%
Mo17 15.0% 15.0% 14.2%
Oh43 14.2% 12.1%
W22 11.0%
% Differentially expressed genes (from 12,327 expressed genes)
• These differentially expressed genes include many examples of genes that are only expressed in some genotypes
History of structural variation studies in maize
Kato et al., 2004 PNAS
Brunner et al. 2005 Plant Cell
Sequencing of multiple haplotypes: Dooner, Rafalski, Morgante, Schnable, Messing
Gain/loss in Hp301, Tx303 and teosinte
Mo17 Hp301 Tx303
Teo
• Many gain/loss sequences in Hp301, Tx303 and teosinte
• Significant amount of reference genome is missing in each line
– Hp301: 24Mbp
– Tx303: 29Mbp
– Mo17: 25Mbp
Gain Loss
Kai Ying; ISU
Gene-centric array based analysis of structural variation in diverse maize
• 24 diverse maize lines (4 SS / 6 NSS / 5 tropical / 6 PVP / 3 mixed)
• 14 teosinte genotypes (4 TIL / 10 wild individuals)
Swanson-Wagner et al., Genome Res. 2010
Not Sig n=28,675 PAV / DownCNV n=3,334 UpCNV n=402 Both n=76
Physical position (Mb)
# g
en
oty
pe
s w
ith
va
ria
nt
Chr9
Chr10
Functional implications of structural variants
• Many – but certainly not all - genes with structural variation are “Unclassified”
• “Classical” maize genes (Schnable, Freeling) – 24 / 420 tested (4 CNV and 20 PAV)
• Transcription factors (GRASSIUS) – 98 / 1,723 tested (7 CNV and 91 PAV)
Structural variation contributes significantly to quantitative trait variation in maize (Chia et al., 2012 Nat Genetics)
Examples of CNV affecting important traits in plants
Cold tolerance in barley (Knox et al TAG 2010)
SCN resistance in soybean (Cook et al. Science 2012)
Flowering time in wheat (Diaz et al 2012 PLoS One)
Herbicide resistance in weeds (Gaines et al., 2010 PNAS)
Potential causes of CNV
Potential sources of dispersed duplicates: 1. Transposition 2. On-going fractionation of syntenic regions (Schnable et al., 2011 PNAS)
CoGePedia
Epigenetics - definitions • Epigenetics: Heritable information
not solely due to DNA sequence •Mitotic memory: Development; response to environment
•Meiotic / trans-generational memory: Silencing of TEs; heritable variation
• Chromatin modifications (DNA methylation / histone modifications) are often a mechanism of epigenetic memory but are not necessarily epigenetic
Epigenetics can contribute to natural variation
• Tip of the iceberg or rare form of variation?
• Most examples of trans-generational epigenetic regulation are variable within the species – Arabidopsis
• SUP, PAI, BAL
– Maize
• B, P, C, Pl, R
• Epigenetically silenced alleles may represent genes on the path to genomic removal via genetic mechanisms
Morgan et al., 1999
Cubas et al., 1999 Chandler and Stam 2004
DNA methylation diversity in maize
Maize Landrace Teosinte
Hypomethylation Hypermethylation
1,754 Rare DMRs Hierarchical Clustering Hierarchical Clustering1,966 Common DMRs
• Rare “loss” of DNA methylation more common than rare “gain” • Diversity of epigenome mirrors genomic diversity
Functional Consequences of DMRs
NA
M in
bre
d 5
mc
NA
M in
bre
d R
NA
-se
q
~40M RNA-seq reads (tissue matched) for each NAM parental inbred
Compare transcript abundance and DNA methylation variation
• Identified nearest genes to each DMR (2,375 genes within 10kb of DMR) and assessed correlation with transcript abundance
•277 (of 2,375 tested) had a significant (q<0.01) negative correlation with expression [53 genes exhibited a positively correlation]
•No significant GO enrichments; many genes lack syntenic orthologs in other species (TEs or novel genes)
• ~0.7% of all genes expression associated with nearby DNA methylation variation
Functional Consequences of DMRs
Qualitative association Quantitative association
Outline
• Molecular variation in crop genomes
– Expression variation
– Structural variation
– Epigenetic variation
• Heterosis
What is heterosis?
Heterosis refers to the phenomenon in which hybrid
offspring exhibit characteristics that lie outside the range
of the parents
Mo17 B73 F1 F1
Mo17 B73
Two major goals for research into mechanisms of heterosis
• Goal 1: Improve prediction of ideal hybrid
genotypes. Testing hybrid combinations involves
major cost/effort and improved prediction could
make this process more efficient.
• Goal 2. Develop inbred lines, or approaches, that “capture” phenotypic gains of heterosis.
Observation and quantification of heterosis
• Heterosis is most readily observed and quantified when two
pure-breeding homozygous lines are crossed
• Heterosis is distinct from segregation and transgressive
variation
Height (cM)
RIL
Parent 1 Parent 2 F1
F2
Pro
po
rtio
n o
f p
op
ula
tio
n
F1
B73 Mo17
Short RILs
Intermediate RILs
Tall RILs
Parent 1 Parent 2
Example F2 Example RIL
Heterosis use pros/cons
• Heterosis can generate high levels of uniform production that
can be re-generated each generation and allow for strong
selection in parents
• Heterosis results in complications in seed production and seed
value
• Choosing to use heterosis likely limits breeding progress
Many traits exhibit heterosis
• Measurements of different plant traits in over 400 maize
hybrids provides evidence for prevalent heterosis for many
traits
Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433
Phenotypic observations about heterosis
Stuber CW, Lincoln SE, Wolff DW, Helentjaris T, Lander ES. 1992. Identification of genetic factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. Genetics 132:823--39
There are some examples of heterosis due to the effects of a single locus
•Heterosis is generally due to contributions from many loci (QTL mapping studies)
Krieger U, Lippman ZB, Zamir D. 2010. The flowering gene SINGLE FLOWER TRUSS drives heterosis for yield in tomato. Nat. Genet. 42:459--63
B73 Best Parent Heterosis
0
2
4
6
8
10
12
14H
100 x
B73
B84 x
B73
F2 x
B73
B14a x
B73
Mo17 x
B73
B77 x
B73
H99 x
B73
W64a x
B73
W22 x
B73
Wf9
x B
73
B37 x
B73
Oh43 x
B73
A188 x
B73
Ran
kin
g
Final Height
Stalk Diameter
Days to Flow er
Number of Chutes
50 Seed Weight
Kernel Row s
Week 3 Height
Biomass Avg *
Greenhouse height *
Phenotypic observations about heterosis
•Heterosis is not quantifiable at the organismal level (trait to trait variation)
Phenotypic observations about heterosis
•Different genes likely control heterosis for heterosis for different traits (lack of correlation for heterosis for different traits)
Pla
nt
heig
ht
BP
H
Yie
ld B
PH
Yield BPH Cob weight BPH
115 diverse inbreds
each crossed to
B73 and Mo17
DistB73DTT
PlantYieldTSLLEN
TSLBCHCNTTSLANG
PLTHTUPLFANGLEAFWDTLEAFLEN
RPRSTLKWDT10KWeight
CobDiameterKernelHeight
EarLengthCobWeight
TotKWt
Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433
Phenotypic observations about heterosis
•Heterosis is only partially correlated with genetic diversity
Pla
nt
hei
ght
BP
H
Pla
nt
yiel
d B
PH
Genetic diversity (from Hamblin 2008)
Attempting to understand heterosis
• Genetic basis; dominance, over-dominance, etc
• Molecular basis; dosage, allele-preference, etc
• Many possible answers each with some evidence
– little evidence for a common answer
Dominance and over-dominance
• The dominance theory of heterosis posits that inbred lines have mildly deleterious alleles and heterosis is the result of complementation of these defects
• The over-dominance theory of heterosis suggests that heterozygosity per se results in heterosis
• Associated concepts
• Pseudo-over-dominance
• Epistasis
• Birchler and others have encouraged moving past dominance / over-dominance debate to think in more quantitative or systems approaches
Dominance
• Evidence for substantial genetic load (deleterious alleles)
from inbreeding depression and from genomic analyses
• Dominance contribution to heterosis must be through MILD
deleterious alleles and likely to be highly multi-genic.
• Also consider capture of “beneficial” alleles - adaptedness
Parent 1 Parent 2 Hybrid F1
Arguments against pure-dominance
Parent 1 Parent 2
+ + +
- -
+ + +
- -
- - - - -
+ +
+ - - -
+ +
+ Hybrid F1 – ideal
recombinant chromosomes
may require too many cross-overs
• Over-dominant action for some loci
• Response: Potential pseudo-overdominance
• Lack of ability to “capture” positive alleles and purge
deleterious alleles
• Response: many genes involved, each with small effects
which may limit ability to purge deleterious alleles
Hybrid F1
+ + +
- -
- - - -
+ +
+
The case for over-dominance
• Loci with over-dominant contribution to phenotype
have been identified (SFT, Erecta, QTL studies)
• Observations of heterosis and inbreeding
depression in polyploids suggest mechanisms
beyond dominance
• Lack of progress in “removing” heterosis and limited
expectations for genetic load
Molecular basis of heterosis
Mo17
B73
F1 ?
F1
Mo17 B73
A. What molecular variation exists between parents? B. What is unique about the hybrid?
No heterosis without variation among parents: Understanding variation and how it combines is important for heterosis What tissue to survey?
What is unique about the hybrid? Transcriptional levels?
0
1
2
3
4
5
6
Ex
pre
ss
ion
le
ve
l
Parent 1 Parent 2 Potential hybrid expression levels
A
B
C
D
E Mid-parent
High
Parent-like
Above
High parent
Below
Low Parent
Low
Parent-like
Differentially expressed genes
• Since many traits have values outside the parental range it
was expected that many genes would also be expressed
outside the parental range
• Most genes are expressed at levels within parental range
What is unique about the hybrid? How might mid-parent expression levels be beneficial?
•Many mid-parent (additive) expression patterns
•Potential “Goldilocks” effect of gene expression on phenotype
• Genetic action of gene expression phenotype does not equal genetic action of phenotype
Exp
ress
ion
leve
l
Gene A Gene B Gene C
Optimal expression range
Increasingly detrimental over-
expression
Increasingly detrimental
under-expression
Hybrid
Inbred 2
Inbred 1 Hybrid
Inbred 2
Inbred 1
Hybrid
Inbred 2
Inbred 1
What is unique about the hybrid? Unique genome / transcriptome content
• Hybrids encode more genes and express more genes than
either parent
• Basically a dominance explanation
• How might these genes contribute to heterosis?
Parent 1 Parent 2
• Most genes present/expressed in both parents • Small number of genes unique to each parent • All genes present / expressed in hybrid
Improved interactions may lead to improved transition precision
• Genome content variation often affects members of gene families and therefore may lead to subtle perturbations proper interactions
• Birchler and Veitia have proposed concept of dosage balance hypothesis
• Propose the having correct interactions in complexes may be critical to achieving proper developmental transitions and stress response
• As co-evolved gene family members are re-united in hybrids they are more efficient at precise transitions in development or in response to stress
• Important to remember that selection has been strong to move from teosinte to maize and to filter out major deleterious alleles
A1
A2
B1
B2
A1 B1
B2
A1
A2 B2
Inbred 1 Inbred 2 Hybrid
The loss of genes (from genome or transcriptome) may be tolerated due to partial redundancy of paralogs or orthologs Allows survival of inbred lines lacking genes and but may “break-down” and provide sub-optimal performance especially during transitions and stress
After bear damage “repaired” for trip home using duct tape
Heterosis Summary
• Heterosis varies among traits and tissues
• Search for unifying principles among traits and species may not
be successful
• Distinct mechanisms causes of molecular variation (genome,
transcriptome, epigenome) and action to produce phenotypic
heterosis
• Selective pressures and genetic load (history) matters
• Modern day lines represent significant selection upon natural
genetic materials
• Limited utility of heterotic groups
Compare / contrast maize-switchgrass heterosis
• Both allopolyploid outcrossers with large
effect population size – likely abundant
genetic load and on-going fractionation
• Breeding style limitations
• Differences in “domesticated vs wild” are
distinct in two species
• Peter Hermanson • Steve Eichten • Amanda Waters • Qing Li • Ruth Swanson-Wagner • Matthew Vaughn (TACC ) • Jawon Song (TACC) • Irina Makarevitch (Hamline) • Damon Lisch (Berkeley)
Iowa State U -Patrick Schnable -Eddy Yeh NimbleGen -Jeffrey Jeddeloh
U Georgia -Kelly Dawe -Xiaoyu Zhang -Jonathan Gent -Nathaniel Ellis
U of Minnesota -Bob Stupar -Chad Myers -Roman Briskine -Rob Schaefer -Peter Tiffin -Lin Li -Gary Muehlbauer U of Wisconsin -Shawn Kaeppler -Scott Stelpflug NSF DBI# 0922095
NSF IOS# 1237931
Modeling of heterosis phenotypes
• Use parental phenotype, genetic distance between parents and environment to model hybrid performance
Scatter Plot
PLTHT_Est
Scatter Plot
TotKWt_Est
Scatter Plot
CobDia_Est
Scatter Plot
CobWt_Est
A. Cob diameter B. Cob weight
C. Plant height D. Total kernel weight
Predicted
Act
ual
Predicted
Act
ual
Predicted
Act
ual
Predicted
Act
ual
Population 1 (R2 = 0.70)
Population 2 – B73 OC (R2 = 0.73)
Population 2 – Mo17 OC (R2 = 0.70)
Population 1 (R2 = 0.91)
Population 2 – B73 OC (R2 = 0.69)
Population 2 – Mo17 OC (R2 = 0.56)
Population 1 (R2 = 0.76)
Population 2 – B73 OC (R2 = 0.53)
Population 2 – Mo17 OC (R2 = 0.54)
Population 1 (R2 = 0.74)
Population 2 – B73 OC (R2 = 0.65)
Population 2 – Mo17 OC (R2 = 0.55)
“Adaptedness” concept from Troyer 2006
Flint-Garcia et al. PLoSOne 2009
Many plant species exhibit heterosis
• Heterosis is also prevalent in many other plant species
although the magnitude and prevalence of heterosis varies
• Note: Actual genetic architecture of heterosis may vary
depending on past selection pressures and natural history
Groszmann M, Greaves IK, Albertyn ZI, Scofield GN, Peacock WJ, Dennis ES. 2011. Changes in 24-nt siRNA levels in Arabidopsis hybrids suggest an epigenetic contribution to hybrid vigor. Proc. Natl. Acad. Sci. USA 108:2617--22
Rice Arabidopsis
Qifa Zhang
Unique expression in hybrids? • Limited evidence for unique expression levels in hybrids
0
1
2
3
4
5
6
Ex
pre
ss
ion
le
ve
l
Parent 1 Parent 2 Potential hybrid expression levels
A
B
C
D
E
High parent level
B84xB73 B37xB73 Oh43xB73 Oh43xMo17 Mo17xB73 B73xMo17
# DE genes 290 655 1071 885 1064 1055
# Non-additive 88 (30.3%) 159 (24.3%) 296 (27.6%) 233 (26.3%) 247 (23.2%) 266 (25.2%)
# NA between parents 83 126 232 184 201 209
# HP or LP 5 32 58 47 44 55
# AHP or BLP 0 3 6 2 2 2
Similar results in Guo et al., 2006; Stupar and Springer 2006; Swanson-Wagner 2006
Contrasting results in Auger et al., 2005; Meyer et al., 2007; Uzarowska et al., 2007
Non-additive
Non-additive
Mid-parent level
Low parent level
High-parent
Low-parent
Above high-parent
Below high-parent
Non-additive between parents
Does epigenotype have information beyond genotype for predicting phenotype?
• Epigenotype is more costly to determine than genotype
– Is there novel information in epigenotype for predicting phenotype?
• Remember: Epigenotype will predominantly act through alteration of expression levels
Genotype
(SNPs / TEs)
Quantitative variation (altered levels of gene
product)
Environment Epigenotype
Qualitative variation (altered quality of gene
product)
Phenotype
?
?
Gene product variation
Distribution of structural polymorphism in B73/Mo17
Springer et al., PLoS Genetics 2009 Belo et al., TAG 2010
Both shared and unique structural variants
Mo17
Hp301
Tx303
21,000 probes in a 20Mb region of chromosome 4
Missing in all 3 genotypes
Missing in Mo17 and
Tx303
Missing in Tx303 only
Copy gain in Hp301 and
Tx303
Copy gain in Hp301 and
Tx303
Segregation of Non-Allelic Gene Copies Generates PAVs/CNVs and Novel Phenotypes
Changes in gene
complement among RILs.
Strong statistical support
for association between
gene loss and yield
component traits in IBM
RILs
Liu et al., Plant J. 2012
Frequent unlinked Mo17 copy gains
• 4,994 probes detect Mo17-specific sequence duplications
• Could be local or unlinked copy gains in Mo17
60% unlinked (trans) 10% linked (cis) 30% unassigned
Most Mo17 copy number gains occur at unlinked genomic positions
NIL type
Genotype at locus
Unlinked duplications Linked duplications Scatter Plot
class
Scatter Plot
class
Scatter Plot
class
Scatter Plot
class
B73 Mo17
B M B M
AC186656 AC194260
B73 Mo17
B M B M
B73 Mo17
B M B M
B73 Mo17
B M B M
AC191373 AC198648
Eichten et al., Plant Phys. 2011
Non-Mendelian gene expression variation in maize
• RNAseq analysis of expression in ~100 RILs
• Most genes have expected patterns (normal or bi-modal distribution)
• ~150 examples of paramutation-like patterns
• ~200 genes with unexpected patterns of presence-absence for transcripts
Lin Li, Gary Muehlbauer : Li et al., PLoS Genetics 2013
Low-parent level
High-parent level
Mid-parent level
Pro
p. o
f ge
ne
s in
eac
h d
/a b
in
<-2.0 -1.0 0 1.0 >2.0
d/a ratio
• The majority of genes exhibit hybrid expression levels within the parental range (94%)
• Similar distributions of additive and non-additive expression for different hybrids
B84xB73
B37xB73
Oh43xB73
Oh43xMo17
Mo17xB73
B73xMo17
A B D E
0
1
2
3
4
5
6
Ex
pre
ss
ion
le
ve
l
Parent 1 Parent 2
A
B
C
D
E C
Heterosis and genome content variation
• Content variation may be a potential contributor to heterosis
– Hybrids contain more genes and express more genes than either parent
NSS
PV
P
SS
How do B73 and Mo17 genomes vary?
• SNPs (coding and non-coding)
• InDels (including transposons)
• Copy number variation (and PAV)
• Epigenetic information
B7
3
Mo
17
B x M
M x B
What happens in hybrids?
• Majority of B73 vs Mo17 DMRs show mid-parent methylation levels in hybrid F1s
• 5-10 DMRs show high-parent methylation state
More Mo17 like
More B73 like
Genome-wide Assessment of DNA methylation
• 1.1 Million experimental probes placed every 200bp
-single-copy -corrected for CGH effects
• meDIP-chip (5mC) and ChIP-chip for H3K9me2 and H3K27me3
-Antibody pulldown of methylated DNA (not context-specific) contrasted against control gDNA
•Assess relative methylation enrichment across low-copy space of maize genome
Analysis B73 methylation Mo17 methylation Genes Repeats
Matt Vaughn, TACC
DNA Methylation variation is prevalent between genotypes, but not between tissues
DNA methylation
Eichten et al., Plant Genome 2012
H3K27me3
Makarevitch et al., Plant Cell 2013
Maize epigenomic profiling Genome wide distribution
• 5mC and H3K9me2 largely overlapping and enriched in pericentromeric regions.
• H3K9me2 rarely found within genes
• H3K27me3 enriched in chromosomes arms and often in genes.
~100 kb
DNA methylation differences following domestication
Maize Landrace Teosinte
Hypomethylation Hypermethylation
172 Maize – teosinte DMRs
3720
Rare & Common DMRs
149
Teosinte-specific DMRs
23
• Some DNA methylation differences between maize and teosinte
• Few are fixed differences in maize / teosinte
• Small number of maize-teosinte DMRs overlap with domestication regions or maize-teosinte DE genes
How does heritable information vary among individuals of a species?
• Expected to occur primarily through SNPs and small InDels that result in:
– Qualitative variation (different proteins)
– Quantitative variation (different amount of mRNA or protein)
• But.. Other types of variation exist as well
Genome content summary • High levels of variation for genome content
– Some association with heterosis
– Potential on-going fractionation
• Implications for genome structure and plant breeding
– Hybrids have more genes than inbreds
– Extra gene fragments segregate in populations
– Non-colinearity within a species
– May require pan-genome sequencing strategies to capture species gene content
Potential transcriptome complementation in hybrids
• Numerous genes expressed in some inbreds but not others
• Some exhibit tissue-specific absence and others are absent in all tissues tested
• Results in higher numbers of genes being expressed in hybrid
• Some are due to differences in expression, others due to genome content differences
B
73
exp
ress
ion
leve
l
Mo17 expression level
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
70mer Affy
No
rma
lize
d a
ve
rag
e s
ign
al
B14
B37
B73
B84
Mo17a
Oh43
W22
Wf9
AF520911
Analogous to Fu and Dooner (2002) suggestion about genomic differences Many additional PA transcriptome patterns documented in Hansey et al., PLoS One 2012