63
Variation in crop genomes and heterosis Nathan Springer

Variation in crop genomes and heterosis

Embed Size (px)

Citation preview

Variation in crop genomes and heterosis

Nathan Springer

Plant breeding relies upon variation

What are the molecular variants that underlie phenotypic diversity? • SNPs • InDels – CNV/PAV • Transposons • Epigenetics • Expression levels

How prevalent are these types of variation? How do they behave in breeding?

Variation: heterosis and transgressive segregation

• Transgressive variation is basis for much of classical breeding efforts

• Apparent phenotypic similarity does not indicate similar genetic mechanisms

F1

B73 Mo17 Short RILs

Intermediate RILs

Tall RILs

RIL

B73 Mo17 F1

F2

Pro

po

rtio

n o

f p

op

ula

tio

n

Outline

• Molecular variation in crop genomes

– Expression variation

– Structural variation

– Epigenetic variation

• Heterosis

B73 Mo17

0

200

400

600

800

1000

1200

1400

Gene Q Gene S Gene T

Exp

ressio

n

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

Gene A Gene B Gene C

Exp

ressio

n

B14

B37

B73

B84

Mo17

Oh43

W22

Wf9

Transcriptome differences among parents

• Comparisons of different individuals of the same species reveal a surprising number of gene expression differences

B37 B73 B84 Mo17 Oh43 W22 Wf9

B14A 11.8% 8.5% 6.5% 16.2% 16.6% 12.0% 11.7%

B37 11.0% 9.5% 14.2% 12.7% 12.1% 11.8%

B73 4.3% 15.6% 14.7% 13.1% 12.0%

B84 15.7% 15.6% 13.5% 11.3%

Mo17 15.0% 15.0% 14.2%

Oh43 14.2% 12.1%

W22 11.0%

% Differentially expressed genes (from 12,327 expressed genes)

• These differentially expressed genes include many examples of genes that are only expressed in some genotypes

History of structural variation studies in maize

Kato et al., 2004 PNAS

Brunner et al. 2005 Plant Cell

Sequencing of multiple haplotypes: Dooner, Rafalski, Morgante, Schnable, Messing

Gain/loss in Hp301, Tx303 and teosinte

Mo17 Hp301 Tx303

Teo

• Many gain/loss sequences in Hp301, Tx303 and teosinte

• Significant amount of reference genome is missing in each line

– Hp301: 24Mbp

– Tx303: 29Mbp

– Mo17: 25Mbp

Gain Loss

Kai Ying; ISU

Gene-centric array based analysis of structural variation in diverse maize

• 24 diverse maize lines (4 SS / 6 NSS / 5 tropical / 6 PVP / 3 mixed)

• 14 teosinte genotypes (4 TIL / 10 wild individuals)

Swanson-Wagner et al., Genome Res. 2010

Not Sig n=28,675 PAV / DownCNV n=3,334 UpCNV n=402 Both n=76

Physical position (Mb)

# g

en

oty

pe

s w

ith

va

ria

nt

Chr9

Chr10

Functional implications of structural variants

• Many – but certainly not all - genes with structural variation are “Unclassified”

• “Classical” maize genes (Schnable, Freeling) – 24 / 420 tested (4 CNV and 20 PAV)

• Transcription factors (GRASSIUS) – 98 / 1,723 tested (7 CNV and 91 PAV)

Structural variation contributes significantly to quantitative trait variation in maize (Chia et al., 2012 Nat Genetics)

Examples of CNV affecting important traits in plants

Cold tolerance in barley (Knox et al TAG 2010)

SCN resistance in soybean (Cook et al. Science 2012)

Flowering time in wheat (Diaz et al 2012 PLoS One)

Herbicide resistance in weeds (Gaines et al., 2010 PNAS)

Potential causes of CNV

Potential sources of dispersed duplicates: 1. Transposition 2. On-going fractionation of syntenic regions (Schnable et al., 2011 PNAS)

CoGePedia

Reference genome

sequence

Pan-genome?

Epigenetics - definitions • Epigenetics: Heritable information

not solely due to DNA sequence •Mitotic memory: Development; response to environment

•Meiotic / trans-generational memory: Silencing of TEs; heritable variation

• Chromatin modifications (DNA methylation / histone modifications) are often a mechanism of epigenetic memory but are not necessarily epigenetic

Epigenetics can contribute to natural variation

• Tip of the iceberg or rare form of variation?

• Most examples of trans-generational epigenetic regulation are variable within the species – Arabidopsis

• SUP, PAI, BAL

– Maize

• B, P, C, Pl, R

• Epigenetically silenced alleles may represent genes on the path to genomic removal via genetic mechanisms

Morgan et al., 1999

Cubas et al., 1999 Chandler and Stam 2004

DMRs in diverse maize genotypes

DNA methylation diversity in maize

Maize Landrace Teosinte

Hypomethylation Hypermethylation

1,754 Rare DMRs Hierarchical Clustering Hierarchical Clustering1,966 Common DMRs

• Rare “loss” of DNA methylation more common than rare “gain” • Diversity of epigenome mirrors genomic diversity

Functional Consequences of DMRs

NA

M in

bre

d 5

mc

NA

M in

bre

d R

NA

-se

q

~40M RNA-seq reads (tissue matched) for each NAM parental inbred

Compare transcript abundance and DNA methylation variation

• Identified nearest genes to each DMR (2,375 genes within 10kb of DMR) and assessed correlation with transcript abundance

•277 (of 2,375 tested) had a significant (q<0.01) negative correlation with expression [53 genes exhibited a positively correlation]

•No significant GO enrichments; many genes lack syntenic orthologs in other species (TEs or novel genes)

• ~0.7% of all genes expression associated with nearby DNA methylation variation

Functional Consequences of DMRs

Qualitative association Quantitative association

Outline

• Molecular variation in crop genomes

– Expression variation

– Structural variation

– Epigenetic variation

• Heterosis

What is heterosis?

Heterosis refers to the phenomenon in which hybrid

offspring exhibit characteristics that lie outside the range

of the parents

Mo17 B73 F1 F1

Mo17 B73

Two major goals for research into mechanisms of heterosis

• Goal 1: Improve prediction of ideal hybrid

genotypes. Testing hybrid combinations involves

major cost/effort and improved prediction could

make this process more efficient.

• Goal 2. Develop inbred lines, or approaches, that “capture” phenotypic gains of heterosis.

Observation and quantification of heterosis

• Heterosis is most readily observed and quantified when two

pure-breeding homozygous lines are crossed

• Heterosis is distinct from segregation and transgressive

variation

Height (cM)

RIL

Parent 1 Parent 2 F1

F2

Pro

po

rtio

n o

f p

op

ula

tio

n

F1

B73 Mo17

Short RILs

Intermediate RILs

Tall RILs

Parent 1 Parent 2

Example F2 Example RIL

Heterosis use pros/cons

• Heterosis can generate high levels of uniform production that

can be re-generated each generation and allow for strong

selection in parents

• Heterosis results in complications in seed production and seed

value

• Choosing to use heterosis likely limits breeding progress

Many traits exhibit heterosis

• Measurements of different plant traits in over 400 maize

hybrids provides evidence for prevalent heterosis for many

traits

Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433

Phenotypic observations about heterosis

Stuber CW, Lincoln SE, Wolff DW, Helentjaris T, Lander ES. 1992. Identification of genetic factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. Genetics 132:823--39

There are some examples of heterosis due to the effects of a single locus

•Heterosis is generally due to contributions from many loci (QTL mapping studies)

Krieger U, Lippman ZB, Zamir D. 2010. The flowering gene SINGLE FLOWER TRUSS drives heterosis for yield in tomato. Nat. Genet. 42:459--63

B73 Best Parent Heterosis

0

2

4

6

8

10

12

14H

100 x

B73

B84 x

B73

F2 x

B73

B14a x

B73

Mo17 x

B73

B77 x

B73

H99 x

B73

W64a x

B73

W22 x

B73

Wf9

x B

73

B37 x

B73

Oh43 x

B73

A188 x

B73

Ran

kin

g

Final Height

Stalk Diameter

Days to Flow er

Number of Chutes

50 Seed Weight

Kernel Row s

Week 3 Height

Biomass Avg *

Greenhouse height *

Phenotypic observations about heterosis

•Heterosis is not quantifiable at the organismal level (trait to trait variation)

Phenotypic observations about heterosis

•Different genes likely control heterosis for heterosis for different traits (lack of correlation for heterosis for different traits)

Pla

nt

heig

ht

BP

H

Yie

ld B

PH

Yield BPH Cob weight BPH

115 diverse inbreds

each crossed to

B73 and Mo17

DistB73DTT

PlantYieldTSLLEN

TSLBCHCNTTSLANG

PLTHTUPLFANGLEAFWDTLEAFLEN

RPRSTLKWDT10KWeight

CobDiameterKernelHeight

EarLengthCobWeight

TotKWt

Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433

Phenotypic observations about heterosis

•Heterosis is only partially correlated with genetic diversity

Pla

nt

hei

ght

BP

H

Pla

nt

yiel

d B

PH

Genetic diversity (from Hamblin 2008)

Attempting to understand heterosis

• Genetic basis; dominance, over-dominance, etc

• Molecular basis; dosage, allele-preference, etc

• Many possible answers each with some evidence

– little evidence for a common answer

Dominance and over-dominance

• The dominance theory of heterosis posits that inbred lines have mildly deleterious alleles and heterosis is the result of complementation of these defects

• The over-dominance theory of heterosis suggests that heterozygosity per se results in heterosis

• Associated concepts

• Pseudo-over-dominance

• Epistasis

• Birchler and others have encouraged moving past dominance / over-dominance debate to think in more quantitative or systems approaches

Dominance

• Evidence for substantial genetic load (deleterious alleles)

from inbreeding depression and from genomic analyses

• Dominance contribution to heterosis must be through MILD

deleterious alleles and likely to be highly multi-genic.

• Also consider capture of “beneficial” alleles - adaptedness

Parent 1 Parent 2 Hybrid F1

Arguments against pure-dominance

Parent 1 Parent 2

+ + +

- -

+ + +

- -

- - - - -

+ +

+ - - -

+ +

+ Hybrid F1 – ideal

recombinant chromosomes

may require too many cross-overs

• Over-dominant action for some loci

• Response: Potential pseudo-overdominance

• Lack of ability to “capture” positive alleles and purge

deleterious alleles

• Response: many genes involved, each with small effects

which may limit ability to purge deleterious alleles

Hybrid F1

+ + +

- -

- - - -

+ +

+

The case for over-dominance

• Loci with over-dominant contribution to phenotype

have been identified (SFT, Erecta, QTL studies)

• Observations of heterosis and inbreeding

depression in polyploids suggest mechanisms

beyond dominance

• Lack of progress in “removing” heterosis and limited

expectations for genetic load

Molecular basis of heterosis

Mo17

B73

F1 ?

F1

Mo17 B73

A. What molecular variation exists between parents? B. What is unique about the hybrid?

No heterosis without variation among parents: Understanding variation and how it combines is important for heterosis What tissue to survey?

What is unique about the hybrid? Transcriptional levels?

0

1

2

3

4

5

6

Ex

pre

ss

ion

le

ve

l

Parent 1 Parent 2 Potential hybrid expression levels

A

B

C

D

E Mid-parent

High

Parent-like

Above

High parent

Below

Low Parent

Low

Parent-like

Differentially expressed genes

• Since many traits have values outside the parental range it

was expected that many genes would also be expressed

outside the parental range

• Most genes are expressed at levels within parental range

What is unique about the hybrid? How might mid-parent expression levels be beneficial?

•Many mid-parent (additive) expression patterns

•Potential “Goldilocks” effect of gene expression on phenotype

• Genetic action of gene expression phenotype does not equal genetic action of phenotype

Exp

ress

ion

leve

l

Gene A Gene B Gene C

Optimal expression range

Increasingly detrimental over-

expression

Increasingly detrimental

under-expression

Hybrid

Inbred 2

Inbred 1 Hybrid

Inbred 2

Inbred 1

Hybrid

Inbred 2

Inbred 1

What is unique about the hybrid? Unique genome / transcriptome content

• Hybrids encode more genes and express more genes than

either parent

• Basically a dominance explanation

• How might these genes contribute to heterosis?

Parent 1 Parent 2

• Most genes present/expressed in both parents • Small number of genes unique to each parent • All genes present / expressed in hybrid

Improved interactions may lead to improved transition precision

• Genome content variation often affects members of gene families and therefore may lead to subtle perturbations proper interactions

• Birchler and Veitia have proposed concept of dosage balance hypothesis

• Propose the having correct interactions in complexes may be critical to achieving proper developmental transitions and stress response

• As co-evolved gene family members are re-united in hybrids they are more efficient at precise transitions in development or in response to stress

• Important to remember that selection has been strong to move from teosinte to maize and to filter out major deleterious alleles

A1

A2

B1

B2

A1 B1

B2

A1

A2 B2

Inbred 1 Inbred 2 Hybrid

The loss of genes (from genome or transcriptome) may be tolerated due to partial redundancy of paralogs or orthologs Allows survival of inbred lines lacking genes and but may “break-down” and provide sub-optimal performance especially during transitions and stress

After bear damage “repaired” for trip home using duct tape

Heterosis Summary

• Heterosis varies among traits and tissues

• Search for unifying principles among traits and species may not

be successful

• Distinct mechanisms causes of molecular variation (genome,

transcriptome, epigenome) and action to produce phenotypic

heterosis

• Selective pressures and genetic load (history) matters

• Modern day lines represent significant selection upon natural

genetic materials

• Limited utility of heterotic groups

Compare / contrast maize-switchgrass heterosis

• Both allopolyploid outcrossers with large

effect population size – likely abundant

genetic load and on-going fractionation

• Breeding style limitations

• Differences in “domesticated vs wild” are

distinct in two species

• Peter Hermanson • Steve Eichten • Amanda Waters • Qing Li • Ruth Swanson-Wagner • Matthew Vaughn (TACC ) • Jawon Song (TACC) • Irina Makarevitch (Hamline) • Damon Lisch (Berkeley)

Iowa State U -Patrick Schnable -Eddy Yeh NimbleGen -Jeffrey Jeddeloh

U Georgia -Kelly Dawe -Xiaoyu Zhang -Jonathan Gent -Nathaniel Ellis

U of Minnesota -Bob Stupar -Chad Myers -Roman Briskine -Rob Schaefer -Peter Tiffin -Lin Li -Gary Muehlbauer U of Wisconsin -Shawn Kaeppler -Scott Stelpflug NSF DBI# 0922095

NSF IOS# 1237931

Modeling of heterosis phenotypes

• Use parental phenotype, genetic distance between parents and environment to model hybrid performance

Scatter Plot

PLTHT_Est

Scatter Plot

TotKWt_Est

Scatter Plot

CobDia_Est

Scatter Plot

CobWt_Est

A. Cob diameter B. Cob weight

C. Plant height D. Total kernel weight

Predicted

Act

ual

Predicted

Act

ual

Predicted

Act

ual

Predicted

Act

ual

Population 1 (R2 = 0.70)

Population 2 – B73 OC (R2 = 0.73)

Population 2 – Mo17 OC (R2 = 0.70)

Population 1 (R2 = 0.91)

Population 2 – B73 OC (R2 = 0.69)

Population 2 – Mo17 OC (R2 = 0.56)

Population 1 (R2 = 0.76)

Population 2 – B73 OC (R2 = 0.53)

Population 2 – Mo17 OC (R2 = 0.54)

Population 1 (R2 = 0.74)

Population 2 – B73 OC (R2 = 0.65)

Population 2 – Mo17 OC (R2 = 0.55)

“Adaptedness” concept from Troyer 2006

Flint-Garcia et al. PLoSOne 2009

Many plant species exhibit heterosis

• Heterosis is also prevalent in many other plant species

although the magnitude and prevalence of heterosis varies

• Note: Actual genetic architecture of heterosis may vary

depending on past selection pressures and natural history

Groszmann M, Greaves IK, Albertyn ZI, Scofield GN, Peacock WJ, Dennis ES. 2011. Changes in 24-nt siRNA levels in Arabidopsis hybrids suggest an epigenetic contribution to hybrid vigor. Proc. Natl. Acad. Sci. USA 108:2617--22

Rice Arabidopsis

Qifa Zhang

Unique expression in hybrids? • Limited evidence for unique expression levels in hybrids

0

1

2

3

4

5

6

Ex

pre

ss

ion

le

ve

l

Parent 1 Parent 2 Potential hybrid expression levels

A

B

C

D

E

High parent level

B84xB73 B37xB73 Oh43xB73 Oh43xMo17 Mo17xB73 B73xMo17

# DE genes 290 655 1071 885 1064 1055

# Non-additive 88 (30.3%) 159 (24.3%) 296 (27.6%) 233 (26.3%) 247 (23.2%) 266 (25.2%)

# NA between parents 83 126 232 184 201 209

# HP or LP 5 32 58 47 44 55

# AHP or BLP 0 3 6 2 2 2

Similar results in Guo et al., 2006; Stupar and Springer 2006; Swanson-Wagner 2006

Contrasting results in Auger et al., 2005; Meyer et al., 2007; Uzarowska et al., 2007

Non-additive

Non-additive

Mid-parent level

Low parent level

High-parent

Low-parent

Above high-parent

Below high-parent

Non-additive between parents

Does epigenotype have information beyond genotype for predicting phenotype?

• Epigenotype is more costly to determine than genotype

– Is there novel information in epigenotype for predicting phenotype?

• Remember: Epigenotype will predominantly act through alteration of expression levels

Genotype

(SNPs / TEs)

Quantitative variation (altered levels of gene

product)

Environment Epigenotype

Qualitative variation (altered quality of gene

product)

Phenotype

?

?

Gene product variation

Distribution of structural polymorphism in B73/Mo17

Springer et al., PLoS Genetics 2009 Belo et al., TAG 2010

Both shared and unique structural variants

Mo17

Hp301

Tx303

21,000 probes in a 20Mb region of chromosome 4

Missing in all 3 genotypes

Missing in Mo17 and

Tx303

Missing in Tx303 only

Copy gain in Hp301 and

Tx303

Copy gain in Hp301 and

Tx303

Novel Hybridization Patterns

Apparent “de novo” CNV in RILs

Segregation of Non-Allelic Gene Copies Generates PAVs/CNVs and Novel Phenotypes

Changes in gene

complement among RILs.

Strong statistical support

for association between

gene loss and yield

component traits in IBM

RILs

Liu et al., Plant J. 2012

Frequent unlinked Mo17 copy gains

• 4,994 probes detect Mo17-specific sequence duplications

• Could be local or unlinked copy gains in Mo17

60% unlinked (trans) 10% linked (cis) 30% unassigned

Most Mo17 copy number gains occur at unlinked genomic positions

NIL type

Genotype at locus

Unlinked duplications Linked duplications Scatter Plot

class

Scatter Plot

class

Scatter Plot

class

Scatter Plot

class

B73 Mo17

B M B M

AC186656 AC194260

B73 Mo17

B M B M

B73 Mo17

B M B M

B73 Mo17

B M B M

AC191373 AC198648

Eichten et al., Plant Phys. 2011

Non-Mendelian gene expression variation in maize

• RNAseq analysis of expression in ~100 RILs

• Most genes have expected patterns (normal or bi-modal distribution)

• ~150 examples of paramutation-like patterns

• ~200 genes with unexpected patterns of presence-absence for transcripts

Lin Li, Gary Muehlbauer : Li et al., PLoS Genetics 2013

Low-parent level

High-parent level

Mid-parent level

Pro

p. o

f ge

ne

s in

eac

h d

/a b

in

<-2.0 -1.0 0 1.0 >2.0

d/a ratio

• The majority of genes exhibit hybrid expression levels within the parental range (94%)

• Similar distributions of additive and non-additive expression for different hybrids

B84xB73

B37xB73

Oh43xB73

Oh43xMo17

Mo17xB73

B73xMo17

A B D E

0

1

2

3

4

5

6

Ex

pre

ss

ion

le

ve

l

Parent 1 Parent 2

A

B

C

D

E C

Heterosis and genome content variation

• Content variation may be a potential contributor to heterosis

– Hybrids contain more genes and express more genes than either parent

NSS

PV

P

SS

How do B73 and Mo17 genomes vary?

• SNPs (coding and non-coding)

• InDels (including transposons)

• Copy number variation (and PAV)

• Epigenetic information

B7

3

Mo

17

B x M

M x B

What happens in hybrids?

• Majority of B73 vs Mo17 DMRs show mid-parent methylation levels in hybrid F1s

• 5-10 DMRs show high-parent methylation state

More Mo17 like

More B73 like

Genome-wide Assessment of DNA methylation

• 1.1 Million experimental probes placed every 200bp

-single-copy -corrected for CGH effects

• meDIP-chip (5mC) and ChIP-chip for H3K9me2 and H3K27me3

-Antibody pulldown of methylated DNA (not context-specific) contrasted against control gDNA

•Assess relative methylation enrichment across low-copy space of maize genome

Analysis B73 methylation Mo17 methylation Genes Repeats

Matt Vaughn, TACC

DNA Methylation variation is prevalent between genotypes, but not between tissues

DNA methylation

Eichten et al., Plant Genome 2012

H3K27me3

Makarevitch et al., Plant Cell 2013

Maize epigenomic profiling Genome wide distribution

• 5mC and H3K9me2 largely overlapping and enriched in pericentromeric regions.

• H3K9me2 rarely found within genes

• H3K27me3 enriched in chromosomes arms and often in genes.

~100 kb

DNA methylation differences following domestication

Maize Landrace Teosinte

Hypomethylation Hypermethylation

172 Maize – teosinte DMRs

3720

Rare & Common DMRs

149

Teosinte-specific DMRs

23

• Some DNA methylation differences between maize and teosinte

• Few are fixed differences in maize / teosinte

• Small number of maize-teosinte DMRs overlap with domestication regions or maize-teosinte DE genes

How does heritable information vary among individuals of a species?

• Expected to occur primarily through SNPs and small InDels that result in:

– Qualitative variation (different proteins)

– Quantitative variation (different amount of mRNA or protein)

• But.. Other types of variation exist as well

Genome content summary • High levels of variation for genome content

– Some association with heterosis

– Potential on-going fractionation

• Implications for genome structure and plant breeding

– Hybrids have more genes than inbreds

– Extra gene fragments segregate in populations

– Non-colinearity within a species

– May require pan-genome sequencing strategies to capture species gene content

Potential transcriptome complementation in hybrids

• Numerous genes expressed in some inbreds but not others

• Some exhibit tissue-specific absence and others are absent in all tissues tested

• Results in higher numbers of genes being expressed in hybrid

• Some are due to differences in expression, others due to genome content differences

B

73

exp

ress

ion

leve

l

Mo17 expression level

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

70mer Affy

No

rma

lize

d a

ve

rag

e s

ign

al

B14

B37

B73

B84

Mo17a

Oh43

W22

Wf9

AF520911

Analogous to Fu and Dooner (2002) suggestion about genomic differences Many additional PA transcriptome patterns documented in Hansey et al., PLoS One 2012

Other differences among parents

• Epigenetic changes

• Allelic preferred translation and transcription

Goff and Zhang; COPB 2013