47
Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Genomics of Natural Variation in Arabidopsis thaliana

Justin BorevitzSalk Institutenaturalvariation.org

Page 2: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Talk Outline

• Natural Variation in Light Response– PHYTOCHROME A/QTL mapping– Fine Mapping/ Gene Expression candidates

• Single Feature Polymorphisms– Deletion/ Candidate genes– Bulk Segregant/ eXtreme Mapping

• Haplotype analysis

Page 3: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Light Affects the Entire Plant Life Cycle

de-etiolation

hypocotyl

}

Page 4: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 5: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 6: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 7: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 8: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Quantitative Trait Loci

EPI1 EPI2

Page 9: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Epistasis scan

Chr1 Chr2 Chr3 Chr4 Chr5

Chr

1

C

hr2

Chr

3

Chr

4

Chr

5

BQTLhttp://hacuna.ucsd.edu/bqtl43,956 pair-wise tests 163 markers and 133 intervalsPermutation threshold p < 0.05 (5000 permutations)

Page 10: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 11: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

SNP377

SM184

SM50

SM35

SM106

G2395

SNP65

SM40

SEQ8298

TH1

MSAT7964

MAT7787

CER45

5.50

5.87

6.34

7.01

7.30

7.44

7.60

7.79

7.96

8.13

8.29

8.65

9.32

MbMarker

Near-Isogenic Lines for LIGHT1 Ler / Cvi #3

mm

81N-J 17A-A/J 114 124 189Ler

6 2 4 3 3 3 Plants

Line

RVE7

GI

194

3

5.0 5.8 5.8 5.1 5.9 5.7 5.8 Phenotype

Page 12: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 13: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

differences may be due to expression or hybridization

Downstream players`

Page 14: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

PAG1 down regulated in Cvi

PLALE GREEN1 knock out has long hypocotyl in red light

Page 15: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

• Abundant Genetic Variation in Light Response– Quickly map in new crosses (XAM still to come!)

• QTL map to novel loci and candidate genes– New crosses find major loci and new loci

• Gene Expression– NILs, pools of extreme RILs or F2s,– Identify candidate genes at QTL (linked)– or downstream effects of QTL (unlinked)

Page 16: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

What is Array Genotyping?

• Affymetrix expression GeneChips contain 202,806 unique 25bp oligo nucleotides.

• 11 features per probset for 21546 genes• New array’s have even more• Genomic DNA is randomly labeled with

biotin, product ~50bp.• 3 independent biological replicates

compared to the reference strain Col

GeneChip

Page 17: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions

Page 18: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Spatial Correction

Spatial Artifacts

Improved reproducibilityNext: Quantile Normalization

Page 19: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

physicallocationknown

GeneticMarkers in genes

Page 20: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

False Discovery and Sensitivity

PM only

SAM threshold

5% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity

Polymorphic 340 117 223 34% Non-polymorphic 477 4 473

False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity

Polymorphic 340 195 145 57% Non-polymorphic 477 28 449

False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59

3/4 Cvi markers were also confirmed in PHYB

90% 80% 70%

41% 53% 85%

90% 80% 70%

67% 85% 100%

Cereonmay be asequencingError

TIGRmatch isa match

Page 21: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org
Page 22: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Chip genotyping of a Recombinant Inbred Line

29kb interval

Discovery 6 replicates X $500 12,000 SFPs = $0.25Typing 1 replicate X $500 12,000 SFPs = $0.041

Page 23: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions

>500 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Page 24: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions Suggest Candidate Genes

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in MAF1

MAF1

MAF1 natural deletion

Page 25: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

Het

Page 26: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Map bibb100 bibb mutant plants100 wt mutant plants

Page 27: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

bibb mapping

ChipMapAS1

Bulk segregantMapping usingChip hybridization

bibb maps toChromosome2 near ASYMETRIC LEAVES1

Page 28: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

BIBB = ASYMETRIC LEAVES1

Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain

bibb as1-101

MYB

bib-1W49*

as-101Q107*

as1bibb

AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM

Page 29: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

eXtreme Array Mapping

Histogram of Kas/Col RILs Red light

hypocotyl length (mm)

cou

nts

6 8 10 12 14

02

46

81

01

2

15 tallest RILs pooled vs15 shortest RILs pooled

Page 30: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

LOD

eXtreme Array Mapping

Allele frequencies determined by SFP genotyping. Thresholds set by simulations

0

4

8

12

16

0 20 40 60 80 100cM

LO

D

Composite Interval Mapping

RED2 QTL

Chromosome 2

RED2 QTL 12cM

Red light QTL RED2 from 100 Kas/ Col RILs

Page 31: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

• Single Feature Polymorphisms– Improve with replicates (easy)– Improved statistical models

• Genotyping– Precisely define recombination breakpoints– Fine mapping– Gene conversion

• Potential Deletions– Candidate genes/ induced mutations

• Bulk segregant Mapping– eXtreme Array Mapping, F2s etc

Page 32: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Array Haplotyping

• What about Diversity/selection across the genome?

• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ

• LD decay, Haplotype block size

• Deep population structure?

• Col, Lz, Ler, Bay, Shah, Cvi, Kas, C24,

Est, Kin, Mt, Nd, Sorbo, Van, Ws2

Page 33: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Array Haplotyping

Inbred lines

Low effectiverecombinationdue to partialselfing

Extensive LDblocks

Col Ler Cvi Kas Bay Shah Lz Nd

Chr

omos

ome1

~50

0kb

Page 34: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

(-4,-3.5] (-3,-2.5] (-2,-1.5] (-1,-0.5] (0,0.5] (1,1.5] (2,2.5] (3,3.5]

T statistic

fre

qu

en

cy

0

e+

00

4

e+

04

8

e+

04

Distribution of T-stats

null (permutation)actual

Not Col ColNA NA duplications

Page 35: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Accession FDR Sensitivity SNP Totalbay 0.0% 43% 51 563c24 0.2% 39% 64 580cvi 0.0% 38% 91 543est 0.0% 59% 39 548kas 1.9% 44% 66 577kendl 3.1% 33% 57 545ler 0.0% 49% 43 562lz 0.0% 53% 51 573mt 0.2% 61% 49 570nd 0.0% 47% 49 568shah 0.0% 24% 80 548sorbo 0.0% 45% 55 526van 0.2% 29% 92 571ws2 0.0% 49% 57 514

Sequence confirmation of SFPs

Page 36: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

SFPs for reverse genetics

http://naturalvariation.org/sfp

14 Accessions 30,950 SFPs`

Page 37: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

-20

24

Chromosome 1

-20

24

Chromosome 2

-20

24

Chromosome 3

-20

24

Chromosome 4

-20

24

Chromosome 5

Chromosome Wide Diversity

Page 38: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Diversity 50kb windows

18100000 18200000 18300000 18400000 18500000 18600000

-10

12

34

5

Chromosome 5 (Mb)

Div

ers

ity

perm95%

sfp.countav.pair-wise

18100000 18200000 18300000 18400000 18500000 18600000

-10

12

34

5

Chromosome 5 (Mb)

Div

ers

ity

perm95%

sfp.countav.pair-wise

18100000 18200000 18300000 18400000 18500000 18600000

-10

12

34

5

Chromosome 5 (Mb)

Div

ers

ity

perm95%

sfp.countav.pair-wise

Page 39: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Tajima’s D like 50kb windows

18100000 18200000 18300000 18400000 18500000 18600000

-3-2

-10

12

Chromosome 5 (Mb)

Ta

jima

's D

like

perm95%

Page 40: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

• Remember to think about hybridization polymorphism in RNA analysis (affy or cDNA)

• Keep in mind that DNA can be used on many arrays

• Example for mapping ESTs

• Haplotyping

• Diversity/Selection

• Association Mapping– Population Genomics (hybrid zones)

Page 41: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

20 40 60 80 100 120 140

20

40

60

80

10

01

20

14

0

features

fea

ture

s

RNA DNA

Universal Whole Genome Array

Transcriptome AtlasExpression levelsTissues specificity

Transcriptome AtlasExpression levelsTissues specificity

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Alternative SplicingAlternative Splicing Comparative GenomeHybridization (CGH)

Insertion/Deletions

Comparative GenomeHybridization (CGH)

Insertion/Deletions

MethylationMethylation

ChromatinImmunoprecipitation

ChIP chip

ChromatinImmunoprecipitation

ChIP chip

Polymorphism SFPsDiscovery/Genotyping

Polymorphism SFPsDiscovery/Genotyping

~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced

Page 42: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

ChipViewer: Mapping of transcriptional units of ORFeome

From 2000v At1g09750 (MIPS) to the latest AGI At1g09750

2000 v Annotation (MIPS)

The latest AGI Annotation

Page 43: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

SNP SFP MMMMM MSFP

SFP

MMMMM M

Chromosome (bp)

con

serv

atio

n

SNP

ORFa

start AAAAA

Tra

nsc

ripto

me

Atla

s

ORFb

deletion

Improved Genome Annotation

Page 44: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Haplotype Map –Linkage Disequilibrium, Gene Family (R genes)

Association Studies – Whole Genome Arrays

192 Accessions, > 200,000 SFPs (~600bp resolution)

Confirm Associations in specific crosses with

eXtreme Array Mapping

Future Projects DNA

Page 45: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

True natural variation in gene expression polymorphism accounted for, alternative splicing

Cis regulatory variation/ Imprintingreciprocal F1s 3 replicates

Transcriptome QTL Map –VanC Advanced Intercross Recombinant Inbred Lines

How many loci control the variation in gene transcription? Candidate TF and binding sites?

Future Projects RNA

Page 46: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Future work with Natural Variation

• VanC advanced intercross RIL population• Backcross collections

Page 47: Genomics of Natural Variation in Arabidopsis thaliana Justin Borevitz Salk Institute naturalvariation.org

Salk

Jon WernerTodd MocklerSarah LiljegrenOlivier LoudetHuaming ChenJoanne ChoryDetlef WeigelJoseph Ecker

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler

UC Davis

Julin Maloof

University of Guelph, Canada

Dave WolynSainsbury Laboratory

Jonathan Jones

USC

Magnus NordborgTina Hu

Syngenta

Hur-Song ChangTong Zhu

NaturalVariation.orgSalk

Jon WernerTodd MocklerSarah LiljegrenOlivier LoudetHuaming ChenJoanne ChoryDetlef WeigelJoseph Ecker

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler

UC Davis

Julin Maloof

University of Guelph, Canada

Dave WolynSainsbury Laboratory

Jonathan Jones

USC

Magnus NordborgTina Hu

Syngenta

Hur-Song ChangTong Zhu

Helen Hay Whitney FoundationHelen Hay Whitney Foundation