View
15
Download
0
Category
Preview:
Citation preview
Agricultural Genomics: The Rise of the Genomes
Dr. Doreen Ware, USDA ARSAgricultural Biotechnology: Emerging Technologies and Insights
January, 27, 2021
Advancing Agriculture Through Collaborative Research on Crop & Model Species
Outline
• Agricultural Drivers• Maize genome 16 years
– What’s in a genome– Improvements in Sequencing technology we can
Continue to evaluate approaches to develop reference assembly and annotations
• Genome/Biology enabled agriculture– Sorghum EMS population
• Forward and reverse genetics – Breaking down complex trait: Yeild & Quality
• Plant architecture: flower• Response to environment: water, heat, nitrogen, disease
• Biology & “Big Data” – collaborative infrastructure– Future
NB#ARC'
LRR'CC'
Drivers for Agriculture: Sustainability and Defense
3
Population Growth
ClimateChange
20182050
Agricultural Resources
Nature Editorial, How to feed a hungry world. 2010
BREEDING FOR 2050 AND BEYOND. Prepare for plant and animal pests and disease while they are still offshore. Design plants for new environments.
CLIMATE CHANGE. Collect and preserve natural diversity.
Maize is the highest world-wide production crop
Reference genomes are foundation tools for ensuring food security & environmental sustainability
Acknowledgements
Gramene www.gramene.org ( NSF & USDA ARS) 2002- present NSF IOS-1127112PI: Doreen Ware, PI (USDA ARS, CSHL) & Pankaj Jaiswal, Co-PI (OSU), Paul Kersey (Ensembl Genomes EBI), Helen Parkinson (ATLAS EBI), Lincoln Stein (Reactome OCIR), Crispin Taylor (ASPB)Gramene @ CSHL Andrew Olson, Joshua Stein, Sharon Wei, : Marcela Karey Monaco, Institutes: Cold Spring Harbor Laboratory, Oregon State University, EMBL – European Bioinformatics Institute, Ontario Institute for Cancer Research, American Society for Plant Biologists
USDA ARS Sorghum Functional Genomics and Germplasm improvement: 2015- presentUSDA Doreen Ware, Zhanguo Xin, Chad Hayes, Yinghua Huang, Gloria Burow, Ratan Chopra, John BurkeSorghumBase: Nick Gladman, Yinping Jiao, Bo Wang, Kapeel Chougule Andrew Olson, Sharon Wei, : Marcela Karey MonacoCorteva,
B73 Genome & Annotation Improvements V4 ( NSF & USDA ARS) ~ 2015 -2017CSHL staff: Yinping Jiao, Bo Wang, Mike Campbell, Josh Stein, Sharon Wei, Doreen Ware, Dick McCombie, Eric AntinouPacBio: David Rank, Paul Peluso, Jason Chin, Tyson Clark, Ting Hong, Elizabeth TsengBioNano: Alex Hestie, Tiffany Liang, Jinghua Shi. USDA ARS Mike McMullen, Kate Guill, University of Georgia: Kelly Dawe , Jonathan Gent, University of Hawaii: Gernot Presting, Kevin Schneider, Thomas WolfgruberInstitutes: Cold Spring Harbor Laboratory, USDA ARS, Pacific BioSciences, BioNano, University of Georgia, University of Hawaii
Maize Diveristy Project ( NSF & USDA ARS) ~ 2003- presentUSD AARS Ed Buckler, Sherry Flint-Garcia, Mike McMullen, Jim Holland, Peter Bradbury, Doreen Ware Cornell QiSun, UCDavisJeff Ross-Ibarra
Transcriptome Variation (USDA ARS) 2016- 2019CSHL staff: Yinping Jiao, Bo Wang, Mike Campbell, Josh Stein, Sharon Wei, Doreen Ware, Dick McCombie, Sara GoodwinPacBio: Elizabeth Tseng, Tyson Clark, Ting Hong, Kevin Eng, Primo BaybayanInstitutes: Cold Spring Harbor Laboratory, USDA ARS, Pacific BioSciences
o
Maize Nested Association Mapping Panel Reference Assemblies (NSF & USDA ARS) 2018- presentUniversity of Georgia: Kelly Dawe, University of Iowa: Matt Hufford University Minnesota Candy HirschCSHL staff: Josh Stein, Kapeel Chougule Sharon Wei, Doreen Ware, Dick McCombie, Sara GoodwinPacBio: Emily Hatas, Paul Peluso, Jason Chin, Cortiva: Victor Llaca, Kevin Fengler, Greg May, DNA Nexus: Brent Hannigan, Chai Fungtammasan, Brittanny O’Sullivan, NIH: Adam Phillippy, Serge KorenInstitutes: Cold Spring Harbor Laboratory, USDA ARS, University of Georgia, University of Iowa, Pacific BioSciences, DNANexus, NIH
Maize Genome Project ( NSF, DOE, USDA ) ~ 2005-2010University of Washington, Rick Wilson, CSHL: Rob Martienssen, Dick McCombie, Doreen Ware, University of Arizona, Rod Wing, University of Iowa Pat Schnable
Maize Genome is over 12 years old
Schnable, Ware et al. Science (2009)Maize Sequence Consortium (NSF, DOE, USDA) PI Rick Wilson
STIFF STALK
B73
Sequence genomes provides us the parts list and allows us to see what is the same or different between organisms
Genes in corn, rice, and sorghum are in similar places in the genome
Schnable, Ware et al. Science (2009)
Triceratops Genome
http://www.jurassicworld.com/creation-lab/
Genomes sequences allow us to see all the variations (mutations) that exist in nature
Letters or Single nucleotides polymorphisms (SNPs)
Gene content or parts list, known as copy number variations (CNVs)
Jumping genes, Transposable Elements (TEs) associated with the regulatory sequence between the Genes
Barbara McClintock1983 Nobel Prize in Physiology & Medicine
Maize is a “Tale of Two Genomes”
A
B
n = 10
n = 10
X AB
n = 20
C
Chromosome breakage and re-fusion
Gene loss
n = 10
Return to 10 chromosomes
Ancestors
Evolutionary History of the Maize Genome
A
B
n = 10
n = 10
X AB
n = 20
C
Chromosome breakage and re-fusion
Gene loss
n = 10
Return to 10 chromosomes
Ancestors
Red = “strong”, dominant genomeBlue = “weak”
Schnable, Ware et.al, Science 2009
Maize is a “Tale of Two Genomes”
Maize, also known as corn experienced a whole genome duplications and then lost many of the genes
The genes that were kept by corn can tell us about how corn is adapting
Transcription factors, kinases, chromatin modifiers
Not all genomes have the same potential
Schnable, Ware et.al, Science 2009
Humans Have Limited Molecular Diversity
0.09%
Silent Diversity; Zhao, et al. (2001) PNAS
1.34%
Maize diversity is greater than the difference between human and chimps
Silent Diversity; Tenallion, et al. (2001) PNAS
1.42%
Individual Maize lines are very different from each other
The SNPs and gene differences affect how corn plants grow
Access to these sequences can accelerate the time it takes to make new lines of corn
Differences come from locally duplicated genes
Images Bo Wang
10/19/2017 fe9228_008f5b8700b54c5590404beccdc6c171.webp (845×552)
http://static.wixstatic.com/media/fe9228_008f5b8700b54c5590404beccdc6c171.jpg/v1/fill/w_845,h_552,al_c,q_85,usm_0.66_1.00_0.01/fe9228_008f5b8700b54c559… 1/1
An additional copy of gene confers tolerance to acidic soil
Maron et. al, PNAS, 2013
Maize genomes are highly variable
Chia JM, Song C, Bradbury PJ, Costich D, de Leon N, Doebley J, Elshire RJ, Gaut B, Geller L, Glaubitz JC et al: Maize HapMap2 identifies extant variation from a genome in flux. Nat Genet 2012, 44(7):803-807.
• High rate of SNP and structure variation in the population
• Structure variations are highly associated with phenotypic variation
• Structural variation in non coding region was enriched for phenotypic variation
• One genome is not enough to represent the diversity of the population
MAIZE DIVERSITY PROJECTMeet the Family (2002 - present)
www.panzea.org
COURTESY SHERRY FLINT-GARCIA
NC33
B115I137TN
81-1MEF 156-55-2IL677A
Ia5125
IA2132
IL101
F2EP1F7
CO255
NC366B52
SC213R
NC238
GT112
Mp339GA209
D940Y
T232
U267Y
CI28AB2
F2834TMo24WMS1334
IDS28
I-29
SA24
4722
SG18Sg1533
IDS91IDS69
F6F44
Ab28A
CML328Oh603
A441-5
CML323
SC55
NC264NC370
NC320NC318NC334NC332
CML92
CML220
CML218A272
Tzi9
CML77
TZI10
CML311
CML349CML158Q
CML154Q
CML281
CML91
parvi-14
parvi-36
parvi-03
parvi-49parvi-30
NC356
TX601
NC340NC300
NC304
NC338NC302
NC354TZI18
A6
Ki44Ki43
Ki2007Ki21
Ki2021Ki14
CML238CML321
CML157Q
CML332CML331CML261CML341CML45CML11CML10Q6199
CML314
CML258
CML5CML254CML61
CML264
CML9
CML108CML287Tzi11
CML38CML14
SC357L578T234
N6E2558WK64
CI64
CI44CI31A NC230
CI66 K55R109BMoGL317NC232
VaW6CH701-30
Va85CI90C M14
H99Va99 PA91
Ky226
H95Oh40B
VA26 Pa762 OH43
A619
Va22Va17Va14 Va59
Va102Va35
T8
A654
Pa880SD44CH9
Pa875H49
W64A
WF9
Mo1WOs420
R168
38-11NC260
Mo44
CI21E
HyA661
ND246
WDA554
CO125CO109
B75
DE-3
DE-2
DE1
B57C123
C103
NC360
NC364NC362
NC342NC290A
NC262
NC344
NC258CI91B
CI187-2
A682K4
NC222
NC236
CI3A
K148
Mt42
MS153
W401A556B77
CM37
CMV3
W117HT
CM7
CO106
B103
R4
Ky228
B164
Mo46Mo47
Mo45
Hi27Yu796-NS
I205
Tzi16Tzi25
B79
B105
N7ASD40N28HT
A641
DE811
H105W
A214NH100
A635A632
A634B14AH91B68
B64
CM174CM105B104
B84
NC250H84
B76
B37
N192A679
B109 NC294NC368 NC326NC314
NC308NC312NC306NC268B46 B10
A239C49AC49
A188
W153RA659R177
W22W182B
33-16
4226
26 Reference Assemblies for the Maize Nested Association Mapping (NAM) Population
B97, KY21, MO17, MS71, M162W, OH43, OH7B
NON STIFF STALK
TROPICAL-SUBTROPICAL
CML52, CML69, CML103, CML228, CML247,CML27, CML322, CML333, KI3, KI11, NC350, NC358, TX303, TZI8
STIFF STALK
B73
SWEET CORN
IL14H, P39
POPCORN
HP301
MIXED
MO18W, M37W
Adapted from McMullen et al. Science 2009
• 25 Parental lines selected to represent the diversity in maize
• From these, 25 families of 200 recombinant inbred lines were generated
• Rich collection of phenotypes
• Resource for joint linkage, association mapping studies
Maize Nested Association Mapping (NAM) Population
Yu, et al. (2008) Genetics
25 DL
× B73
F1s
Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä
ÄSSD
NAM
1
2
200
.
.
.
. . . . . .
B97
CML1
03
CML2
28
CML2
47
CML2
77
CML3
22
CML3
33
CML5
2
CML6
9
Hp30
1
Il14H Ki11 Ki
3
Ky21
M16
2W
M37
W
Mo1
8W
MS7
1
NC3
50
NC3
58
Oh4
3
Oh7
B
P39
Tx30
3
Tzi8
5000 RILs
Arun Seetharam -ISUMargaret Woodhouse – MaizeGDBKapeel Chougule -CSHLShujun Ou -ISUJianing Liu -UGAXuehong Wei –CSHLZhenyuan Lu –CSHL Andrew Olson –CSHLBo Wang -CSHL Sharon Wei -CSHLTingTing Guo -ISURafael Della Coletta -UMXianran Li -ISUJohn Portwood –MaizeGDBKevin Fengler -CortevaVictor Llaca -CortevaAmanda Gilbert -UMNancy Manchanda -ISUSamantha Snodgrass -ISUDavid Hufnagel -ISUSarah Pedersen -ISUMichael Syring –ISUEthy Cannon - MaizeGDBCarson Andorf -MaizeGDBJonathan Gent –UGATodd Michael - JCVIJianming Yu –ISUCandice Hirsch -UMDoreen Ware –CSHLMatthew B. Hufford -ISUR. Kelly Dawe -UGA
Victor Llacas
Kelly DaweU. Georgia
Doreen WareUSDA-ARSCSHL
Matt HuffordIowa State U.
Candy HirschU. Minnesota
Zhenyuan Lu
26 Maize NAM founders reference assemblies (2019- 2021)
Bo Warg
Andrew Olson
New assemblies have a vast improvement in the contiguity of the sequence
StiffStalkNonStiffStalkMixedPopCornSweetCornTropical
Matrices Assembled Contigs
Total Bases in Assembly
2,180,413,054
Contig Contiguity (NG50)
52,409,415
Number of Contigs 811
Longest Contig 161,290,055
Kevin Fengler
Victor Llaca
N50 is the shortest contig length needed to cover 50% of the genome. -> Half of the genome sequence is covered by contigs larger than or equal the N50 contig size. Hufford et, al, bioRxiv, 2021
More than 100, 000 genes found in the 26 maize accession
Hufford et, al, bioRxiv, 2021
Maize Pan Gene Set
Hufford et, al, bioRxiv, 2021
Candy HirschU. Minnesota
Maize Pan Gene SetCore genes are more likely to be Syntenic
Hufford et, al, bioRxiv, 2021
Candy HirschU. Minnesota
The bulk of the genes in maize are found in other species
Hufford et, al, bioRxiv, 2021
Genome contains life history of the speciesTeosinte
Zea mays L. ssp. parviglumis Landraces
Maize, Modern InbredsZea mays L. ssp. mays
Biology Enabled Agriculture
Complex Traits: Yield & Quality Phenomes
Regulatory NetworksMetabolic Pathways
Genome EditingGenomic SelectionMarker Assisted Breeding
QTLGWAS
MetabolomicsExpressionEpigenetic
Genomes
Gene Networks
G X (environment +management) = P
Profiling new sorghum genetic & phenotypic variation
• Parental line: BTX 623• >10,000 individual M2 seed
pools• >6,400 M3 seeds obtained and • Phenotyping is on-going and
need to be expanded• High quality DNA prepared for
all linesEMS Mutagenesis-Random- Single nucleotide change- >99% GCàAT
Zhanguo XinCropping Systems Research Lab,
USDA-ARS, Lubbock TX
Jiao et al. The Plant Cell, 2016
Mutation Detection by whole genome sequencing of 256
mutants for forward genetics• Sequencing summary
- 20 M3 plants pooled together for sequence to averagely 16X - Average whole genome coverage – 86%; gene space coverage -95%
• Quality control of the population: 2 contamination lines + 2 sibling lines• Mutation detection:
- 1,862,560 EMS-induced mutations- Sanger sequencing validation rate >98%- 7,660 mutation/mutant = 1,798 homozygous + 5,862 heterozygous
• Mutation Effect:- 22% of mutations are located in genes, covering 95% of Sorghum genes - 57% (18,684) of the genes harbor >35,000 disruptive mutations, ~2
disruptive mutations per gene.
Yinping Jiao
Jiao et al. The Plant Cell, 2016
Multiseeded (msd1, msd2, msd3)
BTX623 msd2PS
SS
~50% Fertility 100% Fertility
Collaborator Zhanguo Xin USDA ARS, Lubbock
Biology Enabled Agriculture
Regulatory Networks
EMS population
ExpressionMetabolites
Genomes
Transcription factor
G X (environment +management) = P
Zhanguo XinYoung Koung LeeYinping Jiao Nick Gladman
MSD1 TCP transcription factor
Jiao et al., Nature Comm. (2018)
Yield > Flower development > grain number > fertility/branching
Biology Enabled Agriculture
Regulatory Networks
EMS population
ExpressionMetabolites
Genomes
Transcription factor
G X (environment +management) = P
Teosinte BranchedDoebley et al., Geneitics (1995)
TCP transcription factor
Yield > Flower development > grain number > fertility/branching
Biology Enabled Agriculture
Regulatory NetworksMetabolic Pathways
Genome EditingMarker Assisted Breeding
EMS population
ExpressionMetabolites
Genomes
Transcription factor, Plant hormones
G X (environment +management) = P
Ahmad, et al. 2016. Frontiers
MSD2
MSD3
MSD1 TCP transctiption factor
Jasmonic Acid (JA) Pathway
Jiao et al., Nature Comm. (2018)Gladman et al., Int. J. Mol. Science (2019)Dampanaboina, L., Int. J. Mol. Science (2019)
Yield > Flower development > grain number > fertility/branching
Nitrogen, soil, and agricultural sustainability
October 2011. Credit, USGS, NASA.Nitrogen deficiencies, LSU (courtesy, Dr. Brenda Tubana)
Insufficient Nitrogen fertilizer Excess Nitrogen fertilizer
Biology Enabled Agriculture
NUE
Metabolic NetworkExpression
Genomes
Transcription factor
Complex Traits: Yield> Fitness> Limiting Nitrogen
ChristopheLiseron-MonfilsAndrew OlsonLifang Zhang
Guardinier et al, Nature 2017
Siobhan Brady UC DAVIS
Allie GaudinierUC DAVIS
G X (environment +management) = P
AT4G24620
AT1G12110
AT3G61830
AT5G60910
AT3G47340
AT3G14180
AT5G54680
AT5G15210
AT3G53340
AT2G44730
AT5G39760
AT5G05410
AT1G13960
AT4G31550 AT1G76510
AT1G61730
AT3G01970
AT2G47890
AT3G28920
AT5G63790
AT5G66320
AT3G49940
AT2G46830
AT3G11580
AT5G53460
AT1G51600
AT4G29080
AT2G31230
AT4G01250
AT2G35530
AT1G53910
AT1G20700
AT2G40950
AT2G27300
AT1G54060
AT5G54930
AT2G02540
AT1G73730
AT1G21910
AT5G61590
AT2G22840
AT1G71130
AT5G57390
AT5G13330
AT4G23980
AT4G37260
AT1G12630
AT1G12610
AT5G56860
AT1G04880
AT2G23290
AT1G76880
AT3G07650
AT1G68360
AT5G60850
AT2G40620
AT3G54810
AT3G55730
AT1G78600
AT1G32450
AT1G19220
AT2G46310
AT3G08590
AT1G75540
AT5G07440
AT2G33480
AT1G47270
AT5G59820
AT3G17609
AT3G51910
AT5G10140
AT1G05380
AT3G57480
AT2G21320
AT4G30410
AT5G25830
AT4G23750
AT1G22810
AT1G79180
AT5G07700
AT5G07690
AT5G60770
AT3G62100
AT3G24050
AT5G22570
AT4G32890
AT3G60530
AT2G22430
AT5G51990
AT5G16770
AT1G22070
AT1G69170
AT1G01010
AT1G56170
AT3G58120
AT1G14510
AT4G25470
AT5G25810
AT2G44940
AT1G28470
AT2G40740
AT1G22190
TMO5/LHW
AT2G46680
AT1G10120
AT1G69490
AT1G68640
AT3G61890
AT4G28140
AT3G16770
AT5G06710
AT1G31320
AT1G75240
AT4G37790
AT1G72360
AT3G04070
AT5G44080
AT1G22640
AT3G46600
AT2G22900
AT1G29280
AT4G34990
AT5G43170
AT3G58710
AT3G45610
AT4G34610
AT2G30590
AT1G34190
AT1G24280
AT2G15620
AT1G78050
AT5G52020
AT1G47870
AT1G77450
AT1G24625
AT5G40850
AT4G24400
AT4G35260
AT1G14580
AT4G16750
AT5G67190
AT2G38290
AT3G60490
AT5G18170
AT1G03840
AT5G44160
AT4G18880
AT4G35550
AT1G35560
AT2G23320
AT5G41410
AT1G08090
AT5G65010
AT4G24020
AT2G46270
AT2G20880
AT2G04240
AT1G77760
AT1G64530
AT5G13110
AT1G02860
AT5G49620
AT5G09330
AT1G54830
AT2G37630
AT5G12840
AT3G19860
AT5G14340
AT3G03200
AT5G10280
AT3G19580
AT4G14550
AT3G50260
AT5G60890
AT5G26660
AT3G06490
AT1G78080
AT2G04880
AT3G57230
AT2G37430
AT4G39070
AT5G41920
AT4G05390
AT5G24800
AT1G37130
AT4G40060
AT3G23240
AT5G03510
AT1G69310
AT3G42790
AT2G47900
AT1G20910
AT3G08500
AT1G22710
AT3G14940
AT3G22760
AT1G73870
AT1G63910
AT5G60450
AT1G02340
AT3G17100
AT1G48000
AT4G36900
AT4G32800
AT4G04955
AT2G25930
AT3G60750
AT1G72330
AT4G37540
AT5G35630
AT1G30510
AT1G30270
AT2G26940
AT1G26610
AT1G64780
AT1G76350
AT1G16530
AT2G31955
AT5G67420
AT5G11260
AT5G14570
AT5G23090
AT1G34180
AT5G22290
AT4G28500
AT1G32870
AT5G49520
AT1G01060
AT1G29160
AT2G38470
AT1G02220
AT5G08520
AT1G16490
AT5G18270
AT4G01680
AT4G36160
AT1G09540
AT1G66230
AT1G34670
AT5G54470
AT4G22680
AT1G43860
AT1G12860
AT2G18380
AT4G00940
AT5G45710
AT1G31050
AT4G29410
AT3G50410
AT2G42830
AT3G24490
AT1G18570
AT1G77570
AT5G51230
AT2G46530
AT2G38340
AT2G22850
AT1G77640
AT5G65210
AT3G17820
AT3G22830
AT3G56400
AT5G47220
AT1G69010
AT1G22985
AT4G17490
AT1G12260
AT2G36400
AT3G10500
AT2G18280
AT5G02840
AT2G24500
AT2G47270
AT2G48100
AT2G36720
AT3G05800
AT4G36620
AT5G63280
AT1G49720
AT3G10470
AT1G76900
AT3G06120
AT5G67580
AT5G39660
AT4G35040
AT1G75710
AT2G16400
AT2G26150
AT5G50820
AT1G07980
AT5G47230
AT5G43700
AT1G05805
AT1G53170
AT5G05790
AT2G41070
AT3G04730
AT5G13910
AT1G14350
AT1G25560
AT5G62380
AT2G21230
AT1G44830
AT5G01380
AT5G61600
AT5G62020
AT5G28040
AT3G10040
AT3G49760
AT4G38900
AT5G12870
AT1G13260
AT1G68130
AT3G60580
AT5G51190
AT5G13080
AT4G25210
AT2G24570
AT5G53980
AT5G44210
AT5G17800
AT1G68840
AT1G43700
AT5G62610
AT4G36730
AT3G45650
AT3G25730
AT2G45680
AT3G54390
AT1G68920
AT1G66200
AT3G19290
AT5G06960
AT1G69850
AT5G50200
AT4G01120
AT3G55370
AT1G17520
AT4G14770
AT1G07520
AT4G37850
AT1G34310
AT1G30490
AT5G10510
AT3G47520
AT3G15210
AT2G14210
AT4G37740
AT1G50640
Biology Enabled Agriculture
NUE
Regulatory NetworksMetabolic Network
ExpressionGene regulation
Genomes
Transcription factor
Complex Traits: Yield> Fitness> Limiting Nitrogen
ChristopheLiseron-MonfilsAndrew OlsonLifang Zhang
Guardinier et al, Nature 2017
Siobhan Brady UC DAVIS
Allie GaudinierUC DAVIS
G X (environment +management) = P
Limiting N1mM NO3
AT4G24620
AT1G12110
AT3G61830
AT5G60910
AT3G47340
AT3G14180
AT5G54680
AT5G15210
AT3G53340
AT2G44730
AT5G39760
AT5G05410
AT1G13960
AT4G31550 AT1G76510
AT1G61730
AT3G01970
AT2G47890
AT3G28920
AT5G63790
AT5G66320
AT3G49940
AT2G46830
AT3G11580
AT5G53460
AT1G51600
AT4G29080
AT2G31230
AT4G01250
AT2G35530
AT1G53910
AT1G20700
AT2G40950
AT2G27300
AT1G54060
AT5G54930
AT2G02540
AT1G73730
AT1G21910
AT5G61590
AT2G22840
AT1G71130
AT5G57390
AT5G13330
AT4G23980
AT4G37260
AT1G12630
AT1G12610
AT5G56860
AT1G04880
AT2G23290
AT1G76880
AT3G07650
AT1G68360
AT5G60850
AT2G40620
AT3G54810
AT3G55730
AT1G78600
AT1G32450
AT1G19220
AT2G46310
AT3G08590
AT1G75540
AT5G07440
AT2G33480
AT1G47270
AT5G59820
AT3G17609
AT3G51910
AT5G10140
AT1G05380
AT3G57480
AT2G21320
AT4G30410
AT5G25830
AT4G23750
AT1G22810
AT1G79180
AT5G07700
AT5G07690
AT5G60770
AT3G62100
AT3G24050
AT5G22570
AT4G32890
AT3G60530
AT2G22430
AT5G51990
AT5G16770
AT1G22070
AT1G69170
AT1G01010
AT1G56170
AT3G58120
AT1G14510
AT4G25470
AT5G25810
AT2G44940
AT1G28470
AT2G40740
AT1G22190
TMO5/LHW
AT2G46680
AT1G10120
AT1G69490
AT1G68640
AT3G61890
AT4G28140
AT3G16770
AT5G06710
AT1G31320
AT1G75240
AT4G37790
AT1G72360
AT3G04070
AT5G44080
AT1G22640
AT3G46600
AT2G22900
AT1G29280
AT4G34990
AT5G43170
AT3G58710
AT3G45610
AT4G34610
AT2G30590
AT1G34190
AT1G24280
AT2G15620
AT1G78050
AT5G52020
AT1G47870
AT1G77450
AT1G24625
AT5G40850
AT4G24400
AT4G35260
AT1G14580
AT4G16750
AT5G67190
AT2G38290
AT3G60490
AT5G18170
AT1G03840
AT5G44160
AT4G18880
AT4G35550
AT1G35560
AT2G23320
AT5G41410
AT1G08090
AT5G65010
AT4G24020
AT2G46270
AT2G20880
AT2G04240
AT1G77760
AT1G64530
AT5G13110
AT1G02860
AT5G49620
AT5G09330
AT1G54830
AT2G37630
AT5G12840
AT3G19860
AT5G14340
AT3G03200
AT5G10280
AT3G19580
AT4G14550
AT3G50260
AT5G60890
AT5G26660
AT3G06490
AT1G78080
AT2G04880
AT3G57230
AT2G37430
AT4G39070
AT5G41920
AT4G05390
AT5G24800
AT1G37130
AT4G40060
AT3G23240
AT5G03510
AT1G69310
AT3G42790
AT2G47900
AT1G20910
AT3G08500
AT1G22710
AT3G14940
AT3G22760
AT1G73870
AT1G63910
AT5G60450
AT1G02340
AT3G17100
AT1G48000
AT4G36900
AT4G32800
AT4G04955
AT2G25930
AT3G60750
AT1G72330
AT4G37540
AT5G35630
AT1G30510
AT1G30270
AT2G26940
AT1G26610
AT1G64780
AT1G76350
AT1G16530
AT2G31955
AT5G67420
AT5G11260
AT5G14570
AT5G23090
AT1G34180
AT5G22290
AT4G28500
AT1G32870
AT5G49520
AT1G01060
AT1G29160
AT2G38470
AT1G02220
AT5G08520
AT1G16490
AT5G18270
AT4G01680
AT4G36160
AT1G09540
AT1G66230
AT1G34670
AT5G54470
AT4G22680
AT1G43860
AT1G12860
AT2G18380
AT4G00940
AT5G45710
AT1G31050
AT4G29410
AT3G50410
AT2G42830
AT3G24490
AT1G18570
AT1G77570
AT5G51230
AT2G46530
AT2G38340
AT2G22850
AT1G77640
AT5G65210
AT3G17820
AT3G22830
AT3G56400
AT5G47220
AT1G69010
AT1G22985
AT4G17490
AT1G12260
AT2G36400
AT3G10500
AT2G18280
AT5G02840
AT2G24500
AT2G47270
AT2G48100
AT2G36720
AT3G05800
AT4G36620
AT5G63280
AT1G49720
AT3G10470
AT1G76900
AT3G06120
AT5G67580
AT5G39660
AT4G35040
AT1G75710
AT2G16400
AT2G26150
AT5G50820
AT1G07980
AT5G47230
AT5G43700
AT1G05805
AT1G53170
AT5G05790
AT2G41070
AT3G04730
AT5G13910
AT1G14350
AT1G25560
AT5G62380
AT2G21230
AT1G44830
AT5G01380
AT5G61600
AT5G62020
AT5G28040
AT3G10040
AT3G49760
AT4G38900
AT5G12870
AT1G13260
AT1G68130
AT3G60580
AT5G51190
AT5G13080
AT4G25210
AT2G24570
AT5G53980
AT5G44210
AT5G17800
AT1G68840
AT1G43700
AT5G62610
AT4G36730
AT3G45650
AT3G25730
AT2G45680
AT3G54390
AT1G68920
AT1G66200
AT3G19290
AT5G06960
AT1G69850
AT5G50200
AT4G01120
AT3G55370
AT1G17520
AT4G14770
AT1G07520
AT4G37850
AT1G34310
AT1G30490
AT5G10510
AT3G47520
AT3G15210
AT2G14210
AT4G37740
AT1G50640
Sufficient N10mM NO3
Biology Enabled Agriculture
NUE
Regulatory NetworksMetabolic Network
tDNA population
ExpressionGene regulation
Genomes
Transcription factor
Complex Traits: Yield> Fitness> Limiting Nitrogen
ChristopheLiseron-MonfilsAndrew OlsonLifang Zhang
Guardinier et al, Nature 2017
Siobhan Brady UC DAVIS
Allie GaudinierUC DAVIS
G X (environment +management) = P
Limiting N1mM NO3
AT4G24620
AT1G12110
AT3G61830
AT5G60910
AT3G47340
AT3G14180
AT5G54680
AT5G15210
AT3G53340
AT2G44730
AT5G39760
AT5G05410
AT1G13960
AT4G31550 AT1G76510
AT1G61730
AT3G01970
AT2G47890
AT3G28920
AT5G63790
AT5G66320
AT3G49940
AT2G46830
AT3G11580
AT5G53460
AT1G51600
AT4G29080
AT2G31230
AT4G01250
AT2G35530
AT1G53910
AT1G20700
AT2G40950
AT2G27300
AT1G54060
AT5G54930
AT2G02540
AT1G73730
AT1G21910
AT5G61590
AT2G22840
AT1G71130
AT5G57390
AT5G13330
AT4G23980
AT4G37260
AT1G12630
AT1G12610
AT5G56860
AT1G04880
AT2G23290
AT1G76880
AT3G07650
AT1G68360
AT5G60850
AT2G40620
AT3G54810
AT3G55730
AT1G78600
AT1G32450
AT1G19220
AT2G46310
AT3G08590
AT1G75540
AT5G07440
AT2G33480
AT1G47270
AT5G59820
AT3G17609
AT3G51910
AT5G10140
AT1G05380
AT3G57480
AT2G21320
AT4G30410
AT5G25830
AT4G23750
AT1G22810
AT1G79180
AT5G07700
AT5G07690
AT5G60770
AT3G62100
AT3G24050
AT5G22570
AT4G32890
AT3G60530
AT2G22430
AT5G51990
AT5G16770
AT1G22070
AT1G69170
AT1G01010
AT1G56170
AT3G58120
AT1G14510
AT4G25470
AT5G25810
AT2G44940
AT1G28470
AT2G40740
AT1G22190
TMO5/LHW
AT2G46680
AT1G10120
AT1G69490
AT1G68640
AT3G61890
AT4G28140
AT3G16770
AT5G06710
AT1G31320
AT1G75240
AT4G37790
AT1G72360
AT3G04070
AT5G44080
AT1G22640
AT3G46600
AT2G22900
AT1G29280
AT4G34990
AT5G43170
AT3G58710
AT3G45610
AT4G34610
AT2G30590
AT1G34190
AT1G24280
AT2G15620
AT1G78050
AT5G52020
AT1G47870
AT1G77450
AT1G24625
AT5G40850
AT4G24400
AT4G35260
AT1G14580
AT4G16750
AT5G67190
AT2G38290
AT3G60490
AT5G18170
AT1G03840
AT5G44160
AT4G18880
AT4G35550
AT1G35560
AT2G23320
AT5G41410
AT1G08090
AT5G65010
AT4G24020
AT2G46270
AT2G20880
AT2G04240
AT1G77760
AT1G64530
AT5G13110
AT1G02860
AT5G49620
AT5G09330
AT1G54830
AT2G37630
AT5G12840
AT3G19860
AT5G14340
AT3G03200
AT5G10280
AT3G19580
AT4G14550
AT3G50260
AT5G60890
AT5G26660
AT3G06490
AT1G78080
AT2G04880
AT3G57230
AT2G37430
AT4G39070
AT5G41920
AT4G05390
AT5G24800
AT1G37130
AT4G40060
AT3G23240
AT5G03510
AT1G69310
AT3G42790
AT2G47900
AT1G20910
AT3G08500
AT1G22710
AT3G14940
AT3G22760
AT1G73870
AT1G63910
AT5G60450
AT1G02340
AT3G17100
AT1G48000
AT4G36900
AT4G32800
AT4G04955
AT2G25930
AT3G60750
AT1G72330
AT4G37540
AT5G35630
AT1G30510
AT1G30270
AT2G26940
AT1G26610
AT1G64780
AT1G76350
AT1G16530
AT2G31955
AT5G67420
AT5G11260
AT5G14570
AT5G23090
AT1G34180
AT5G22290
AT4G28500
AT1G32870
AT5G49520
AT1G01060
AT1G29160
AT2G38470
AT1G02220
AT5G08520
AT1G16490
AT5G18270
AT4G01680
AT4G36160
AT1G09540
AT1G66230
AT1G34670
AT5G54470
AT4G22680
AT1G43860
AT1G12860
AT2G18380
AT4G00940
AT5G45710
AT1G31050
AT4G29410
AT3G50410
AT2G42830
AT3G24490
AT1G18570
AT1G77570
AT5G51230
AT2G46530
AT2G38340
AT2G22850
AT1G77640
AT5G65210
AT3G17820
AT3G22830
AT3G56400
AT5G47220
AT1G69010
AT1G22985
AT4G17490
AT1G12260
AT2G36400
AT3G10500
AT2G18280
AT5G02840
AT2G24500
AT2G47270
AT2G48100
AT2G36720
AT3G05800
AT4G36620
AT5G63280
AT1G49720
AT3G10470
AT1G76900
AT3G06120
AT5G67580
AT5G39660
AT4G35040
AT1G75710
AT2G16400
AT2G26150
AT5G50820
AT1G07980
AT5G47230
AT5G43700
AT1G05805
AT1G53170
AT5G05790
AT2G41070
AT3G04730
AT5G13910
AT1G14350
AT1G25560
AT5G62380
AT2G21230
AT1G44830
AT5G01380
AT5G61600
AT5G62020
AT5G28040
AT3G10040
AT3G49760
AT4G38900
AT5G12870
AT1G13260
AT1G68130
AT3G60580
AT5G51190
AT5G13080
AT4G25210
AT2G24570
AT5G53980
AT5G44210
AT5G17800
AT1G68840
AT1G43700
AT5G62610
AT4G36730
AT3G45650
AT3G25730
AT2G45680
AT3G54390
AT1G68920
AT1G66200
AT3G19290
AT5G06960
AT1G69850
AT5G50200
AT4G01120
AT3G55370
AT1G17520
AT4G14770
AT1G07520
AT4G37850
AT1G34310
AT1G30490
AT5G10510
AT3G47520
AT3G15210
AT2G14210
AT4G37740
AT1G50640
Sufficient N10mM NO3
Biology Enabled Agriculture
NUE
Regulatory NetworksMetabolic Network
tDNA population
ExpressionGene regulation
Genomes
Transcription factor
Complex Traits: Yield> Fitness> Limiting Nitrogen
ChristopheLiseron-MonfilsAndrew OlsonLifang Zhang
Guardinier et al, Nature 2017
Siobhan Brady UC DAVIS
Allie GaudinierUC DAVIS
G X (environment +management) = P
Changing climate increasing temperature and drought
Reverse Genetics: From Gene to Phenotype
GeneinArabidopsis Sorghumgene Aminoacid
change MutantId
CER6 Sobic.001G453200 E159K ARS20KCS12 Sobic.004G341300 R189C ARS73CER5 Sobic.009G083300 P581L ARS73
CER5 Sobic.009G083300 L244F ARS20
CER6 Sobic.006G020600 A133T ARS205KCS7 Sobic.002G268300 P449S ARS31KCS4 Sobic.002G268500 A49V ARS31KCS20 Sobic.005G168700 R303Q ARS185CER1 Sobic.001G222700 L100F ARS185
Jiao et al. The Plant Cell, 2016
Epicuticular wax (bloom) of sorghum plays important roles in tolerance of environmental stress.
Biology Enabled Agriculture
Heat tolerance
Regulatory Networks
EMS populationFitness impact
Genomes
ATP dependent protease
Complex Traits: Yeild > Stress > Heat tolerance
Ftsh11 identified in a model plant
USDA-ARS, Lubbock TX Zhanguo Xin, Gloria Burow, Ratan Chopra, John Burke, Chad Hayes
Jiao et al. Plant Cell 2016
G X (environment +management) = P
Yinping Jiao Zhanguo Xin
Biology Enabled Agriculture
Metabolic Pathways
Genome EditingMarker Assisted Breeding
Genomes
Long chain fatty acid, ATP dependent protease
Complex Traits: Stress Resilience> Water Use efficiency
Jiao et al. Front. Plant Science 2018
Mass Spec
USDA-ARS, Lubbock TX Zhanguo Xin, Gloria Burow, Ratan Chopra, John Burke, Chad Hayes
G X (environment +management) = P
Yinping Jiao Zhanguo XinNick Gladman
EMS population
Resistance to Southern Leaf BlightKump et al. Nat Genet 2011
Climate change impacts disease pressures
• Yield loss: Pesticide spraying increases direct cost and impacts to the environment, and human health
• Disease resistance “R”genes (NLR) are rapidly evolving and often seen in cluster. Good candidate for structural variation
• Pan Genomes: High quality reference assemblies to support characterization of core and dispensible (adaptive) genes
Sorghum Leaf spothttp://texassorghum.org/wp-content/uploads/2015/10/Fig.-2.jpg
R gene
Sugarcane aphids
https://www.myfields.info/pests/sugarcane-aphid
• Since 2013 in the US sugarcane aphids have been causing enormous damage to sorghum crop
• Tx2783 has high resistance to SCA
Tx2783SCA locus mapped to the top of chromosojme 6
Collaborators:Yinghua Huang (USDA-ARS), Zhanguo Xin (USDA-ARS) Chad Hayes (USDA-ARS)
Wang et al., bioRxiv, 2021
Sorghum sugarcane aphid tolerant TX2783 reference assembly
447 PacBio contigs (25.6 Mb contig N50)
Wang et al., bioRxiv, 2021
Chr. 06
Chr. 06
1. Tx2783 191KB insertion black box2. All of the genes in the cluster are disease resistance genes3. Complex region with local duplications 4. Inversion
Chr. 10
Sorghum Pan Genomes site: TX2783, BTX623, TX436, TX430, RIOAssembly Gene Neighborhood conservation view
Andrew Olson Sharon Wei Kapeel Chougule
Biology Enabled AgricultureYield> Biotic Stress > Disease Resistance
Signaling pathway
Genome EditingMarker Assisted Breeding
Expression
Genomes
NLR R Genes
G X (environment +management) = P
Wang et al., bioRxiv, 2021
Yinping Jiao Zhanguo Xin Bo Wang
TX2783 191 kb structural variant (SV) containing a cluster of R Genes
Resequencing data from 62 accessions identified the SV is segregating at a low frequency in these lines. Liya Wang
AT1G31320
AT2G42300 bZIP29
ANAC102
AT2G21230
CUC1
CUC2
HDG1
bZIP17
AT3G05760
AT1G20910
PHV
MGP
IAA2ARF6
SHR
miR393aIAA7
SCRIAA14
RGA1HSL1DREB2A
ARF9
AT5G08520
ERF104HSFA5
GATA9
AT2G44730
miR169emiR169d
ERF4
NF-YA10
EDF1
AT1G76510 WOX14
OBP2
AT2G44940 WRKY6
HB21
TRB2
bHLH121
GBF3ILR3
ANAC092
miR169nm PLT3GBF2
HB30
IAA19
IAA3
AT3G11280
IAA11
CBF4
IAA9 IAA4
IAA18
AT1G12630
IAA26
AT5G52020
AT5G63090 NF-YA8
miR169lHAP2B
HAP2C
miR169f
miR169b
NF-YA5HAP2A AT1G68360
NF-YA6 E2FCNF-YA9DDF1
AT1G76880
AT1G61730
AT1G04880
ARF18
DOF6
AT1G44830
ANT
miR169cAT1G54060
TMO6
bZIP69
AT5G62940
AT4G39070
TOE1
AT5G40330
AT2G46400 AT1G16490
AT4G23810
AT1G53230
ARF7
AT5G13790 AT3G55370
REV GAI
ARF11
IAA10IAA1
IAA31
IAA8
Exploring Genomes with an eye toward breedingBiology Enabled Agriculture
Phenomes
Metabolic PathwaysProtein ComplexesRegulatory Networks
Genome EditingGenomic SelectionMarker Assisted BreedingQTL
GWAS
MetabolomicsExpressionEpigenetic
Genomes
Gene Networks
Association AnalysesGenes with major affect
Comparative analysisTransfer information across species
Regulatory and Inference NetworksMany genes with minor contributions associated with a complex trait
Improved understanding of biological mechanisms & systems will improve breeding models and support for genetic engineering
Genomes X (Environment +Management) = Phenotypes
MetabolitesProteinsTranscripts Chromatin
Decreasing cost of sequencing leads to increasing computes and data management
$50 million (2009) Sequencing CentersBAC library, Sanger sequencing library, finishing libraries,computes
$250- 180 thousand (2016) Sequencing Centers PacBio long single molecule, Optical map, illumina short read High quality DNA, Library prep, access sequencer & ***compute
$90- 50 thousand (2017) Sequencing CentersPacBio long single molecule, Optical map, 10X illumina short read High quality DNA, Library prep, access sequencer & ***compute
$45- 25 thousand (2018) Local/ Sequencing Centers$2-6 thousand (2021) Long single molecule
DecreaseSequence
Cost
Increase Compute
Cost
Improve Assembly
Quality
Schnable, Ware et al, Science 2009Jiao et al., Nature, 2017Ou et. al, Genome Biology 2020Liu et al, Nature Comm 2020Wang et. al, submitted 2021Hufford et. al, submitted 2021
Sensors & MetadataSequencers, Microscopy, Imaging, Mass spec, Metadata & Ontologies
IO SystemsHardrives, Networking, Databases, Compression, LIMS
Compute SystemsCPU, GPU, Distributed, Clouds, Workflows
Scalable AlgorithmsStreaming, Sampling, Indexing, Parallel
Machine Learningclassification, modeling,
visualization & data Integration
ResultsDomain
Knowledge
Biology has transitioned to an Information Science
Keystone Big Data in Biology 2014 Stein, Schatz, Ware
Big Data” Biology PyramidQuantitative Biology Technologies
Artificial Inteligence (AI) is allowing us to reimagine how we approach science
Artificial Inteligence (AI) is allowing us to reimagine how we approach science
“AlphaFold is a once in a generation advance, predicting protein structures with incredible speed and precision. This leap forward demonstrates how
computational methods are poised to transform research in biology and hold much promise for
accelerating the drug discovery process.”
ARTHUR D. LEVINSONPHD, FOUNDER & CEO CALICO, FORMER CHAIRMAN
& CEO, GENENTECH
Wild Grapes have natural resistance to disease
10/18/2017 severe PM at MHRI 2013 (21).JPG (3648×2736)
https://www.agric.wa.gov.au/sites/gateway/files/severe%20PM%20at%20MHRI%202013%20%2821%29.JPG 1/1
Wild Grapes come with baggage they have shed with domestication
Wild: Male and Female
Domesticated:Hermaphrodite
Unpleasant taste
This, (2006); Feechan, et.al (2015)
Fig. 3
Plant NLR R-genes
Tyler B. (2001). Trends in Genetics. 17(11):611-614
Pathogen elicitor
NB#ARC'
LRR'CC'
coil-coil ATP-binding
Gene-for-gene hypothesis
Imagine a world where you can begin to model protein ligand models for disease resitance genes!
The Next Green Revolution will be “Data” and “Design” driven
Agriculture has transition to a Data Science. Massive data generation—genotypes, phenotypes, soil & environment.
Data now exceeds human capacity to formulate and test hypotheses about gene function, regulatory networks, and predictive agriculture.
Develop new approaches and systems that can supply new hypotheses for researchers for crop breeding, fermentation systems, solar energy capture, pest and disease management.
Adapt what has evolved in nature and design the space nature missed Norman Borlaug
Nobel Peace Prize 1970
While CRISPR has the power to enhance breeding byrapidly customizing and optimizing crop productivity
Genome Biology provides the insights to drive the translation
Unraveling Complex Traits using Genomic, Genetic, Systems & Computational Approaches
Advancing Agriculture Through Collaborative Research on Crop & Model Species
Ware LabKapeel ChouguleNick GladmanYInping JiaoVivek KumarSunita KumariForrest LiAugusto Lima DinzZhenyuan LuAndrew OlsonMichael RegulskiBo WangLiya WangXiaofei WangMarcela Tello RuizSharon WeiPeter Van BurenLifang ZhangCarol Hu
CSHL Dick McCombieRob MartienssenDave JacksonTom GingerasDave Micklos
Uplands FarmTim MulliganKyle Schlecht
HPCC resourcesTodd Heywood
Recommended