View
63
Download
4
Category
Preview:
Citation preview
SoftwareandDataasScaffoldsforIntegrativeScience
DavidLeBauer,Ph.D.UniversityofIllinoisatUrbana-Champaign
DepartmentofAgriculturalandBiologicalEngineeringCarlRWoeseInstituteforGenomicBiology
NationalCenterforSupercomputingApplications
1
Outline
• Overview:ProblemsandApproach
• CombiningInformation
• Models:integrationacrossdomains
• PEcAn:integrationofmodelsanddata
• TERRAREF:automateddatacollectionandanalysis
• FutureDirections
2
Challengesweface
• AgriculturalProduction:• Feeding9bnby2050• Climateischanging
• Resourcesarebecomingscarce
• ScientificProblems:
• Howdogenescontroltraits?• Howcanleveragedataandcomputing?
3Tilmanetal,Nature2002
Yield
Fertilizer
Pesticides
TechnicalSolutionsforScienceandAgriculture
• KnowledgeisSpreadAcrossManyScalesandFormats:• ExpertKnowledge• Data• MechanisticModels
• Integratingthesewillenable:• StrongerInferenceandPrediction• MoreScienceandEngineering
4
Marshall-Colon et al 2017 Frontiers in Plant Science
102m
10-3m
103m
104m
105m Whichcropsareviable,…andwhere?
Whatfractionofglobalenergy/fooddemand?
CountylevelmeanyieldsSupplychainoptimization
Localtopography:soil,hydrologySub-fieldmanagement
CropArchitectureRowSpacing/OrientationHarvestingEquipmentShadingresponse
SpatialScale Questions
OpportunitiesAcrossScales
5
Outline
• ConceptualOverview• ComputationalSolutions
• CropModels
• PEcAn• TERRAREF
• FutureDirections
Zhu,Lynch,LeBauer,Millar,Stitt,Long,2015PlantCell&Environment
6
EvanDelucia
StartingPoint:ConceptualModels
7
BioCro:CombiningBiology,Physics,Chemistry
HumphriesandLong2005Miguezetal2009,2012Jaiswal,DeSouza,Larsen,LeBauer,…etal2017Wang,Jaiswal,LeBauer,…etal2015
8
InputsMeteorology(energy,water)
Soil(physics,carbon,nutrients)
Parameters(e.g.planttraits)
OutputsYield,Biomass,
EnergyBalance
WaterUse
NutrientUse
ScalingPhotosynthesisfromLeaftoCanopy
Light
Temperature
Light
Light
Photosynthesis
Photosynthesis
Temperature9
ScalingUp&PredictingtheFuture
IPCC AR5 Warszawski et al. PNAS
Temperature Precipitation
ClimateForecasts(2040-2050) CMIP5:5Climatemodelsx4CO
2emissionsScenarios
10
EffectsofClimateonSugarcaneYieldinBrazil2040-2050 Climate Impact
(metric Tons / ha)
Jaiswal, DeSouza, Larsen, LeBauer, Miguez, Sparovek, Bollero, Buckeridge, Long, 2017
Scaling leaf-level CO2 x T x H2O response11
Outline
• ConceptualOverview• ComputationalSolutions
• CropModels
• PEcAn:LinkingModelsandData
• TERRAREF• FutureDirections
12
PEcAn
LeBauer et al, 2013
Ecological Model-data Synthesis
LeBauerandTreseder,2008
13
Thomasetal2013
CombiningDataandModelsIsHard,MostlyaTechnicalChallenge
Traits System States
Prediction
Soil
Meteorology
Parameters
Boundary Conditions
Drivers
Publications
Primary Data
Repositories
Wild Data Relevant Information Configuration
Sensitivity
Calibration
Validation
AnalysesRun Model Outputs
JustRunningaModelisHard
Most of this work is model independent, so solutions can be shared
14
DataSources AnalysesEcosystemModels
BioCroED2CLMSIPNET
...n=12
TheStandardApproach:Redundant,LaborIntensive,ErrorProne
Converter
ForMet,needoneconverterperdriver(m)xmodel(n)combination
Prediction
NARR
NOAA
Fluxnet
CMIP5
… m = 10
Met Station
Calibration
Sensitivity
Validation
Visualization
15
PEcAncommonformats:Manyusersuse,reuse,test,andimprovecomponents
CommonFormat
CommonFormat
EcosystemModels
BioCroED2CLMSIPNET
...n=12
Converter
Onlyneedn+m(notn×m)convertersLesswork,morerobustandvalidresults
Diverse Met Data
NARR
NOAA
Fluxnet
CMIP5
… n = 10
Met Station
Analyses
Prediction
Calibration
Sensitivity
Validation
Visualization
16
ParameterEstimation:CombiningLiteratureandFieldData
LeBaueretal,2013 17
LeBaueretal,2013
Givencurrentdata,whatdrivesuncertainty?3Years,1crop,1location
18
PEcAnVarianceDecomposition
Bars:ParameterContributiontoUncertaintyinYieldPrediction
Grey=PriorBlack=Posterior
Usedtoinformoptimaldatacollection
LeBaueretal,2013
Automation&Reuse:Uncertaintyanalysisbars/color=ParameterContributiontoPredictiveUncertainty
3Years,1crop,1location
19Dietzeetal,2014
~1Year,8scientists,17PFTs,6biomes
TargetedFieldStudy:WillowWaterUse
Wertin,LeBauer,Volk,Leakey,inprep
Predictions
20
Before AfterDataCollection
AddData Configure AnalyzeRun
MakingCrop&EcosystemModelsAccessible
LeBaueretal2013,Kooperetal2013,Dietzeetal,201321
PEcAnisacommunityproject
42Contributors>50citationsTextbook100sofstudentstrained
22
PEcAnRadiativeTransferModelInversion
23Ely,Serbin,Shiklomanov,Dietzeandothers
PEcAnnowprovidesaplaceforsharedmodels,dataaccess,andtools
Tools: Web front end PostGIS database* Met Scaling and Gap filling Data Ingest Meta-Analysis* Sensitivity & Uncertainty Analysis* Ensemble Prediction Parameter Data Assimilation State Data Assimilation Benchmarking Visualization* Data Modeling:
Radiative Transfer Photosynthesis Tree Rings
Models: BioCro* CABLE CLM DALEC ED* FATES G’Day JULES Linkages LPJguess MAAT MAESPA PRELES SIPNET
Data: Literature* Field Measurements Expert Priors* Meteorology Soils PalEON Fluxnet ORNL NEON TERRA REF* LTER …
github.com/pecanproject/pecanpecanproject.org
24
Outline
• ConceptualOverview• ComputationalSolutions
• CropModels
• PEcAn:LinkingModelsandData
• TERRAREF• FutureDirections
25
HighThroughputPhenotyping
• HighThroughputPhenotyping:• Replacemanualwithsensor-basedmeasurements• Measuremoretraitswithhigherfrequency
• But…sensorsareexpensiveanddataaredifficulttointerpret
• Terraprogrammajorinvestmenttopushthisforward
http://bulletin
.ipm.illinois.ed
u/print.ph
p?id=513
26
TERRAREF
• Motivation:
• AutomatedMeasurements—>StrongerInference
• Software&Data—>FrameworkforInterdisciplinaryCollaboration
• Solutions:• ReferenceDatasets• ModularandInteroperable
• OpenData,Software,Computing
27
APhenomicsPipelineforCropImprovement
Sensors Traits Genotypes
Selection
Genomics
Higher Yield Yield Stability Nutrition Stress Tolerance and more …
Automated MeasurementsComponent & Aggregate
Genomic Prediction
Pan Genome
28
DiverseScientificDisciplines
Sensors Traits Genotypes
Selection
Genomics
Engineering Robotics Computer Vision
(Eco)Physiology Agronomy
Biology
Breeding
Statistics & Machine Learning
29
ARPA-ETERRA
OpenDatasetforSixProjects+PublicRelease
30
TERRAReferenceDataSources
LemnatecScanalyzerDanforth,St.Louis
LemnatecFieldScannerUSDAALRC,Maricopa,AZ
TractorandUAVAZandKansasState
31
FieldScannerSensors
terraref.org/articles/lemnatec-scanalyzer-field-sensors/
VNIR Imaging Spectrometer 380-1000nmSWIR Imaging Spectrometer 900-2500 nmIR Temperature SensorNDVI (1 down, 1 up) 650, 800 nmPRI Sensor 531, 570 nmPAR Sensor 410-655 nmColor Sensor 410-655 nm3D Scanners: 2 Side View, 1 DownRGB: 2 Side View, 1 Down (1)Active Reflectance 670, 730, 780 nmPS II Fluorescence Environmental: wind, temperature, humidity, light, rain, CO2
32
Approach:IntegrateSoftwareandDatabases
• Whatdopeoplecurrentlyuse?
• Whatdomainspecificsoftwareanddatabasesexist?
• Howcanweconnectthese?• Whatstandards&conventionstoadopt?
33
GeneralFrameworkforCross-DomainLinks
Sensors Traits Genotypes
Selection
Genomics
LocationTime
Genotype
34
DataFormats,Standards&Conventions
Sensors Traits Genotypes
Selection
CF Conventions OGC
geoTIFF NetCDF-CF LAS
PEcAn Crop Ontology AgMIP/ICASA BRAPI
BAM, FASTQ, VCF, BED, FASTA, GFF
Genomics
35
TERRAREFDatabases
Sensors Traits Genotypes
Selection
Genomics
36
ModularSoftware
github.com/terraref 37
TERRAREFPipeline
Fieldmeasurements
Metadata
TraitData
PipelineOrchestration
SensorData
Analysis&Development
1TB/d
<48h
Genomics
38
DataAnalysisEnvironmentsAnyLinuxConfiguration+LargeFilesystem+ Databases+ Compute
Workflows:Analyze! Share! PublishDevelop! Deploy
workbench.terraref.org39
~/data~/tutorials
40
WebApplicationDevelopedwithNDSWorkbench
traitvis.workbench.terraref.org 41
218mm
RobertPlessZongyangLiSolmazHajmohammadi
3DLaserScanner
42
%Reflectance
10cm
Nscandirection
HyperspectralImageat543nm
x
y
43
Thermal
44
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 45
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 46
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 47
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 48
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 49
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 50
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 51
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 52
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 53
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 54
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 55
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 56
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 57
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 58
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 59
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 60
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 61
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 62
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 63
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 64
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 65
AutomatedDetectionAlgorithms(TimeSeriesofPanicleCounts)
ZongyangLiandRobertPless 66
GeoffMorris&ZhenbinHu,KSULOD(LogorithmofOdds)geneslinkedtotrait
GenesThatControlGrowthRate
67
Getinvolved
• Signupforbetareleaseofsoftwareanddata• terraref.org/data
• Useandprovidefeedbackonsoftwareanddataformats
• github.com/terraref
• Collaborate• Fieldmeasurements
• Software• Algorithms
• ColocatedSensors68
Outline
• ConceptualOverview• ComputationalSolutions
• CropModels
• PEcAn:LinkingModelsandData
• TERRAREF• FutureDirections
69
Baroneetal2017bioRxiv“UnmetNeedsforAnalyzingBiologicalBigData:ASurveyof704NSFPrincipalInvestigators”
SoftwareCarpentryXSEDE.org,SharedClusters
Trainingisthebottleneck
70
Introductiontodatascience,withexamplesandprojectsfromTERRAREF
HackathonsandTraining
71
ArkansasStateUniversityIowaStateUniversityPurdueUniversityUniversityofArizonaUniversityofIllinoisUniversityofNebraskaUniversityofArkansas
Toppetal,unpublished72
SensorModelingandModelCoupling
Toppetal.unpublished73
ModularModelComponents
Zhu,Lynch,LeBauer,Millar,Stitt,Long,2015PlantCell&EnvironmentMarshall-Colonetal2017FronsersinPlantSciencecropsinsilico.org
Eachcomponentrepresents>=1hypothesis.
Eachparameteroroutputcanbetreatedasaphenotype
EnvironmentaldriverscanbeintegratedovertoaddressGxE
74
PurduePhenomics&IoTPlatforms• DevelopCyberinfrastructure
• Makedatauseable
• Facilitateinterdisciplinaryresearch• Assessexistingcapabilities,currentroadblocks,futureneeds• WorkwithLibrary,RCAC,facultytofacilitatedatapublishing
• QA/QC• CommunityStandardsandCommonInterfaces
75
Funding:
NSFAdvancesinBiologicalInfrastructure
USDANIFAFoodandAgricultureCyberinformaticsandTools
AgriculturalTechnology
Onceweunderstandhowthesesystemswork,wecanengineerforecosystemservicesratherthatsolelyforyield:• Climatecontrol• Soilimprovement,carbonstorage• Roots,mycorrhizae,microbiome• Pharmaceuticals• PetrochemicalSubstitutes• …anythingplantscando
NASA Ames Research Center
76
ToddMockler ProjectLeadNadiaShakoor ProjectDirector
NoahFahlgren Phenotyping&BioinformaticsEricaFishel TechnologyTransfer
SolmazHajmohammadi SensorFusion
StephenKresovich BreedingJeremySchmutz Sequencing
GeoffMorris Gene-traitAssociationsWilliamRooney Breeding
PedroAndrade-Sanchez Agronomy&PhenomicsMichaelOttman Physiology
MariaNewcomb FieldMeasurementsJeffWhite Agronomy
DavidLeBauer Informatics&ComputingRobertPless ImageAnalysis
RomanGarnett PredictionAlgorithmsWasitWalamu Sensing&Physiology
MaxBurnetteCraigWillis
RobKooperJeffTerstreip
ZongyangLi
ZhenbinHuNickHeyek
CharlieZenderHenryButowsky
Team
77
• MikeDietze,BostonUniversity
• DavidLeBauer,UniversityofIllinois• ShawnSerbin,BrookhavenNationalLab• AnkurDesai,UniversityofWisconsin
• KentonMcHenry,NationalCenterforSupercomputingApplications
• andmanyotheruser/contributors
78
DavidLeBauer
dlebauer@illinois.edu
TERRAREF
terraref.org
github.com/terraref
@terra_ref
PEcAnProject
pecanproject.org
github.com/pecanproject
@pecanproject79
Recommended