FAIRDOM – FAIR Asset management and sharing experiences in Systems and Synthetic Biology
Prof Carole GobleThe FAIRDOM [email protected]://fair-dom.org, http://fairdomhub.org
COMBINE 2016, Newcastle, UK 19 September 2016
FAIRFindable
Accessible
Interoperable
Reusablehttps://www.force11.org/group/fairgroup/fairprinciples
Intelligible
Reproducible
Citable
Track & Countable
FindableAccessibleInteroperableReusable
DataOperationsModels
Systems and Synthetic Biology
Projects
Sys Bio Projects ....
5
P1. BaCell-SysMOThe transition from growing to non-growing Bacillus subtilis cells - A systems biology approach
P2. COSMICSystems Biology of Clostridium acetobutylicum - a possible answer to dwindling crude oil reserves
P3. SUMOSystems Understanding of Microbial Oxygen Responses Escherichia coli
P4. KOSMOBACIon and solute homeostasis in enteric bacteria Escherichia coli
P5. SysMO-LABComparative Systems Biology: Lactic Acid Bacteria: Lactococcus lactis, Enterococcus faecalis, Streptococcus pyogenes
P6. PSYSMOSystems analysis of biotech induced stresses: towards a quantum increase in process performance in the cell factory Pseudomonas putida
P7. SCaRABSystems Biology of a genetically engineered
Pseudomonas fluorescens with inducible exo-polysaccharide production: analysis of the dynamics and robustness of metabolic networks
P8. MOSES
MicroOrganism Systems Biology: Energy and Saccharomyces cerevisiae
P9. TRANSLUCENT Gene interaction networks and models of cation homeostasis in Saccharomyces cerevisiae P10. STREAM
Global metabolic switching in
Streptomyces coelicolor P11. SulfoSYSSilicon cell model for the central carbohydrate metabolism of the archaeon Sulfolobus solfataricus under temperature variation
P12. SysMO-DBData management group200611 projects
201512 projects
Stakeholders Stress…Sponsored to Support Projects: National level and
ERANets
Funders / Institutions• Capitalising on
investments• Skills• Justification, Audit,
Compliance• Showcase access
Publishers• Reproducibility• New publishable
assets• New services• Showcase access
Projects….Researchers’ Productivity Rhetoric
Multi-partnerDistributedDynamic membershipOverlapsSensitivities
CollaborationPublicationShowcasingComplianceRetention
Home-grown resourcesPublic resourcesMixed Skills
Projects’ “PAP-PA” People, Assets, Processes -> Publishing, Analysis
People
AssetsProcesses
Projects’ People, Assets, ProcessesOrganisation, Communication, Dissemination
Simulation
SOPs
Data Models
StandardsAnalyticsPipelines
Samples
TrackingVersioning
Validation
Articles
Strains
modeller
experimentalist
Upstream, downstream assets discovery
Organisation Communication Dissemination
Helps navigation Reuse
later
Enable team to reuse/
reproduceHelp
others find out
Reuse with new
partnersTell more and
take credit
Collecting and tracking data/modelsChoosing what to keepPreparing what to share and when
Most data/models won’t be shared• Wrong experimental method• Hidden parameter discovered • Faulty experiment
Promote standardised metadata practices
FAIR Projects & Programmes
Who is working with wh
ich organism?
What methods are been used to determine enzyme activity?
Under which experimental conditions are
my
partners working on for the measurement
of glucose
concentration?
What is the provenance of the parameters for this version of the model?What SOP was used for this sample?
Where is the validation data for this model?
Is there any group generating kinetic data?
Is this data available?
Track versions of my model
Whats the relationship between the data and model
Which data belong to which publications?
SOP Directory
Yellow Pages
Spreadsheettools
Auto-harvesting
FAIR Projects and Programmes
Findable Exchange & find assets and peopleCitationCredit
AccessibleShare, disseminate and publish assets sensitivelyGateway to third party tools, archivesStore assetsPackage assets
Track collection of data and metadataConsistent reporting for interpretation, interop & comparisonPromote and support standardised metadata practices.Support reproducible publications
Organise and link assetsMaintain the experimental contextRetain results beyond a projectReuse results, tools, archivesRespect local solutions
Standards! Standards!
FAIRDOM Initiative
Project Support
Community Actions
Platforms, Tools
Public Project Commons
FAIRDOM Community, Knowledge Hubhttp://www.fair-dom.org
Know-how, Guides, Templates, Workshops, Training, Webinars, Standards and Policy Forums
Policy and International Initiatives
Surveys
Stanford et al The evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
Community Clubshttp://www.fair-dom.org
Samples Club with ELIXIR, BBMRI-ERIC, EBI…Rework and harmonise sample metadata frameworkbioschemas.org
Developers FoundrySupport developers of Systems Biology tools and platforms 3rd Foundry meetingDec 1-2 2016, Frankfurt
FAIRDOM Platforms and Tools
FAIR Sharing Metadata CatalogueProject CommonsCross-repository gatewayTool gatewayResults repository
Web-based
Local Storage AnalyticsTrackingLIMS, Auto-archivingIn flight repositoryLocal-based
FAIRDOM Platform Installations
*Troup, E.; Clark, I; Swain, P; Millar, AJ; Zielinski, T (2015) Practical evaluation of SEEK and openBIS for biological data management in SynthSys http://hdl.handle.net/1842/12236
Local retention and In flight management, Private sharingCentres, large or national projectsLocal skills
One stop showcaseProgrammesPost-project retentionSupplementary materials
People and Project Commons FAIRDOMHub.org
self-managed workspaces
Sharing sensitivity
Investigation
Study Analysis
Data
Model
SOP(Assay)
Linking, “Packaging” & Citing Codes, Data, Models, SOPs, Samples, Strains, Articles, People, Projects….
Packaging
25
Jon Olav Vik, Norwegian University of Life Science
26
Programme Overarching research theme (The Digital Salmon)Project Research grant (DigiSal, GenoSysFat)Investigation A particular biological process, phenomenon or thing(typically corresponds to [plans for] one or more closely related papers)Study Experiment whose design reflects a specific biological research questionAssay Standardized measurement or diagnostic experiment using a specific protocol(applied to material from a study)
Jon Olav Vik, Norwegian University of Life Science
Visible, but details maybe inaccessible
Visibility controls
STUDY ASSAYINVESTIGATION
Experimental assay
Modeling assay
Publication
Maksim Zakhartsev
Glucose
Glycerol
Ethanol
GlycolysisPPP
TCA
CO2
Lactate
Carbohydrates
Nucleotides
Kinases
Pyrophosphotase
RNA
Electron transport
Maintenance
C1 metabolism
Proteines
NH4 transport Sulfur metabolism
Biomass
Lipids
Stoichiometric model
30
Stoichiometric model of Saccaromyces cerevisiae growing anaerobically at D=0.1 h¯¹Compartments 1
Pathways 35
Transformers: 254
reactions 214
transports 33
polymerization
7
Compounds 253
balanced comp.
234
Supplementary information
Annotation file
SEEK
Stoichiometric matrix
SEEK
SBML
SEEK
Stationary fluxes
SEEK
ODE-based model
32
example = glc-permease
variables: Cexglc, Cglc, Cg6p
parameters: Rmax, Kglc, KI-g6p, KII-g6p
0 500 1000 15000
2
4
6
glc-ex
Ci [
mm
ol/L
]
0 500 1000 15000
0.05
0.1
glc-in
Ci [
mm
ol/L
]
0 500 1000 15000
2
4
6
g6p
Ci [
mm
ol/L
]0 500 1000 1500
0
1
2
3f6p
time [sec]
Ci [
mm
ol/L
]
0exglc pulseex pulsex
glc glc Perm glcr
dC FCD C C t r a C
dt V
max
.1
.1 6 6
.1 .1 6 .1 .1 6 .1
.1
max
.1
.1
.1
11 1
1
11
1
perm influx efflux
exglc
permglc
influx exglc
exglc glc glc g p glc g p
glcglc glc I g p glc II g p
glc
glcperm
glcefflux
glc
glc glc
glc
r r r
CR
Kr
CC K C C C C
CK K K K KK
CR
Kr C
C KK
6 6
.1 6 .1 .1 6 .1
.1
1exglc g p glc g p
exglc glc I g p glc II g p
glc
C C C CC K K K KK
[Rizzi et al., 1997]
DATA fileEquations
Simulations
https://dx.doi.org/10.1111/febs.13237
https://doi.org/10.15490/seek.1.investigation.56
Model PublicationPublishing and data citation
reviewer
Institutional Repository
FAIRDOM Catalogue, a web of resourcesdrawing together across resources; reusing tools and repositoriesrespect local project solutions, tool plugins
Standards
Personal DataLocal Stores
ExternalDatabases
Publishing services
Modelling tools
SOPs
BiVes
Datatools
Specialist Public Repositories
General archives
Repository RepertoireStanford et al The evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
Local Data Stores
Laissez-Faire
openBISmetadata extraction data relationship/linkingdata processing minimal user input
*Troup, E.; Clark, I; Swain, P; Millar, AJ; Zielinski, T (2015) Practical evaluation of SEEK and openBIS for biological data management in SynthSys http://hdl.handle.net/1842/12236
Modelling standards based in browser validation and simulation
Model comparison and versioning
SBML Model simulation
[Stellenbosch, Rostock]
http://seekdb.bionet.nsc.ru/
Alexey KolodkinSiberian BranchRussian Academy of Sciences
Reproducible model simulations in papers using COMBINE Archive & SED-ML
[Jacky Snoep, Dagmar Waltemath, Martin Peters ]
Three tiered service
store DOI citable supplementary files on FAIRDOMHub
model and data curation
reproducible clickable figures in papers using SED-ML
COMBINE Archives support
Martin Peters Talk Wednesday Morning
COMBINE Archives support
Martin Peters Talk Wednesday Morning
COMBINE Archives support
Martin Peters Talk Wednesday Morning
Standards-based Structured Descriptiontemplates, register, harvest, index, search
Metadata Catalogue
Storage
Regi
ster
met
adat
a
Upl
oad
data
Regi
ster
link
Regi
ster
acc
ess m
etho
d
Closed Storage
Hooking togethertemplates, register, harvest, index, search
Just Enough Results Model
Sample
Spreadsheets!!
JBEI-ICELabCollector
LabArchivesJupyter
Galaxy / Refinery in the works….
SPARQL endpoint release
Focus – joining stuff up!- JSON read/write API- ISA-TAB output
Spreadsheet tooling for metadata templates and metadata
harvesting
Data annotation with standards
Embed ontologies into Excel templates
Excel spreadsheets enriched with ontology annotations
Upload, extract metadata and register
in browser viewing + annotations
Model annotation – in schema
Model annotation – semanticSBML
Samples Club samples framework
User defined sample models
Interlinking between sample typesSample type defines a sharable standard
Template toolingAuto extractionTied to assay processes
Spotlight: Synthetic BiologySynBioChem Centre, UK
FAIRDOM Workflows and Pathways
Project Support – Understanding ProcessPlanning, Setups, Curation, Advice, Support
Community support Special project
support Special project
support
model technical curationwith our JWS Online partners
PALs project ambassadorsco-design, tailoring, communication, requirements, review
standard, best practices
More Projects and Centres
User Meeting, Barcelona@ICSB 2016
Processes and People• 80% process, 20% tech• Structuring the ISA• PIs• Sticking to conventions
and policies• Tension with standards
take-up vs laissez faire• Time and resource• Local responsibility• Recognition• Institutional Repositories• Automagic
• Licenses• Negotiated access• Embargos• Permission controls• Staged sharing• Private walled gardens
FAIR Play
Using FAIRDOM my own lab colleagues saw what I was doing and called to collaborate!
Jurgen HannstraVrije Universiteit Amsterdam, Netherlands
FAIR Play• Drivers
– External dominate– Personal
productivity• Trading behaviours
– Tribal based– Modellers vs
Experimentalists• “enclave” sharing
– Rather than public donation
• Reciprocity & credit– Citation
affecting behavioural change through libertarian paternalism*
*Garza et al Framing the Community Data System Interface, https://dx.doi.org/10.6084/m9.figshare.1300051.v5
Stanford et al The evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
FAIRDOM Association e.V.
Jon Olav Vik, Norwegian University of Life ScienceMaksim ZakhartsevPlant Systems BiologyUniversity HohenheimStuttgart, GermanyAlexey KolodkinSiberian BranchRussian Academy of Sciences
Tomasz Zieliński,SynthSys CentreUniversity Edinburgh