52
SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology Katy Wolstencroft University of Manchester

SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

  • Upload
    hume

  • View
    37

  • Download
    2

Embed Size (px)

DESCRIPTION

Katy Wolstencroft University of Manchester. SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology. SysMO-DB. DB. A data access, model handling and data integration platform for Systems Biology: To support and manage the diversity of Data, Models and experimental protocols - PowerPoint PPT Presentation

Citation preview

Page 1: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Katy WolstencroftUniversity of Manchester

Page 2: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB

A data access, model handling and data integration platform for Systems Biology:

To support and manage the diversity of Data, Models and experimental protocols Local data management systems

That promotes shared understanding Using a common platform and common

technologies

DB

Page 3: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Systems Biology Challenges

Interdisciplinary work Heterogeneous data and models Modellers and experimentalists have different

skills, training, experience Modellers and experimentalists have different

vocabularies and jargon Working together

Page 4: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Pan European collaboration Eleven individual projects, 91 institutes

Different research outcomes A cross-section of microorganisms, incl.

bacteria, archaea and yeast

Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way

Present these processes in the form of computerized mathematical models

Pool research capacities and know-how

Already running since April 2007 Runs for 3-5 years This year, 2 new projects join and 6 leave

http://www.sysmo.net

Systems Biology of Microorganisms

Page 5: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

The Problem

No one concept of experimentation or modelling

No planned, shared infrastructure for pooling

Page 6: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Types of data

Multiple omics genomics, transcriptomics proteomics, metabolomics fluxomics, reactomics

Images Molecular biology Reaction Kinetics Models

Metabolic, gene network, kinetic Relationships between data sets/experiments

Procedures, experiments, data, results and models Analysis of data

Page 7: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Started in June 2008Web-based solution to facilitate:

exchange of data, models and processes (intra- and inter- consortia)

search for data, models and processes across the initiative

maximisation of the "shelf life" and utility of the data, models and processes generated

dissemination of results

DB SysMO-DB

Page 8: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB Team

University of Stellenbosch, South AfricaUniversity of Manchester, UK

Jacky Snoep

Hits, Germany

Isabel Rojas

University of Manchester, UK

Olga Krebs

Wolfgang Müller Carole Goble

Stuart Owen

Katy Wolstencroft

Finn Bacall

SABIO-RK

JWS Online

TavernamyExperiment

Page 9: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB PALS team Power Contributors.

21 Postdocs and PhD students Design and technical

collaboration team Intense collaboration UK and Continental PALS

Chapters Audits and Sharing.

Methods, data, models, standards, software, schemas, spreadsheets, SOPs…..

20 questions Deployment into Projects

Page 10: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Principles…

A series of small victories Realistic Don‘t reinvent Sustainable and extensible Migrate to standards

Provide instant gratification Incremental development Fitting in with normal lab

practices

Page 11: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

The Lowest Hanging Fruit

SysMO SEEK – a catalogue of assets SysMO Yellow Pages The people and their expertise The institutions and their facilities Data – experimental data sets Data – analysed results Data – external reference data sets Models Processes – laboratory protocols and bioinformatics

analyses Publications

The catalogue references assets held elsewhere

Page 12: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SEEK screenshot?

Page 13: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

COSMIC

BaCell-SysMO

SysMOLab

MOSES

Alfresco

Alfresco

Wiki

Wiki

ANOTHER

A DATASTORE

Harvesters

Page 14: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Why not a central Warehouse? Protective of models

in progress vs published models. Access and Version management Curator-Rival conflict

Reluctant to share data Even within their own projects Legacy spreadsheets dominate Curation practices vary Centralised archive take-up Point to Point Exchange

People don’t mind sharing methods People want to advertise publications

Nature 461, 145 (10 Sept09)

Page 15: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Access Permissions

Just Enough Sharing

Reusing myExperiment

Page 16: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Dat

a

Mod

els

Proc

esse

s

SysMO DB

SysMO-DB Architecture

SysMO-SEEK web interface

Assets and Yellow

Pages Catalogues

JERM

Page 17: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Making use of the Assets

Understanding the content of the data Linking assets together Linking assets to experimental context Running comparisons between data files Running model simulations Running data analysis pipelines

Page 18: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

What is the JERM?

JERM “Just Enough Results Model” Minimum information to exchange data

What type of data is it Microarray, growth curve, enzyme activity…

What was measured Gene expression, OD, metabolite concentration….

What do the values in the datasets mean Units, time series, repeats….

Which experiment does it relate to? How does it relate to models? How was the data created

SOPs and protocols

Page 19: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

CIMR Core Information for Metabolomics ReportingMIABE Minimal Information About a Bioactive Entity MIACA Minimal Information About a Cellular Assay MIAME Minimum Information About a Microarray Experiment MIAME/Env MIAME / Environmental transcriptomic experiment MIAME/Nutr MIAME / Nutrigenomics MIAME/Plant MIAME / Plant transcriptomics MIAME/Tox MIAME / Toxicogenomics MIAPA Minimum Information About a Phylogenetic Analysis MIAPAR Minimum Information About a Protein Affinity Reagent MIAPE Minimum Information About a Proteomics Experiment MIARE Minimum Information About a RNAi Experiment MIASE Minimum Information About a Simulation Experiment MIENS Minimum Information about an ENvironmental Sequence MIFlowCyt Minimum Information for a Flow Cytometry Experiment MIGen Minimum Information about a Genotyping Experiment MIGS Minimum Information about a Genome Sequence MIMIx Minimum Information about a Molecular Interaction Experiment MIMPP Minimal Information for Mouse Phenotyping Procedures MINI Minimum Information about a Neuroscience Investigation MINIMESS Minimal Metagenome Sequence Analysis Standard MINSEQE Minimum Information about a high-throughput SeQuencing Experiment MIPFE Minimal Information for Protein Functional Evaluation MIQAS Minimal Information for QTLs and Association Studies MIqPCR Minimum Information about a quantitative Polymerase Chain Reaction experimentMIRIAM Minimal Information Required In the Annotation of biochemical Models MISFISHIE Minimum Information Specification For In Situ Hybridization and Immunohistochemistry

ExperimentsSTRENDA Standards for Reporting Enzymology DataTBC Tox Biology Checklist

BioPAX : Biological Pathways Exchange http://www.biopax.org/FuGE Functional Genomics Experiment MGED: Microarray Experimental Conditions

http://www.mibbi.org/index.php/MIBBI_portal

Minimum Information Models

Page 20: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

The Idea

For each data type….. Transcriptomics Proteomics Metabolomics Single Cell Data

Generate and apply…. JERM template JERM extractor for data host Subset registered in SEEK Access / export through JERM interface / template

Define a JERM….. Top down analysis of standards Bottom up analysis of practice

1

2

3

ISA-TAB

Page 21: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Experimental Data

Metadata

People

ProjectsAssay

Study

Experimental conditions

Factors studied

Models

SOPs

Homogenised terminology and values in the datasets themselves

Workflows

Based on ISA-TAB

Investigation

SEEK + JERM

Page 22: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

For publishing

JERM data needs to be related to SOPs, experimental context (ISA) and other data

JERM must be “MIBBI” compliant for exporting to public repositories e.g. Microarray data needs to be MIAME compliant

Page 23: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

ISA-TAB

Relating data and its experimental context Investigation, Study, Assay

TAB = tabular A format suitable for spreadsheets

http://isatab.sourceforge.net/

Page 24: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

ISA Provides....

A common framework for relating different types of data e.g. microarrays and proteomics

Facilitates submission to international public repositories of genomics, transcriptomics and proteomics studies

Page 25: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Identifying Biological Objects

What do you have in your data? Proteins/enzymes, genes/expression levels,

metabolites

Where/how do these objects interact? Pathways, flux, experimental conditions

What models describe these interactions

Possible when using common frameworks, naming schemes and controlled vocabularies

Page 26: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

BioPortal Integration for Searching

Repository for submitting and sharing Biological ontologies http://bioportal.bioontology.org/

Search for concepts across all or selected ontologies

BioPortal provides a number of Restful Webservices Search Concept lookup Visualisation

Integrated within SEEK as a plugin

Page 27: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Tools to help manage data:Annotation standards by stealth

Controlled vocabulary plug inBioPortal

Page 28: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Following Standards

We recommend formats but we do not enforce them Protocols and SOPs – Nature Protocols Data – JERM models and community minimum

information models Models – SBML and related standards Publications – PubMed and DOI

If you follow the prescribed formats, you get more out, but if you don’t, you can still participate

lowering the adoption barrier

Page 29: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Off the shelf

Except for the JERM, we have only used community resources, vocabularies and services

You can get a long way by implementing community practices and providing ways to integrate them

Page 30: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB and Models

Page 31: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Nicolas Le Novere, Data Integration in the Life Sciences, Manchester, 2009

Page 32: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Models: Incentives for using Standards

Models can be shared in SysMO-SEEK in any format SBML is the recommended format We also recommend MIRIAM compliance and SBO annotation

If you use SBML, you can use JWS Online to run simulations in SEEK

Page 33: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Screenshot of JWS Online

JWS Online Plugin•online simulator, runs in your browser•upload models in SBML format•Web Service enabled•SBGN schemas, with annotations and external links

Page 34: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Falko Krause, Humboldt-University, Berlin http://www.semanticsbml.org/aym

Page 35: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Models Resources

Models can be published in public repositories JWS-Online, BioModels

Models can be annotated SBML, MIRIAM, SBO

No public resources currently for sharing models with associated data, or for loading new data into models

Page 36: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Linking Data to Models

Relating data and models Where did the data come from for developing the

model? Where did the data come from for validating the

model? What were the results of model simulations?

Page 37: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Current Functionality in SEEK

Show all data used for construction together with the model, such that process can be repeated

Uploaded models loaded with this data by default

Manually alter parameters and run simulations

Page 38: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Next Steps: Model Validation

Test/compare model with experimental data for complete system Find data in SEEK Upload data from elsewhere Automatically load into model Run simulations and compare with original results

JERM for models Mapping tools – allows you to identify

columns/rows in spreadsheets containing the right information

Page 39: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

ISA for Models

Modelling and experimental work intersect Investigations, Study, Assay.....or modelling

analysis..... Modelling analysis types

Metabolic models, gene networks Modelling type

ODE, algebraic

Studies – combinations of experimental assays, modelling analyses, and informatics analyses

Page 40: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB the e-Laboratory

An e-Laboratory is an information system for bringing together people, data and analytical methods at the point of investigation or decision-making

Page 41: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Current Status

Finding things so that we can compare them Understanding who has what Understanding what can be compared with what

– the experimental context

Page 42: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Where we are going…

A dynamic resource for analysis as well as browsing

Automatic comparison of data from inside files Understanding where and how data and models

are linked Running simulations with new experimental data Running analyses and workflows over the data

and models

Page 43: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Workflows from myExperiment

Data preparation, annotation and analysis Systems Biology workflow Pack on myExperimentMicroarray analysis and text mining

Created by Afsaneh Maleki-Dizaji

from SUMO, University of Sheffield

Based on previous work by Paul Fisher, University of Manchester

http://www.myexperiment.org/workflows/187

Page 44: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SEEK as a data analysis and meta analysis service

SBML model construction and population Calibration workflow Data requirements

Parameterised SBML model Experimental data

Metabolite concentrations from key results database

Calibration by COPASI web service

Peter Li

Page 45: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Data analysis and meta analysis

SEEK Analysis Service with pre-cooked analysis tools.

Calibration workflow Data requirements

Parameterised SBML model Experimental data

Metabolite concentrations from key results database

Calibration by COPASI web service

Peter Li

Load model:

Load data:

GO

Page 46: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

New Directions

Page 47: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Opening SysMO Out

Using SysMO as a dissemination space for the SysMO consortium Supplementary material in publications Data citation

Packaging software so that others can use it Easy to install a SEEK for yourself

Packaging and exchanging JERM Templates Helping with standardisation

Promotion and example work with SBRML and data and models linkage

Page 48: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

SysMO-DB Approach in Other projects

SysMO2 – new projects and legacy EraSysBio+ Lungsys and SBCancer Virtual Liver

Page 49: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

New Considerations

Eukaryotic organisms Interactions between host and pathogen Human disease

multicellular interactions, tissues, organs multiscale modelling

Page 50: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Outstanding Issues

Keeping data at project sites has responsibilities Reliability - Sites available continuously and promptly Support - Must be proof against virus attacks, etc. Archiving - Beyond the lifetime of the project.

Page 51: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

How it works

Find a solution that fits in with current practices Start simple, show benefits, add more Engage with the people actually doing the work

PhD students, Post-docs Let the scientists retain control over their data

and who can see it Don’t reinvent. Use available vocabularies,

minimal model standards Help prevent people duplicating work by linking

the people as well as the resources

Page 52: SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

Acknowledgements

SysMO-DB Team SysMO-PALS

myGrid, Hits and JWS Online teams EMBL-EBI, MCISB

http://www.sysmo-db.org