27
SysMO-SEEK: Sharing Data and Models in Systems Biology Katy Wolstencroft Stuart Owen Jacky Snoep University of Manchester

SysMO-SEEK: Sharing Data and Models in Systems Biology

Embed Size (px)

DESCRIPTION

SysMO-SEEK: Sharing Data and Models in Systems Biology. Katy Wolstencroft Stuart Owen Jacky Snoep University of Manchester. SysMO-DB Project. DB. A data access, model handling and data integration platform for Systems Biology: To support and manage the diversity of - PowerPoint PPT Presentation

Citation preview

Page 1: SysMO-SEEK: Sharing Data and Models in Systems Biology

SysMO-SEEK: Sharing Data and Models in Systems Biology

Katy WolstencroftStuart OwenJacky Snoep

University of Manchester

Page 2: SysMO-SEEK: Sharing Data and Models in Systems Biology

SysMO-DB Project

A data access, model handling and data integration platform for Systems Biology:

To support and manage the diversity of Data, Models and experimental protocols from a

consortium Web based Standards compliant

DB

Page 3: SysMO-SEEK: Sharing Data and Models in Systems Biology

Pan European collaboration 13 individual projects, >100 institutes

Different research outcomes A cross-section of microorganisms, incl.

bacteria, archaea and yeast

Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way

Present these processes in the form of computerized mathematical models

Pool research capacities and know-how

Already running since April 2007 Runs for 3-5 years This year, 2 new projects join and 6 leave

http://www.sysmo.net

Systems Biology of Microorganisms

Page 4: SysMO-SEEK: Sharing Data and Models in Systems Biology

Types of data

Multiple omics genomics, transcriptomics proteomics, metabolomics fluxomics, reactomics

Images Molecular biology Reaction Kinetics Models

Metabolic, gene network, kinetic Relationships between data sets/experiments

Procedures, experiments, data, results and models Analysis of data

Page 5: SysMO-SEEK: Sharing Data and Models in Systems Biology

Challenges

Heterogeneous data and models Distributed groups of researchers Modellers and experimentalists have different

skills, training, experience Scientists want to remain in control Scientists reluctant to share

Social and technical challenges

Page 6: SysMO-SEEK: Sharing Data and Models in Systems Biology

SysMO-DB Dev Team

University of Stellenbosch, South AfricaUniversity of Manchester, UK

Jacky Snoep

Heidelberg Institute for Theoretical Studies Germany

University of Manchester, UK

Olga Krebs

Wolfgang Müller

Sergejs Aleksejevs Carole Goble

Stuart Owen

Katy Wolstencroft

Finn Bacall

Franco du Preez

Page 7: SysMO-SEEK: Sharing Data and Models in Systems Biology

Social Challenge: Focus GroupSysMO PALs

DB team Focus Group Projects

Show what is thereSuggest what is possible

Ask for requirements

Give requirementsTell priorities

Rate outcomesSuggest improvements

Double checkTransmit

Disseminate

Collect answers

Page 8: SysMO-SEEK: Sharing Data and Models in Systems Biology

Technical Challenge

Rapid and incremental development Driven by the PALs Just enough and just in time , not Just in case No reinvention Sustainable and extensible Migrate to standards Fitting in with normal lab practices

Page 9: SysMO-SEEK: Sharing Data and Models in Systems Biology

Protocols for Models

Protocol Title Authors Keywords Description Assumptions Equations Numerical Methods/Algorithms Computational Tools Parameter Estimation Techniques Limitations References

What do we share

Methods Data Results+ +Models +

All SysMO Assets

Page 10: SysMO-SEEK: Sharing Data and Models in Systems Biology

SOP

A Tree View of Assets

Investigation Studies Assay

ConstructionValidation

SOP

SOP

ISA infrastructure provides a directory structure for experiments

http://isatab.sourceforge.net/

Page 11: SysMO-SEEK: Sharing Data and Models in Systems Biology
Page 12: SysMO-SEEK: Sharing Data and Models in Systems Biology

Incentives for sharing

Safe haven for data Credit and attribution Help with exporting to public repositories (e.g.

One-click export to ArrayExpress, PRIDE etc) A repository for “supplementary materials” in

publications Linking publications and data

Access other resources through a SEEK gateway

Page 13: SysMO-SEEK: Sharing Data and Models in Systems Biology

Access Permissions

Just Enough Sharing

...we don’t talk about security

Page 14: SysMO-SEEK: Sharing Data and Models in Systems Biology

COSMIC

SysMOLab

MOSES

Alfresco

Wiki

Wiki

ANOTHER

A DATASTORE

Just Enough sharing

SOP

Fetch on Request

Direct Upload

Page 15: SysMO-SEEK: Sharing Data and Models in Systems Biology

How do we share

“Just Enough Results Model” What type of data is it

Microarray, growth curve, enzyme activity… What was measured

Gene expression, OD, metabolite concentration…. What do the values in the datasets mean

Units, time series, repeats….

Based on: Minimum information models

e.g. MIAME, MIAPE, MIRIAM Biological ontologies

e.g. Gene Ontology, MGED, SBO Bioportal web service used in SysMO-SEEK for:

Concept lookup and visualisation

Page 16: SysMO-SEEK: Sharing Data and Models in Systems Biology

How do we share

Share JERM templates developed by SysMO-DB, PALs and consortium Spreadsheet templates Database Schemas

Encourage uptake throughout SysMO transcriptomics metabolomics proteomics etc….

Page 17: SysMO-SEEK: Sharing Data and Models in Systems Biology

RightField: Annotation by Stealth

Page 18: SysMO-SEEK: Sharing Data and Models in Systems Biology
Page 19: SysMO-SEEK: Sharing Data and Models in Systems Biology

Identifying Biological Objects

What do you have in your data? Proteins/enzymes, genes/expression levels,

metabolites

Where/how do these objects interact? Pathways, flux, experimental conditions

What models describe these interactions

Possible when using common frameworks, naming schemes and controlled vocabularies

Page 20: SysMO-SEEK: Sharing Data and Models in Systems Biology

Following Standards We recommend formats but we do not enforce

them Protocols and SOPs – Nature Protocols Data – JERM models and community minimum

information models Models – SBML and related standards Publications – PubMed and DOI

If you follow the prescribed formats, you get more out, but if you don’t, you can still participate

Lowering the adoption barrier

Page 21: SysMO-SEEK: Sharing Data and Models in Systems Biology

SEEK, the eLaboratory

A dynamic resource for analysis as well as browsing

Automatic comparison of data from inside files Understanding where and how data and models

are linked Running simulations with new experimental data Running analyses and workflows over the data

and models

Page 22: SysMO-SEEK: Sharing Data and Models in Systems Biology

Workflows from myExperiment

Data preparation, annotation and analysis Systems Biology workflow Pack on myExperimentMicroarray analysis and text mining

Created by Afsaneh Maleki-Dizaji

from SUMO, University of Sheffield

Based on previous work by Paul Fisher, University of Manchester

http://www.myexperiment.org/workflows/187

Page 23: SysMO-SEEK: Sharing Data and Models in Systems Biology

SEEK as a data analysis and meta analysis service

SBML model construction and population Calibration workflow Data requirements

Parameterised SBML model Experimental data

Metabolite concentrations from key results database

Calibration by COPASI web service

Peter Li

Page 24: SysMO-SEEK: Sharing Data and Models in Systems Biology

Data analysis and meta analysis

SEEK Analysis Service with pre-cooked analysis tools.

Calibration workflow Data requirements

Parameterised SBML model Experimental data

Metabolite concentrations from key results database

Calibration by COPASI web service

Peter Li

Load model:

Load data:

GO

Page 25: SysMO-SEEK: Sharing Data and Models in Systems Biology

Why it works for us

A solution that fits in with current practices Start simple, show benefits, add more Engage with the people actually doing the work

PhD students, Post-docs Build to the PALs requirements Respect publication cycles Respect cultural differences Scientists stay in control

Page 26: SysMO-SEEK: Sharing Data and Models in Systems Biology

SysMO Methods Spreading

Virtual Liver Mueller, via HITS

Lungsys SBCancer EraSysBio+

Eukaryotic organisms Interactions between host and pathogen Human disease Multi scale modelling

Page 27: SysMO-SEEK: Sharing Data and Models in Systems Biology

Acknowledgements

SysMO-DB Team SysMO-PALS

myGrid, Hits and JWS Online EMBL-EBI, MCISB

http://www.sysmo-db.org