SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology

  • Published on

  • View

  • Download

Embed Size (px)


Katy Wolstencroft University of Manchester. SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology. SysMO-DB. DB. A data access, model handling and data integration platform for Systems Biology: To support and manage the diversity of Data, Models and experimental protocols - PowerPoint PPT Presentation


<ul><li><p>SysMO-DB: Sharing and Exchanging Data and Models in Systems BiologyKaty WolstencroftUniversity of Manchester</p></li><li><p>SysMO-DB approachLinking data to modelsSysMO-DB, the e-Laboratory</p></li><li><p>SysMO-DB A data access, model handling and data integration platform for Systems Biology:To support and manage the diversity ofData, Models and experimental protocolsLocal data management systemsThat promotes shared understandingUsing a common platform and common technologies</p></li><li><p>Systems Biology ChallengesInterdisciplinary workHeterogeneous data and modelsModellers and experimentalists have different skills, training, experienceModellers and experimentalists have different vocabularies and jargonWorking together</p></li><li><p>Pan European collaborationEleven individual projects, 91 institutesDifferent research outcomes A cross-section of microorganisms, incl. bacteria, archaea and yeast</p><p>Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way</p><p>Present these processes in the form of computerized mathematical models</p><p>Pool research capacities and know-how</p><p>Already running since April 2007Runs for 3-5 yearsThis year, 2 new projects join and 6 leavehttp://www.sysmo.netSystems Biology of Microorganisms</p></li><li><p>The ProblemNo one concept of experimentation or modelling</p><p>No planned, shared infrastructure for pooling</p></li><li><p>Types of dataMultiple omicsgenomics, transcriptomicsproteomics, metabolomicsfluxomics, reactomicsImagesMolecular biologyReaction KineticsModelsMetabolic, gene network, kineticRelationships between data sets/experimentsProcedures, experiments, data, results and modelsAnalysis of data</p></li><li><p>Linking and using DataModelsConstructed from experimental dataConstructed by using parameters from literatureData Analysed and compared and integratedStatistics, pipelines and workflowsIdentification of the same entities in different data setsIdentification of where data sets overlapExperimental context</p></li><li><p>Started in June 2008Web-based solution to facilitate:exchange of data, models and processes (intra- and inter- consortia)search for data, models and processes across the initiativemaximisation of the "shelf life" and utility of the data, models and processes generateddissemination of resultsSysMO-DB </p></li><li><p>SysMO-DB TeamUniversity of Stellenbosch, South AfricaUniversity of Manchester, UKJacky SnoepHits, GermanyIsabel RojasUniversity of Manchester, UKOlga KrebsWolfgang MllerCarole GobleStuart OwenKaty WolstencroftFinn Bacall SABIO-RKJWS OnlineTavernamyExperiment</p></li><li><p>SysMO-DB PALS teamPower Contributors.21 Postdocs and PhD studentsDesign and technical collaboration teamIntense collaborationUK and Continental PALS ChaptersAudits and Sharing.Methods, data, models, standards, software, schemas, spreadsheets, SOPs..20 questionsDeployment into Projects</p></li><li><p>PrinciplesA series of small victoriesRealisticDont reinventSustainable and extensibleMigrate to standards</p><p>Provide instant gratificationIncremental developmentFitting in with normal lab practices</p></li><li><p>The Lowest Hanging FruitSysMO SEEK a catalogue of assetsSysMO Yellow PagesThe people and their expertiseThe institutions and their facilitiesData experimental data setsData analysed resultsData external reference data setsModelsProcesses laboratory protocols and bioinformatics analysesPublicationsThe catalogue references assets held elsewhere </p></li><li><p>SEEK screenshot?</p></li><li><p>COSMICBaCell-SysMOSysMOLabMOSESAlfrescoAlfrescoWikiWikiANOTHERA DATASTOREHarvesters</p></li><li><p>Why not a central Warehouse?Protective of modelsin progress vs published models.Access and Version managementCurator-Rival conflictReluctant to share dataEven within their own projectsLegacy spreadsheets dominateCuration practices varyCentralised archive take-upPoint to Point ExchangePeople dont mind sharing methodsPeople want to advertise publications</p><p>Nature 461, 145 (10 Sept09)</p></li><li><p>Access PermissionsJust Enough SharingReusing myExperiment</p></li><li><p>Data</p><p>ProcessesSysMO DBSysMO-DB ArchitectureSysMO-SEEK web interfaceAssets and Yellow Pages CataloguesJERM</p></li><li><p>Making use of the Assets</p><p>Understanding the content of the data Linking assets togetherLinking assets to experimental contextRunning comparisons between data filesRunning model simulationsRunning data analysis pipelines</p></li><li><p>What is the JERM? JERM Just Enough Results ModelMinimum information to exchange dataWhat type of data is it Microarray, growth curve, enzyme activityWhat was measured Gene expression, OD, metabolite concentration.What do the values in the datasets meanUnits, time series, repeats.</p><p>Which experiment does it relate to?How does it relate to models?How was the data created SOPs and protocols</p></li><li><p>CIMR Core Information for Metabolomics ReportingMIABE Minimal Information About a Bioactive Entity MIACA Minimal Information About a Cellular Assay MIAME Minimum Information About a Microarray Experiment MIAME/Env MIAME / Environmental transcriptomic experiment MIAME/Nutr MIAME / Nutrigenomics MIAME/Plant MIAME / Plant transcriptomics MIAME/Tox MIAME / Toxicogenomics MIAPA Minimum Information About a Phylogenetic Analysis MIAPAR Minimum Information About a Protein Affinity Reagent MIAPE Minimum Information About a Proteomics Experiment MIARE Minimum Information About a RNAi Experiment MIASE Minimum Information About a Simulation Experiment MIENS Minimum Information about an ENvironmental Sequence MIFlowCyt Minimum Information for a Flow Cytometry Experiment MIGen Minimum Information about a Genotyping Experiment MIGS Minimum Information about a Genome Sequence MIMIx Minimum Information about a Molecular Interaction Experiment MIMPP Minimal Information for Mouse Phenotyping Procedures MINI Minimum Information about a Neuroscience Investigation MINIMESS Minimal Metagenome Sequence Analysis Standard MINSEQE Minimum Information about a high-throughput SeQuencing Experiment MIPFE Minimal Information for Protein Functional Evaluation MIQAS Minimal Information for QTLs and Association Studies MIqPCR Minimum Information about a quantitative Polymerase Chain Reaction experimentMIRIAM Minimal Information Required In the Annotation of biochemical Models MISFISHIE Minimum Information Specification For In Situ Hybridization and Immunohistochemistry ExperimentsSTRENDA Standards for Reporting Enzymology DataTBC Tox Biology Checklist</p><p>BioPAX : Biological Pathways Exchange Functional Genomics Experiment MGED: Microarray Experimental Conditions</p><p> Information Models</p></li><li><p>The IdeaFor each data type..TranscriptomicsProteomicsMetabolomicsSingle Cell DataGenerate and apply.JERM templateJERM extractor for data hostSubset registered in SEEKAccess / export through JERM interface / templateDefine a JERM..Top down analysis of standardsBottom up analysis of practice123ISA-TAB</p></li><li><p>Experimental DataMetadataPeopleProjectsAssayStudyExperimental conditionsFactors studiedModelsSOPsHomogenised terminology and values in the datasets themselvesWorkflowsBased on ISA-TAB InvestigationSEEK + JERM</p></li><li><p>For publishingJERM data needs to be related to SOPs, experimental context (ISA) and other data</p><p>JERM must be MIBBI compliant for exporting to public repositoriese.g. Microarray data needs to be MIAME compliant</p></li><li><p>ISA-TABRelating data and its experimental contextInvestigation, Study, AssayTAB = tabularA format suitable for spreadsheets</p><p></p></li><li><p>ISA Provides....</p><p>A common framework for relating different types of data e.g. microarrays and proteomics Facilitates submission to international public repositories of genomics, transcriptomics and proteomics studies</p></li><li><p>Identifying Biological ObjectsWhat do you have in your data?Proteins/enzymes, genes/expression levels, metabolites</p><p>Where/how do these objects interact?Pathways, flux, experimental conditions </p><p>What models describe these interactions</p><p>Possible when using common frameworks, naming schemes and controlled vocabularies</p></li><li><p>BioPortal Integration for SearchingRepository for submitting and sharing Biological ontologies for concepts across all or selected ontologiesBioPortal provides a number of Restful WebservicesSearchConcept lookupVisualisationIntegrated within SEEK as a plugin</p></li><li><p>Tools to help manage data: Annotation standards by stealthControlled vocabulary plug inBioPortal</p></li><li><p>Following StandardsWe recommend formats but we do not enforce themProtocols and SOPs Nature ProtocolsData JERM models and community minimum information modelsModels SBML and related standardsPublications PubMed and DOIIf you follow the prescribed formats, you get more out, but if you dont, you can still participate lowering the adoption barrier</p></li><li><p>Off the shelfExcept for the JERM, we have only used community resources, vocabularies and services</p><p>You can get a long way by implementing community practices and providing ways to integrate them</p></li><li><p>SysMO-DB and Models</p></li><li><p>Nicolas Le Novere, Data Integration in the Life Sciences, Manchester, 2009</p></li><li><p>Models: Incentives for using Standards</p><p>Models can be shared in SysMO-SEEK in any format SBML is the recommended formatWe also recommend MIRIAM compliance and SBO annotation</p><p>If you use SBML, you can use JWS Online to run simulations in SEEK</p></li><li><p>Screenshot of JWS OnlineJWS Online Pluginonline simulator, runs in your browserupload models in SBML formatWeb Service enabledSBGN schemas, with annotations and external links</p></li><li><p>Falko Krause, Humboldt-University, Berlin</p></li><li><p>Models ResourcesModels can be published in public repositoriesJWS-Online, BioModelsModels can be annotated SBML, MIRIAM, SBONo public resources currently for sharing models with associated data, or for loading new data into models</p></li><li><p>Linking Data to Models</p><p>Relating data and modelsWhere did the data come from for developing the model?Where did the data come from for validating the model?What were the results of model simulations?</p></li><li><p>Current Functionality in SEEK</p><p>Show all data used for construction together with the model, such that process can be repeatedUploaded models loaded with this data by defaultManually alter parameters and run simulations</p></li><li><p>Next Steps: Model ValidationTest/compare model with experimental data for complete systemFind data in SEEKUpload data from elsewhereAutomatically load into modelRun simulations and compare with original results</p><p>JERM for modelsMapping tools allows you to identify columns/rows in spreadsheets containing the right information</p></li><li><p>ISA for ModelsModelling and experimental work intersectInvestigations, Study, Assay.....or modelling analysis.....Modelling analysis typesMetabolic models, gene networksModelling typeODE, algebraic</p><p>Studies combinations of experimental assays, modelling analyses, and informatics analyses</p></li><li><p>SysMO-DB the e-LaboratoryAn e-Laboratory is an information system for bringing together people, data and analytical methods at the point of investigation or decision-making</p></li><li><p>Current Status </p><p>Finding things so that we can compare themUnderstanding who has whatUnderstanding what can be compared with what the experimental context</p></li><li><p>Where we are goingA dynamic resource for analysis as well as browsing</p><p>Automatic comparison of data from inside filesUnderstanding where and how data and models are linkedRunning simulations with new experimental dataRunning analyses and workflows over the data and models</p></li><li><p>Workflows from myExperimentData preparation, annotation and analysis Systems Biology workflow Pack on myExperimentMicroarray analysis and text mining</p><p>Created by Afsaneh Maleki-Dizajifrom SUMO, University of SheffieldBased on previous work by Paul Fisher, University of Manchester</p></li><li><p>SEEK as a data analysis and meta analysis serviceSBML model construction and population</p><p>Calibration workflowData requirementsParameterised SBML modelExperimental dataMetabolite concentrations from key results databaseCalibration by COPASI web servicePeter Li</p></li><li><p>Data analysis and meta analysisSEEK Analysis Service with pre-cooked analysis tools.Calibration workflowData requirementsParameterised SBML modelExperimental dataMetabolite concentrations from key results databaseCalibration by COPASI web servicePeter LiLoad model: Load data:GO</p></li><li><p>New Directions</p></li><li><p>Opening SysMO OutUsing SysMO as a dissemination space for the SysMO consortiumSupplementary material in publicationsData citationPackaging software so that others can use itEasy to install a SEEK for yourselfPackaging and exchanging JERM TemplatesHelping with standardisationPromotion and example work with SBRML and data and models linkage</p></li><li><p>SysMO-DB Approach in Other projects</p><p>SysMO2 new projects and legacyEraSysBio+Lungsys and SBCancerVirtual Liver</p></li><li><p>New Considerations</p><p>Eukaryotic organismsInteractions between host and pathogenHuman diseasemulticellular interactions, tissues, organs multiscale modelling</p></li><li><p>Outstanding Issues</p><p>Keeping data at project sites has responsibilitiesReliability - Sites available continuously and promptlySupport - Must be proof against virus attacks, etc.Archiving - Beyond the lifetime of the project. </p></li><li><p>How it worksFind a solution that fits in with current practicesStart simple, show benefits, add moreEngage with the people actually doing the workPhD students, Post-docsLet the scientists retain control over their data and who can see itDont reinvent. Use available vocabularies, minimal model standards Help prevent people duplicating work by linking the people as well as the resources</p></li><li><p>Acknowledgements</p><p>SysMO-DB TeamSysMO-PALS</p><p>myGrid, Hits and JWS Online teamsEMBL-EBI, MCISB</p><p></p><p>**###same as last slide### people reluctant to share*Social challenge*</p><p>The SysMO ProjectsEleven out of 32 collaborative projects have been selected for funding within SysMO. Working groups from the six partner countries Austria (2), Germany (29), Norway (7), Spain (9), The Netherlands (15) and the United Kingdom (22) as well as from the Czech Republic (1), France (2) and Switzerland (4) contribute their experience and knowledge and will ensure the success of the SysMO initiative. The projects are covering different fields of interests and will be supported by the partner institutions with appr. 28 M Project 1: BaCell-SysMO The transition from growing to non-growing Bacillus subtilis cells - A systems biology approachProject 2: COSMICSystems Biology of Clostridium acetobutylicum - a possible answer to dwindling crude oil reservesProject 3: SUMOSystems Understanding of Microbial Oxygen ResponsesProject 4:Ion and solute homeostasis in enteric bacteria: an integrated view generated from the interface of modelling and biological experimentationProject 5:Comparative Systems Biology: Lactic Acid Bacteria Project 6: PSYSMOSystems analysis of biotech induced stresses: towards a quantum increase in process performance in the cell factory Pseudomonas putidaProject 7:Systems Biology of a genetically engineered Pseudomonas fluorescens with inducible exo-polysaccharide production: analysis of the dynamics and robustness of metabolic networksProject 8: MOSES MicroOrganism Systems Biology: Energy and Saccharomyces cerevisiaeProject : TRANSLUCENT Gene interaction networks an...</p></li></ul>


View more >