47
A summary of various COMBINE standardization activities Michael Hucka, Ph.D. Department of Computing and Mathematical Sciences California Institute of Technology Pasadena, CA, USA Whole-cell modeling summer school, Rostock, Germany, March 2015

A summary of various COMBINE standardization activities

Embed Size (px)

Citation preview

Page 1: A summary of various COMBINE standardization activities

A summary of various COMBINE standardization activities

Michael Hucka, Ph.D. Department of Computing and Mathematical Sciences

California Institute of Technology Pasadena, CA, USA

Whole-cell modeling summer school, Rostock, Germany, March 2015

Page 2: A summary of various COMBINE standardization activities

The need for standards

Page 3: A summary of various COMBINE standardization activities
Page 4: A summary of various COMBINE standardization activities
Page 5: A summary of various COMBINE standardization activities

ABC-SysBio CellNetAnalyzer Karyote* PaVESy SBW: Auto Layout acslXtreme CellNOpt KEGGconverter PAYAO sbw: javasim ALC Cellware KEGGtranslator PET sbw: stochastic simulator AMIGO CLEML Kineticon PhysioLab Modeler SCIpath Antimony CL-SBML Kinsolver PINT SED-ML Web Tools APMonitor COBRA libAnnotationSBML PK-Sim / MoBi semanticSBML Arcadia CompuCell3D libRoadRunner PNK SensSB Asmparts ConsensusPathDB libSBML PottersWheel SGMP Athena COPASI libSBMLSim PRISM Sigmoid* AutoSBW CRdata libStruct ProcessDB SIGNALIGN AVIS CycSim MASS Toolbox ProMoT SignaLink BALSA CySBML MatCont PROTON SigPath BASIS Cytoscape MathSBML pybrn SigTran BetaWB Cyto-Sim Medicel PyDSTool SIMBA Bifurcation Discovery Tool DBSolve MEMOSys PySB SimBiology BiGG DEDiscover MesoRD PySCeS Simpathica BiNoM Dizzy Meta-All RANGE SimPheny* BiNoM Cytoscape Plugin DOTcvpSB Metaboflux RAVEN Simulate3D Bio Sketch Pad E-CELL MetaCrop Reactome Simulation Core Library BioBayes ecellJ MetaFluxNet ReMatch Simulation Tool BIOCHAM EPE Metannogen RMBNToolbox SimWiz BioCharon ESS Metatool roadRunner SloppyCell BioCyc Facile MetExplore RSBML SmartCell BioGRID FAME MetNetMaker SABIO-RK Snoopy Biological Networks FASIMU MIRIAM Resources Saint SOSlib BioMet Toolbox FBASBW MMT2 SBFC SPDBS BioModels Database FERN modelMaGe SBML Harvester SRS BioModels Importer FluxBalance ModeRator SBML Layout STEPS BioNessie Fluxor Modesto SBML Reaction Finder StochKit BioNetGen Genetdes Moleculizer SBML Translators StochPy BioPARKIN Genetic Network Analyzer MonaLisa SBML2APM StochSim BioPathwise Gepasi Monod SBML2BioPax STOCKS BioPAX2SBML Gillespie2 MOOSE SBML2LaTeX SurreyFBA BioRica GINsim MuVal (Multi-valued logic) SBML2NEURON SyBiL BioSens GNAT Narrator SBML2Octave SYCAMORE BioSPICE Dashboard GNU MCSim nemo SBML2SMW SynBioSS BioSpreadsheet GRENDEL NetBuilder' SBML2TikZ Systrip BioSyS HSMB NetPath SBML2XPP TERANODE Suite BioTapestry HybridSBML NetPro SBMLEditor The Cell Collective BioUML iBioSim Odefy SBML-PET-MPI Tide BoolNet IBRENA Omix SBMLR TinkerCell braincirc Insilico Discovery ONDEX SBML-SAT Trelis BRENDA insilicoIDE optflux SBML-shorthand UTKornTools BSTLab iPathways Oscill8 SBMLSim VANTED ByoDyn JACOBIAN PANTHER Pathway SBMLsqueezer Vcell CADLIVE Jacobian Viewer PathArt sbmltidy WebCell Cain Jarnac Pathway Access SBMLToolbox WinSCAMP CARMEN JarnacLite Pathway Analyser SBMM assistant Wolfram SystemModeler Cell Illustrator JDesigner Pathway Builder SBO xCellerator CellDesigner JigCell Pathway Solver SBSI Xholon Cellerator JSBML Pathway Tools SBToolbox2 XPPAUT CellMC JSim PathwayLab sbtranslate CellML2SBML JWS Online PATIKAweb SBW

Many software tools for modeling available today

Page 6: A summary of various COMBINE standardization activities

https://www.behance.net/gallery/d/7465033

Modeling often entails the use of more than one tool

Page 7: A summary of various COMBINE standardization activities

Different tools ⇒ different interfaces & languages

Page 8: A summary of various COMBINE standardization activities

Developing standards is not easyNeed bring together people with diverse set of knowledge & skills

• Scientific needs

• Technical implementation skills

• Practical experience

Need manage multiple phases

• Creation

• Evolution

• Support

Page 9: A summary of various COMBINE standardization activities

Developing standards is not easyNeed bring together people with diverse set of knowledge & skills

• Scientific needs

• Technical implementation skills

• Practical experience

Need manage multiple phases

• Creation

• Evolution

• Support} This is just for the specification of the

standards, to say nothing of the necessary software and other infrastructure!

Page 10: A summary of various COMBINE standardization activities

Very recent example of how doing a poor job can be costly

Claim: published FBA results are incorrect. Reality: poor file format led to misinterpretation.

Page 11: A summary of various COMBINE standardization activities

Where does COMBINE fit in?

Page 12: A summary of various COMBINE standardization activities

Realizations about the state of affairs in late-2000’s

• Many efforts overlapped, but lacked coordination

• Efforts were inventing their own processes from scratch

• Many individual meetings meant more travel for many people

• Limited and fragile funding didn’t support solid, coherent base

COMBINE = Computational Modeling in Biology Network

• Coordinate meetings

• Coordinate standards development

• Develop common procedures & tools

• Provide a recognized voice

Motivations for the creation of COMBINE

Page 13: A summary of various COMBINE standardization activities

Standardization efforts represented in COMBINE today

BioPAX

Qualifiers

GPML

COMBINE Standards

Associated Standardization Efforts

Related Standardization Efforts

COMBINE Archive

MAMO

Page 14: A summary of various COMBINE standardization activities

COMBINE formats cover many areas– from Nicolas Le Novère

Page 15: A summary of various COMBINE standardization activities

Example of practical infrastructure provided by COMBINECommon URI scheme for specification documents

!

!

!

!

!

!

E.g.: http://identifiers.org/combine.specifications/sbgn.er.level-1.version-1

• Resolves to a page that lists where the specification can be found

- Actual document can be stored anywhere

• Hierarchical structure

Page 16: A summary of various COMBINE standardization activities

SBML in (slightly) more depth

Page 17: A summary of various COMBINE standardization activities

Background context: formal models

{

Page 18: A summary of various COMBINE standardization activities

Communication of models in the olden days…Describe the models and equations in a paper, and you’re done, right?

Problems:

• Errors in printing

• Missing information

• Dependencies onimplementation

• Outright errors

• Can be a hugeeffort to recreate

Solution: use a standardelectronic format

Page 19: A summary of various COMBINE standardization activities
Page 20: A summary of various COMBINE standardization activities

Format for representing models of biological processes

• Data structures + principles + serialization to XML

• (Mostly) Declarative, not procedural—not a scripting language

(Mostly) neutral with respect to modeling framework

• E.g., ODE, stochastic systems, etc.

Does not store experimental data, or simulation descriptions

• But software may write their own metadata (annotations) in SBML

For software to read/write, not humans

SBML = Systems Biology Markup Language

Page 21: A summary of various COMBINE standardization activities

SBML is a file format based on XML

Page 22: A summary of various COMBINE standardization activities

SBML is a file format based on XML

Software shields you from working with this directly

Page 23: A summary of various COMBINE standardization activities

The process is central

• Literally called “reaction” (not necessarily biochemical)

• Participants are pools of entities of the same kind (“species”) !

!

!

!

• Species are located in containers (“compartments”)

- Core SBML assumes well-mixed compartments

Models can further include:

• Other constants & variables

• Discontinuous events

Core SBML concepts are fairly simple

• Unit definitions

• Annotations

• Other, explicit math

na1 A nb1 B+ nc1 Cf1(...)

na2 A nd2 D+ ne1 Ef2(...)

. . .nc3 C nf3 F

f3(...) + ng3 G

Page 24: A summary of various COMBINE standardization activities

Core SBML constructs support many types of models

!

ODE (e.g., cell differentiation)

Conductance-based (e.g., Hodgin-Huxley)

Typically do not use SBML “reaction” construct,but instead use “rate rules” construct

Neural (e.g., spiking neurons)

Typically use “events” for discontinuous changes

Pharmacokinetic/dynamics

“Species” are not required to be biochemical entities

Infectious diseases BioModels Database model #MODEL1008060001

BioModels Database model #BIOMD0000000451

BioModels Database model #BIOMD0000000020

BioModels Database model #BIOMD0000000127

BioModels Database model #BIOMD0000000234

Example of model type Example model

List originally by Nicolas Le Novére

Page 25: A summary of various COMBINE standardization activities

Community-based development processDefines process for

• Proposing changes and additions to SBML and SBML packages

• Developing specifications

• Voting

• The roles of editors

!

Small changes forthcoming in package requirements and procedures

Page 26: A summary of various COMBINE standardization activities

Frank Bergmann

Andreas Dräger

Michael Hucka

Brett Olivier

Sven Sahle

Dagmar Waltemath

SBML Editors

Stefan Hoops

Sarah Keating

Nicolas Le Novère

Chris Myers

James Schaff

Lucian Smith

Darren Wilkinson

PastCurrent(chair)

Page 27: A summary of various COMBINE standardization activities

SBML Level 1 SBML Level 2 SBML Level 3

predefined math functions user-defined functions user-defined functions

text-string math notation MathML subset MathML subset

reserved namespaces for annotations

no reserved namespaces for annotations

no reserved namespaces for annotations

no controlled annotation scheme

RDF-based controlled annotation scheme

RDF-based controlled annotation scheme

no discrete events discrete events discrete events

default values defined default values defined no default values

monolithic monolithic modular & extensible

Page 28: A summary of various COMBINE standardization activities

Level 3 packages add constructs on top of SBML Level 3 Core

Page 29: A summary of various COMBINE standardization activities

SBML Level 3 What it Hierarchical model composition Models containing submodels ✔

Flux balance constraints Constraint-based models ✔

Qualitative models Petri net models, Boolean models ✔

Graph layout Diagrams of models ✔

Multicomponent/state species Entities w/ structure; also rule-based models draft

Spatial Nonhomogeneous spatial models draft

Graph rendering Diagrams of models draft

Groups Arbitrary grouping of components draft

Arrays Arrays of entities draft

Dynamic structures Creation & destruction of components draft

Distributions Numerical values as statistical distributions draft

Annotations Richer annotation syntax

Status

Page 30: A summary of various COMBINE standardization activities

SBML Level 3 ‘comp’: Hierarchical Model Composition

Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

Core SBML

Page 31: A summary of various COMBINE standardization activities

SBML Level 3 ‘comp’: Hierarchical Model Composition

Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

Core SBML

Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

With hierarchical model composition

Species ...Compartments ...

Parameters ...Reactions ...

Model “B”

Species ...Compartments ...

Parameters ...Reactions ...

Model “C”

Page 32: A summary of various COMBINE standardization activities

The ‘comp’ package supports multiple arrangements

Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

Species ...Compartments ...

Parameters ...Reactions ...

Model “B”

Separate files (possibly in databases)

Species ...Compartments ...

Parameters ...Reactions ...

Model “C”

Model “C”

Model “D”

Species ...Compartments ...

Parameters ...Reactions ...

Model “D”

Model “B”

(Think of libraries of

tested models.)

Page 33: A summary of various COMBINE standardization activities

Results can be “flattened” to plain SBML Level 3 CoreFacility implemented in libSBML

Allows tools to read L3 + ‘comp’ models as if they were just plain L3

Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

Species ...Compartments ...

Parameters ...Reactions ...

Model “B”

Species ...Compartments ...

Parameters ...Reactions ...

Model “C”

Model “C”

Model “D”

Species ...Compartments ...

Parameters ...Reactions ...

Model “D”

Model “B” Species ...Compartments ...

Parameters ...Reactions ...

Model “A”

SBML Level 3 CoreOriginal SBML Level 3 Core + SBML ‘comp’

Page 34: A summary of various COMBINE standardization activities

Layer on top of SBML Level 3 Core to define syntax for constraint-based (e.g., flux-balance analysis) models

Nicknamed ‘fbc’

Developed by Brett Olivier and Frank Bergmann

SBML Level 3 ‘fbc’: Flux Balance Constraints

Page 35: A summary of various COMBINE standardization activities

Layer on top of SBML Level 3 Core to define syntax for constraint-based (e.g., flux-balance analysis) models

Nicknamed ‘fbc’

Developed by Brett Olivier and Frank Bergmann

SBML Level 3 ‘fbc’: Flux Balance Constraints

New version was just agreed upon last week!

New draft specification is under development.

Page 36: A summary of various COMBINE standardization activities

SBML Level 3 ‘distrib’: DistributionsLayer on top of SBML Level 3 Core to provide mechanisms for:

1. Sampling a random value from a probability distribution

2. Describe uncertainty about values of elements

Uses UncertML 3.0 for the definition of various distribution functions

Page 37: A summary of various COMBINE standardization activities

Outli

ne

Introduction and motivation

COMBINE

SBML

SED-ML

SBGN

Conclusion

Page 38: A summary of various COMBINE standardization activities

Need to capture the processes applied to models

?

BIOMD0000000319 in BioModels Database

Decroly & Goldbeter, PNAS, 1982

Page 39: A summary of various COMBINE standardization activities

Format to capture procedures, algorithms, parameter values

• Neutral format for encoding the steps to go from model to output

Can be used for

• Simulation experiments encoding parametrizations & perturbations

• Simulations using more than one model and/or method

• Data manipulations to produce plot(s)

!

SED-ML = Simulation Experiment Description ML

Page 40: A summary of various COMBINE standardization activities

Software apps & libraries available for SED-ML Level 1 v.1Some SED-ML-compatible software today:

• libSedML

• jlibsedml

• SBW Simulation Tool

• CellDesigner

• Web tools

• others

http://sedml.org

Page 41: A summary of various COMBINE standardization activities

Outli

ne

Introduction and motivation

COMBINE

SBML

SED-ML

SBGN

Conclusion

Page 42: A summary of various COMBINE standardization activities

Graphical representation of modelsToday: broad variation in graphical notation used in biological diagrams

• Between authors, between journals, even people in same group

However, standard notations would offer benefits:

• Consistency = easier to read diagrams with less ambiguity

• Software support: verification of correctness, translation to math

Page 43: A summary of various COMBINE standardization activities

SBGN = Systems Biology Graphical NotationGoal: standardize the graphical notation in diagrams of biological processes

3 sublanguages to describe different facets of a model

• Process Diagram: causal sequences of processes & their results

- A node represents a given state of an entity

• Entity Relationship: interactions bet. entities regardless of sequence

- A node represents an entity regardless of state

• Activity Flow: information flowing from one entity to another

- Hybrid — shows flow of activity without state transitions

Languages reuse same symbols, but their interpretations are different

Page 44: A summary of various COMBINE standardization activities

SBGN support todayBeing used in publications

Numerous software tools and databases

• API libraries are under development

See http://sbgn.org for more

Martin et al., Autophagy, Jan. 2013

Reactome Database — http://reactome.org

Page 45: A summary of various COMBINE standardization activities

Acknowledgments

Page 46: A summary of various COMBINE standardization activities

Huge thanks to everyone in the COMBINE community

Attendees of COMBINE 2013, Paris, France

Attendees of COMBINE 2014, Los Angeles, California, USA

Page 47: A summary of various COMBINE standardization activities

National Institute of General Medical Sciences (USA) Air Force Office of Scientific Research (USA) BBSRC (UK) Beckman Institute, Caltech (USA) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Drug Disease Model Resources (EU-EFPIA Innovative Medicine Initiate) ELIXIR (UK) European Molecular Biology Laboratory (EMBL) Google Summer of Code International Joint Research Program of NEDO (Japan) Japanese Ministry of Agriculture Japanese Ministry of Education, Culture, Sports, Science and Technology JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) JST ERATO-SORST Program (Japan) Keio University (Japan) Molecular Sciences Institute (USA) National Science Foundation (USA) STRI, University of Hertfordshire (UK)

SBML funding sources over the past 14 years