Upload
mike-hucka
View
95
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Morning tutorial given at the COMBINE/ERASysApp day of tutorials on "Modelling and Simulation of Biological Models" on Sunday, September 14, ahead of ICSB 2014 in Melbourne, Australia.
Citation preview
SBML (the Systems Biology Markup Language)Michael Hucka, Ph.D.
Department of Computing and Mathematical Sciences California Institute of Technology
Pasadena, CA, USA
COMBINE/ERASysApp tutorial, Melbourne, Australia, September 2014
Email: [email protected] Twitter: @mhucka
What is SBML for, and why would anyone care?
ABC-SysBio CellNetAnalyzer Karyote* PaVESy SBW: Auto Layout acslXtreme CellNOpt KEGGconverter PAYAO sbw: javasim ALC Cellware KEGGtranslator PET sbw: stochastic simulator AMIGO CLEML Kineticon PhysioLab Modeler SCIpath Antimony CL-SBML Kinsolver PINT SED-ML Web Tools APMonitor COBRA libAnnotationSBML PK-Sim / MoBi semanticSBML Arcadia CompuCell3D libRoadRunner PNK SensSB Asmparts ConsensusPathDB libSBML PottersWheel SGMP Athena COPASI libSBMLSim PRISM Sigmoid* AutoSBW CRdata libStruct ProcessDB SIGNALIGN AVIS CycSim MASS Toolbox ProMoT SignaLink BALSA CySBML MatCont PROTON SigPath BASIS Cytoscape MathSBML pybrn SigTran BetaWB Cyto-Sim Medicel PyDSTool SIMBA Bifurcation Discovery Tool DBSolve MEMOSys PySB SimBiology BiGG DEDiscover MesoRD PySCeS Simpathica BiNoM Dizzy Meta-All RANGE SimPheny* BiNoM Cytoscape Plugin DOTcvpSB Metaboflux RAVEN Simulate3D Bio Sketch Pad E-CELL MetaCrop Reactome Simulation Core Library BioBayes ecellJ MetaFluxNet ReMatch Simulation Tool BIOCHAM EPE Metannogen RMBNToolbox SimWiz BioCharon ESS Metatool roadRunner SloppyCell BioCyc Facile MetExplore RSBML SmartCell BioGRID FAME MetNetMaker SABIO-RK Snoopy Biological Networks FASIMU MIRIAM Resources Saint SOSlib BioMet Toolbox FBASBW MMT2 SBFC SPDBS BioModels Database FERN modelMaGe SBML Harvester SRS BioModels Importer FluxBalance ModeRator SBML Layout STEPS BioNessie Fluxor Modesto SBML Reaction Finder StochKit BioNetGen Genetdes Moleculizer SBML Translators StochPy BioPARKIN Genetic Network Analyzer MonaLisa SBML2APM StochSim BioPathwise Gepasi Monod SBML2BioPax STOCKS BioPAX2SBML Gillespie2 MOOSE SBML2LaTeX SurreyFBA BioRica GINsim MuVal (Multi-valued logic) SBML2NEURON SyBiL BioSens GNAT Narrator SBML2Octave SYCAMORE BioSPICE Dashboard GNU MCSim nemo SBML2SMW SynBioSS BioSpreadsheet GRENDEL NetBuilder' SBML2TikZ Systrip BioSyS HSMB NetPath SBML2XPP TERANODE Suite BioTapestry HybridSBML NetPro SBMLEditor The Cell Collective BioUML iBioSim Odefy SBML-PET-MPI Tide BoolNet IBRENA Omix SBMLR TinkerCell braincirc Insilico Discovery ONDEX SBML-SAT Trelis BRENDA insilicoIDE optflux SBML-shorthand UTKornTools BSTLab iPathways Oscill8 SBMLSim VANTED ByoDyn JACOBIAN PANTHER Pathway SBMLsqueezer Vcell CADLIVE Jacobian Viewer PathArt sbmltidy WebCell Cain Jarnac Pathway Access SBMLToolbox WinSCAMP CARMEN JarnacLite Pathway Analyser SBMM assistant Wolfram SystemModeler Cell Illustrator JDesigner Pathway Builder SBO xCellerator CellDesigner JigCell Pathway Solver SBSI Xholon Cellerator JSBML Pathway Tools SBToolbox2 XPPAUT CellMC JSim PathwayLab sbtranslate CellML2SBML JWS Online PATIKAweb SBW
Many software tools for modelingand simulation are available
https://www.behance.net/gallery/d/7465033
Research often involves the use of more than one tool
Need flexible way to exchange results between tools (and researchers)
Why not simply distribute a model in the original format?
Why not simply distribute a model in the original format?Of course, you should do that anyway – it’s vital for good science
• Allows others to run the original model, understand it, verify it, etc.
Why not simply distribute a model in the original format?Of course, you should do that anyway – it’s vital for good science
• Allows others to run the original model, understand it, verify it, etc.
But native formats by themselves are not ideal for communication
• Not everyone can run the same software
- One tool’s native format is not necessarily another tool’s format
• Biological semantics often not encoded in native formats
• A neutral format can allow more flexible model reuse
• A common format can allow people to relate your work to theirs more directly (and algorithmically)
Why not simply distribute a model in the original format?Of course, you should do that anyway – it’s vital for good science
• Allows others to run the original model, understand it, verify it, etc.
But native formats by themselves are not ideal for communication
• Not everyone can run the same software
- One tool’s native format is not necessarily another tool’s format
• Biological semantics often not encoded in native formats
• A neutral format can allow more flexible model reuse
• A common format can allow people to relate your work to theirs more directly (and algorithmically)
!global p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19; … !% Declare parameters u1 = inf1; u4 = inf4; kdelay = 5/8; … p1 = 0.00174155; p2 = 8; p3 = 0.868; p4 = 0.108; p5 = 584; … q4F = (p24+p25*q(1)+p26*q(1)^2+p27*q(1)^3)*q(4); q1F = (p7 +p8 *q(1)+p9 *q(1)^2+p10*q(1)^3)*q(1); SR3 = (p19*q(19))*d3; SR4 = (p1 *q(19))*d1; … % ODEs q1dot = SR4+p3*q(2)+p4*q(3)-(p5+p6)*q1F+p11*q(11)+u1; q2dot = p6*q1F-(p3+p12+p13/(p14+q(2)))*q(2); q3dot = p5*q1F-(p4+p15/(p16+q(3))+p17/(p18+q(3)))*q(3); …
Example fragment of model in MATLAB
Why not simply distribute a model in the original format?Of course, you should do that anyway – it’s vital for good science
• Allows others to run the original model, understand it, verify it, etc.
But native formats by themselves are not ideal for communication
• Not everyone can run the same software
- One tool’s native format is not necessarily another tool’s format
• Biological semantics often not encoded in native formats
• A neutral format can allow more flexible model reuse
• A common format can allow people to relate your work to theirs more directly (and algorithmically)
What is SBML?
SBML is a file format based on XML
SBML is a file format based on XML
Don’t work with it directly! Let software do it.
Format for representing models of biological processes
• Data structures + principles + serialization to XML
• (Mostly) Declarative, not procedural—not a scripting language
(Mostly) neutral with respect to modeling framework
• E.g., ODE, stochastic systems, etc.
Does not store experimental data, or simulation descriptions
• But software may write their own metadata (annotations) in SBML
For software to read/write, not humans
SBML = Systems Biology Markup Language
The process is central
• Literally called “reaction” (not necessarily biochemical)
• Participants are pools of entities of the same kind (“species”) !
!
!
!
• Species are located in containers (“compartments”)
- Core SBML assumes well-mixed compartments
Models can further include:
• Other constants & variables
• Discontinuous events
Core SBML concepts are fairly simple
• Unit definitions
• Annotations
• Other, explicit math
na1 A nb1 B+ nc1 Cf1(...)
na2 A nd2 D+ ne1 Ef2(...)
. . .nc3 C nf3 F
f3(...) + ng3 G
Core SBML constructs support many types of models
!
Typical ODE models (e.g., cell differentiation)
Conductance-based models (e.g., Hodgin-Huxley)
Typically do not use SBML “reaction” construct,but instead use “rate rules” construct
Neural models (e.g., spiking neurons)
Typically use “events” for discontinuous changes
Pharmacokinetic/dynamics models
“Species” are not required to be biochemical entities
Infectious diseases BioModels Database model #MODEL1008060001
BioModels Database model #BIOMD0000000451
BioModels Database model #BIOMD0000000020
BioModels Database model #BIOMD0000000127
BioModels Database model #BIOMD0000000234
Example of model type Example model
List originally by Nicolas Le Novére
Many examples of SBML and software resources are available
Accepted by dozens of journals *
100’s of software tools available today
• Libraries: libSBML, JSBML
• 260+ listed in SBML Software Guide †
1000’s of models available
• ... in public databases, e.g., BioModels Database, Reactome
• ... as supplementary data to papers
• ... in private repositories
!!* http://sbml.org/Documents/Publications_known_to_accept_submissions_in_SBML_format † http://sbml.org/SBML_Software_Guide
http://sbml.org
Where to find software applications compatible with SBML
Where to find software applications compatible with SBML
Find SBML software
Results of 2011 survey of SBML-compatible softwareQuestion: Which of the following categories best describe your software? (Check all that apply.)
Simulation software
Analysis s/w (in addition, or instead of, simulation)
Creation/model development software
Visualization/display/formatting software
Utility software (e.g., format conversion)
Data integration and management software
Repository or database
Framework or library (for use in developing s/w)
S/w for interactive env. (e.g., MATLAB, R, ...)
Annotation software0 20 40 60 80
11
13
13
14
16
23
31
31
40
42
Out of 81 responses
What are some practical details useful to know about SBML?
From modeling environments From databases
From papersFrom other people
SBML filescan come from
anywhere
SBML FilesFormat: UTF-8 (a text file format)
Extension: usually .xml (sometimes .sbml)
Applications usually have their own native format
• Import/export SBML
Models exist in different “Levels and Versions” of SBML
• Indicated at top of file
SBML identifiers and namesMost elements have both an “id” and a “name” field
• Identifier field has restricted syntax: abc123 or _abc123, etc.
- This “id” field value is what you use in math formulas in SBML
• Value of “name” field is almost completely unrestricted
- Names with spaces, $tr@nge and Fμηηγ characters!
• Must assign a value to “id”, but “name” is optional
Some tools let you use names and ignore ids—they autogenerate ids
• But some (especially those w/ script language features) expose ids
SBML “rules”“Rules” in SBML define extra mathematical expressions
• E.g.: if need to express additional mathematical relationships beyond what is implied by the system of reactions
3 subtypes:
!
!
!
!
!
Rules define relationships that hold at all times
Rule type General form Example
algebraic 0 = S1 + S2
assignment x = y + z
rate dS/dt = 10.5
x = f(V)
0 = f(W)
dx/dt = f(W)
Rules in the context of the overall model
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
Rules in the context of the overall model
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
Equations derived from reaction
definitions
Rules in the context of the overall model
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
Algebraic rules
Rules in the context of the overall model
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
Assignment rules
Rules in the context of the overall model
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
Rate rules
Rules in the context of the overall model
Rules and equations from reactions are
taken together
dS1/dt = r1 + r2 + r3 + . . .
dS2/dt = �r1 + r5 + . . .
. . .
0 = f1(W)0 = f2(W)
. . .
x = g1(W)y = g2(W)
. . .
dm/dt = h1(W)dq/dt = h2(W)
. . .
rate law = k · [S1]rate law =
Why are SBML rate expression not identical to rate laws?
• Consider reaction between 2 compartments, #1 and #2
• Assume told . What does this mean?
!
• But what if volume ?
- Look at # of molecules in each compartment:
!- How molecules leave & enter each compartment?
molecules leave the first compartment
molecules enter the second compartment
Interpreting reactions in SBML
S1 � S2
S1
S2
V1V2
d[S2]dt
= �d[S1]dt
= k · [S1]
V2 = 3V1
nS1 = [S1] · V1 nS2 = [S2] · V2
k · [S1] · V1
3 · k · [S1] · V1
rate law = k · [S1]rate law =
Why are SBML rate expression not identical to rate laws?
• Consider reaction between 2 compartments, #1 and #2
• Assume told . What does this mean?
!
• But what if volume ?
- Look at # of molecules in each compartment:
!- How molecules leave & enter each compartment?
molecules leave the first compartment
molecules enter the second compartment
Interpreting reactions in SBML
S1 � S2
S1
S2
V1V2
d[S2]dt
= �d[S1]dt
= k · [S1]
V2 = 3V1
nS1 = [S1] · V1 nS2 = [S2] · V2
k · [S1] · V1
3 · k · [S1] · V1
Oops! Something’s wrong …
“Kinetic law” in SBML
Rate expressions are (usually) substance/time, not substance/size/time
Conversion for basic cases is simple:
• Multiply by volume of compartment where reactants are located: !
!• Express rates of changes of reactants & products in terms of substances:
!
!
!
!
• Can easily recover concentrations:
rate = k · [S1] · V1
dnS1
dt= �k1 · [S1] · V1
[S1] =nS1
V1[S2] =
nS2
V2
moles, or items, or mass, or
“dimensionless”
dnS2
dt= k1 · [S1] · V1
SBML itself provides syntax and only limited semantics
SBML itself provides syntax and only limited semantics
No standard identifiers
SBML itself provides syntax and only limited semantics
Low info content
No standard identifiers
SBML itself provides syntax and only limited semanticsRaw models alone are insufficient
Need standard schemes for machine-readable annotations
• Identify entities
• Clarify mathematical semantics
• Link to other data resources
• Provide authorship & pub. info
Low info content
No standard identifiers
SBML supports two annotation schemesSBO (Systems Biology Ontology)
• For mathematical semantics
• One SBML object ← one SBO term
• Short, compact, tightly coupled but limited scope
MIRIAM (Minimum Information Requested In the Annotation of Models)
• For any kind of annotation
• One SBML object ← multiple MIRIAM annotations
• Larger, more free-form, wider scope
Both are externalized and independent of SBML
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000036" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000036" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
SBO:0000036
“forward bimolecular rate constant, continuous case”
Why care about standard ways of writing annotations?
Structured, machine-readable annotations increase your model’s utility
• Allow more precise identification of model components
- Understand model structure
- Search/discover models
- Compare models
• Adds a semantic layer—integrates knowledge into the model
- Helps recipients understand the underlying biology
- Allows for better reuse of models
- Supports conversion of models from one form to another
What has been happening with SBML lately?
SBML Level 1 SBML Level 2 SBML Level 3
predefined math functions user-defined functions user-defined functions
text-string math notation MathML subset MathML subset
reserved namespaces for annotations
no reserved namespaces for annotations
no reserved namespaces for annotations
no controlled annotation scheme
RDF-based controlled annotation scheme
RDF-based controlled annotation scheme
no discrete events discrete events discrete events
default values defined default values defined no default values
monolithic monolithic modular & extensible
Level 3 packages add constructs on top of SBML Level 3 Core
Level 3 package What it enablesHierarchical model composition Models containing submodels ✔
Flux balance constraints Constraint-based models ✔
Qualitative models Petri net models, Boolean models ✔
Graph layout Diagrams of models ✔
Multicomponent/state species Entities w/ structure; also rule-based models draft
Spatial Nonhomogeneous spatial models draft
Graph rendering Diagrams of models draft
Groups Arbitrary grouping of components draft
Arrays Arrays or sets of entities draft
Dynamic structures Creation & destruction of components draft
Distributions Numerical values as statistical distributions draft
Annotations Richer annotation syntax
Status
Level 3 package What it enablesHierarchical model composition Models containing submodels ✔
Flux balance constraints Constraint-based models ✔
Qualitative models Petri net models, Boolean models ✔
Graph layout Diagrams of models ✔
Multicomponent/state species Entities w/ structure; also rule-based models draft
Spatial Nonhomogeneous spatial models draft
Graph rendering Diagrams of models draft
Groups Arbitrary grouping of components draft
Arrays Arrays or sets of entities draft
Dynamic structures Creation & destruction of components draft
Distributions Numerical values as statistical distributions draft
Annotations Richer annotation syntax
Status
Updated
Updated
Updated
New
Community-based development processDefines process for
• Proposing changes and additions to SBML and SBML packages
• Developing specifications
• Voting
• The roles of editors
!
Small changes forthcoming in package requirements and procedures
Acknowledgments
Mike Hucka, Sarah Keating, Frank Bergmann, Lucian Smith, Andrew Finney, Herbert Sauro, Hamid Bolouri, Ben Bornstein, Maria Schilstra, Jo Matthews, Bruce Shapiro, Linda Taddeo, Akira Funahashi, Akiya Juraku, Ben Kovitz, Nicolas Rodriguez, Andreas Dräger, Alex Thomas
SBML & JSBML Team:
SBML Editors:Mike Hucka, Frank Bergmann, Sarah Keating, Nicolas Le Novère, Chris Myers, Lucian Smith, Stefan Hoops, Sven Sahle, James Schaff, Dagmar Waltemath, Darren Wilkinson, Brett Olivier
And another huge thanks to everyone in the SBML and COMBINE communities for massive contributions to SBML development and continuing support
GoSC students: Victor Kofia, Ibrahim Vazirabad, Leandro Watanabe
Attendees of COMBINE 2013, Paris, France
A huge thanks to the many contributors to SBML Level 3 package specifications!
SBML funding sources over the past 14 years
National Institute of General Medical Sciences (USA) Air Force Office of Scientific Research (USA) BBSRC (UK) Beckman Institute, Caltech (USA) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Drug Disease Model Resources (EU-EFPIA Innovative Medicine Initiate) ELIXIR (UK) European Molecular Biology Laboratory (EMBL) Google Summer of Code International Joint Research Program of NEDO (Japan) Japanese Ministry of Agriculture Japanese Ministry of Education, Culture, Sports, Science and Technology JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) JST ERATO-SORST Program (Japan) Keio University (Japan) Molecular Sciences Institute (USA) National Science Foundation (USA) STRI, University of Hertfordshire (UK)
I’d like your feedback! You can use this anonymous form:
http://tinyurl.com/mhuckafeedback