BAD - New... · Table of Contents: Chapter 1: Biological Problem Formulation Introduction Biological networks Multiscale modeling Dynamics overview Problem formulation languages Intuitively

Tutorial Notes:

New Mathematical Methodsfor Systems Biology

Eric MjolsnessInstitute for Genomics and Bioinformatics, University of California, Irvine

[email protected]: International Conference on Systems Biology 2006

Yokohama, Japan

(c) Eric Mjolsness 2006Not for further reproduction or dissemination.

1

Table of Contents:

Chapter 1: Biological Problem FormulationIntroduction Biological networksMultiscale modelingDynamics overviewProblem formulation languagesIntuitively useful dualities and trinitiesNotation

Chapter 2: ProbabilityGenerating functionsBranching processesMachine learningStatistical mechanics

Chapter 3: DynamicsDifferential equationsOperator algebra

Chapter 4: GraphsNetwork statisticsGraph LaplacianGraph grammarsGraph automataGraph homology

Chapter 5: GeometryMechanics of deformable media Growing mediaFinite element methodsHomology

Postscripts

ICSB_TutorialV12.nb

2

Chapter 1

Biological problem formulation

1.1 Introduction: Probability, Dynamics, Geometry, Graphs

The goal of this tutorial is to enrich the set of applied mathematical tools in common use in systemsbiology. The method followed is to describe unifying mathematical ideas and objects, and how they arise incurrent computational biology research aimed at scientifically important questions.

Since that’s a pretty tall order, the descriptions are far from mathematically rigorous, and the examplesare chosen mostly from research projects in which I’ve been a collaborator, focusing particularly on areas wherewe have departed from the standard practice. Relevant core areas in scientific computing, engineering and dataanalysis that will not be addressed much here include matrix theory, numerical solutions of differential equations,nonliner optimization, control theory, and many others. Another limitation of these notes is that they are in placessketchy or repetitive. Yet another limitation is that this is the first time they have been tried out in a tutorialsetting; they will probably improve with feedback. For all these limitations I apologize, and hope the session willspark some new ideas anyway.

The partitioning of mathematics into core areas evolves very slowly over time. One version is: algebra,geometry, analysis, and logic; with number theory, combinatorics, topology, dynamical systems, probability, andstatistics all tucked in somewhat uncomfortably under the four main headings. A subgoal here is to provide acategorization of mathematical areas that would serve the needs of applied work in systems biology, computa-tional biology, and mathematical biology as those disciplines come to be percieved as useful by experimentalbiologists.

To that end I propose a pedagogical organization consisting of Probability (including statistics, statisti-cal mechanics, and the foundations of stochastic processes), Dynamics (including differential equations, attractors,control theory, pattern formation in a fixed space, and applications of stochastic processes), Algebra (linearalgebra, matrix theory, root-finding, nonlinear optimization, some abstract algebra, all omitted in these notes);Graphs (inlcuding graph theory, network theory, and discretized models of “space” that can serve either as a staticarena for dynamics, or that may also evolve with time and/or have geometric continuous-space limits); andGeometry (including differential geometry and algebraic and differential topology e.g. of growing structures; butalso including the mechanics of spatially continuous deformable media such as elastic, viscoelastic and fluidmechanics, finite element and finite volume methods for solving the same, pattern formation in a growing orchanging space - most of which are only “tangentially” related to Riemannian manifolds). Of these topics, proba-bly the most significant for biological applications today is dynamics on graphs that represent biological networkswith a fixed or varying structure.

ICSB_TutorialV12.nb

3

1.2 Biological networks

A primary goal of systems biology is to extend and deepen our understanding of biological systems byrelating multiple spatio-temporal scales - for example to consistently understand the same system at a fine,mechanistic scale and a coarse, more functional and behavioral scale. The method for achieving this goal iscomputational modeling.

Examples of scale hierarchies include molecular, pathway, cellular and tissue levels of organization anddescription in biology .

Two central activities arise from this definition. We must investigate and describe biological systems atmultiple scales, such as molecular and cellular scales. And we must translate between biological and descriptionsand mathematical models. The two activities are related as follows: Our mathematical models must also beexpressed at multiple scales, providing a sufficiently consistent representation of the corresponding multiple-scalerelationship in biology that successful predictions can be made. Thus, two central themes for systems biology are(1) translation from biological descriptions to mathematical models and back, and (2) investigation of consistentcoarse/fine scale relationships in a scale hierarchy, within both biological descriptions and mathematical models.

The scale hierarchy theme can be elaborated as follows: what, exactly, should be described hierarchi-cally? The major scaling categories include (a) space and material, setting the spatial scale at which a system isdescribed, and (b) time, so that fast processes are at a finer scale than slower ones. These alternatives correspondroughly to objects and processes, or to nouns and verbs, respectively. In addition there are two other arenas forscaling that are more technical and arise from the computational modeling point of view: (c) the size of networksin network abstractions of biological systems, and (d) intrinsic or irreducible complexity, related to minimalprogram length definitions of complexity in computer science.

In practice some or all of these axes scale up in very rough proportion as we consider more challengingbiological modeling problems; this may be called “diagonal scaling”. Ideally and eventually we would build somekind of multidimensional hierarchy of descriptions in both biological terms, and a corresponding hierarchy ofcomputable mathematical models, with consistent translations between the two at every level resulting in a sort ofladder. As a tractable first step, we will start at the smallest and least ambitious end of all these scale hierarchies,then build up as it becomes possible to do so.

1.2.1 Strategies for doing systems biology

The major steps of systems biology research may be somewhat arbitrarily categorized as (a) obtainingrelevant data on biological objects and their interactions, (b) constructing a dynamical system or other mathemati-cal model, (c) analysing such a model, and (d) making predictions or otherwise helping to prioritize real biologicalexperiments as a result. We have argued that the key problem for current systems biology is the translation ofbiological systems into models such as realistic dynamical system models. But how can this be achieved?

Several major strategies may be identified. A direct or “network” approach is to first represent biologi-cal knowledge and data in one or more formalized connectivity maps or “labeled graphs” which can be stored andmanipulated on a computer, then to semi-automatically translate such graphs into dynamical models. A secondapproach that we may call “inferential” adds machine learning to the network approach, by partially automatingthe process of constructing the graph and/or the model from available data. Another type of approach takesexisting dynamical system models as input data, and attempts to provide other models as output. In the“multiscale” version of this approach, the output models are intended to be simpler, more understandable and/ormore computationally tractable approximating models. Ideally the resulting coarse-scale model will be usable byitself or, for high accuracy, in combination with the original less tractable model(s) [Willsky; Mjolsness GarrettMiranker 91]. In the “bottom-up” version of the model-to-model approach, the output model is a compoundmodel largely composed from input models of constituent biological subsystems possibly followed by somefurther tuning of the model to fit system-level observations in addition to subsystem-level observations. The latteralternative implements the essential decomposition/composition strategy of reductionist analysis, and should bedirectly compatible with the network approach. In fact these strategies may be used in various combinations.

ICSB_TutorialV12.nb

4

Several major strategies may be identified. A direct or “network” approach is to first represent biologi-cal knowledge and data in one or more formalized connectivity maps or “labeled graphs” which can be stored andmanipulated on a computer, then to semi-automatically translate such graphs into dynamical models. A secondapproach that we may call “inferential” adds machine learning to the network approach, by partially automatingthe process of constructing the graph and/or the model from available data. Another type of approach takesexisting dynamical system models as input data, and attempts to provide other models as output. In the“multiscale” version of this approach, the output models are intended to be simpler, more understandable and/ormore computationally tractable approximating models. Ideally the resulting coarse-scale model will be usable byitself or, for high accuracy, in combination with the original less tractable model(s) [Willsky; Mjolsness GarrettMiranker 91]. In the “bottom-up” version of the model-to-model approach, the output model is a compoundmodel largely composed from input models of constituent biological subsystems possibly followed by somefurther tuning of the model to fit system-level observations in addition to subsystem-level observations. The latteralternative implements the essential decomposition/composition strategy of reductionist analysis, and should bedirectly compatible with the network approach. In fact these strategies may be used in various combinations.

1.2.2 Network approach

It is often useful to introduce an intermediate stage of formalization, in between biological descriptionsand dynamical models. This level consists of labeled graphs of various kinds. A particular kind of labeled graphis a bipartite (labeled) graph, in which the label information includes an arbitrary “color” field with two possiblevalues, say red and blue, and red nodes are only allowed to connect to blue nodes and vice versa.

Examples of such graphs include: (1) directed or undirected graph of interactions between molecules;(2) a bipartite graph of biochemical reactions and the molecules that enter and leave those reactions; (3) a bipartitegraph of molecules that bind to specific binding sites possessed by other molecules; and (4) an undirected graph ofcompartment adjacency relationships.

Clearly, with such graphs it should be possible to represent the connectivity and many other propertiesof biological networks. Our model-construction problem then becomes two easier problems: to create the rightlabeled graphs from biological knowledge and data, and then to generate realistic dynamical systems from thelabeled graphs (perhaps using more data at this stage). From the point of view of artificial intelligence, the labeledgraphs will be a form of “knowledge representation” whose meaning is biological. They are then translated intonew data structures whose meaning is mathematical: they represent mathematical objects, namely dynamicalsystems models.

This graph-to-model translation process should respect bottom-up compositionality so that a compoundbiological system gets represented as a compound graph and translated into a compound model. This, then is onemajor strategy for accomplishing the goals of systems biology.

Figure: Hierarchical “ladder” of biological systems, networks, and models. Vertical axis: Molecular, path-way, cellular, and tissue levels (for example). Each is represented by biological and model descriptions, andtranslations between them (horizontal axis, in black). Network descriptions (in gray) are optional but often auseful intermediate. Frequently, only the network-to-model part of the translation process (horizontal axis) isformalizable.

ICSB_TutorialV12.nb

5

Figure: Hierarchical “ladder” of biological systems, networks, and models. Vertical axis: Molecular, path-way, cellular, and tissue levels (for example). Each is represented by biological and model descriptions, andtranslations between them (horizontal axis, in black). Network descriptions (in gray) are optional but often auseful intermediate. Frequently, only the network-to-model part of the translation process (horizontal axis) isformalizable.

One fundamental advantage of using a network approach where possible, is that it sets up the condi-tions required for information theory to be applied. Information theory as originally formulated by Shannonrequires an information sender, receiver, and channel. These can be mapped to two nodes connected by a link. Orat a coarser scale, they can be mapped to two subgraphs connected by a set of links. Then we can quantify theamount of information sent over the channel. One biological system could be described informationally in severaldifferent ways depending on how one identifies the channels.

Thus, despite the epochal shift of attention from physical to biological sciences, mathematics is still thelanguage of nature.

1.2.3 Network dynamic models

Another basic variation on the multiscale theme is the introduction of dynamics at each scale. Thisresults in a parallel hierarchy of “nouns” (objects such as molecules, cells, and tissues) and “verbs” (processessuch as reactions, cell division, and tissue growth). Objects and processes form a pair of parallel, or more preciselyinterleaved, multiscale hierarchies: the noun and verb hierarchies. The objects or nouns at level L participate inprocesses at level L + 1 ê2. (We needn’t commit to a universal starting value for L .) The stable attractors of thedynamics at level L + 1 ê2 frequently comprise the objects at level L + 1.

In quantum mechanics for example, electrons and nuclei are charged level-L objects participating in alevel L£ = L + 1 ê2 process described by the Schrödinger equations (or more accurately its relativistic cousin, theDirac equation). The stable lowest-energy state of a collection of n electrons bound to a nucleus of charge +n maybe identified as an “atom”, ready to play with other atoms in level L£ + 1 chemical processes. These processes arealso governed by the Schrödinger equation, and in some cases can be approximated by simpler approximate oreffective dynamics (see Section 3.18 below). Of course one can also drill downwards in the scale hierarchy, in thiscase identifying level-L objects (the charged particles) as stable, mobile excitations in underlying quantum fieldsthat also create stable “kinks” in the electromagnetic field.

ICSB_TutorialV12.nb

6

Figure: interleaved noun (object) and verb (process) scale hierarchies. Vertical arrows between objectsets are restriction maps. Rightgoing diagonal arrows are just associations, but processes obey approximatecommutation relations outlined in Section 3.19. Leftgoing diagonal arrows may be defined in some cases byattractors of dynamical systems for the corresponding processes.

In computational systems biology, it is usually the processes that are mapped to mathematical problemformulations through their dynamics, such as biochemical reaction rate laws in algebraic form. An example of thismapping is shown in the ontology (schema) of the Sigmoid pathway model database (Figure). Biologically,processes such as respiration and DNA replication can be categorized as to whether their function contributes tothe existence of objects, such as homeostatic energy-using cells, or to processes involving those objects, such ascell division, at higher levels in a scale hierarchy that is integrated by trickle-down evolutionary pressure from thelevel(s) at which selection operates.

Figure: Mathematical and biological process hierarchies implemented in the Sigmoid pathway model-ing database, www.sigmoid.org .

Even this interleaved ladder description of the scale hierarchy of nouns and verbs is a simplification,since the actual hierarchy branches both upwards and downwards in a bipartite directed acyclic graph (a DAG)that isn’t necessarily stratified into consistent level numbers except locally. If verb nodes are eliminated from thisbipartite scale graph, for example, one obtains a classic compositional hierarchy of objects and parts, which is alsoin general a DAG.

ICSB_TutorialV12.nb

7

Even this interleaved ladder description of the scale hierarchy of nouns and verbs is a simplification,since the actual hierarchy branches both upwards and downwards in a bipartite directed acyclic graph (a DAG)that isn’t necessarily stratified into consistent level numbers except locally. If verb nodes are eliminated from thisbipartite scale graph, for example, one obtains a classic compositional hierarchy of objects and parts, which is alsoin general a DAG.

Dynamical models concept map, showing several mathematically different types of dynamics and theirrelationships:

Square-based arrows (here and in subsequent “concept maps” are subset relationships or inclusions. Otherarrows are functions.

Fixed networks

Dynamics is local in a fixed graph, e.g. a graph of reactions, or of spatial compartments, or of interact-ing genes.

Time-varying networks

Dynamics is local in a variable graph. E.g. cell divide creating new spatial compartments; genesduplicate during evolution, creating new interactions; and reactions may be turned on and off by the mediation ofcyclicly expressed genes in a cell cycle.

We may model time-varying networks with gated interactions (one variable controls the interactionbetween two others) and/or gated variables (one variable’s value determines whether another variable enters thedynamical system or not).

See: Gating links in Dependency Diagrams; Node existence links in DD’s; graph grammars in SPG’s.

1.3 Multiscale modeling

1.3.1 Noun and verb scale hierarchies

Dynamical models concept map:

ICSB_TutorialV12.nb

8

1.4 Dynamics overview

Multiscale; Variable Structure Systems4 types: dynamic/steady state, stochastic/deterministic (see concept map figure).quasistatic dynamicsIterative algorithms

Arabidopsis shoot apical meristem (SAM): A multiscale variable-structure system

The growing tip (shoot apical meristem) of the vascular plant Arabidopsis thaliana provides an illustra-tive example of a variable-structure dynamical system at three different scales: the molecular, cellular, and organlevels. At the molecular level, the primary molecules are auxin (a plant growth hormone) and PIN1 (a membrane-bound auxin efflux carrier or “pump”). In a simple model [1], protein PIN1 in the membrane of cell i boundingcell j acts as a catalyst in removing auxin molecules from cell i (modeled with annihilation) and simultaneouslyinserting them into cell j (creation). Reciprocally, auxin acts on PIN1, directing its incorporation into the nearestmembrane compartments of neighboring cells and (optionally) locally enhancing its synthesis. The reactions inthis positive feedback loop can be simplified as:

lomno

auxin@iDï auxin@ jDPIN1@i, jD, « ï PIN1@iD

auxin@iD, PIN1@iDï PIN1@i, jDauxin@ jD

, PIN1@i, jD Ø «, PIN1@i, jD Ø PIN1@iD|o}~o

,

where for example

auxin@iDï auxin@ jDPIN1@i, jD=

8auxin@iD, PIN1@i, jD Ø Complex1, Complex1 Ø auxin@iD, PIN1@i, jD , Complex1 Ø auxin@ jD, PIN1@i, jD <At the cell level, cells have internal state including the above reactions and also cell mass and position.

When mass exceeds a threshold, cell divide. Mass influences the resting length of elastic springs connectingneighboring cells which determine their positions. Positions determine which cells are neighbors, therefore whichregulatory subnetworks are connected. There is variable structure both in the objects (cells) and their relationships(communicating neighbors; lineage trees of cell ancestry).

ICSB_TutorialV12.nb

9

Figure 6. SAM, top view, three time slices. Color = auxin concentration. Emergent peaks correspond to floralmeristem primordia. [Images courtesy of Henrik Jönsson.]

Figure 6 shows the dynamical pattern of auxin (yellow/blue scale) evolving over time, from a verysimilar model as detailed in [1]. Emergent phenomena are the auxin peaks that form off center and move outradially to make room for new peaks. The peaks are hypothesized to determine the position of the primordia fornew floral meristems in the phyllotactic pattern of flowers, leaves and branches for the above-ground part of theplant. The variable-structure objects at this scale are the primordia.

The variable structure dynamics illustrated by this model is not analytically tractable but is expressibleat each of three levels (the molecular, cellular, and organ levels) using the dynamical grammars formalism. Theability of a cell to divide and interact with its neighbors gives rise, at a coarser spatial and temporal scale, to theability of a shoot apical meristem to branch and create the floral meristems. This example indicates the potentialrelevance of Dynamical Grammar framework for multiscale modeling in biology.

1.5 Problem formulation languages

1.5.1 Diagrams

To come in Section 2.2: Bayes Nets; Markov Random Fields; plates; Dependency Diagramscategory theory commutative diagrams

1.5.2 Stochastic Parameterized Grammars (SPG’s)

Examples

The essential idea is that there is a “pool” of fully specified parameter-bearing terms such as{bacteriumHxL , macrophageHyL , redbloodcellHzL} where x, y and z might be position vectors. A grammar caninclude rules such as

8bacteriumHxL, macrophageHyL< Ø macrophageHyL with rHx - y¥Lwhich specify the probability per unit time, r, that the macrophage ingests and destroys the bacterium as a

function of the distance x - y¥ between their centers. Sets of such rules are a natural way to specify manyprocesses. They may have more than one term on the left hand side, making them “context sensitive” rather than“context-free”. We will define the semantics of grammars composed of such rules by mapping them to stochasticprocesses in both continuous time (Chapter 3, Section 3.8) and discrete time (Chapter 3, Section 3.9), and relatingthe two definitions (Chapter 3, Section 3.10). A key feature of the semantic maps is that they are naturally definedin terms of an algebraic ring of time evolution operators: they map operator addition and multiplication intoindependent or strongly dependent compositions of stochastic processes, respectively.

ICSB_TutorialV12.nb

10

which specify the probability per unit time, r, that the macrophage ingests and destroys the bacterium as afunction of the distance x - y¥ between their centers. Sets of such rules are a natural way to specify manyprocesses. They may have more than one term on the left hand side, making them “context sensitive” rather than“context-free”. We will define the semantics of grammars composed of such rules by mapping them to stochasticprocesses in both continuous time (Chapter 3, Section 3.8) and discrete time (Chapter 3, Section 3.9), and relatingthe two definitions (Chapter 3, Section 3.10). A key feature of the semantic maps is that they are naturally definedin terms of an algebraic ring of time evolution operators: they map operator addition and multiplication intoindependent or strongly dependent compositions of stochastic processes, respectively.

Consider the rewrite rule

(1.1)A1 Hx1 L, A2 Hx2 L, ..., An Hxn L Ø B1 Hy1 L, B2 Hy2 L, ..., Bm Hym L with rH@xi D, @yj DLHere Ak and Bl are terms that denote elements ta of a set ! = 8ta » a œ "< , indexed by elements a œ " of a

totally ordered set " . Members of ! are distinct symbols called types, all different from one another. The reasonEquation 1 is written using terms Ak and Bl rather than directly using the types ta is that different terms A and/orB may denote repeated appearances of the same type ta . The terms are each optionally followed by parenthesizedexpressions for parameters xi or yj , chosen from a base language #P defined below. A term followed by such aparenthesized parameter (or parameter vector) xi is called a parameterized term. We will frequently abbreviatethis notation for terms, using “ta Hxi L” to denote a parameterized term of type ta with a parameter whose value isgiven by xi . The terms Ai on the left hand side (the LHS) can appear in any order, as can the terms Bj on the righthand side (the RHS) of the rule. The intended meaning of the “A1 Hx1 L, ..., An Hxn L Ø B1 Hy1 L, ..., Bm Hym L ” part ofthis rule is that all the parameterized terms A are instantaneously converted or transformed into the parameterizedterms B at some time t . At that moment in time, the rule is said to fire .

Also in Equation 1, r is a nonnegative function, assumed to be denoted by an expression in a baselanguage #R defined below, and also assumed to be an element of a vector space ! of real-valued functions. Itsarguments consist of all the parameters from all the parameterized terms on both sides of the rule, LHS and RHS,listed in a standardized order induced by the order on " (not the arbitrary order induced by the order of the A’sand B’s within the rule RHS and LHS.) Informally, r is interpreted as a nonnegative probability rate: the indepen-dent probability per unit time that any possible instantiation of the rule will fire if its left hand side preconditionremains continuously satisfied for a small time interval. This interpretation will be formalized in the semantics.

As an example of a rule,

(1.2)HydrogenAtomHxL, HydrogenAtomHyL Ø HydrogenMoleculeHzL

with f H »» x - y »»L expI-I »» x - z »»2 + »» y - z »»2 M ë 2 s2 Mmight describe a chemical reaction complete with atomic position vectors x, y, and z .

An example of a stochastic parameterized grammar built out of rules like Equation 1 or Equation 2 isthe following:

grammar (discrete-time) binaryclustertreegen (nodesetHx; NullL Ø 8nodeHxi L< ) {nodesetHx; GL Ø P := nodeHx; GL, 8childHx; PL » 1 b i b n< with qHnL subject to 0 b n b 2 childHy; PL Ø nodesetHx; PL with fHx » yL

}


ICSB_TutorialV12.nb

11

1.6 Intuitively useful dualities and trinities

Duality (informal): two classes that can be mapped into one another, each providing a different insight.verbs vs. nouns (= dynamic vs static; process vs. object)deterministic vs. stochasticdiscrete vs. continuous (representations of time, space, state variables, state transitions)fine vs coarse scale (or restriction/prolongation maps)language vs. vision (e.g. text vs. diagrams in model specification; algebra vs. geometry)

... and a few trinitiestime, space, and state variable ( ~ dynamics, geometry, algebra)Semigroup, generator, resolvent [Engel and Nagel] for time-evolution equations.

1.7 Notation

In the rest of these notes we will use the following definition of a version of the Heaviside function Qfrom Boolean values to integers as follows:

(1.3)QHPL ª ; 1 if Predicate P is true0 otherwise .

Also Kronecker delta function dK Ha, bL or da b is

(1.4)dK Ha, bL = QHa = bL = ; 1 if x = y0 otherwise

and dHx, yL = dHx - yL is the Dirac delta (generalized) function appropriate to a particular measure m on themeasure space V .

In addition to the standard set-builder notation 8x » PHxL< for defining the members of a set from apredicate P , we will sometimes build ordered sets or lists in a similar way using square brackets:@xHiL » PHxHiL, iL »» i œ $D imposes the image of a preexisting ordering on the index set $ (such as the ordering ofnatural numbers if $ Œ " ) onto any elements xHiL selected for inclusion by the predicate P , and thus denotes a settogether with a total ordering. This notation can be read aloud as “xHiL such that PHxHiL, iL ordered by i œ $D”. The“ »PHxHiL, iL” clause may be omitted if P is True or otherwise obvious from the context, and likewise the “ »» i œ $”clause may be omitted if the indexing and the ordering of the indexing is obvious from the context, for example@xi D . Multiply-ordered arrays are possible as well, e.g. @Mi j »» i, j œ $D means @@Mi j »» i œ $D »» j œ $D and can beabbreviated @Mi j D .

ICSB_TutorialV12.nb

12

In addition to the standard set-builder notation 8x » PHxL< for defining the members of a set from apredicate P , we will sometimes build ordered sets or lists in a similar way using square brackets:@xHiL » PHxHiL, iL »» i œ $D imposes the image of a preexisting ordering on the index set $ (such as the ordering ofnatural numbers if $ Œ " ) onto any elements xHiL selected for inclusion by the predicate P , and thus denotes a settogether with a total ordering. This notation can be read aloud as “xHiL such that PHxHiL, iL ordered by i œ $D”. The“ »PHxHiL, iL” clause may be omitted if P is True or otherwise obvious from the context, and likewise the “ »» i œ $”clause may be omitted if the indexing and the ordering of the indexing is obvious from the context, for example@xi D . Multiply-ordered arrays are possible as well, e.g. @Mi j »» i, j œ $D means @@Mi j »» i œ $D »» j œ $D and can beabbreviated @Mi j D .

Also

(1.5)HnLi ª ‰k=n-i+1

n

k = n ! ê i! = ‚k=0

n

sHn, kL xk

where HnLi =the falling factorial, not the Pochhammer symbol or rising factorial; and sHn, kL=Stirling numbersof the first kind. Also

(1.6)Jn

iN ª

HnLiÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHn - 1L!Xx\ = average of x .

ICSB_TutorialV12.nb

13

Chapter 2

Probability

2.1 Generating functions

2.1.1 Generating functions for probabilities

Given a probability distribution pHnL on the integers or pHxL on the real numbers, we define thefollowing.

Generating function:

gHzL = ‚n

pHnL zn or

gHzL = ‡-¶

¶

pHxL zx d x

Moment generating function:

fHmL = ‚n

pHnL em n = gHem L or

fHmL = ‡-¶

¶

pHxL em x d x = gHem L

Differentiate to get mean, variance, and other moments.

gH1L = fH0L = 1

represents the fact that total probability sums up to unity. Moments are calculated as follows.

Xn\ =d gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d zÃz=1 =

d log gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d zÃz=1 =

d log gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d log z

ƒƒƒƒƒƒƒƒƒz=1

Var n = Yn2 ] - Xn\2 = XnHn - 1L\ + Xn\ - Xn\2

XnHn - 1L\ =d2 gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d z2

ƒƒƒƒƒƒƒƒƒƒz=1

Alternatively,

Var n =d2 log fHmLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d m2

ƒƒƒƒƒƒƒƒƒƒm=0 =d2 log gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd Hlog zL2

ƒƒƒƒƒƒƒƒƒƒz=1 .

In general,

ICSB_TutorialV12.nb

14

XHnLk \ =dk gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d zk

ƒƒƒƒƒƒƒƒƒƒz=1

Ynk ] =dk fHmLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d mk

ƒƒƒƒƒƒƒƒƒƒm=0

which are related by Equation 1.5 of Chapter 1.Independence:

gHz1 , z2 L = gHz1 L gHz2 L

Standard distributions and their generating functions

BernoulliHpL : Coin flip with probability p of “success”.

gHzL = H1 - pL + p z

BinomialHp, nL : n independent Bernoulli trials; find m = number of successes.

gHzL = HH1 - pL + p zLn HindependenceL= ⁄m=0

n I nm M H1 - pLn-m pm zm

MultinomialHp1 , ... pN , nL :gHz1 , ... zN L = I⁄k=1

N pk zk Mn HindependenceL= ‚

m1 ... mN r0⁄m=N

I nm1 ... mN

M ‰k=1

N Hpk Lmk ‰k=1

N H zk Lmk

Poisson(l)

gHzL = el Hz-1L = e-l ‚n=0

¶ lnÅÅÅÅÅÅÅÅÅÅn !

zn .

2.1.2 Algebra of generating functions

There is a useful algebra of generating functions for probability distributions. To begin, an unnormal-ized generating function GHzL can always be normalized: gHzL = GHzL êG H1L , where z may be scalar or vector.

Multiplication G1 Hz1 L G2 Hz2 L of generating functions in different variables z1 and z2 corresponds toindependent events. If the successive powers of z have the interpretation of counting something, then the productG1 HzL G2 HzL is the distribution of the sum of those counts c over two independent events, as in the derivation of theBinomial distribution out of Bernoulli distributed trials. If the same distribution governs both independent events,we raise GHzL to the second power. Similar remarks apply for the product of any number of generating functions,to get ¤i=1

n Gi Hzi L , ¤i=1n Gi HzL , or @GHzLDn .

Addition G1 Hz1 L + G2 Hz2 L of generating functions corresponds to uniformly-weighted mixture of theirrespective distributions. Weighted addition w1 G1 Hz1 L + w2 G2 Hz2 L corresponds to weighted-mixture of distribu-tions. In the special case that the distributions being combined have mutually exclusive support, then additioncorresponds to assignment of unnormalized probabilities (with no mixing of nonzero probabilities) on the union ofthe support sets. Similar remarks apply for the sum of any number of generating functions, to get to get⁄i=1

n Gi Hzi L or ⁄i=1n wi Gi Hzi L .

ICSB_TutorialV12.nb

15

Combining these two types of operations, we find that the weighted sum of products

(2.1)‚i=0

¶

wi @G2 HzLDi = G1 HG2 HzLL = HG1 ÎG2 L HzL

represents the distribution of sum of an elementary count c governed by G2 HzL , in a situation where i different“events” with distribution governed by G2 HzL take place with relative probability wi governed by G1 HyL . Thissituation occurs in a discrete-time branching process (see next section) in which the first generation branches withdistribution P1 HiL governed by G1 HyL , and the second generation branches with distribution P2 HnL governed byG2 HzL . Thus, branching processes are governed by the functional composition of generating functions. Thegenerating functions may all be unnormalized or normalized:

‚i=0

¶

wi @g2 HzLDi = g1 Hg2 HzLL = Hg1 Îg2 L HzL

Finally, generating functions G1 Hz1 L and G2 Hz2 L in different variables can be contracted in variousways to create generating functions in fewer variables. This is done by expanding out the product G1 Hz1 L G2 Hz2 Linto a basis set of monomials z1

n1 z2n2 and then linearly mapping each basis monomial into a polynomial function

fn1 n2 HzL of a new variable z . We may denote this operation as G1 ù f G2 :

(2.2)HG1 ù f G2 L HzL = Lf @G1 Hz1 L G2 Hz2 LD = ‚n1 ,n2

G1 H1L Pr1 Hn1 L G2 H1L Pr2 Hn2 L fn1 n2 HzL

For example, fn1 n2 HzL = zn1 +n2 gives the sum of counts (e.g. Binomial distribution from Bernoulli distributions)as before. However, the z’s and n’s may be generalized to vectors. Clearly, the nature and meaning of thecontraction operation is govered by the (generating) functions fn1 n2 HzL . They correspond to node probabilitytables in a Bayes Net;

Hg1 ù f g2 L HzL = ‚n1 ,n2

Pr1 Hn1 L Pr2 Hn2 L fn1 n2 HzL = ‚n

zn Pr f Hn » n1 , n2 L Pr1 Hn1 L Pr2 Hn2 L

However, the branching process relationship is not part of normal Bayes Networks since it results in a variablerather than fixed structure of connections. A new link type is required to express this Section 2.2 below, and [2].

Thus, the operations of multiplication, addition, weighted addition, function composition, and contrac-tion (ä , + , +w , Î , and ù f ) provide a meaningful algebra on generating functions for discrete probabilitydistributions.

2.2 Branching Processes

Most material in this section is from UCI ICS TR 05-09, “Variable-Structure Systems from Graphs andGrammars”, by the author.

Here is a simple cluster-generating grammar that generalizes binaryclustergen by allowing any numberof elements per cluster:

grammar (discrete-time) clustergen (nodesetHxL Ø 8nodeHxi L<) {nodesetHxL Ø nodeHxL, 8childHxL » 1 b i b n< with qHnL subject to n r 0. childHyL Ø nodesetHxL with fHx » yL

}

ICSB_TutorialV12.nb

16

2.2.1 Discrete-time branching processes

Composition of heterogeneous generating functions f HxL = g1 Hg2 Hg3 HxLLL etc. Then the infinite discrete-timebranching process is f0 HxL = x and:

(2.3)fL+1 HxL = fL HgL+1 HxLL If all g’s are the same, then the infintite discrete-time branching process is:

fL+1 HxL = gH fL HxLL = fL HgHxLL .

If each g is the same, and if we keep parent nodes in the count by substituting gHxL with x gHxL , then , the steady-state relationship between f and g is given by the beautiful functional equation [3]

(2.4)f HxL = x gH f HxLL .

This functional equation can be solved iteratively using Taylor series expansions and

fL+1 HxL = x gH fL HxLL and f0 HxL = 1 .

Convergence is observed to be reliable: one new coefficient is fixed per iteration. Example: Monod-Wyman-Changeaux model. g1 HzL = zH0L + L zH1L ; g2 HzL = zn ; g3

0 HzH0L L = zH0, 0L zH0,2L ,g3

1 HzH1L L = zH0, 0L zH0,1L , and g4H0ê1L HzH0ê1,0ê1L L = H1 + KH0ê1, 0ê1L zH0ê1, 0ê1L L . Then

f4 HzH0,0L , zH0,1L , zH1,0L , zH1,1L L = I1 + KH0, 0L zH0, 0L Mn I1 + KH0,2L zH0,2L Mn + L I1 + KH1, 0L zH1, 0L Mn I1 + KH1,1L zH1,1L Mn

If we now contract using zH0,0L # S = s êKH0, 0L , zH0,1L # S = s êKH0, 0L , zH0,1L # A = a êKH0, 1L andzH0,2L # I = b êKH0, 2L we obtain the partition function for the MWC model:

ZMWC HzL = H1 + sLn H1 + bLn + L H1 + c sLn H1 + aLnwhere c = KH1, 0L êKH0, 0L .

Lagrange inversion formula

An even more effective solution is by reversion of power series [Pitman 98]. Solve the functionalequation to find:

f HxL = gè -1 H1 ê xL, wheregèHyL = gHyL ê y

In Mathematica or other computer algebra systems this can be implemented (either numerically or symboli-cally) in a single line.

Example: for the geometric distribution g = H1-pLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH1-p xL the average number of offspring per node is

DB H1 - pLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH1 - p xL , xF ê. x Ø 1

pÅÅÅÅÅÅÅÅÅÅÅÅ1 - p

which has a pole at p=1. The reversion of power series may be done in Mathematica, with input

InverseSeriesBSeriesB 1ÅÅÅÅy

ikjjj H1 - pL

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH1 - p yLy{zzz, 8y, 0, 10<FF@@3DD

ICSB_TutorialV12.nb

17

that yields the result

91 - p, H1 - pL2 p, 2 H1 - pL3 p2, 5 H1 - pL4 p3, 14 H1 - pL5 p4,42 H1 - pL6 p5, 132 H1 - pL7 p6, 429 H1 - pL8 p7, 1430 H1 - pL9 p8,4862 H1 - pL10 p9, 16796 H1 - pL11 p10, 58786 H1 - pL12 p11=

The appearance of the Catalan numbers 1ÅÅÅÅÅÅÅÅÅÅÅN+1 I 2 NN M in the case of the exponential distribution can be understood

from the standard bijection between arbitrary and binary trees, whose number is counted by these integers. Theexact solution for this example is:

f HxL =1 -

è!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1 - 4 pH1 - pL xÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

2 p,

Let us compute the expected cumulative population:

FullSimplify@D@Log@f@xDD, xD ê. x Ø 1D1ÅÅÅÅ2

ikjjj1 +

1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅè!!!!!!!!!!!!!!!!!!!!!!H1 - 2 pL2

y{zzz

PlotB 1ÅÅÅÅ2

i

k

jjjjjjjjj1 +

1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ"######################H1 - 2 pL2

y

{

zzzzzzzzz, 8p, 0, 1<F

0.2 0.4 0.6 0.8 1

10

20

30

40

50

60

70

Figure . Average number of descendants as a function of p, averaged over all cases for which the resultis finite. This is equal to all the cases for pb 1/2 only, because of the singularity at that point.

In addition, for p>=1/2 there is a nonzero value for the total probability

1 - f H1L = 1 -1 - » 1 - 2 p »ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

2 p= ; 0 p b 1 ê 2

2 - 1 ê p p r 1 ê 2

of an infinite number of descendants.

ICSB_TutorialV12.nb

18

PlotB -1 + 2 p + Abs@1 - 2 pDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

2 p, 8p, 0, 1<F

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

Figure . Probability of an infinite result, as a function of p.Thus the generating function f HxL contains full information about the expected number of descendants,

for any p .

Size distribution calculations: Pr(N) and Pr(n, N)

PrHNL is determined by the function q . Its derivation requires an understanding of birth-and-deathprocesses [4], which can be provided by the generating function point of view. Let gHzL be the generating functionfor qHnL . Also define the generating function f HxL for PrHNL :

gHzL = ‚n=0

¶

zn qHnL and f HxL = ‚N=1

¶

xN PrHNL

For example, if q is a geometric distribution, gHzL = H1 - pL ê H1 - p zL . The power law is of particular interestfor many applications.: if q is a power law with power -a , gHzL = p0 + H1 - p0 L Lia HzL ê z HaL . f HxL is the generat-ing function for PrHNL for finite integer values of N only; 1 - f H1L is the probability of obtaining N = ¶ . Someimportant pairs Hg, f L are listed in Table 1.

Table 1. Generating functions for qi

Distribution name Distribution qHnL Generating function gHzL Generating function f HxL

Geometric qi =pi (1-p) gHxL = 1-pÅÅÅÅÅÅÅÅÅÅÅÅÅ1-p z f HxL = 1-è!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1-4 pH1-pL xÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ2 p

Linear Fractional qn = di 0 1-b-pÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ1-p +

H1 - dn0 L b pn-1

H1-b-pL+Ib-p+p2 M zÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH1-pL H1-p zL

f HxL =IH1 - pL - Hb - p + p2 L x2 -

,IHp - 1L2 + 2 Hp -

1L Hb + 2 b p +3 Hp - 1L pL x +Hb + Hp - 1LpL2 x2 MM ë

H2 pH1 - pL L

ICSB_TutorialV12.nb

19

Binomial qn = I Nn M pi H1 - pLN-n xn gHzL = HH1 - pL + p zLn

f HxL =

H1 - pLn x + n H1-pL2 n p x2ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp-1 +

IH1 - pL3 n H-n p2 +

3 n2 p2 L x3 M ëI2 Hp - 1L2 M

+ ...

binary binomial treeqn = I 2

n M pn H1 - pL2-n xn

n œ 80, 1, 2< gHzL = HH1 - pL + p zL2 f HxL = 1-è!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1-4 xH1-pLêpÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ2 x - 1-pÅÅÅÅÅÅÅÅÅÅp

Power lawqn =

; p0 n = 0H1 - p0 L n-a ê zHaL n > 0

gHzL =p0 + H1 - p0 L Lia HzL ê zHaL

f HxL = p0 x + H1-p0 L p0 x2ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅzHaL +

H2-a Hp0 - 1L p0H-2a + 2a p0 -

p0 zHaLL x3 L ëHzHaLL2 + ...

Resource boundedness

The cluster generation procedure can be serialized for implementability:

PrH8nk » 1 b k b j + 1< » N, nL = PrHnj+1 » 8nk » 1 b k b j<, N, nL PrH8nk » 1 b k b j< » N, nLPrHnj+1 » 8nk » 1 b k b j<, N, nL =

PrH8nk » 1 b k b j + 1< » N, nLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

PrH8nk » 1 b k b j< » N, nL =CoefAxN ¤k=1

j+1 yknk , ¤k=1

n g2 Hx yk LE …yk> j+1 =1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅCoefAxN ¤k=1

j yk nk , ¤k=1n g2 Hx yk LEE …yk> j =1

Resulting functionsE.g. if q2 = a geometric distribution, we can compute this solution and get a Binomial-Beta distribution that

plays the role of the Beta distribution in the CRP below. This can be calculated as follows:

PrHnj+1 » 8nk » 1 b k b j<, N, nL =

Prikjjjjjnj+1 » Nè = N - ‚

k=1

j

nk , ny{zzzzz = Bb

ikjjjjjnj+1 » a = 1, b

`= n - j - 1, n = Nè ª N - ‚

k=1

j

nk , ny{zzzzz

Xnj+1 \ ª EHnj+1 L = n a

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa + b

` =Nè

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅn - j

Hequal sized portionsL

YHnj+1 - Xnj+1 \L2 ] ª VarHnj+1 L = n a b

`

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅJa + bL` 2

a + b`

+ nÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa + b

`+ 1

=Nè Hn - j - 1LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHn - jL2

INè + n - jMÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHn - j + 1L

Hzero variance for j + 1 = nLInstead of a “stick-breaking construction”, let’s call this the “bread-breaking construction”: each nj+1 in turn

tries to take an equal share of the Nè remaining samples. Only the last one is able to do this with no variance, sincehe just takes everything left over.

ICSB_TutorialV12.nb

20

But in the more interesting case of a power law, g2 HzL = Lia HzL ê zHaL , we need to calculate the numera-tor Coef@ ÿ D in the following expression:

PrHnj+1 » 8nk » 1 b k b j< » N, nL =Hnj+1 L-a

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅzHaL

CoefAxN-n j+1 , HLia HzLLn- j-1 EÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

CoefAxN-n j , HLia HzLLn- j EE

2.2.2 Continuous-time branching processes

Here is the alternative discrete-time semantics of clustergen (omitting for simplicity the node labels x , and justkeeping the tree structure):

(2.5)H` = ‚

k=0

¶

qHkL ak a = gHaL a

H = gHaL a - N

(2.6)H` 2

= gHaL2 a2 + gHaL g£ HaL aH` 3

= g HaL3 a3 + 3 Hg HaLL2 g£ HaL a2 + g HaL Hg£ HaLL2 a + Hg HaLL2 g HaL a; ...

where

gHzL = ‚n=0

¶

zn qHnL and gH1L = ‚n=0

¶

qHnL = 1

In this model, every power of H`

, and the continuous-time evolution exp t H , can be formally expressed andcomputed using efficient power series operations (composition and reversion) on generating functions. Withgenerating functions f HxL , operators Ha, aL are represented by Hx , x µLrespectively. Then H

`# @gHxL x D , and

H # @HgHxL - xL x D . Defining

(2.7)JHx; x0 L = ‡x0

x d uÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅgHuL - u

and KHx; x0 L = ‡x0

x d uÅÅÅÅÅÅÅÅÅÅÅÅÅÅgHuL

Then, considering JHx; x0 L to be a function of just its first argument x ,

dÅÅÅÅÅÅÅÅÅÅÅd J

=d xÅÅÅÅÅÅÅÅÅÅÅd J

d

ÅÅÅÅÅÅÅÅÅÅd x

=1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd J ê d x

d

ÅÅÅÅÅÅÅÅÅÅd x

" H

(2.8)et H f HxL# et Hdêd JL f IJ-1 HJHxLLM = f IJ-1 Ht + JHxLLMby Taylor’s theorem in the form ea x f HxL = f Hx + aL . Thus we need only calculate J-1 Ht + JHxLL using power

series reversion and composition. [1] (section III.3 eq. (7)) provides a different derivation. A similar calculationholds for discrete-time semantics (Equation 6) using K , so that

(2.9)es H`

f HxL # f IK-1 Hs + KHxLLM = f ikjjjx + s gHxL +

s2ÅÅÅÅÅÅÅÅ2

gHxL g£ HxL +s3ÅÅÅÅÅÅÅÅÅ3 !

IgHxL Hg£ HxLL2 + HgHxLL2 g HxLM + ... y{zzz

= f HxL + s gHxL x f HxL + ... " HI + s gHaL a + ... L f HxL ,

from which we can recalculate (Equation 6). In either case the grammar is tractable because clustergen is acontext-free grammar: there is only one term on the left hand side of each rule.

Thus, both the continuous-time and discrete-time semantics can be at least formally solved in thisspecial case.

Example. H`

= rI1 + c2 a2 M a where r = 1 ê H1 + c2 L , represents a birth-death process with one birth ordeath event per discrete time step. Each possible birth or death event is oblivious of the all others in its continuous-time firing rate. Then the alternative discrete-time semantics has KHxL = arctanHc xL ê Hr cL , soes H

` †1\ # HtanHs r cL + c xL ê @cH1 - c x tanHs r cLLD . Therefore 1 ÿ es H

` †1\ = HtanHs r cL + c L ê @cH1 - c tanHs r cLLD ,

which has singularities at finite s and therefore is not equal to ea s for any a. So in this case, 1 ÿ H` n

†1\ an .

ICSB_TutorialV12.nb

21

Example. H`

= rI1 + c2 a2 M a where r = 1 ê H1 + c2 L , represents a birth-death process with one birth ordeath event per discrete time step. Each possible birth or death event is oblivious of the all others in its continuous-time firing rate. Then the alternative discrete-time semantics has KHxL = arctanHc xL ê Hr cL , soes H

` †1\ # HtanHs r cL + c xL ê @cH1 - c x tanHs r cLLD . Therefore 1 ÿ es H

` †1\ = HtanHs r cL + c L ê @cH1 - c tanHs r cLLD ,

which has singularities at finite s and therefore is not equal to ea s for any a. So in this case, 1 ÿ H` n

†1\ an .In continuous time for c = 1, this model also has a simple solution:

J-1 Ht + JHxLL = t+2 x-t xÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅt+2-t x = tÅÅÅÅÅÅÅÅÅ2+t + ‚n=1

¶ 4 tn-1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH2+tLn+1 xn .

Example. If q has a normalized power-law distribution:

qHnL = ; p0 n = 0H1 - p0 L n-a ê zHaL n > 0

then gHzL = p0 + H1 - p0 L Lia HzL ê zHaL (where Lia HzL is the polylogarithm function) and the continuous-timeand alternative discrete-time dynamics are given formally by the integrals

JHx; x0 L = ‡x0

x

d zÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp0 - z + H1 - p0 L Lia HzL ê zHaL and KHx; x0 L = ‡

x0

x

d zÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp0 + H1 - p0 L Lia HzL ê zHaL

But these integrals, unlike those of the previous example, do not appear in integral tables and may not haveanalytic solutions in terms of commonly studied special functions.

Open problem

Polylogarithm integrals: find fast algorithm

2.3 Machine Learning

Most material in this section is from UCI ICS TR 05-09, “Variable-Structure Systems from Graphs andGrammars”, by the author.

2.3.1 Markov Random Fields

PrHx » xinitial L =1ÅÅÅÅÅÅZ

‰kœ%1

fk H@xi » Fi k = 1DL

2.3.2 Bayes Nets

PrHx » xinitial L = ‰kœ%2

fk H@xi »» Di k = 1D » @xj » Dk j = 1DL

ICSB_TutorialV12.nb

22

2.3.3 Factor Graphs

Following [5] we recall that undirected (MRF) and directed (BN) graphical models may be incorpo-rated into a common graph framework by introducing probability factor nodes fa into graphs that denote probabil-ity distributions. We assume each factor node fa is labelled by a member of some function space & whosevalues are nonnegative; for example, they may be exponentials of real-valued “potential functions” in anotherfunction space & £ .

For a dependency diagram, the semantics function Y maps such labelled graphs to probability densityfunctions which we can write as follows:

(2.10)PrHx » xinitial L =1ÅÅÅÅÅÅZ

Âkœ%1

fk H@xi »Fi k = 1DL ‰kœ%2

fk H@xi » Di k = 1D » @xj » Dk j = 1DL

Here F and D are 0/1-valued adjacency matrices for two separate link types: undirected (MRF-like) anddirected (BN) dependency links. An F link represents participation in a potential function in a Boltzman distribu-tion or Markov Random Field. Although the graph F is itself directed, its Boolean-algebra square U = F FT

symmetrically connects random variables that are related by one or more potential functions. By contrast thedirected links D form a Directed Acyclic Graph (DAG) and represent a generalized form of conditional distribu-tion in which each probability factor participates in the normalization relationship for directed factor graphs [5][6], and specializes to a conditional distribution if a variable xi and a factor k are uniquely connected by a D -link.In addition, some of the variable nodes x may actually be labeled as fixed parameters. These may be interpreted asconditionalized random variables. F , U , and D type dependency links are distinguished by being labeled with “f”,“u”, and “d” in the labeled graph representation of a dependency diagram.

The conversion of MRF and BN links into FG f and d links is illustrated in Figure 1. With theseinterconversions, we can freely intermix f, u, and three types of d links (depending on the types of nodes con-nected) provided that sthe set of all the d links form a DAG.

ICSB_TutorialV12.nb

23

Figure 1: Conversions between MRF u undirected dependency links (top) and BN d directed dependency links(bottom), on the left, and FG f and d links, on the right. There exists a more precise definition of the factor f in thetop panel that would require additional pairwise factors between each pair of variables in the FG on the right (not

drawn).

Let 'H& , Nrv , Nfactor L be the space of probability density functions that can be constructed according toEquation 10 from Nrv random variables and from Nfactor probability factors each in the space & . Let'H& , Nrv L = 'H& , Nrv , ¶L .

We denote dependency diagram classes DDH8 f , d<, 8#, $<, & , HNnode , Nlink LL for diagrams containingboth f and d links, both integer-valued and real-valued random variables, factor functions in a function space &,and bounds HNnode , Nlink L on the numbers of nodes and links, assuming the above form for the semantics fuction Yfrom class members to probability distributions. More restricted classes such as DDH8 f <, #, & , HNnode , Nlink LL(integer-valued Boltzmann distributions) can be described in a similar way.

Factor graphs as defined have a fixed structure of dependencies given by the matrices F and D , regard-less of the values of other variables or parameters such as time. We seek to remove these limitations by definingnew node and link types and their semantics. We also consider wherever possible the reduction to previouslydefined node and link types, and the effects of such reduction on size and complexity parameters such as numbersof nodes and links. The essential new link types are: factor gating links labelled by “g”, node and factor indexingnodes and links labelled by “i” (which allow for replication), and node existence links labelled by “e”. Otherconvenenient node and link types will be defined in terms of these.

ICSB_TutorialV12.nb

24

2.3.4 Dependency Diagrams

Gating links

A special case of dependency link is of particular interest for variable-structure systems: the gatinglink. We assign meaning to such links by extending the semantic function Y to dependency diagrams (labelledgraphs that denote distributions) that include them. Examples of gating links include all 0/1-valued multiplicativeindicator variables or functions in MRF’s, such as line processes in region segmentation, cluster membershipvariables in mixture models, and graph matching assignment matrices [7].

The semantic function Y now assigns to each such diagram the probability distribution:

(2.11)

Pr H8x<L =1ÅÅÅÅÅÅZ

Âkœ%1

fk H8xi »Fi k = 1<LÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÂ

k » gHk,kL=1 QJxk>0NF

ä Âkœ%2

fk H8xi » Di k = 1< » 8xj » Dk j = 1<LÄ

Ç


k » gHk,kL=1 QJxk >0NF

Here, the 0/1-valued Heaviside functions QHpredicateL are in the exponent and are applied to each integer- orreal-valued random variable xk that gates factor k according to the gating graph links whose adjacency matrix isgHk, kL = gk,k . Each product of Heavide functions takes value 0 or 1. Only if all its gating constraints (if any) aremet is a probability factor fk multiplied into the joint probability distribution. Both D and F type interactions canbe gated. Since fk

0 = 1, in the absence of all gating links, each product ¤k Q is empty, hence =1, and the defini-tion for Y reduces to the previous one for FG’s, Boltzmann distributions, MRF’s, and BN’s. For this definition towork, we define 00 = 1.

The “standard expansion” map ( for gating links eliminates all such links in favor of F and D links.This can be done by replacing all gating links with ungated ones of D type if possible using local renormalizationof fk , and with F type links if not; also it changes the probability factor fk accordingly by raising it to the powerof the product all incident 0/1 gating variable values, assuming that is possible within the function class & . Theresulting ungated graphical model has the same “meaning” (maps under Y to the same joint distribution on thesame random variables) as the gated one but is devoid of gating links. In this way, we can reduce dependencydiagrams with gating links to those without. The cost of doing so is an increase in the number of arguments andthe generality of the allowed probability factor functions. The number of nodes and links remains constant.

Two essentially different classes of diagram may be considered: those in which the gating links areconstrained to form a DAG when added to D , and those that aren’t. If the D and associated g links form a DAG,then we may attempt a reduction of gated directed links to ungated directed links. If for some reason the normaliza-tion property of D-links is lost, any affected probability factors can be moved from the “D” product to the “F ”product and all corresponding dependency link labels changed accordingly. Either way we establish the followinglemma.

Lemma 1. There exists a semantics function Y from DDH8 f , d, g<; & ; HNnode , Nlink LL to probabilitydensity functions 'H& , Nnode L on Nnode variables such that:

(a) Yfg specializes to FG’s, MRF’s and BN’s, i.e. it agrees with the standardY : DDH8 f , d<, & , HNnode , Nlink LL Ø 'H& , Nnode Lon diagrams without g links. Evaluated on such diagrams,Y = Yfg .

ICSB_TutorialV12.nb

25

(b) There exists a “standard expansion” map ( which reduces DDH8 f , d, g<; & ; HNnode , Nlink LL toDDH8 f , d<; & ; HNnode , Nlink LL , such that on the domain DDH8 f , d, g<; & ; HNnode , Nlink LL , Y = Yfg Î( . In otherwords, the following diagram commutes:

As we will see, however, this “standard” expansion map ( can be far from the most efficient reductionfor particular variable-structure systems. This fact is important to understanding the difference in principlebetween variable-structure and fixed-structure systems.

Node existence links

In the context-free tree example, a large number of potential random variables are not involved in anygiven tree, depending on the values (and involvement) of their parent variables. We may indicate non-involve-ment, or effective non-existence, of a random variable in a tree by using special gating links (labelled “e”) to cutoff all of its interactions with other variables. Thus the semantics YHDL of DDH f , d, g, e<; & ; HNnode , Nlink LL is:

(2.12)

Pr H8x<L =1ÅÅÅÅÅÅZ

Âkœ%1

fk H8xi » Fi k = 1<LÄ

Ç


k » gHk,kL=1 QJxk >0N Â

k » eHi,kL=1 QJxk>0NF

äÂkœ%2

fk H8xi » Di k = 1< » 8xj » Dk j = 1<LÄ

Ç


k » gHk,kL=1 QJxk>0N Â

k » eHi,kL=1 QJxk >0NF

The products in the exponent are more simply expressed in terms of Boltzmann/Gibbs distribution energyfunctions, where they just multiply the potential functions Vk = -log fk :

(2.13)

E = „kœ%1

Vk H8xi » Fi k = 1<L Âk » gHk,kL=1

QJxk > 0N ‰k » eHi,kL=1

QJxk > 0N

+„kœ%2

Vk H8xi » Di k = 1< » 8xj » Dk j = 1<L Âk » gHk,kL=1

QJxk > 0N ‰k » eHi,kL=1

QHxk > 0N

We may equivalently reduce existence links to gating links as follows: constrain

eHi, kL = 1 ñ H" k : Fi k = 1ÓDi k = 1 fl g£ Hk, kL = 1Lwhich may be achieved by defining

g£ Hk, kL = gHk, kLÓ H$ i » eHi, kLÔ HFi k = 1ÓDi k = 1LL.

ICSB_TutorialV12.nb

26

Indexing links

One more significiant notational extension is required to express variable-structure system architecturesin which a fixed amount of network description information controls a variable number of random variables anddependencies. In structured applications such as those involving time, space, or other architectural regularities,conventional algebraic notation for generative models expands to include subscripts, indices, or their equivalent.Here we incorporate such indices as part of the formal specification and semantics of Dependency Diagrams.Using index nodes, a fixed amount of network description information can specify a variable number of randomvariables and dependencies. Index nodes are a reformulation and extension [8] of the Plates notation of [9], sothey may also be called “platelets”. The key idea is that, whenever variables or functions could algebraicallyappear with subscripts indicating multiplicity, a dependency diagram has an “index node” with an “index link” tothe corresponding variable or function node. The Figure shows the replacement of repeated random variables,with or without interactions, by index nodes and index links labelled “i” (iota).

Figure: Indexing nodes. (a) An indexed set of random variables replaced by a variable node and an indexnode. (b) An indexed set of random variables and their directed or undirected dependencies, replaced by two

indexed random variables and their directed or undirected dependence.

In the absence of gating, we may express indexing with i (iota) links most simply using sparse matrices (alsodenoted by iota) as follows:

iHi, aL œ 80, 1<, iHk, aL œ 80, 1<where implicitly we constrained i to index the variable nodes xi , a to index the index nodes aa , and k to index

the probability factor nodes fi . Of course, the symbols a, i , and k are not (necessarily) themselves index nodesbut rather meta-indices in the mathematical language we are using to describe dependency diagrams. Indeed, byaugmenting the metaindex i of random variable xi with an ordered set of indices Haa L that index xi as specified byiHi, aL = 1, we obtain the indexed random variable xi,Haa »iHi,aL=1L . A particular example would be x5,Ha1 ,a3 L or, evenmore specifically, “gradeHstudent, courseL ”.

ICSB_TutorialV12.nb

27

The semantics of index nodes and links are given, under the restrictive assumption of a single level ofindexing in DD(d,i), by the following probability density formula. Define JHkL to be the ordered setHam » iHk, mL = 1L , and likewise define JHiL = Haa » iHi, aL = 1L and JH jL = Hab » iH j, bL = 1L . Then the value of Y is

(2.14)PrH8x<L = 1ÅÅÅÅÅZ ¤8aa œ$a < ¤kœ%1

fk, JHkL H8xi, JHiL » Fi k = 1<Lä ¤kœ%2

fk, JHkL H8xi, JHiL » Di k = 1< » 8xj, JH jL » Dk j = 1<LNote that all products over indices act within the same global scope - there is no nesting of parenthisized

subproducts over indices. This is only the simplest situation. In general, index nodes introduce the need for acompatible tree of index scope nodes (with an implicit root node for the whole diagram) that determines whichprobability factors are within scope for which index products.

The following generalizations of the foregoing indexing mechanism can be added, go further beyondPlates, and are straightforward to express in terms of more general probability formulas: (a) multiple levels ofindexing (subscripts on the subscripts, in a DAG of iota-relationships); (b) combination of indexing with gatinglinks “g” and therefore with existence links “e”; (c) numerical index constraint links di , enforced by Kroneckerdelta function factors; (d) reordering the indices of some variable and factor nodes (from the default numericalorder) to express multidimensional transpose operations; (e) constraints du (expanded into probability factors fk

using Kronecker and Dirac delta functions) relating random variable values to index values and/or to each other;(f) ièHk, aL œ 80, 1< relationships that allow an indexed set of variables as arguments to a single probability factor;(g) upper index limits l that can be variable rather than constant as assumed so far. In the proof of Proposition 2,below, we will only need (a), (b), and (c) above.

The new link types, and the link types in terms of which they are defined, are summarized in Table 2.

ICSB_TutorialV12.nb

28

Table 2. Dependency Diagram link types: definitional dependencies (itself a DAG)

Link type Symbol Definable in terms of ...

factor dependency f stat mech, axiomatic, or dunconditional dependency u f, stat mech, or axiomaticdirected (conditional) dependency d u, or axiomatic

gated interaction g (f, d)node existence e gconstraint on random variables du (u, dKronecker , dDirac Lidentity of variable node names repetition du

index i repetitive expansionscope s iconstraint on indices di (i, g {, s})identity of index node names repetition diargument indexing iè i {, s}, repetitive expansion

time delay (f | d)dt (i,(f | d), di )variable index limit l evalue type v (du , i)permutable indexing is repetitive expansion

Example Constrained and multilevel indexing is illustrated in the following diagram fragment. Itcontains no probability factors but hierarchically indexes a single random variable xl, Hi1 , ... ik , ... il L .

Figure 3: Random variable node x (circle) indexed by (squares) level number l , lineage indices ik , where k isconstrained to be b l (hexagon).

Diagrams can be drawn more simply by omitting selected link labels according to the default link type conven-tions proposed and listed in Table 3.

With these node and link types, translation to and from Boltzmann distribution and related architectures ispossible.

ICSB_TutorialV12.nb

29

2.4 Statistical Mechanics


2.4.1 Example

Transcriptional regulation: Phage l switch

[Ackers 82; Ackers and Shea 85]Proc Natl Acad Sci U S A. 1982 February; 79(4): 1129–1133.Quantitative model for gene regulation by lambda phage repressor.G K Ackers, A D Johnson, and M A Shea

ICSB_TutorialV12.nb

30

Sum the relevant state production rates weighted by occupancy fractions :

to get the rate law. This anticipates our use of partition functions to calculate equilibria.To reverse engineer a network out of these state probabilities, introduce a topology of possible state transitions

and enforce detailed balance to constrain all pairs of forward and backward rates . Topology of states connectedby one binding change:

4----1----2----6----4 ... | | | | ... 7----3----5----8----7 ... (wraps around)

Equilibrium model for transcriptional regulation within a network. Refer forward to partition functions.

2.4.2 General statistical mechanics framework for dilute solutions

Assume we have a molecular complex defined at each level by a set of binary occupancy variablessi œ 80, 1< , related through a high-order Ising model. For each slot there is a fugacity variable zi . We can define amultidimensional array J of interaction energies, whose elements are indexed by the ordered set of indices rHsL :

JrHsL = JHiH1L< iH2L< ... < iHlLL œ $

ICSB_TutorialV12.nb

31

with the convention that any other values of J are 0. Defining 00 = 1, the partition function for equilibriumstatistical mechanics is

(2.15)ZHz » JL = „8s »si œ80,1<<

ikjjjjj‰

i

zisiy{zzzzz Â8s»si œ80,1<<

expÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅ- b JrHsL ‰

j

Hs j Ls j

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑwhich is just another way of writing

ZHz » JL = ‚8s »si œ80,1<<

expH- b EL, where

E = ‚i

mi si + ‚i j

Ji j si s j + ‚i j k

Ji j k si sj sk + ‚i j k l

Ji j k l si s j sk sl + ...

and - bmi ª log zi

Considered as a function of the fugacities z , ZHzL is a high-order polynomial and it is a generatingfunction for the (unnormalized) probabilities of all configurations s . However, many J’s can tend towards ¶ insuch a way as to prohibit particular combinations of values of 8si <by giving them zero probability. Also many J’scan be exactly zero, so that particular interactions are absent. These possibilities can be encoded by the predicatesPHsL and QHsL , respectively, in the following expression for the partition function:

(2.16)

ZHz, JL = „8s » PHsL<

ikjjjjj‰

i

zisiy{zzzzz Â8s»QHsL<

exp

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅ- b JrHsL ‰

j

Hsj Ls j

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑ

ª „8s » PHsL<

ikjjjjj‰

i

zisiy{zzzzz ‰8s»QHsL Ô Ïi si flsi <

HwLrHsL

ª „8s » PHsL<

ikjjjjj‰

i

zisiy{zzzzz ‰8s»QHsL <

HwLrHsL HsL.

In this notation,

HwLrHsL = expH- b JrHsL LJrHsL = D GrHsL

b = 1 ê Hk TLthus

(2.17)HwLrHsL = expH-D GrHsL ê k TLwhere G is the Gibbs free energy, T is the temperature, k is Boltzmann’s constant, and D GrHsL is the change in

Gibbs free energy (which may be zero) due specifically to the particular combination of sites i represented bysi =1. In traditional Ising models, for example, only pairs of sites i and j which are spatial neighbors havenonzero D GHi, jL values; therefore nonneighboring pairs and all higher combinations have zero D G ’s:0 = D GHi, j, kL = D GHi, j, k, lL =. .. .

As an example, a protein with two binding sites b = 1 and b = 2 that can each be empty or occupied bymolecules of species 1 or 2 respectively, and no other internal states, would have PHsL = True and hence

ZHz1 , z2 L = ‚8s »si œ80,1<ÔPHsL<

z1s1 z2

s2 w1s1 w2

s2 w1 2s1 s2 = 1 + w1 z1 + w2 z2 + w1 2 z1 z2

A protein with a single binding site that can be empty or occupied by species 1 or 2 would be modeled the sameway except that PHsL = s1 Ô s2

êêêêêêêêêê , hence ZHz1 , z2 L = 1 + w1 z1 + w2 z2 . If the protein is itself regarded as anotherspecies that can be present or absent, with fugacity z0 , then it must be present, so PHsL = s0 Ô s1 Ô s2

êêêêêêêêêê , and thepartition function is ZHz1 , z2 L = z0 H1 + w1 z1 + w2 z2 L . Likewise, a heterodimer consisting only of species 1 and 2with no internal states would have PHsL = s1 Ô s2 and therefore ZHz1 , z2 L = w1 2 z1 z2 . (This is a trivial case sincethere is only one state, but it will be useful when the species are given internal states as well.) In each case, as forany probability generating function, the coefficients can be normalized to give the probabilities of each possibleconfiguration of bindings.

ICSB_TutorialV12.nb

32

A protein with a single binding site that can be empty or occupied by species 1 or 2 would be modeled the sameway except that PHsL = s1 Ô s2

êêêêêêêêêê , hence ZHz1 , z2 L = 1 + w1 z1 + w2 z2 . If the protein is itself regarded as anotherspecies that can be present or absent, with fugacity z0 , then it must be present, so PHsL = s0 Ô s1 Ô s2

êêêêêêêêêê , and thepartition function is ZHz1 , z2 L = z0 H1 + w1 z1 + w2 z2 L . Likewise, a heterodimer consisting only of species 1 and 2with no internal states would have PHsL = s1 Ô s2 and therefore ZHz1 , z2 L = w1 2 z1 z2 . (This is a trivial case sincethere is only one state, but it will be useful when the species are given internal states as well.) In each case, as forany probability generating function, the coefficients can be normalized to give the probabilities of each possibleconfiguration of bindings.

Figure 1. Illustration of elementary partition functions in dilute solution. (a) A protein with two bindingsites and a possible energetic interaction between them. ZHz1 , z2 L = 1 + w1 z1 + w2 z2 + w1 2 z1 z2 . (b) A proteinwith a single binding site that can be empty or occupied by species 1 or 2, but not occupied by both.1 + w1 z1 + w2 z2 . (c) A heterodimer of two molecular species “bound” to a fictitious enclosing complex.ZHz1 , z2 L = w1 2 z1 z2 .

Thus partition functions may be expressed as polynomials in fugacity variables. This is a particularlyconvenient notation for molecules in a dilute solution which acts as a reservoir, since in that case fugacities zi areproportional to concentrations ci = @Si D . Further examples are given in [T. Hill]. Such polynomial partitionfunctions can be put into a form with homogeneous degree by introducing the complementary fugacity variableszi

+ and zi- and substituting zi = zi

+ ê zi- : Zhomog Hz+ , z- » wL = ZHz+ ê z- » wL H¤i zi

- L . No information is lost sinceZHz » wL = Zhomog Hz+ = z, z- = 1 » wL .

ICSB_TutorialV12.nb

33

2.4.3 Gene Regulation Networks - MSR91

The Central Dogma of Molecular Biology: Information flow is DNA --> mRNA--> protein. Modifica-tions of the Central Dogma: transcription factor feedback of protein to DNA; protein modification in the form ofconformation changes, covalent modifications like phosphorylation/methylation, assembly and disassembly ofprotein-containing molecular complexes.

Figure . The central dogma of molecular biology, with transcriptional feedback added.The extra Central Dogma Like Network (CDLN, process ontology) link types allow for feedback loops.[Mjolsness, Sharp and Reinitz, Journal of Theoretical Biology 1991]

T

v

Extracellularcommunication

Consider the single level of modularity with a global activation variable, similar to MWC but with nonidenticalbinding sites:

ICSB_TutorialV12.nb

34

Figure 11 (a)Diagram with replication nodes (boxes) and indexing links:

Figure 11 (b)

Indexing:

i, j œ 81 : N<b œ 81 : B<l œ 81 : 2<

Level 1:

Zi Hzi L = zi wi ‰b=1

B

zèHi bL+ + ‰

b=1

B

zèHi bL-

Level 2: Each site b has a partition function for the TF monomers and dimers that could bind there, with mutualexclusion between the alternatives.

XHi bLs = „9sHi b jL ,sHi b j kL …⁄ j sHi b jL +⁄ j k sHi b j kLb1=

ikjjjjjj‰

b j

zèHi b jL sHi b jLy{zzzzzz ikjjjjjj‰

b j k

zèHi b j kL sHi b j kLy{zzzzzz

(2.18)XHi bLs = 1 + ‚j=1

J

wHi b jLs zj + ‚j, k=1

J

wHi b j kLs zj zk

Composition is governed by the grammar

Gi b jl=1,2 = QH1 b b b BL Qi

kjjjjj‚

s=1

wHi b jLs + ‚s=1

‚k=1

N

wHi b j kLsy{zzzzz

Gi b j rl=1,2 = Gi b j

l=1,2 dr 1

The composite partition function is then:

ICSB_TutorialV12.nb

35

(2.19)

ZGRN Hzi L = zi wHi 0L ‰b=1

B

XHi bL+ + ‰b=1

B

XHi bL-

= zi wHi 0L Âb=1

B ikjjjjjj1 + ‚

j=1

J

wHi b jL+ zj + ‚j, k=1

J

wHi b j kL+ zj zk

y{zzzzzz + Â

b=1

B ikjjjjjj1 + ‚

j=1

J

wHi b jL- z j + ‚j, k=1

J

wHi b j kL- zj zk

y{zzzzzz

where

wHi b jL = expI-D GHi b jL ë k TM,wHi b j kL = expI-D GHi b j kL ë k TM, and

wHi 0L = expH-D GHi 0L ê k TL .

We now compute

Activation = log ZiÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ log zi

ƒƒƒƒƒƒ†ƒƒƒƒƒƒzi =1 =

wi ¤b=1B XHi bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅwi ¤b=1

B XHi bL+ + ¤b=1B XHi bL-

= Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwi Â

b=1

BXHi bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅXHi bL-

, 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑ= Hill

Ä

ÇÅÅÅÅÅÅÅÅÅÅwi

¤b=1B XHi bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ¤b=1B XHi bL-

, 1É

ÖÑÑÑÑÑÑÑÑÑÑ

= s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅlog

i

k

jjjjjjjjwi Âb=1

B

XHi bL+ ì‰b=1

B

XHi bL-y

{

zzzzzzzz

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑX contains TF-TF energetic interactions through the dimerization term.

Observation. If every binding site is identical, having many possible inputs encoded in sigma, and ifwi =1, then this simplifies to

Activation = HillÄÇÅÅÅÅÅÅÅÅ wi

1êB X+

ÅÅÅÅÅÅÅÅÅÅX-

, BÉÖÑÑÑÑÑÑÑÑ = s@BHlog X+ - log X- LD, where

sHyL = 1 ê H1 + expH-yLLIf B is so large that we are inside the radius of convergergence for the logarithm around 1 which is O(1), but

outside the radius of convergence of s@B yD which is O(1/B), then we can expand the log but not the sigma andrecover a neural network equation:

Activation @ s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅB looomnooo‚j=1

J

ITHi jL+ - THi jL- M zj + ‚j, k=1

J

ITHi j kL+ - THi j kL- M z j zk

|ooo}~ooo

+ hi

É

Ö


where

THi jLs = wHi b jLs , THi j kLs = wHi b j kLs , and hi = logwi

This argument was made in [JTB91].Observation. Much more generally, if occupancies of binding sites are low but there are many binding

sites, then

ICSB_TutorialV12.nb

36

Activation = s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅlogwi + ‚

b=1

B

log XHi bL+ - ‚b=1

B

log XHi bL-

É

Ö


log XHi bL+ @ ‚j=1

J

wHi b jLs z j + ‚j, k=1

J

wHi b j kLs zj zk ` 1

Activation @ s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅ‚j=1

J

ITHi jL+ - THi jL- M zj + ‚j, k=1

J

ITHi j kL+ - THi j kL- M zj zk + hi

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑwhere

(2.20)

THi jL = ‚b=1

B

wHi b jL = ‚b=1

B

expI-D GHi b jL ë k TM,

THi j kL = ‚b=1

B

wHi b j kL = ‚b=1

B

expI-D GHi b j kL ë k TM, and

hi = logwi = -D Gi ê k T

In more familiar notation, this is the Artificial Neural Network-like model of gene regulation networks [Mjol-sness, Sharp and Reinitz JTB 1991]:

Proposition 1: We may approximate the foregoing activation function by

(2.21)Activation @ gikjjjjjj‚

j=1

J

Ti j vj + ‚j, k=1

J

Ti j k vj vk + hi

y{zzzzzz.

where

THi jL = ‚b=1

B

wHi b jL+ - ‚b=1

B

wHi b jL- = ‚b=1

B

expI-D GHi b jL+ ë k TM - ‚b=1

B

expI-D GHi b jL- ë k TM,

THi j kL = ‚b=1

B

wHi b j kL+ - ‚b=1

B

wHi b j kL- = ‚b=1

B

expI-D GHi b j kL+ ë k TM - ‚b=1

B

expI-D GHi b j kL- ë k TM, and

hi = logwi = -D Gi ê k TgHxL = 1 ê H1 + expH-xLL

by expanding the logarithm under under certain conditions. The conditions of validity for this expansion andapproximation are:

(a) large number of binding sites, B p 1, and(b) probability of occupancy for each binding site is e or 1 - e , with e`1.

Moreover, all connection matrix entries have micro-level interpretations in terms of binding energies in Equa-tion 20.

Drosophila Anterior-Posterior Axis

Gap Genes [Reinitz Sharp and Mjolsness, Journal Experimental Zoology 1995] :

ICSB_TutorialV12.nb

37

Even-skipped gene expression [Reinitz and Sharp, Mechanisms of Development, 1995]:

2.4.4 Composition Theorem

Suppose we have a set of molecules indexed by i , with state units (both visible and hidden) indexed bya and site units indexed by b for binding sites, and r for docking regions if different from binding sites (otherwiseb = r). (See Figure 1).

ICSB_TutorialV12.nb

38

Figure 1. A molecule of species i equipped with three binding sites indexed by b and one allostericbinary variable s1 .

The partition function for each species of molecule (indexed by i) is

(2.22)Zi Hzi L = „8si » Pi Hsi L<

ikjjjjj‰

a

zi asi ay{zzzzz ‰8si » Qi Hsi L <

HwLrHsi L Hsi L,

where s = Hv, h, qL is decoupled into externally visible state units v , hidden state units h , and binding site unitsq . We now split up the visible, hidden and binding units:

Zi Hzi L = „9vi … P` i Hvi L=

ikjjjjjj ‰

aœV HiLzi a

vi ay{zzzzzz Zi Hvi , zi L

Zi Hvi , zi L = „9hi , qi … Pi Hvi , hi , qi L=

ikjjjjjj ‰

aœH HiLzi a

hi ay{zzzzzz ikjjjjjj ‰

bœBHiLzi b

qi by{zzzzzz ‰8si » Qi Hsi L <

HwLrHsi L Hvi , hi , qi L

P`

i Hvi L = Í8hi , qi < Pi Hvi , hi , qi L = SatisfiableHPi Hvi , hi , qi L » vi LNow introduce a context-free grammar G of possible site occupants and site occupation variables:

Gi b vi , j r œ 80, 1<sèi b j r œ 80, 1< such that ‚

j, d

sèi b j d b qi b i.e. MutexH8sèi b j r » j r œ "<L

CHi b j rL Ha a£ L œ 80, 1< such that‚a

CHi b j rL Ha a£ L b 1 and ‚a£

CHi b j dL Ha a£ L b 1

where C is a constant array that specifies the method of connecting together two molecule representations byidentifying some of their visible state variables.

See Figure 2.

ICSB_TutorialV12.nb

39

Figure 2. (a) Context-free Grammar G (dotted intermolecular arrows) of possible binding relationshipsbetween molecular species i and j. (b) Actual binding relationships (solid intermolecular arrows) in a particularconfiguration.

As in [JTB91], we use the symbol G to denote multidimensional sparse arrays of 0/1 values that specify thepossible transitions of a grammar. Define a grammar by the 0/1-valued arrays G as shown in Figure 2a and below.The basic G array is the most detailed:

Gi b vi , j r = ; 1 if region r of molecule j can bind to site b of molecule i in state vi

0 otherwise

and the derived ones are

ICSB_TutorialV12.nb

40

Gi b, j r = Qi

k

jjjjjjjjj ‚9vi … P` i Hvi L=

Gi b vi , j r

y

{

zzzzzzzzz œ 80, 1<

Gi b, j = Qikjjjjj‚

r

Gi b, j ry{zzzzz œ 80, 1<

Gi j = Qikjjjjj‚

b

Gi b, jy{zzzzz = Q

ikjjjjj‚

b r

Gi b, j ry{zzzzz œ 80, 1<

In the important special case of a context-free grammar, there is only one allowed value of r (say r = 1) andGi b, j r = Gi b, j dr 1 .

Both the grammar and the predicate Pè constrain the actual bindings that may be established (Figure 2b), asrepresented by sèi b j r :

sèi b j r = 1 fl Pè i b vi Hsi b L Ô HGi b vi , j r = 1L

Composition Theorem for Partition Functions

If G is a context-free grammar and

(2.23)

Zi Hzi L = „8si » Pi Hsi L<

ikjjjjj‰

a

zi asi ay{zzzzz ‰8si » Qi Hsi L <

HwLrHsi L Hsi L

= „9vi … P

ì Hvi L=

ikjjjjjj ‰

aœV HiLzi a

vi ay{zzzzzz Zi Hvi , zi L

is a set of partition functions, then the partition function for composite objects with root node i is:

(2.24)

Zcomposite i Hzi , 8z j <L = „9vi … P` i Hvi L=

ikjjjjjj ‰

aœV HiLzi a

vi ay{zzzzzz Zi Ivi , 9zèi b # zi b Zè i vi b =M

=

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ„

9vi … P`

i Hvi L=

ikjjjjjj ‰

aœV HiLzi a

vi ay{zzzzzz Zi Hvi , zèi vi b b L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑ Izèi vi b b # zi b Zè i vi b M

(2.25)Zè i vi b = Zè i vi b I9zèi vi b j r # Zj Hv j Hvi L, z j L=Mwhere

ICSB_TutorialV12.nb

41

(2.26)

Zè i vi b H8zi vi b j r <L = „9sèi b j r œ80,1< …sè i b j rbGi b vi , j r Ô ⁄ j, r sè i b j r =1=

ikjjjjjj‰8 j r<

Hzèi vi b j r Lsè i b j ry{zzzzzz

i

k

jjjjjjjjj ‰9si a , sè i b j r … Qè i b H8si a , sè i b j r »"a j r<L=

HwLrIsi , sé i b M Hvi , séi b Ly

{

zzzzzzzzz

and

(2.27)v j Hvi L = „8v j <

v j ‰aœV HiL

dHvi a , vj a£ LCHi b j rL Ha a£ L

and

vi b is the minimal subset of vi componentsthat interact with séi b through HwLrIsi , si b M .

Polymers

In this case i œ 8Ø, 1< , b œ 81< , r œ 81< , a œ Ø .Grammar: G1111 = 1; all others =0. P

`01 = Hs0111 b 1L and all other s0*** = 0.

Gi b j r = di 1 db 1 d j 1 dr 1

P0 = T . Z1 = 1 + w1 V1 z1 . Recursion means z1 = Z1 , so

Z1 = 1 ê H1 - w1 V1 L = ‚s=0

¶

Hw1 V1 Ls

which is a generating function for the number of filled slots starting from any given molecule, which is in turnone less than the length of the polymer. Naturally, a geometric distribution results.

2.4.5 GMWC

Metabolic modeling in [Yang et al., JBC 280(12) 11224-11232, 2006]. Threonine deaminase is modeled by theMonod-Wyman-Changeaux model.

ICSB_TutorialV12.nb

42

A simple hierarchical example is given by the Generalized Monod-Wyman-Changeaux model of allostericenzymes [Najdi et al. 2006].

One way to index a stratified hierarchical model description is using multiple indices: i = Hl, il L where l is alevel number and il is a further indexing of the molecules that may participate at level l .

Figure 4. GMWC .

ICSB_TutorialV12.nb

43

Figure 5. GMWC machine learning style diagrammatic representation. (a) Replication specified by indexnodes (platelets) and plates. (b) Replication indicated by several copies and ellipses.

Level 1 (top): Global activation/inactivation state, with permanent binding to n independent subunitsIndexing: l = 1; il = 1(and can be omitted). Thus i = H1L . The state vector is v1 = a1 œ 8+1, -1< which we

abbreviate by superscripts +, - . In this convention zèH1 bL ª zèH1 1 bL . b œ 81, ... , n< Then

ZH1L = zH1L ZH1L H+, 8zèH1 bL <L + ZH1L H-, 8zèH1 bL <L

= zH1L wH1L ‰b=1

n

z0 b zèH1 bL+ + ‰

b=1

n

z0 b zèH1 bL-

ICSB_TutorialV12.nb

44

which upon composition will become

ZH1L = zH1L wH1L ‰b=1

n

Zè H1 + bL H8zi + b j r <L + ‰b=1

n

Zè H1 - bL H8zi - b j r <L

Because the binding never changes at level 1 in this grammar, but only the state of what is bound, we drop theconstant z0 b factors as uninformative. Thus

(2.28)ZH1L I8zèH1 bL <M = zH1L wH1L ‰

b=1

n

zèH1 bL+ + ‰

b=1

n

zèH1 bL-

Level 2: Each subunit is identical, has v2 Hv1 L ª v1 , and has ⁄b=13 Qb binding sites for substrates

(b = 1), activators (b = 2), and inhibitors (b = 3). There is an energetic interaction for each substrate, and alogical constraint on activators (which are only allowed in the +1 state) and inhibitors (only allowed in the -1state). This situation is illustrated in Figure 9. Binding sites on a subunit are indexed by b = Hb qL , b œ 81, 2, 3< .

Indexing: l = 2; il = 1(and can be omitted). Thus i = H2L .P2 Ha, s1 , s2 , s3 L = Hs2 q fl aLÔ Hs3 q fl aêêL = Hs2 q fl aL Ô Hs3 q

êêêêê › aLFrom Equation 23 for i = j = H2L ,

(2.29)

ZH2L Hv2 = 1, z2 L =

zH2L wH2Likjjjjjj‰

q=1

Q1

I1 + wH2 1 qL+ zèH2 1 qL+ M

y{zzzzzz ikjjjjjj‰

q=1

Q2


y{zzzzzz +

ikjjjjjj‰

q=1

Q1

I1 + wH2 1 qL- zèH2 1 qL- M


q=1

Q3


y{zzzzzz

Composition is governed by the grammar

Gi b1 jl=1,2 = di 0 d j 1 QH1 b b1 b nL

Gi b j rl=1,2 = Gi b j

l=1,2 dr 1

From the general formulae of Equation 26 and Equation 25,XXX

Zè H1 bL H8zè0 b 1 1<L = zè0 b 1 1 # ZH2L

zèH0 bL = ZH2L

‰b=1

n

zèH0 bL+ = IZH2L Mn

Note: Levels 1 and 2 can be combined using the Composition Theorem for partition functions:

ZH1Î2L HzH1L , zH2L L = zH1L wH1L zH2L wH2Likjjjjjjikjjjjjj‰

q=1

Q1



q=1

Q2


y{zzzzzzy{zzzzzz

n

+

ikjjjjjjikjjjjjj‰

q=1

Q1



q=1

Q3


y{zzzzzzy{zzzzzz

n

Here zH2L wH2L are redundant with zH1L wH1L so we drop them, and simplify slightly:

ICSB_TutorialV12.nb

45

(2.30)

ZH1Î2L HzH1L , zH2L L =

zH1L wH1L Âq=1

Q1

I1 + wH2 1 qL+ zèH2 1 qL+ Mn ‰

q=1

Q2

I1 + wH2 2 qL+ zèH2 2 qL+ Mn + Â

q=1

Q1

I1 + wH2 1 qL- zèH2 1 qL- Mn ‰

q=1

Q3

I1 + wH2 3 qL- zèH2 3 qL- Mn

Level 3: Convergence through sharing of fugacity variables, each of which is (for a dilute well-stirredsolution in a fixed macroscopic volume) proportional to the number of molecules present and therefore to concen-tration. No new dependence on ± state is allowed. There is no competitive binding.

Indexing : l = 3, il œ 8Hb, qL » 1 b b b 3 Ô1 b q b Qb < , so i = H3, b, qL .

wH2 b qLs zèH3 b qLs ª

loooooomnoooooo

SH3 1 qL êKH3 -1 1 qL if b = 1 Ô s = -1SH3 1 qL êKH3 +1 1 qL if b = 1 Ô s = +1SH3 2 qL êKH3 +1 2 qL if b = 2SH3 3 qL êKH3 -1 2 qL if b = 3

ª

loooooomnoooooo

sq if b = 1 Ô s = -1cq sq if b = 1 Ô s = +1aq if b = 2iq if b = 3

Composition of all levels:

Gi b jl=2,3 = di 1

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅ‚b=1

3

‚q=1

Q b

db Hb,qL d j Hl=3,b,qL

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑGi b j r = IGi b j

l=1,2 + Gi b jl=2,3 M dr 1

Using traditional parameters L = wH1L and c = wH2 1L ,

(2.31)ZGMWC HzH1L , zH2L L = ZH1Î2Î3L HzH1L , zH2L L = zH1L LÂq=1

Q1

H1 + cq sq Ln ‰q=1

Q2

H1 + aq Ln + Âq=1

Q1

H1 + sq Ln ‰q=1

Q3

H1 + iq Ln

from which the rate law for catalysis can be mechanically derived [3], in the simplifying special case thatQ1 = 1 and therefore cq = c .

Clearly factors of wH2 2L , wH2 3L can be absorbed into the normalization of aq , iq unlike the factor of wH2 1L sincesq appears in two different places. This model has been used successfully for metabolic pathways [Najdi et al.].For Q1 = Q2 = Q3 = 1it reduces to the Monod Wyman Changeaux [MWC] model as shown for example in [Yanget al.].

ZMWC Hz, s, a, iL = z L H1 + c sLn H1 + aLn + H1 + sLn H1 + iLnCompetitive binding serves only to undermine the distinction between activators and inhibitors in the MWC

and GMWC models, and is best omitted as in the present model.Proposed Syntax:

2.4.6 Hierarchical Cooperative Activation (HCA) model

Hierarchical Cooperative Activation (HCA0 [Mjolsness 2001] is another statistical mechanical modelof transcriptional regulation, including overlapping binding sites together with MWC-like activation of energeti-cally defined nonoverlapping modules (ovals in Figure below) that contain state variables whose values alterbinding energies (but not necessarily hard logic of binding) within the module. Overlapping binding sites arelimited to pairs. (Figure).

ICSB_TutorialV12.nb

46

Figure .or equivalently using machine learning model notations:

Figure .Level 0 (top): global activation/inactivation

Indexing: l = 0; il = 1(and can be omitted).

ZH0L HzL = zH0L wH0L zèH0L+ + zèH0L

-

Level 1: Independent binding of modules to heterogeneous sites somewhere in the transcription com-plex, conditioned on complex activation.

Indexing: l = 1; il = m œ 81, ..., I1 = M< .

ZH1L H zèL = ‰m=1

M

zèH1 mL

Level 2: module activation/inactivationIndexing: l = 2; il = 1(and can be omitted).

ZH2 mL H zèL = zH2 mL wH2 mL+ zèH2 mL+ + zèH2 mL

-

Level 3: Independent competitive binding of modules to heterogeneous sites within the module, condi-tioned on module activation and global transcription complex activation.

Indexing: l = 3; il = b œ 81, ..., I3 = BHi1 L< .

ZH3 mL H zèL = ‰b=1

BHmLzèH3 m bL

Level 4: Binding site occupancy in terms of (up to pairwise) overlapping sites and dimers.Let sHbL=the (arbitrarily chosen) occluded sites out of any pairs of overlapping sites.

ICSB_TutorialV12.nb

47

Zè H4 m bL

= 1 + ‚j=1

J

wH3 m b jL zèH4 m bL j + ‚j=1

J

wH3 m sHbL jL zèH4 m bL j +

‚j, k=1

J

wH3 b j kL zèH4 m bL j zèH4 m bL k + ‚j, k=1

J

wH3 sHbL j kL zèH4 m sHbLL j zèH4 m sHbLL k

Level 5: Actual choice of site occupants

zèH4 m bL j = zj

Thus

XHm bLs = Zè H4 m bLs

IzèH4 m bL j # zj M

= 1 + ‚j=1

J

wH3 m b jLs zj + ‚j=1

J

wH3 m sHbL jLs z j + ‚j, k=1

J

wH3 m b j kLs z j zk + ‚j, k=1

J

wH3 m sHbL j kLs z j zk

Composition of levels 0-3:

(2.32)

ZH0L Hzè, zH2 mL = 1L = zH0L wH0L Âm=1

M Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅwH2 mL++ ‰

b=1

BHmLXHm bL+ + ‰

b=1

BHmLXHm bL-

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑ+ Â

m=1

M Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅwH2 mL-+ ‰

b=1

BHmLXHm bL+ + ‰

b=1

BHmLXHm bL-

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑwhere

XHm bLs = 1 + ‚j=1

J

wHm b jLs zj + ‚j=1

J

wHm s HbL jLs zj + ‚j, k=1

J

wHm b j kLs zj zk + ‚j, k=1

J

wHm s HbL j kLs z j zk

Then

(2.33)

ZH0L Hzè, zH2 mL = 1L

=

loooomnoooozH0L wH0L Â

m=1

M Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwH2 mL++ Â

b=1

BHmLXHm bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅXHm bL-

+ 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑ+ Â

m=1

M Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwH2 mL-+ Â

b=1

BHmLXHm bL+


+ 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑ

|oooo}~oooo ikjjjjj‰

m=1

M

‰b=1

BHmLXHm bL-

y{zzzzz

Recall the definition of a Hill function:

HillHx, nL = xn ê H1 + xn LsHyL = 1 ê H1 + expH-yLL

and note

x = exp yHillHexpHyL, nL = expHn yL ê H1 + expHn yLL = sHn yL

Then

ICSB_TutorialV12.nb

48

Activation = log Z0

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ log zH0L

ƒƒƒƒƒƒ†ƒƒƒƒƒƒƒz0 =1 = Hill

Ä

ÇÅÅÅÅÅÅÅÅÅÅ

wH0L ¤m=1M AwH2 mL++ ¤b=1

BHmL XHm bL+ + ¤b=1BHmL XHm bL- E

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ¤m=1M AwH2 mL-+ ¤b=1

BHmL XHm bL+ + ¤b=1BHmL XHm bL- E , 1

É


= Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

wH0L ‰m=1

M BwH2 mL++ ‰b=1

BHmL XHm bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅXHm bL- + 1FÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ‰

m=1

M BwH2 mL-+ ‰b=1

BHmL XHm bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅXHm bL- + 1F, 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑBy counting number of indices it is possible to drop the level numbers from w , and reduce its double sign index

to a single sign index.

Proposition 2. For a fully statistical HCA,

Activation = Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwH0L Â

m=1

M loomnooAwHmL+ ¤b=1

BHmL XHm bL+ + ¤b=1BHmL XHm bL- E

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅAwHmL- ¤b=1BHmL XHm bL+ + ¤b=1

BHmL XHm bL- E|oo}~oo

, 1

É

Ö


= Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅwH0L Â

m=1

M loooomnooooBwHmL+ ‰

b=1

BHmL XHm bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅXHm bL- + 1FÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅBwHmL- ‰

b=1

BHmL XHm bL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅXHm bL- + 1F

|oooo}~oooo

, 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑ,

= Hill

Ä

Ç


m=1

M loooomnoooo1 +

ikjjj wHmL+

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅwHmL-

- 1y{zzz Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwHmL- Â

b=1

BHmLXHm bL+


, 1

É

Ö


|oooo}~oooo, 1

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑ,

where

XHm bLs = 1 + ‚j=1

J

wHm b jLs zj + ‚j=1

J

wHm sHbL jLs zj + ‚j, k=1

J

wHm b j kLs zj zk + ‚j, k=1

J

wHm sHbL j kLs zj zk .

Note that we now have two Hill functions nontrivially composed with one another (one is nested as an argu-ment inside the other).

ANN-like approximation

Calculate:

Activation = Hill

Ä

Ç


m=1

M loooomnoooo1 +

ikjjj wHmL+


- 1y{zzz Hill

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅwHmL- Â

b=1

BHmLXHm bL+


, 1

É

Ö


|oooo}~oooo, 1

É

Ö


= s

Ä

Ç


i

k

jjjjjjjjwH0L Âm=1

M loooomnoooo1 +

ikjjj wHmL+


- 1y{zzz s

Ä

Ç


i

k

jjjjjjjj wHmL- Âb=1

BHmLXHm bL+


y

{

zzzzzzzz

É

Ö


|oooo}~ooooy

{

zzzzzzzz

É

Ö


= s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅlogHwH0L L + „

m=1

M

log

loooomnoooo1 +

ikjjj wHmL+


- 1y{zzz s

Ä

Ç


i

k

jjjjjjjj wHmL- Âb=1

BHmLXHm bL+


y

{

zzzzzzzz

É

Ö


|oooo}~oooo

É

Ö


ICSB_TutorialV12.nb

49

= s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅlogHwH0L L + „

m=1

M

log

loooomnoooo1 +

ikjjj wHmL+


- 1y{zzz s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅlogI wHmL- M + „

b=1

BHmLlogik

jjj XHm bL+


y{zzzÉ

Ö


|oooo}~oooo

É

Ö


= s

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅhH0L + „

m=1

M

log loomnoo

1 + IexpIhHmL+ - hHmL- M - 1M sÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅhHmL- + ‚

b=1

BHmLIlog XHm bL+ - logXHm bL

- MÉ

Ö

ÑÑÑÑÑÑÑÑÑÑÑ|oo}~oo

É

Ö


= g

Ä

Ç


m=1

M

gèÄ

Ç


b=1


- M; hHmL+ - hHmL-

É

Ö


É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑThus

Activation = g

Ä

Ç


m=1

M

gèÄ

Ç


b=1


- M; hHmL+ - hHmL-

É

Ö


É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑwhere

(2.34)gèHx; hL ª log@1 + HexpHhL - 1L gHxLD

Using convexity,

gèHx; hL = log@expHhL gHxL + H1 - gHxLLDr h gH xL + H1 - gHxLL = 1 + Hh - 1L gHxL

If we minimize the approximation error

(2.35)1ÅÅÅÅÅ2

‡0

1Ih l + H1 - lL + c lH1 - lL - logIeh l + H1 - lLMM2 d l

with respect to c for any given h , we find

(2.36)cHhL =5ÅÅÅÅÅ6

8 I1 - 4 eh + 4 e2 h - e3 h M + 3 hI1 - 3 eh - 3 e2 h + e3 h MÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅH-1 + eh L3

Thus we may approximate

gèHx; hL @ 1 + @h - 1 + cHhLLD gH xL - cHhL gH xL2which is exact at the extreme values gHxL = 0 or =1. This formula may be seen as a small neural network

implementing gèHx; hL in terms of a previous layer that calculated gHxL .If B is so large that we are inside the radius of convergergence for the logarithm around 1 which is

O(1), but outside the radius of convergence of s@B yD which is OH1 êBL , then we can expand the log but not thesigma and recover a neural network equation:

log XHm bL @ ‚j=1

J

wHm b jLs z j + ‚j=1

J

wHm sHbL jL zj + ‚j, k=1

J

wHm b j kL zj zk + ‚j, k=1

J

wHm sHbL j kL z j zk

THm jL = ‚b=1

BHmLwHm b jL = ‚

b=1

BHmLwH3 m b jL and THm j kL = ‚

b=1

BHmLwHm b j kL = ‚

b=1

BHmLwH3 m b j kL

ICSB_TutorialV12.nb

50

‚b=1

BHmLlog XHm bL @ ‚

j=1

J

THm jL zj + ‚j, k=1

J

THm j kL z j zk

‚b=1

BHmLlog XHm bL+ - ‚

b=1

BHmLlog XHm bL- @ ‚

j=1

J

ITHm jL+ - THm jL- M z j + ‚j, k=1

J

ITHm j kL+ - THm j kL- M zj zk =

‚j=1

J


J

THm j kL zj zk

This gives a series of approximations summarized in Proposition 3.

Proposition 3. For the hierarchical (HCA) model of Equation 32 the activation function is

Activation = gi

k

jjjjjjjjhH0L + „m=1

M

gèikjjjjj hHmL- + ‚

b=1


- M; hHmL+ - hHmL-y{zzzzzy

{

zzzzzzzz

where gè Hx; hL ª log@1 + HexpHhL - 1L gHxLD and XHm bLs is given by Equation 32. It has the followingapproximation:

(2.37)Activation @ gi

k

jjjjjjjjhH0L + „m=1

M

gèikjjjjjj hHmL- + ‚

j=1

J


J

THm j kL zj zk ; hHmL+ - hHmL-y{zzzzzzy

{

zzzzzzzz

as well as a second less accurate approximation (a two-layer sum-product neural network with complicatedweight formula)

(2.38)

Activation @ gikjjjjjhH0L + M + ‚

m=1

M

IhHmL+ - hHmL- - 1 + cIhHmL+ - hHmL- MM vHmL - ‚m=1

M

cIhHmL+ - hHmL- M vHmL 2y{zzzzz, where

vHmL = gikjjjjjjhHmL- + ‚

j=1

J


J

THm j kL zj zk

y{zzzzzz

where cHhL is given by Equation 36, and a third still less accurate approximation which takes the form of aclassic two-layer neural network:

(2.39)Activation @ gikjjjjjhH0L + M + ‚

m=1

M

IhHmL+ - hHmL- - 1M vHmLy{zzzzz, vHmL = g

ikjjjjjjhHmL- + ‚

j=1

J


J

THm j kL z j zk

y{zzzzzz.

In these expressions the constants are defined by

ICSB_TutorialV12.nb

51

THm jL = THm jL+ - THm jL- and THm j kL = THm j kL+ - THm j k L-

THm jL = ‚b=1

BHmLwHm b jL = ‚

b=1

BHmLexpI-D GHm b jL ë k TM,

THm j kL = ‚b=1

BHmLwHm b j kL = ‚

b=1

BHmLexpI-D GHm b j kL ë k TM, and

hHmL+ = log wHmL+ = -D GHmL ê k ThH0L = log wH0L = -D GH0L ê k T

and gHxL = 1 ê H1 + expH-xLL .The conditions of validity for the first approximation are:(a) large number of binding sites within each module, BHmLp 1, and(b) probability of occupancy for each binding site is e or 1 - e , with e`1.

In the 1 - e case the expressions for T in terms of omega are altered due to refactoring of Xi.The conditions of validity of the second and third approximations depend only on Equation 35 and Equation 36.

Example HCA model for Dorsal, Twist, and Snail in Drosophila dorsoventral axis. This models softens thehard logic of the models investigated by Zinzen et al. [2006] which may be diagrammed as follows:

ICSB_TutorialV12.nb

52

2.4.7 Steady state and equilibrium thermodynamics

In the standard dynamical theory of such systems we can index all the states of the system with a largecompound index I and then write a Master Equation for the probability pI HtL of each configuration:

(2.40)d pI HtLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= ‚

J

KI J pJ HtL - ‚J

KJ I pI HtL = ‚J

@KI J pJ HtL - KJ I pI HtLD

In a later section on stochastic processes, K will receive the name H`

instead, and ⁄J KJ I will be DI I in adiagonal matrix. Then

d pHtLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= H ÿ pHtL, where H = H` - D .

The probability rates KI J are nonnegative and determined by the rate functions r . This quantity is zero for anysteady state. A brute force solution to the steady state problem is to find a basis for the left nullspace ofK - diagH1 ÿ KLand formulate the nonnegativity and conservation of probability as a linear programming (LP)problem in this space. Fortunately there is extra structure to use in this problem.

The quantity KI J pJ HtL is the flow of probability from node J to I , and the steady state conditiond p êd t = 0 reflects conservation of probability at each node. Any steady state solution flow obeying such a nodeconservation law can be represented as a weighted sum of loop flows, each of which trivially conserves probabil-ity. One can define a loop vector space for the state transition graph, such that each loop is a linear combinationwith integer coefficients of basis loops. A steady state solution flow can be decomposed in the correspondingmanner.

In equlibrium as opposed to steady state, detailed balance makes the further requirement that each termin the foregoing sum is zero:

KI J êKJ I = pI ê pJ .

Around any loops these ratios must be consistent:

ICSB_TutorialV12.nb

53

KI JÅÅÅÅÅÅÅÅÅÅÅÅÅKJ I

KJ LÅÅÅÅÅÅÅÅÅÅÅÅÅÅKL J

KL IÅÅÅÅÅÅÅÅÅÅÅÅÅKI L

= 1, etc

‰transitions HI , JL œ directed loop

KI JÅÅÅÅÅÅÅÅÅÅÅÅÅKJ I

= ‰transitions HI, JL œ directed loop

pIÅÅÅÅÅÅÅÅÅpJ

= 1

This property only needs to be verified for a set of basis loops; it then follows for all loops.We may further consider the situation where some reactions are in equilibrium through detailed bal-

ance, and others not. For a steady state network, we seek a maximal equilibrium-like subnetwork that satisifiesdetailed balance.

One way to do this is as follows. Restrict K`

to include only those state transitions of K that occur inpairs satisfying detailed balance, and for which neither KI J is zero. For any transitions not in detailed balance,take the smaller of the two opposite flows to define both sides of K

` through detailed balance, and the remaining

one-sided flow to be outside of K (say in Kè ). Now

(2.41)K = K` HpL + Kè HpL

K`

I J HpL = minHKI J , KJ I HpI ê pJ L Lwhere K

` satisfies detailed balance, Kè is zero- or one-sided for every transition, and both K

` and Kè satisfy the

fixed-point equation for pJ . This decomposition of K actually depends on p - as p varies, the decomposition of Kcan changes continuously and piecewise linearly - but the dependence of the structures of K

` and Kè is only through

very broad equivalence classes of p state vectors. This definition of K`

may or may not disconnect the graph ofstates determined by K . If it does, consider next only a connected component of the graph of states.

Define

wI J = K`

I J ê K`

J I

kI J ="######################K

Ì J ä K

`J I

Then

K`

I J = kHI JLè!!!!!!!!!

wI J , wherekHI JL = kHJ IL and wI J = 1 êwJ I .

Of the three quantities wI J , wJ I , and kI J , only wJ I goes to infinity for a one-sided transitionK`

I J = 0, K`

J I 0. If both directions are zero, define wI J =1. Let sI J œ 80, 1< denote the sparsity structure ofK`

I J .

Then around any loop

wI J wJ K wK I = 1, or in general

(2.42)‰transitions HI, JL œ directed loop

wI J = 1

whence (for a connected component of K) we can by summing along every path starting from an arbitary state,construct a potential function b GI for which DG=0 around every loop:

(2.43)wI J = exp@- b DI J GD = exp@- b HGI - GJ LD = wI êwJ

In the absence of Kè , b G is the Gibbs free energy, scaled by the inverse temperature b . It may have a differentadditive constant in G for different connected components of the state graph. We take GØ to be zero for thecompletely unbound state.

ICSB_TutorialV12.nb

54

If we can calculate the Gibbs free energy of the system in the special case of only one connectedcomponent, we can compute

(2.44)pI ê pJ = exp@- b HGI - GJ LD and thus

pI =pIÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ⁄J pJ

=exp@- b GI DÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ⁄J exp@- b GJ D

If there are several components connected only through Kè , one can partition probability among them bysolving the probability flow problem on a coarse-scale graph connected by links of Kè alone. If Kè connects nodeswithin one component (with a sum of nonopposing loop currents, of course), conservation of flow in Kè can betaken to further constrain G - it biases the assignment of effective free energies vs. equlibrium (Gibbs) freeenergies to states, so that Kè can be expressed as a sum of the given loop currents. A further wrinkle is that thedefinition of Kè actually depends on p - as p varies, the decomposition of K can change continuously.

ICSB_TutorialV12.nb

55

Chapter 3

Dynamics

3.1 Dynamics in biology

3.1.1 Development

Serial pattern formation in 1,2,3D> serial symmetry breaking

Rich internal space/cellParameterized by large networks

=> ODEs, SDEs, SPsVariable-Structure System at network & cellular scalesCell coupling: mechanical & signaling

=> {ODEs}, PDEs, SPDEsLocal control of tissue growth emerges

Dynamic geometryTeleological, due to selection

Calls for control theory, computer science

3.2 Differential Equations

Locality in time and space+continuum limit ~ => differential equations (DE’s)

3.2.1 DE Types and relationships

Ordinary differential equations (ODE), differential-algebraic equations (DAE), partial differential equations(PDE), stochastic differential equations (SDE), differential-delay equations (DDE), stochastic partial differentialequations (SPDE).

ICSB_TutorialV12.nb

56

3.2.2 Behaviors and model reduction

verbs (level L + 1 ê2) --> nouns (level L ), e.g. atoms - Compute attractors to find emergent objects and theirdynamics.

3.2.3 Translating small networks to dynamical systems

Composition rules: E.g. translate the union of a set of reactions. [Cellerator]For sufficiently small time steps, multiple processes occuring in parallel have simple ways of “adding up” their

effects. This is reflected in adding the RHS of ODE’s for mean values or for probability distributions (Masterequation stochastic approach). Thus, locality in time enables a simple mapping from graph composition tocomposition of dynamical systems.

Figure: Commutative diagram for composition of networks and dynamical models. Relate to“semantics” of graph notations. Union of rule multisets (top horizontal multi-arrow, decorated with ‹ ) maps tosum of generators (bottom horizontal multi-arrow, should be decorated with ⁄).

Reaction decomposition.

A1 + A2 + ... + An Ø B1 + B2 + ... + Bm with k

or, allowing for duplications,

(3.1)8ma Aa » 1 b a b Amax <ök 8nb Ab » 1 b a b Amax <

The reactions in this equation can be decomposed into elementary processes of the form8A + B Ø C, C Ø A + B< if A, B, or C are allowed to be null (Ø) reactants as necessary. This kind of decomposi-tion may be useful in software, network analysis, and in inventing simulation algorithms.

ICSB_TutorialV12.nb

57

3.2.4 Example small networks

Substrate-enzyme-product

S + E Ø C with kf

C Ø S + E with krC Ø P + E with kcat

Mass action kinetics.If enzyme E is small, and k f S , kr >> kcat : n = 1 Hill function. Then we get the Michaelis-Menten

dynamical model.This is equivalent to replacing a single binding site (fully solved below) with E binding sites (E a small

number).A more detailed derivation of the Michaelis-Menten form for S-E-P model is given in [Ermentrout] or

[Murray 1989]. The general idea is to systematically separate fast and slow time scales in a perturbation expan-sion. This “singular perturbation theory” method applies very widely in dynamical systems given by ODE’s[Kevorkian and Cole].

Activated TF example

Input phosphorylates TF which activates it for transcriptional regulation:

;;TF V TFpP

,?, 8P V «<, 8Pp # P, hill@vmax -> 1.5, nhill -> 3, khalf -> 2.7D<?

3.2.5 Emergent properties of small networks: Feedback loops & attractors

Small networks are capable of behaviors that can be useful in broad categories of cell function such asmolecular and structural synthesis, information processing, and their combination in replication.

In metabolic networks for molecular synthesis, for example, carbon backbones must typically be builtone carbon atom at a time - therefore a sequence of different reaction steps is required, each catalysed by adifferent enzyme. The cell must at least approximately optimize its energy and material investment in eachpossible product, such as the 20 amino acids required in proteins, as well as any intermediate forms. Thereforeevery synthetic pathway must have additional regulatory feedback links. Even small networks can provide therequired feedback inhibition to regulate the level of a synthesized product.

Other information processing functions are available in small networks as well. A positive feedbackloop can provide a bistable switch capable of storing one bit of information. Other networks can detect the valueof a small signal amidst molecular noise, and amplify it either linearly (if the relevant information is quantitativeand on a linear scale) or nonlinearly (if for example a single bit of information is being detected such as thepresence or absence of an important stimulus). Temporal information processing strategies can be implementedwith oscillation and other intrinsically temporal behaviors. Some of the behavioral possibilities available withsmall enzyme networks are reviewed in [Sniffers and Buzzers paper: Tyson, Cheng, Novak, Current Opinion inCell Biology 2003, 15:221–231]

ICSB_TutorialV12.nb

58

The important role of noise will be modeled using the operator algebra framework in Section 3.3.3below. Mostly noise is something to overcome, but it can also be a generator of useful diversity.

These capabilities and perhaps others that we don’t know about are emergent and available in the spaceof possible small networks. We have the best chance of cataloging the range of useful information processingbehaviors for small networks, where there are relatively few possible graph structures to survey, before the onsetof a combinatorial explosion of larger network structures. But which behaviors, if any, actually occur in a livingsystem is determined by the slow-timescale population dynamics of Darwinian evolution. Ultimately no biologi-cal model will be complete without an evolutionary component, though modeling that is often beyond our currentcapabilities.


generic TF feedback example

Basic behaviors: fixed points, oscillations, and amplificationBistable fixed pointsE.g. a network with 2 Hills functions, as derived above for transcriptional regulation or other binding site

dependent regulation:

ICSB_TutorialV12.nb

59

alternative regulatory mechanisms: transcriptional, protein modification (e.g. phosphorylation, complexassembly/disassembly, regulated protein degradation. Different cellerator notations and reaction translations foreach of these.

Fixed points:

Note that number of fixed points depends on n and, for n = 1, on the slopes at the origin.

Feedback motifs and their behaviors

Bistability. Eg. competing n = 2 Hill functions as above, with external input.

8x ã e ê He + y^2L, y ã s Heê He + x^2LL<(New Kernel) In[442]:=

soln4 = Solve@8x ã e ê He + y^2L, y ã s Heê He + x^2LL< ê. 8e Ø .1<, 8x, y<D;tb = Table@Map@H#@@1DD Ø #@@2DDL &, N@soln4 ê. 8s Ø ss<D, 82<D, 8ss, 0.03, 3, .03<D;tb2 = 8x, y< ê. tb;vals =Flatten@Table@8i* .01, tb2@@i, j, 2DD<, 8i, 1, 100<, 8j, 1, Length@tb@@1DDD<D, 1D;

vals1 = Select@vals, HAbs@Im@#@@2DDDD < .0001L &D;vals2 = Select@vals, HAbs@Im@#@@2DDDD >= .0001L &D;l1 = ListPlot@Re@vals1D, PlotStyle Ø [email protected];l2 = ListPlot@Re@vals2D, PlotStyle Ø [email protected]; Show@l1, l2D

ICSB_TutorialV12.nb

60

0.2 0.4 0.6 0.8 1-0.25

0.25

0.5

0.75

1

1.25

1.5

Figure . Note the two pitchfork bifurcations. Blue curves are pure real-valued; red curves have anonzero imaginary component which is not plotted. The main blue curve creates the conditions for hysteresis:some values of the input (horizontal axis) have three possible values of the output (vertical axis), two of which arestable. Which one is chosen depends on the history of the input variable.

This is a good example of a the functioning of a switch. Depending on the input, one of two stablepositions is reached. Furthermore, small variations in the input value do not generally flip the switch. Conse-quently, the switch has memory of its recent input history.

e=0.24 is located very near to a catastrophe point. For larger epsilons the two branches just cross, and finallythey separate. Only one can be pure real-valued. The curves are topologically related to x=-(y-y0)^3+u (y-y0)for the simpler fold catastrophe. [http://mathworld.wolfram.com/ Catastrophe.html ]

0.2 0.4 0.6 0.8 1

0.5

1

1.5

Figure A discontinuity in the input-output curve for an ultrasenstive positive feedback system.It doesn’t always work out so nicely. For example the obvious mutually reinforcing version of this

system suffers lockup at zero:

8x ã y^2 ê He + y^2L, y ã s Hx^2 ê He + x^2LL<

ICSB_TutorialV12.nb

61

0.2 0.4 0.6 0.8 1-0.1

0.1

0.2

0.3

0.4

0.5

0.6

This switch is irreversible. Large input is compatible with two different stable states, but once the input hasgotten low enough to force the switch into the off position, it remains there always.

8x ã e + y^2 ê H e + y^2L, y ã s He + x^2 ê He + x^2LL<is apparently ultrasensitive:

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

Figure A discontinuity in the input-output curve for an ultrasenstive positive feedback system.

Other emergent properties. We will see examples of homeostasis (in regulation of amino acid synthesis, andusing adaptive control laws), linear and nonlinear amplification in the presence of noise, and the functioning ofcomplex equilibria for microsystems such as molecular complexes for transcriptional regulation or signal transduc-tion, that can have have many inputs. There are also temporal functions such as low-pass or band-pass filteringthat can emerge from small networks.

The veto power of evolution. We can mathematically understand some small networks using attractorgeometry as above, and then compose circuits in such a way that larger networks are also understandable. How-ever, both natural and computational evolution are free to explore the space of poorly understood, not fullymodular networks and may produce systems whose operating principles go beyond what we expect. This effecthas been encountered many times in the training of nonlinear neural networks for computational purposes.

3.2.6 Translations in Cellerator

elementary:

ICSB_TutorialV12.nb

62

CelleratorArrow

Name of Reaction,Differential Equations

Typical BiochemicalNotation

S → P ConversiondPdt

= kS , dSdt

= −kSS k⎯ → ⎯ P

A + B → C

Unidirectional ReactiondAdt

= −kAB ,

dBdt

= −kAB , dCdt

= kAB

A + B k⎯ → ⎯ C

A + Bn → C Unidirectional Reaction withcooperativity n(n must be an integer)dAdt

=dBdt

= −kABn = −dCdt

A + nB k⎯ → ⎯ C

A + B ↔ C Bidirectional ReactiondAdt

=dBdt

= −dCdt

= − kf AB + krC ,A + B

kf⎯ → ⎯ ⎯ kr

← ⎯ ⎯ C

∅→ A Creation: dAdt

= kk⎯ → ⎯ A

A →∅ Annihilation: dAdt

= −kA A k⎯ → ⎯

SE⎯ →⎯← ⎯⎯ P Enzymatic (Catalytic) Reaction

dSdt

=−k f SE + kr X , dPdt

= kX ,

dXdt

= −dEdt

= k f SE − (k + kr)X

S + Ek f⎯ → ⎯ ⎯ kr

← ⎯ ⎯ X k⎯ → ⎯ E + P

Advanced:

Cellerator Form Differential Equation Terms

{{A1,A2,...,An} B,

hill[options]}

d[B]dt

= r +vi[Ai ]i=1

n∑( )n

K n + vi[Ai ]i =1

n∑( )n

{{A1,A2,...,AN} B,

GRN[options]}

d[B]dt

= r1+ exp{−(h + Ti[Ai ]

ni

i=1

N∑ )}

{{A1,A2,...,AN} B,

GRN[...,Sigmoid→f]}

d[B]dt

= rf (h + TiAn )

{{A1,A2,...,AN} B,

SSystem[options]}

d[B]dt

= 1τ

k+ [Ai ]Ci

+

i=1n∏ − k− [Ai ]

Cii

i=1n∏⎛

⎝⎜⎞⎠⎟

{S1,S2,...} ⇒ {P1, P2,...{{A1,A2 ,...},{I1,I2 ,...}}

E}

d[Pi ]dt

= kcat [E]1+ aq( )n

q∏ sqq∏ (1+ sq )n−1q∏ + L (csq )q∏ (1 + csq )n−1

q∏ (1+ iq )nq∏

1 + aq( )nq∏ (1 + sq )n

q∏ + L (1+ csq )nq∏ (1 + iq )n

q∏

ICSB_TutorialV12.nb

63

Example: Oscillatory approach to fixed point in food web

Example: Oscillation in signal transduction

Hoffmann, Levchenko, Scott, Baltimore. Science 298:1241, 2003

ICSB_TutorialV12.nb

64

A. Levchenko, Cellerator notebook.

3.3 Operator Algebra

This is a fundamental approach to stochastic modeling, mapping Stochastic Parameterized Grammars(which generalize from chemical reaction networks) to time-evolution equations.

Most material in this section is from UCI ICS TR 06-11, “Stochastic Process Semantics for DynamicalGrammars”, by the author.

3.3.1 SPG Rule syntax

The expression ti Hxi L is called a parameterized term, which can match to a parameter-bearing object or terminstance in the pool of such objects. A particular type ti may appear in a rule any finite number of times, andindeed a particular parameterized term ti Hxi L may appear any finite number of times. So we use multisets8 ... taHiL Hxi L ... <* (in which the same object taHiL Hxi L may appear as the value of several different indices i ) for boththe LHS and RHS (Left Hand Side and Right Hand Side) of a rule. If $L and $R are index sets, we may use set-builder notation 8elementHiL » PredicateHiL<as a meta-language to write a general form for rules with fixed numberof RHS and LHS elements:

(3.2)8taHiL Hxi L » i œ $L <* Ø 8ta£ H jL Hyj L » j œ $R< *with rr H@xi D, @yj DL

Here the same object taHiL Hxi L may appear as the value of several different indices i under the mappingsi # HaHiL, xi L and/or i # Ha£ HiL, yi L . Note that the multisets of terms in both RHS and LHS are intrinsically unor-dered, but the parameters within a term are ordered and hence do not need names. Finally we introduce theshorthand notation ti = taHiL and t£

j = ta£ H jL , and revert to the standard informal notation 8< for multisets; then wemay informally write

(3.3)8ti Hxi L< Ø 8t£j Hyj L< with rr H@xi D, @y j DL

In addition to the with clause of a rule following the LHSØRHS rule header, several other alternativeclauses can be used as follows. “under EHx, yL” is translated into “with expH-E Hx, yLL ê ZHxL” where ZHxL is thenormalizing Boltzmann distribution partition function corresponding to EHyL , holding x constant. Equality con-straints “subject to f Hx, yL” is translated into “with dH f Hx, yLL” where d is an appropriate Dirac or Kronecker deltafunction that enforces a constraint f Hx, yL = 0. (Inequality constraints such as “subject to f Hx, yL à 0” and Bool-ean combinations thereof, may be translated using the QHPL function of Equation 3 above.) Clauses of the form“via G” and “substituting G” are used in order to call a grammar within a grammar as a subroutine or macro,respectively. A rule may have multiple clauses of the same or different keyword; each clause contributes a multipli-cative factor to the overall firing rate r. In the absence of any clause, r defaults to one.

The set-builder metalanguage for describing the form of SPG rules is also convenient for specifiyingmultiple similar rules in a rule schema, all of which belong to a grammar. For example we would like to admit arule schema that could replace the first three rules in the binaryclustergen grammar:

nodesetHxL Ø nodeHxL, 8childHxL » 1 b i b n< with qHnL subject to 0 b n b 2

ICSB_TutorialV12.nb

65

3.3.2 Semantics for rules and grammars

We provide a semantics function Yc HGL as an algebraic construction that results in a dynamical systemin the form of a stochastic process, if it exists, or a special “undefined” element if the stochastic process doesn’texist. The stochastic process is defined by a very high-dimensional differential equation (the master equation) forthe evolution of a probability distribution in continuous time. On the other hand we will also provide a semanticsfunction Yd HGL that results in a discrete-time stochastic process for the same grammar, in the form of an operatorthat evolves the probability distribution forward by one discrete rule-firing event. In each case the stochasticprocess specifies the time evolution of a probability distribution over the contents of a “pool” of grounded parame-terized terms ta Hxa L each of which can be present in the pool with any allowed multiplicity from zero to na

max . Wewill relate these two alternative “meanings” of an SPG, Yc HGL in continuous time and Yd HGL in discrete time, inSection 3.10.

Both semantic maps are given in terms of operator algebra. Starting with the grammar we construct alinear mapping from a probability distribution over states at one time to a function proportional to the probabilitydistribution over states at a later time. The mapping is constructed by algebraic operations (operator addition andmultiplication, and scalar-operator mulitplication) from more elementary linear mappings. To do so we need todefine the states.

3.3.3 Example: TF binding and unbinding; Master equation

Binding site mathematicsThere are two basic biochemical processes in this system: TF binding to a site, and TF unbinding from a site.

One can think of them as “reactions” for continuous probabilities rather than for continuous concentrations, or asstochastic grammar rules:

site_unoccupied (t) Ø site_occupied(t+Dt) with Pr(uØo | Dt)site_occupied(t) Ø site_unoccupied (t+Dt) with Pr(oØu | Dt)

Transition probabilities for small time steps are proportional to Dt and, for occupation, to the concentration ofthe ligand:

PrHunoccupied Ø occupiedL = a D t @ADPrHoccupied Ø unoccupiedL = b D t

This follows for very small time intervals D t and dilute solutions of A in which the probability of one moleculeof A being within a minimal distance of the site required for binding in time D t is small. These small probabili-ties scale up with linearly with the number of chances to execute the binding event, which is in turn proportionalto the concentration and to the time interval D t .

Equilibrium or steady-state

b P = a@AD H1 - PL

P =a@AD

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅb + a@AD .

ICSB_TutorialV12.nb

66

2µ2 Matrix formulation

In matrix form:

ikjjPrHunoccupiedL Ht + D tL

PrHoccupiedL Ht + D tLy{zz > D tK 1 - a@AD b

a@AD 1 - bO ikjjPrHunoccupiedL HtL

PrHoccupiedL HtLy{zz

Fundamental stochastic “Master equation” for this two-state system [10], [11]:

(3.4)d pÅÅÅÅÅÅÅÅÅÅÅd t

= H p

with a Hamiltonian H that is the sum of operators for the two active processes:

H = O1 + O2 = a@AD K -1 01 0 O + bK 0 1

0 -1 O

and thus

(3.5)H = K -a@AD ba@AD - b

O .

Fixed point analysis:

0 = K -a@AD ba@AD - b

O K1 - PP

O

which is solved by

b P = a@AD H1 - PLor

P =a@AD

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅb + a@AD .

This is a Hill function with n = 1. Note an alternative model is a homodimer: [A] Ø [B B] = k [B]^2 (massaction) . (Treat ambiguity with factor of 1/2 .)

P =a k @BD2

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅb + a k @BD2

This is a Hill function with n=2. It has a fundamental new property: inflection d2 P/d@BD2 > 0, allowingvariable gain and thus amplification, as well as multiple fixed points (hysteresis).

1 2 3 4 5

0.2

0.4

0.6

0.8

ICSB_TutorialV12.nb

67

Figure. The two Hill functions (n = 1 and n = 2) are compared. Note the different behaviors at zeroand large input: at zero, the n = 1 function has infinite slope and the n=2 function has zero slope. The n = 2 Hillfunction also asymptotes to 1 more quickly.

In Mathematica: Plot[{a x^n/(b + a x^n) /. {a -> 1, b -> 1, n -> 1}, ax^n/(b + a x^n) /. {a-> 1, b -> 1, n -> 2}}, {x, 0, 5}] .

Exercise. (Steady-state analysis) Show n = 1 Hill’s function has positive slope but negative secondderivative. Show n = 2 Hill’s function has positive slope and a second derivative that changes from positive(small inputs) to negative (large inputs).

Time course: Exponential convergence to steady state

Solution of the linear master equation d pÅÅÅÅÅÅÅÅd t = H p :Compare to unrestricted population growth equation d x êd t = g xHtL ; solution is x = x0 exp g t , where

exp can be defined as the limit of many tiny population-increasing steps multiplied together,

exp g t = limnØ¶ H1 + g t ê nLn

º H1 + g tL for small t.,

or equivalently using the Taylor series

exp g t = ‚i=0

¶ Hg tLnÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

n != 1 + g t +

Hg tL2ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

2!+

Hg tL3

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ3 !

+ ...,

n != ‰i=1

n

i = nHn - 1L Hn - 2L ... 3 2 1 .

Likewise for any matrix H ,

expHt HL = limn Ø ¶

JI +t

ÅÅÅÅÅn

HNn

= ‚i=0

¶ HtLn HnÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

n !

º I + T for small t.

This is easy to calculate for our little 2 µ 2 matrix: Defining G = -H = K a@AD - b-a@AD b

O , we calculate

G2 = Ha@AD + bL G , which is quite a simplification that occurs in this particular case. Hence Gn = Ha@AD + bLn-1 G ,hence

expHt HL = expH-t GL = ‚i=0

¶ H-tLn GnÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

n != I + „

i=1

¶ H-tLn Ha@AD + bLn-1 GÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

n!

= I +1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa@AD + b

ikjjjjj-1 + ‚

i=0

¶ H-tLn Ha@AD + bLnÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

n !

y{zzzzz G

Therefore

expHt HL = I -1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa@AD + b

G +e-t Ha@AD+ bLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa@AD + b

G

ICSB_TutorialV12.nb

68

Thus, using the fact that the initial probabilities p0 and p1 sum to one,

(3.6)K p0 HtLp1 HtL O = expHt HL K p0

p1O =

1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅa@AD + b

K ba@AD O +

a@AD p0 - b p1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

a@AD + b e-t Ha@AD+bL K 1

-1 O

which has the same fixed point or long-time limit p1 HtL Ø a@AD ê Ha@AD + bL as before. But this solution alsoincludes the exact formula for the exponential decay of any initial deviation a@AD p0 - b p1 0 from the fixedpoint, with decay rate a@AD + b . In this particular situation the average occupancy is the same as the probabilityp1 HtL which we have calculated.

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

Figure. Convergence of p1 HtL to the same steady state from multiple initial conditions, p1 . Here wetake {a = 1, b = 1, [A] = 1}.

Exercises. 1. Verify p1 Ht = 0L = p1 .2. Verify the semigroup property for this solution, that evolving forward from time 0 to time t£ and then from

time t£ to time t gives the same answer as evolving forward from time 0 to time t .

Reformulation in terms of basis operators for creation and destruction

For a greater degree of abstraction, let us reformulate this particular matrix H in terms of stochasticcreation and annihilation operators a and a thusly:

H = a 8a - HI - NL< + b 8a - N<where

a = K 0 10 0 O, a = K 0 0

1 0 O, I = K 1 00 1 O, N = K 0 0

0 1 Oa a + a a = I

This is a specific example of a much general pattern for such operators,

Or = Oè r - diagI1T ÿ Oè r MH = ⁄r Or

where in our case O1 = a (creation of an occupied slot, or binding) and O2 = a (destruction of an occupied slot,or unbinding). A much more general formulation of the Master equation is given below. It can be expressed interms of transition rates from general states J to I:

PrHSJ Ø SI L

ICSB_TutorialV12.nb

69

3.3.4 System states for master equation

A pool state or state of the “pool of term instances” is defined as a nonnegative-integer-valued functionn :) Ø #* = 80, 1, 2, ...< . It is the “number of copies” na Hxa L œ 80, 1, 2, ...< of each parameterized term ta Hxa Lthat is grounded (has no variable symbols Xc ), for any combination Ha, xa L œ ) = ˇ

aœ"aVa of a type index

a œ " and a parameter value xa œ Va . We may denote this pool state by 8na HxL< , an “indexed set” notation forsuch functions. Each type ta may be assigned a maximum possible value na

HmaxL for all na Hxa L , commonly ¶ (noconstraint on copy numbers) or 1 (so na Hxa L œ 80, 1< which means each term-value combination is simply presentor absent). The system state or state of the full system at time t is defined as a probability distribution on allpossible values of this (already large) pool state: PrH8na Hxa L » Ha, xa L œ )<; tL ª PrH8na Hxa L<; tL . The probabilitydistribution concentrated on one particular pool state 8na Hxa L< is called a pure state of the dynamical system anddenoted » 8na Hxa L<\ . A probability distributions that is not a pure state is called a mixed state of the system.

3.3.5 Fock space construction

To better formalize the system states, the function space that probability distributions PrH8na Hxa L<Loccupy may be formalized as follows. (This subsection is not essential to understanding the applications pre-sented later.)

Each value space Va is a measure space, with a s -algebra sa of “events” on which probability is to bedefined. A probability distribution on a measure space X (such as Va ) is just a (nonnegative) measure P on the s-algebra for which PHXL = 1. We now construct a probabilistic version of a many-particle “symmetric Fock space”following [12]. For any nonnegative integer na we can can define the set of states that have a total a total of na

“copies” of grounded parameterized term ta Hxa L , as the permutation-symmetrization of the Cartesian product ofna copies of Va :

*a Hna L =ikjjjj

m=1

na

Vay{zzzzì+Hna L .

Here +HnL is the symmetric group on n items. The division sign produces equivalence classes of Cartesian-product members that differ only by a permutation of na items. The idea here is that instantiated terms don’t haveindividual identities aside from the values of their parameters; two terms of the same type and value are equiva-lent. A new s-algebra is induced on the space *a Hna L by the Cartesian product operation and another new s-algebra is induced by the symmetrization operation. Next, any finite nonnegative number na of terms are allowedin a disjoint union of measure spaces *a Hna L , and the construction is repeated in a cross product over for all termtypes a :

*a = na =0

¶

*a Hna L and * =a*a

Now ! is a measure space (since it has an induced s-algebra) and thus defines a probabilistic Fockspace & as the set of probability distributions defined on ! . In this way we arrive at a symmetric Fock space forprobability distributions. It is comparable to the usual consturction in quantum mechanics which produces aHilbert space of probability amplitude functions for many-particle systems, except that probability distrubutionsdo not require the Hilbert space framework as quantum amplitudes do. This probabilistic version of a Fock spaceis suitable for defining probability distibutions over the sets of copy numbers that label pure states » 8na Hxa L<\ .

ICSB_TutorialV12.nb

70

3.3.6 Master equation

For continuous-time t we define the semantics Yc HGL of our grammar as the unique solution, if oneexists, of the following differential equation:

(3.7)dÅÅÅÅÅÅÅd t PrH8na HxL<; tL = ⁄8ma HxL< H8n< 8m< PrH8ma HxL<; tL, i.e. in matrix notation

dÅÅÅÅÅÅÅd t PrH tL = H ÿ PrH tLstarting from any initial condition PrH 0L . This is called the master equation [10]; see also [13]. It has the formal

solution

(3.8)PrH tL = expH t HL ÿ PrH 0L .

If there is no unique solution of the master equation for all times t > 0, then we define the least upperbound T of times T £ r 0 for which there is a unique solution for all times t œ @0, T£ D starting from initial conditionPrH 0L . T is at least zero since the initial condition itself is the unique solution on @0, 0D = 80< . Furthermore forevery nonnegative integer k , there is a unique solution on @0, H1 - 1 ê kLäTD . For each initial condition PrH 0L , wedefine Yc HGL to be (a) the common unique solution on t œ @0, TL = ‹k=1

¶ @0, H1 - 1 ê kLä TD ; concatenated with (b)a special “not really defined” symbol (such as “¶”) thereafter for t œ @T, +¶D . T is called the definition limit forG and PrH 0L . We do not attempt to maintain “partial definedness” for mixed states that include nonzero weightsfor pure states for which the master equation has a unique solution at time t as well as pure states for which themaster equation has no unique solution at time t , so this is a fairly conservative definition of Yc HGL . The operatorH will be defined in Section 3.8, completing the definition of Yc HGL .

For discrete-time semantics Yd HGL there is some probability update map U which acts on probabilityvectors by “Î” to evolve them forward by one rule-firing time step. Then after k discrete time steps or rule-firingsthe probability is:

(3.9)PrH kL = U Î ... U ÎPrH 0L ª Uk ÎPrH 0Lwhich, taken over all k r 0 and PrH8na HxL<; 0L , defines Yd HGL . In both cases the long-time evolution of the

system may converge to a limiting distribution Yc* HGL ÿ PrH 0L = limtØ¶ PrH8na HxL<; tL which is a key feature of the

semantics. But we do not define the semantics Ycêd HGL as being only this limit even if it exists. Thus semantics-preserving transformations of grammars are fixedpoint-preserving transformations of grammars but the conversemay not be true. The operator U will be defined in Section 3.9, completing the definition of Yd HGL . Both H andU will be determined by an operator H

` computed from the SPG syntax.

Fortunately, even though the mathematical objects just defined are large, they are completely deter-mined by the generators H and H

` which in turn are simply composed from elementary operators acting on the

space of such probability distributions. Indeed they are elements, or limits of elements, of the operator polynomialring $@8Ba <D defined over a set of basis operators 8Ba < in terms of operator addition, scalar multiplication, andnoncommutative operator multiplication. These basis operators 8Ba < provide elementary manipulations of thecopy numbers na HxL . The operator algebra is meaningful: operator addition corresponds to composition of parallelprocesses, nonnegative scalar multiplication corresponds to speeding up or slowing down a process (as is done inthe product of scalar rate functions from different clauses in a single rule), and operator multiplication correspondsto the obligatory co-occurrence of the constituent events that define a process, in immediate succession. Commuta-tion relations between operators describe the exact extent to which the order of event occurrence matters.

Continuous-time and discrete-time SPG semantics have been implemented as Mathematica notebooks

ICSB_TutorialV12.nb

71

3.3.7 Operator algebra

The simplest basis operators 8Ba < are elementary creation operators 8aa HxL » a œ " Ô x œ Va < andannihilation operators 8aa HxL » a œ " Ô x œ Va < that increase or decrease each copy number na HxL in a particularway (reviewed in [14]):

(3.10)aa HxL » 8nb HyL<\ = » 8nb HyL + dK Ha, bL dHx, yL<\aa HxL » 8nb HyL<\ = na HxL » 8nb HyL - dK Ha, bL dHx, yL<\

where dK Ha, bL is the Kronecker delta function (defined in Section 7.0) and dHx, yL = dHx - yL is the Dirac delta(generalized) function appropriate to the (product) measure m on the relevant value space V . These two operatortypes then generate Na HxL = aa HxL aa HxL

Na HxL » 8nb HyL<\ = aa HxL aa HxL » 8nb HyL<\ = na HxL » 8nb HyL<\ ,

and they satisfy

@aa HxL, ab HyLD ª " commutator " of a and a ªHaa HxL ab HyL - ab HyL aa HxL= 0 if a b or x y .

We can write these operators a, a as finite or infinite dimensional matrices depending on the maximum copynumber na

HmaxL for type ta . If naHmaxL =1 (for a fermionic term), and we if omit the type and value subscripts which

are all assumed equal and discrete below, then

a = K 0 01 0 O , a = K 0 1

0 0 O

8a, a< ª " anticommutator " of a and a ª a a + a a = K 1 00 1 O = I ; a a = N ª K 0 0

0 1 O .

These matrices can be interpreted as follows. They operate on the two-dimensional vector space ofprobabilities HpH0L, pH1LL that the number of objects present is n = 0 or n = 1. They do not in general conservetotal probability, so this is the positive orthant of a two-dimensional space. The operator a moves all the probabil-ity pHn = 1L to the n=0 state, i.e. destroys an object, and it simply eliminates the original probability pHn = 0Lfrom the system:

a K qp O = K 0 1

0 0 O K qp O = K p

0 O

Similarly a creates an object but doesn’t conserve probability. Probability conservation will be restored usingmore complex operators built out of these fundamental ones. Here and below, the matrix rows and columns areindexed by number n of indistinguishable objects (number of copies of an object of a given type a and parameterx) immediately before and after an operator is applied.

Likewise if naHmaxL =¶ (for a bosonic term),

a =

i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 1 0 0 00 1 0 00 0 1 0ª

y

{

zzzzzzzzzzzzzzzzzz= dn,m+1 and a =

i

k

jjjjjjjjjjjjjjjjjj

0 1 0 0 0 0 2 00 0 0 30 0 0 0 ª

y

{

zzzzzzzzzzzzzzzzzz= m dn+1,m ,

and

ICSB_TutorialV12.nb

72

@a, aD ª Ha a - a a L = I =

i

k

jjjjjjjjjjjjjjjjjj

1 0 0 0 0 1 0 00 0 1 00 0 0 1ª

y

{

zzzzzzzzzzzzzzzzzz; a a = Na ª

i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 0 1 0 00 0 2 00 0 0 3ª

y

{

zzzzzzzzzzzzzzzzzz.

By truncating these matrices to finite size nHmaxL < ¶ we may compute that for some polynomial QHN » nHmaxL Lof degree nHmaxL -1 in N with rational coefficients,

@a, aD = I + N QIN … nHmaxL M .

Eg. if nHmaxL =1 then Q = -2; if nHmaxL =¶ then Q = 0. If the parameters x are continuous e.g. real-valued, thenthe general commutator relation becomes

(3.11)@aHxL, aHyLD = dHx - yLAI + N QIN … nHmaxL MEwhere d is again the Dirac delta (generalized) function appropriate to the (product) measure m on the relevant

value space V .For any measure space of parameter values x , and for any nHmaxL , the set of all operators of the form

aHxL and aHxL generate an algebra over the real numbers by scalar multiplicaton and operator-operator additionand multiplication. This algebra is associative. The commutators listed above (in Equation 11) suffice to derivecommutator expressions for all pairs of operators in this algebra; thus they provide a specification of the Liealgebra associated with the operator algebra, in which all operator triples satisfy the Jacobi identity@A, @B, CDD + @B, @C, ADD + @C, @A, BDD = 0. The importance of the commutator for dynamical system applicationsis that it characterizes the nature of the noncommutation and hence interdependence (or interference) of differentkinds of time-evolution operators that occur in the same dynamical system. (Anticommutators play an analogousrole in “Lie superalgebras” that arise in supersymmetric particle theories and elsewhere in physics, and arepotentially important since we frequently use the special case nHmaxL = 1.) Thus, commutators and commutationrelations are fundamental to the operator formulation of dynamical systems.

Calculations with creation and annihilation operators

For future reference,

a =

i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 1 0 0 00 1 0 00 0 1 0ª

y

{

zzzzzzzzzzzzzzzzzz; a2 =

i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 0 0 0 01 0 0 00 1 0 0ª

y

{


i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 0 0 0 00 0 0 01 0 0 0ª

y

{

zzzzzzzzzzzzzzzzzz; ...

@ak Dn m = dn,m+k

a =

i

k

jjjjjjjjjjjjjjjjjj

0 1 0 0 0 0 0 2 0 00 0 0 3 00 0 0 0 4ª

y

{


i

k

jjjjjjjjjjjjjjjjjj

0 0 2 0 0 0 0 0 6 00 0 0 0 120 0 0 0 0 ª

y

{


i

k

jjjjjjjjjjjjjjjjjj

0 0 0 6 0 0 0 0 0 240 0 0 0 0 0 0 0 0 0ª

y

{

zzzzzzzzzzzzzzzzzz; ...

ICSB_TutorialV12.nb

73

Aak En m

= HmLk dn,m-k =m !

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHm - kL!dn,m-k

Compare to:

zk zm = zm+k and zk zm = HmLk zm-k

3.3.8 Continuous-time semantics

For a grammar rule number “r” of the form of (Equation 3) we define the operator that first(instantaneously) destroys all parameterized terms on the LHS and then (immediately and instantaneously) createsall parameterized terms on the RHS. This happens independently of time or other terms in the pool. Assumingthat the parameter expressions x, y contain no variables Xc , the effect of this event is:

(3.12)O`

r = rr HHxi L, Hyj LLÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅ‰

iœrhsHrLaaHiL Hxi L

É

Ö


Ä

Ç


jœlhsHrL abH jL Hyj L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑThe operators within each of the two products above commute, so their order within each product is arbitrary.

If there are variables 8Xc < , we must sum or integrate over all their possible values in c DbHcL :

(3.13)

O`

r = ‡D bH1L

... ‡DbHcL

... ikjjjjj‰

c

d mbHcL HXc Ly{zzzzz rr HHxi H8Xc <LL, Hyj H8Xc <LLL

Ä

Ç


iœrhsHrLaaHiL Hxi H8Xc <LL

É

Ö


Ä

Ç


jœlhsHrL abH jL Hyj H8Xc <LL

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑThus, syntactic variable-binding has the semantics of multiple integration. This is the same result one would get

if each rule with variables were replaced with a (finite, countable, or uncountably infinite) set of rules with allpossible values substituted in for all the variables, and running in parallel.

Constructed this way, the semantics for each rule is oblivious in that every possible rule firing has aprobability per unit time which does not depend on the number of other possible rule firings of the same ordifferent rules. The increasing integer entries of the annihilation operators ensure this property. For example arule that requires as input exactly two copies of a given parameterless term ta , finds its probability rate functionmultiplied by na Hna - 1L ª Hna L2 (from two powers of the annihilation operator) which is proportional to I na

2 M , thenumber of ways those two inputs can be chosen from the pool. Thus the probability per possible instantiated rulefiring is independent of na , and likewise for other term numbers nb .

Likewise for a rule with k identical inputs, the annihilation operator monomial Haa Lk gives a factor ofna ! ê Hna - kL ! ª Hna Lk to the total firing rate, which is proportional to the number of ways of choosing k unorderedinputs from the pool I na

k M . The proportionality factor of k ! , like rr HHxi L, Hyj LL , is an intrinsic property of the rule rand independent of the pool size na ; thus we define rr in Equation 12 so that it already contains this factor. Thisdefinition is a matter of convention, but our choice of convention also has the advantage of reproducing thechemical Law of Mass Action for large na , with rr and not rr ê k ! as the reaction rate for reaction r , thus agreeingwith chemical usage (see Section 3.11) in this important limit.

A monotonic rule has all of its LHS terms appear also on the RHS, so that nothing is destroyed, inwhich case

ICSB_TutorialV12.nb

74

(3.14)O`

r = rr HHxi L, Hyj LLÄ

Ç


iœrhsHrLîlhsHrLaaHiL Hxi L

É

Ö


Ä

Ç


jœlhsHrL NbH jL Hyj L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑUnfortunately the foregoing expressions for O

`r don’t conserve probability because probability inflow

to new states (described by O`

r ) must be balanced by outflow from current state (diagonal matrix elements). Thefollowing operator does conserve probability:

Or = O`

r - diagI1T ÿ O`

r MFor the entire grammar the time evolution operator is simply a sum of the generators for each rule:

(3.15)H = ‚r

Or = ‚r

O`

r - ‚r

diagI1T ÿ Oè r M = H`

- D

This superposition implements the basic principle that every possible rule firing is an exponentialprocess, all happening in parallel until a firing occurs. Note that (Equation 12) (Equation 13) and H

`= ⁄r O

`r are

encompassed by the polynomial ring $@8Ba <D where the basis operators include all creation and annihilationoperators. Ring addition (as in Equation 15 or Equation 13) corresponds to independently firing processes, suchas arise from different rules in the same grammar. Ring operator multiplication (as in Equation 12) corresponds toobligatory event co-occurrence.

Equation 15 completes the definition of YHcL from Section 3.4 / Section 3.6.Thus, SPG’s have an “operational semantics” (the solution of the master equation for time evolution

operators) that is also “compositional”, since the syntactic union or concatenation of rules (or multisets of rules) ina SPG corresponds to composition of semantics by operator addition in Equation 15. Operator multiplication isused to construct the time evolution operator for each rule. The compositionality of the semantics may be relatedto its asynchronicity: the operator semantics doesn’t in general impose a unique preferred execution order on rule-firing events.

Relation to quantum mechanics

If t Ø i t =è!!!!!!!

-1 t and

a =

i

k

jjjjjjjjjjjjjjjjjjjjj

0 0 0 0 1 0 0 0

0 è!!!2 0 0

0 0 è!!!3 0ª

y

{

zzzzzzzzzzzzzzzzzzzzz

=è!!!n dn,m+1 and a =

i

k

jjjjjjjjjjjjjjjjjjjjj

0 1 0 0

0 0 è!!!2 0

0 0 0 è!!!30 0 0 0 ª

y

{

zzzzzzzzzzzzzzzzzzzzz

=è!!!!m dn+1,m ,

then we recover the particle number basis for time-evolution operators in quantum mechanical dynamics. Thesame commutation algebra @a, aD = I holds, for nmax = ¶. In this way SPG syntax can be given another continu-ous-time semantics as a quantum mechanical system. However, the construction of Fock spaces is slightly differ-ent, requiring Hilbert spaces [12], and the standard interpretation of the state vector in terms of probability isdifferent and loses quantum phase information, so the mapping to discrete-time algorithms (undertaken below forthe standard stochastic process semantics) becomes more problematic. We may call an SPG with continuous-timequantum semantics a “quantum grammar”. Currently no implementation of this SPG semantics exists.

ICSB_TutorialV12.nb

75

3.3.9 Discrete-time SPG semantics

The operator H`

describes the flow of probability per unit of continuous time, over an infinitesimalcontinuous-time interval, into new states that result from a single rule-firing of any type. We seek a relatedsemantics for the same SPG in discrete time.

If we start in a pure state p0 = » 8ma Hxa L<\ (a pure state of the pool of parameterized terms) and condi-tion the probability distribution at later times t > 0 on a single rule having fired, thereby setting aside the probabil-ity weight for all other possibilities, the resulting distribution p1 on pool states should be proportional to H

`ÿ p0

with a proportionality constant that ensures normalization. So if the n’th component of p1 is:

@p1 Dn ª Prdiscrete H » 8na Hxa L<\ » k = 1, » 8ma Hxa L<\Lthen we could define the discrete-time dynamics using

(3.16)p1 = I H`

ÿ p0 M ë I 1 ÿ H`

ÿ p0 M if 1 ÿ H`

ÿ p0 0.

Since p0 is a pure state, the l ’th component of p0 is @p0 Dl = dHl, mL and the nth component of this expressionis equal to

‚lH`

n,l @p0 DlÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ‚

n£‚

lH`

n£ ,l @p0 Dl

=H`

n,mÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ⁄n£ H`

n£ ,m= „

l

ikjjjjjH

`n,l ì

ikjjjjj‚

n£

H`

n£ ,ly{zzzzzy{zzzzz H@p0 Dl L

= BH` ÿ diagI1 ÿ H` M-1

ÿ p0 Fn

.

The problem with this definition is the possibility of dividing by zero when the pure (pool) state is also aterminal state (for which H

`ÿ p0 = 0), so that no probability flows out of it.

Principal discrete-time semantics

To avoid division by zero in the definition of Yd HGL , we first define

(3.17)Hè Hw L = H`

+ w diagIQI 1 ÿ H`

= 0MM Dè Hw L = diagI1 ÿ Hè Hw LMfor some fixed, real-valued w r 0, and

(3.18)@D£ HDLDn m = ; 1 êDn n if n = m and Dn n 00 otherwise

By this definition of the “prime” operation, for diagonal matrices D = D . Note that the continuous-timedynamics is unaffected by the value of w :

(3.19)H = H`

- D = Hè - Dè

Now we can define the first rule-firing update

(3.20)p1 = Hè ÿ Dè

£ÿ p0 , where

Dè HHè L = diagI1 ÿ Hè Mwhich is the same as Equation 16 for nonterminal pure states but for terminal states either results in p1 = 0 (if

w =0) or p1 = p0 (for w > 0); a terminal state either has its probability vanish or stay fixed upon further update,depending on the value of the fixed parameter w. This update is now in a form that can be iterated as a linear map,even if it is applied to a potentially mixed state such as p1 , and hence can be iterated and interpreted as a stochas-tic algorithm:

ICSB_TutorialV12.nb

76

which is the same as Equation 16 for nonterminal pure states but for terminal states either results in p1 = 0 (ifw =0) or p1 = p0 (for w > 0); a terminal state either has its probability vanish or stay fixed upon further update,depending on the value of the fixed parameter w. This update is now in a form that can be iterated as a linear map,even if it is applied to a potentially mixed state such as p1 , and hence can be iterated and interpreted as a stochas-tic algorithm:

(3.21)pk = Hè ÿ Dè£

ÿ pk-1 = I Hè ÿ Dè£ Mk ÿ p0 for k œ " .

This expression directly specifies a discrete-time execution algorithm. It represents a Markov chain if w > 0.To see that, sort pool states into nonterminal states followed by terminal states and write Hè and Dè in(nonterminal, terminal) block form:

(3.22)

Hè =ikjjjj

Hè 11 0Hè 21 w I

y{zzzz; Dè =

ikjjj diagI1 ÿ Hè 1 M 0

0 w Iy{zzz;

Dè£

=ikjjjjj diag I1 ÿ Hè 1 M-1

00 I êw

y{zzzzz; Hè ÿ Dè

£=i

kjjjjjjj

Hè 11 ÿ diagI1 ÿ Hè 1 M-10

Hè 21 ÿ diagI1 ÿ Hè 1 M-1I

y

{zzzzzzz

Then 1 ÿ Hè ÿ Dè£=1 and Equation 21 represents a Markov chain for w > 0. In the w > 0 case, further updates

beyond the terminal state simply leave the probability unchanged (AHè ÿ Dè£ E

22= I ); it is as if there were an extra

rule that fires to no effect. These updates may be called pseudo-events. Equation 19 and Equation 22 show that theactual value of w > 0 is irrelevant to continuous-time and discrete-time dynamics respectively, so we are free topick some arbitrary nonnegative value scaled by some property of H

` (since omega has units of frequency or

inverse time). But we are also free to retain the adjustability of w , which is the strategy adopted here.If w =0 the update formula of Equation 21 is still interpretable as a stochastic algorithm, which halts

upon reaching a terminal state, even though it is not a Markov chain:

(3.23)

Hè = H` =ikjjjj

H`

11 0H`

21 0

y{zzzz; Dè =

ikjjjj diagI1 ÿ H

`1 M 0

0 0y{zzzz;

Dè£

=ikjjjjj diagI1 ÿ H

`1 M-1

00 0

y{zzzzz; Hè ÿ Dè

£=i

kjjjjjjj

H` 11 ÿ diagI1 ÿ H` 1 M-10

H`

21 ÿ diagI1 ÿ H`

1 M-10

y

{zzzzzzz = H

` D

The distribution pk as a function of step number (k ) on possible execution traces is defined as the discrete-timesemantics Yd HGL , even if total “probability” decreases with iterations due to terminal states. A disadvantage ofthe semantics in the w =0 case is that in subsequent iterations the probability state vector pk carries no informationabout which terminal state the system ended up in.

Termination probabilities pk HÛL (a new scalar, not a vector like pk ) may be included in the w =0 caseso that total probability is conserved, by the following equivalent formulation which is an affine map but not aMarkov chain:

(3.24)

i

kjjjjjjj

pk+1

pk+1 HÛL1

y

{zzzzzzz =

i

k

jjjjjjjjjH`

ÿ D£ 0 0-1 ÿ H

`ÿ D£ 0 1

0 0 1

y

{

zzzzzzzzz i

kjjjjjjj

pk

pk HÛL1

y

{zzzzzzz ;

soloooomnoooo

pk = H`

ÿ D£ ÿ pk-1 = IH` ÿ D£ Mk ÿ p0

pk HÛL = 1 - H`

ÿ D£ ÿ pk-1 = 1 - IH` ÿ D£ Mk ÿ p0

|oooo}~oooo

We may consider the formal symbol "Û" to be a new halted pool state (the discrete-time successor to allterminal states including itself) which is treated specially by the discrete-time semantics and not needed by thecontinuous-time semantics. But this interpretation is not essential to the discrete-time semantics of Equation 21.

Of course if there are no terminal pool states, then these variations are equivalent:Yd HG, w > 0L = Yd HG, w = 0L . These variations may be called the principal discrete-time Markovian semantics(Equation 22) and the principal discrete-time halting semantics (Equation 23 or Equation 24), respectively, andthe principal discrete-time semantics (Equation 21) collectively. They both provide linear or affine maps U thatmay be substituted into Equation 9, to complete the definition of Yd .

ICSB_TutorialV12.nb

77

Of course if there are no terminal pool states, then these variations are equivalent:Yd HG, w > 0L = Yd HG, w = 0L . These variations may be called the principal discrete-time Markovian semantics(Equation 22) and the principal discrete-time halting semantics (Equation 23 or Equation 24), respectively, andthe principal discrete-time semantics (Equation 21) collectively. They both provide linear or affine maps U thatmay be substituted into Equation 9, to complete the definition of Yd .

Time-Ordered Product Expansion (TOPE)

A valuable tool for studying such stochastic processes in physics is the Time-Ordered Product Expan-sion [15-16]. We use the following form (derived in Appendix 1):

(3.25)

exp Ht HL ÿ p0 = exp Ht HH0 + H1 LL ÿ p0

= „k=0

¶ ÄÇÅÅÅÅÅÅÅÅÅ‡0

td t1 ‡

t1

td t2 ‡

tk-1

td tk expHHt - tk L H0 L H1 expHHtk - tk-1 L H0 L H1 expHt1 H0 L

ÉÖÑÑÑÑÑÑÑÑÑ ÿ p0

where H0 is a solvable or easily computable part of H , so the exponentials expHt H0 L can be computed orsampled more easily than expHt HL . See the Appendix for an elementary probabilistic derivation of this form. Thisexpression can be used to generate Feynman diagram expansions, in which k denotes the number of interactionvertices in a graph representing a multi-object history [14]. If we apply (Equation 25) with

H1 = H`

and H0 = -D

we derive the well-known Gillespie algorithm for simulating chemical reaction networks [17], which can nowbe applied to SPG’s. This derviation of a widely-used stochastic algorithm is explained in more detail at the endof Appendix I. With SPG’s we need to consider also the possibility of terminal states, in which case one mayalternatively use

H1 = Hè HwL and H0 = -Dè HwL .

However many other decompositions of H are possible, one of which is used in Section 3.15 below.Because the operators H can be decomposed in many ways, there are many valid simulation algorithms for eachstochastic process. The particular formulation of the Time-Ordered Product Expansion used in (Equation 25) hasthe advantage of being recursively self-applicable.

Thus, (Equation 25) entails a systematic approach to the creation of novel simulation algorithms.

3.3.10 Relation between continuous- and discrete-time semantic maps


ICSB_TutorialV12.nb

78

In the SPG semantics approach we start with continuous-time stochastic dynamics and specialize todiscrete-time (this section) and/or deterministic dynamics. The following three propositions relate the resultingcontinuous-time and discrete-time semantics.

Proposition 1. Given the stochastic parameterized grammar (SPG) rule syntax of Equation 26, (a) There is a semantic function Yc mapping from any continuous-time stochastic parameterized

grammar G via a time evolution operator HIH` HGLM to a joint probability density function on the parameter valuesand birth/death times of grammar terms, conditioned on the total elapsed time, t . For any initial probabilitydistribution p0 , there is a maximal Tdef Hp0 L œ @0, +¶D such that for all times on t œ @0, Tdef L , Yc HGL HtL ÿ p0 is aprobability density and is the unique solution of the master equation on that interval.

(b) There is a semantic function Yd mapping any discrete-time, sequential-firing stochastic parameter-ized grammar G via a time evolution operator Hè HG, wL to a joint probability density function on the parametervalues and birth/death times of grammar terms, conditioned on the total discrete time defined as number of rulefirings, k . It depends on whether a global real-valued parameter w r 0 is =0 or >0; for w > 0 it is a Markov chain.

Proof of Proposition 1: (a): Section 3.4, Section 3.6, and Section 3.8. The definition of Tdef Hp0 L in Section 3.4 as the least upper bound

of times T £ r 0 for which there is a unique solution of the master equation for all times t œ @0, T £ D starting frominitial condition p0 , implies that there also exists such a solution on @0, Tdef L = ‹0<T £ <Tdef

@0, T £ D . Suppose thisTdef were not maximal as claimed in (a). Then there would exist some T* > Tdef for which the master equationhas a unique solution on .@0, T* L , an interval which properly contains @0, Tdef L . Then there would exist a uniquesolution on the subinterval @0, T £ = HT* + Tdef L ê2D . Tdef being an upper bound of such T£ values, we haveHT* + Tdef L ê2 b Tdef which implies T* b Tdef , in contradiction to T* > Tdef .

(b): Section 3.4, Section 3.6, and Section 3.9.

We have a defined probability Prcontinuous H8na HxL< » tL ª Prcontinuous H ÿ » tL for t œ @0, Tdef L ; formally it isexpHt HL ÿ p0 which can be calculated by the TOPE of Section 3.9.2. We would like to compute the probabilityPrcontinuous H ÿ » kLgiven k rule-firings, and compare it with Prdiscrete H ÿ » kL . Both of these k -step probabilities needfurther defininition which we now provide. To remove the dependence on t for large times in Prcontinuous H ÿ » tL , weneed a prior distribution Prcontinuous HtL which is uninformative except at large times for which expH-D D tL @ 0 andany bias introduced by the prior on t is unimportant.

Definition 1. Define the continuous-time k -event probability as the density Yc HGL HtL multiplied by theuniform distribution of times t from 0 to some T < Tdef Hp0 L , integrated over t from 0 to T and then conditioned onk firings the last of which occured at exactly time t ; it is denoted Prcontinuous H ÿ » k, tk = t - tk = 0, T , p0 L where tkis the time of the k ’th rule firing. In equations:

ICSB_TutorialV12.nb

79

PrHtL = ; 1 êT if t œ @0, TD0 otherwise = QH0 b t b TL ê T

Prcontinuous H ÿ , t0 , ... , tk , k » T , p0 L ª Ÿ0

¶ d t Prcontinuous H ÿ , t0 , ... , tk , k » t, p0 L PrHtL= Ÿ0

T d t Prcontinuous H ÿ , t0 , ... , tk , k » t, p0 L PrHtL(integration over time t , where t j = t j+1 - tj = j ’th time difference between rule firings). Also

Prcontinuous H ÿ , tk = 0, k » T, p0 L = ‡0

¶

d t0 ‡0

¶

d tk-1 Prcontinuous H ÿ , t0 , ... , tk-1 , tk = 0, k » T , p0 L

(last firing happens exactly at time tk = t , and others are unconstrainted) and

Prcontinuous H ÿ » k, tk = 0, T, p0 L =Prcontinuous H ÿ , tk = 0, k » T , p0 LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

Prcontinuous Htk = 0, k » T , p0 L(conditioned on k=number of rule-firings).

Definition 2. Define the discrete-time k -event probability as discrete-time density Yd HGL , conditionedon k and on the absence of a halt by step k (i.e. not reaching a terminal state in the case w = 0 by step k -1),denoted Prdiscrete H ÿ » not halted , k events, p0 L .

Prdiscrete H ÿ » not halted , k events, p0 L =Prdiscrete H ÿ , not halted » k events, p0 LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

Prdiscrete Hnot halted » k eventsL

Definition 3. Define the continuous-time k -liveness probability as the same probability time integral asin Definition 1, but joint rather than conditional in k , summed over all pool states. It is denotedPrcontinuous Htk = 0, k » T , p0 L :

Prcontinuous Htk = 0, k » T, p0 L = 1 ÿ Prcontinuous H ÿ , tk = 0, k » T , p0 L .

Definition 4. Define the discrete-time k -liveness probability as the discrete-time probability Yd HGL ofnot halting after k events, denoted Prdiscrete Hnot halted » k eventsL :

Prdiscrete Hnot halted » k eventsL = 1 ÿ Prdiscrete H ÿ , not halted » k events, p0 L .

Proposition 2. Suppose that SPG G involves only discrete-valued parameters, and that the nonzeroelements of diag(Hè ) are bounded below by d > 0 (so that either w = d or w r 0). If 0 < T < Tdef Hp0 L and0 < e ` 1 and

T d p k … logHe ê kL … Ii.e. e p k e-T dêk M ,

then (a) Either both the continuous-time k -liveness probability Prcontinuous Htk = 0, k » T , p0 L is zero and the

discrete-time k -liveness probability Prdiscrete Hnot halted » k eventsL is zero, or they are both nonzero andPrcontinuous H ÿ » k, tk = 0, T , p0 L approximates Prdiscrete H ÿ » not halted , k events, p0 L with relative error OHeL :

Prcontinuous H ÿ » k, tk = 0, T , p0 L @ Prdiscrete H ÿ » not halted , k events, p0 LIn particular if Tdef Hp0 L=+¶ and the liveness probabilities are nonzero, then the T Ø ¶ limit of the

continuous-time k -event probability vector is equal to the discrete-time k -event probability vector:

ICSB_TutorialV12.nb

80

limTØ¶ Prcontinuous H ÿ » k, tk = 0, T, p0 L = Prdiscrete H ÿ » not halted , k events, p0 L;also

(b) If G has no terminal pool states and involves only discrete-valued parameters, and if w=0, then thereexists another grammar G£ HGL derived from G such that without using any prior distribution on t , and under thesame assumptions on T , d and k as in (a) above, with a relative approximation error of OHeL

Prcontinuous G£ H ÿ , k » t = T , p0 L = pk* Ht = TL @ Prdiscrete G H ÿ » not halted , k events, p0 L.

Proof of Proposition 2: (a) Equation 25 is the starting point; it is established in Appendix 1. Then from Appendix 2 of [UCI ICS TR

06-11], under the stated conditions,Case I: Prcontinuous Htk = 0, k » T , p0 L = 0: Then tk = t - tk and

Prcontinuous Htk = 0, k » T , p0 L = 0 = Prdiscrete Hnot halted » k eventsLCase II: Prcontinuous Htk = 0, k » T , p0 L > 0: If w 0, with a relative approximation error of OHeL ,

Prcontinuous H ÿ » k, tk = 0, T , p0 L @ Prdiscrete H ÿ » k, p0 LIf w = 0 with a relative approximation error of OHeL ,

Prcontinuous H ÿ » k, tk = 0, T , p0 L @ Prdiscrete H ÿ » not halted , k events, p0 L.This is also true of the case w 0 since in that case there is zero probability of halting and Prdiscrete H ÿ » k, p0 L =

Prdiscrete H ÿ » not halted , k events, p0 L .(b) Appendix 3 of [UCI ICS TR 06-11].

Note : In the limit T Ø 0, the inequalities assumed in Proposition 2 cannot all be satisfied and noapproximation follows in (a) or (b).

Corollary. The following diagram commutes:

Figure 1. Commutative diagram for continuous-time and discrete-time semantics. Here k = number ofrule firings, t = continuous time, S = a map from continuous-time to discrete-time probabilities associated withlarge integration times T (Proposition 2a, 2b) or small evaluation times t (Proposition 3) in the continous-timesemantics.

ICSB_TutorialV12.nb

81

3.3.11 Biochemical applications of operator algebra approach

Biochemical reaction networks

Given the chemical reaction network syntax

(3.26)9maHrL Aa … 1 b a b Amax =ö

kHrL 9nbHrL Ab … 1 b a b Amax = ,

we can eliminate the non-SPG syntax of integer-valued “stoichiometric” multiplicities maHrL and nb

HrL on thechemical inputs and outputs, by defining an index mapping

aHiL = „c=1

Amax

c Qikjjjjj‚

d=1

c-1

mdHrL < i b ‚

d=1

c

mdHrL y{zzzzz =

loooooooooom

n

oooooooooo

1 if 0 < i b m1HrL

2 if m1HrL < i b m1

HrL + m2HrL

... ...a if ⁄c=1

a-1 mcHrL < i b⁄c=1

a mcHrL

... ...

and likewise for bH jL as a function of 9nbHrL = . Then (Equation 26) can be translated to the following equivalent

grammar syntax for the multisets of parameterless terms

loomnoo

taHiLƒƒƒƒƒƒƒƒƒƒƒƒ0 < i b ‚

c=1

Amax

mcHrL|oo}~oo*

Øloomnoo

ta£ H jLƒƒƒƒƒƒƒƒƒƒƒƒ0 < j b ‚

c=1

Amax

ncHrL|oo}~oo *

with kHrL

whose semantics is the time-evolution generator

(3.27)O`

r = kHrL

Ä

Ç


iœrhsHrLaaHiL

É

Ö


Ä

Ç


jœlhsHrLabH jL

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑ.

This generator is equivalent to the stochastic process model of mass-action kinetics for the chemical reactionnetwork (Equation 26).

In this application, it is important to note that the Time Ordered Product Expansion can represent asampling algorithm which is essentially the same as the Gillespie Stochastic Simulation Algorithm.

Theory and a synthesis/decay example, 8Ø ô A, A ô Ø<

As an example, consider the two-reaction network 8Ø ö A, A ö Ø< for synthesis and decay of asingle reactant A . We define the creation and annihilation operators as very simple infinite-dimensional matricesthat increase or decrease the number of molecules of type A present at any given time:

(3.28)a =

i

k

jjjjjjjjjjjjjjjjjj

0 0 0 0 1 0 0 00 1 0 00 0 1 0ª

y

{

zzzzzzzzzzzzzzzzzz= dn,m+1 and a =

i

k

jjjjjjjjjjjjjjjjjj

0 1 0 0 0 0 2 00 0 0 30 0 0 0 ª

y

{

zzzzzzzzzzzzzzzzzz= m dn+1,m

which have an algebra defined by the “commutator” @a, aDn m ª Ha a - a a Ln m = dn,m , i.e.

ICSB_TutorialV12.nb

82

(3.29)@a, aD ª Ha a - a a L = I =

i

k

jjjjjjjjjjjjjjjjjj

1 0 0 0 0 1 0 00 0 1 00 0 0 1ª

y

{

zzzzzzzzzzzzzzzzzz.

Each reaction r is represented by an operator Oè r resulting from combination of creation and annihilationoperators:

(3.30)Oè JØ ö

ks AN = ks a

Oè JA ökd

ØN = kd a

These operators account for the flow of probability into a new state, but not out of the current state. We mustbalance the flow of probability so that total probability is conserved. In general we construct the “Hamiltonian” Hby subtractacting a probability balancing term for each reaction, then adding the resulting operators for all reac-tions in a reaction network:

(3.31)Or = Oè r - diagI1T ÿ Oè r M

H = ⁄r Or

This then determines the Master Equation:

(3.32)dÅÅÅÅÅÅÅd t PrH8n<; t » 8m<; 0L = ⁄8p< H8n< 8p< PrH8p<; t » 8m<; 0L, i.e.

dÅÅÅÅÅÅÅd t PrH t » 0L = H PrH t » 0Lwith initial condition

(3.33)PrH8n<; 0 » 8m<; 0L = d8n<,8m< .

This system has the abstract solution:

(3.34)PrH t » 0L = expH t HL.Note that the Chapman-Kolmogorov equation automatically follows from the abstract solution, since even for

matrices, multiples of H commute and therefore exp@Ht - t£ L HD exp@Ht£ - tL HD = exp@Ht - tL HD .The formal solution exp Ht HL can be exhibited as a power series in t by computing successive powers

of H , with a truncated dimension proportional to the maximum power of H sought, in a computer algebraprogram.

In our particular case we find

(3.35)H = ks Ha - IL + kd Ha - NLwhere Nn m = Ha aLn m = n dn m is the “number operator” in the space of molecules of type A. Given this Hamilto-

nian, we must solve Equation 3.7 in a concrete and usable way.Solution method

The general path to solving such stochastic process problems is to consider the generating function

(3.36)G8m< H8z<, tL = „8nHiL=0<

¶

Pr8nHiL<,8mHiL< HtL ‰i

Hzi LnHiL

or more generally

ICSB_TutorialV12.nb

83

(3.37)GH8x<, 8y<, tL = „8nHiL=0<

¶

„8mHiL=0<

¶

Pr8nHiL<,8mHiL< HtL ‰i

Hxi LnHiL Hyi LmHiL

The variable zi corresponds to the “fugacity” for creation of molecules of a reactant of type i in the grandcanonical ensemble in statistical mechanics. W can compute the map

(3.38)ai # zHiL ª

ÅÅÅÅÅÅÅÅÅÅÅzi

, ai # zi , I # 1, Ni = ai ai # zi zHiL , and @ai , a j D = di j .

Then we can map the our 2-reaction Master equation to

(3.39)d

ÅÅÅÅÅÅÅÅÅd t

Gm Hz, tL = H Gm Hz, tL = Cks Hz - 1L + kd K ÅÅÅÅÅÅÅÅÅ z

- z

ÅÅÅÅÅÅÅÅÅz

OG Gm Hz, tL = kd H1 - zLC ÅÅÅÅÅÅÅÅÅz

-ksÅÅÅÅÅÅÅÅkd

G Gm Hz, tL.

Recall the generic initial condition (IC) is Pr8nHiL<,8mHiL< = d8nHiL<,8mHiL< , so GH8x<, 8y<, 0L = 1 ê¤i H1 - xi yi Land G8m< H8z<, 0L = ¤i HzLmHiL .

This is a single linear partial differential equation in two variables (z and t ) that can be solved analyti-cally by separation of variables in one space and one time dimension:

(3.40)

ÅÅÅÅÅÅÅ t gm z HzL hm l HtL = -l gm l HzL hm l HtL = H gm l HzL hm l HtLGm Hz, tL = Ÿ gm l HzL e-l t d l

H gm l HzL = -z gm l HzLIC : Ÿ gm l HzL d l = zm

In our case this specializes to:

Akd Hz - 1LA ÅÅÅÅÅÅz - kE - lE gm l HzL = 0

gm l HzL = cm HlL Hz - 1Llêkd ez k

In Mathematica this amounts to:

DSolveCKHz - 1L K gHzLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅz

- k gHzLO - l gHzLO == 0, gHzL, zG

99g@zD -> ez k+l Log@-1+zD C@1D==Using the initial conditions we can calculate:

gm Hz, tL = I1 + Hz - 1L e-kd t Mm ekHz-1L H1-e-kd t L

This may be expanded out in powers of z , giving for m = 0 a Poisson distribution with

l = kI1 - e-kd t M l converges exponentially from 0 to k, with time constant 1 ê kd .

Special case: k =0, for A ö Ø . Then

ICSB_TutorialV12.nb

84

gm Hz, tL= I1 + Hz - 1L e-kd t Mm= Iz e-kd t + I1 - e-kd t MMm

= ‚l=0

m

zl JmlN pl H1 - pLm-l

where

p = e-kd t .

The mean of the binomial distribution falls exponentially from m to 0, with time constant 1 ê kd .

Solution methods for the master equation

An equivalent representation of the annihilation and creation operators is given by their respectiveeffects on a generating function GHzL , with one symbolic variable zHt,xL for each allowed term type and parametercombination:

(3.41)GHzL = „8nHt,xL=0<

¶

Pr8nHt,xL< ‰8Ht,xL<

HzHt,xL LnHt,xL = „8nHt,xL=0<

¶

Pr8nHt,xL< expÄ

ÇÅÅÅÅÅÅÅÅÅÅ‚t

‡ d x nHt, xL mHt, xLÉ


.

The variable z corresponds to the “fugacity” for creation of instances of a term tHxL in the grand canonicalensemble in statistical mechanics, and mHt, xL ª log zHt, xL is the corresponding “chemical potential” . Then we map

(3.42)azHt,xL # zHt,xL , azHt,xL # zHt, xL, and @azHt,xL , azHt£ ,x£ L D = dt t£ dHx - x£ L .

The most involved part of this translation is just subtracting the diagonal probability-balance term, to go fromOè r to Or . In this representation, the “Master equation” is first-order linear partial differential equation in manyvariables. In addition, boundary conditions are linear in G, second- and higher-order correlations are expectationscomputed as linear operators acting on G, and separation of time and space factors in the PDE solution gives afurther linear analysis (analogous to an eigenvalue problem) of G.

Solvable Examples

Exponential decay of a single particle: Hamiltonian H = K 0 l0 -l

O = l H a - NL2ä2 , in which the

creation and annihilation operators have been projected into the two-dimensional subspace corresponding topresence of either zero or one copies of a particle. The generating function is G0 = 1, G1 = 1 - e-l t + e-l t z .

Poisson process: Hamiltonian H = r H a - IL . Equivalent to a single chemical reaction denoting synthe-sis, :Ø ö

ri A> . Generating function starting from m -particle initial condition:

Gm Hz, tL = zm exp HrHz - 1L tL = e- r t ‚n=0

¶

Hr tLn

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅn!

zn+m

Population decay process: H = r H a - NL . Equivalent to a single chemical reaction denoting decay oruncatalysed degradation :Aö

rdØ> . Generating function:

ICSB_TutorialV12.nb

85

Gm Hz, tL = HHz - 1L e-r t + 1Lm = ‚n=0

m

JmnN e-n r t H1 - e- r t Lm-n zn , n ~ BinomialHm, e-r t L.

(This can be calculated from the PDE in Hz, tL or seen from the single-particle decay generating function.)Galton-Watson birth-death process, equivalent to a chemical reaction network with decay and a “chain

reaction” :A örb 2 A, Aö

rdØ> :

H = rb I a2 a - NM + rd Ha - NL

Gm Hz, tL =ikjjj Hrb - rd eHrb -rd L t L z + rd HeHrb -rd L t - 1L

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅrb H1 - eHrb -rd L t L z + Hrb eHrb -rd L t - rd L

y{zzz

m

.

Galton-Watson birth-death process with immigration, equivalent to a chemical reaction network withsynthesis, decay, and a “chain reaction” :A ö

rb 2 A, Ø öri A, Aö

rdØ> :

H = rb I a2 a - NM + rd Ha - NL + ri Ha - IL

Gm Hz, tL = K Hrb - rd LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅrb H1 - eHrb - rd L t L z + Hrb eHrb -rd L t - rd L O

ri êrb

ikjjj Hrb - rd eHrb - rd L t L z + rd HeHrb -rd L t - 1L

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅrb H1 - eHrb -rd L t L z + Hrb eHrb -rd L t - rd L

y{zzz

m

.

The detailed mathematical theory of such birth/death processes can be extended to multiple object types, andincludes conditions for the exclusion of “explosions” in which the number of terms generated grows to infinity ina finite amount of time.

Synthesis-decay chemical reaction network :Ø öks A, Aö

kdØ> (as above):

H = ks Ha - IL + kd Ha - NLDefining k = ks ê kd , the differential equation and its solution by separation of variables are

Gm Hz, tL = ‡ gm l HzL hm l HzL d l

Akd Hz - 1LA ÅÅÅÅÅÅz - kE - lE gm l HzL = 0 = ÅÅÅÅÅÅÅ t hm l HzL + l hm l HzLgm l HzL = cm HlL Hz - 1Llêkd ez k

and finally

gm Hz, tL = J1 + Hz - 1L e-kd t Mm ekHz-1L I1-e-kd t M .

Bidirectional conversion chemical reaction: :A ökf B, Bö

kr A> :

Note that conservation of A + B makes this actually a finite-dimensional system, provided that the initialconditions give zero probability to all values of A and B above some finite limit. Whether this condition issatisfied or not, the solution is as follows.

H = kf Ha2 a1 - N1 L + kr Ha1 a2 - N2 LDefining g8m< l HzL = z1

mH1L z2mH2L gl HzL, z = z1 ê z2 , and k = kf ê kr , the method of characteristics gives

g8m< Hz, tL = ez m2 Hz + kL-k Hm1 +m2 L CÄÇÅÅÅÅÅÅÅÅt -

logHz - 1L + k logHz + kLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

kf + kr

ÉÖÑÑÑÑÑÑÑÑ

and using the initial condition g8m< Hz, 0L = 1 we find

ICSB_TutorialV12.nb

86

g8m< Hz, tL = e m2 Hz-jHz ,tLL K jHz, tL + kÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

z + kO

k Hm1 +m2 L

where

yk ,kr HzL ª logHz - 1L + k logHz + kLyk ,kr HjHz, tLL = -H1 + kL kr t + yk ,kr HzL .

This solves the model. For example we can compute the average amount of reaction product Xn2 \ as a functionof time:

Xn2 HtL\ = log Gm Hz, tLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

z2ßz=1

= m2 + XnHtL\

= m2 e -Hkf +kr L t +kf

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅkf + kr

Hm1 + m2 L I1 - e -Hk f +kr L t M

Correlations can be calculated to any order. A few more elaborate reactions have been solved in equilibriumusing hypergeometric functions [18].

Open problems

8A + B ¨ C< reaction “solution” – find fast algorithm (as if there were an analytic solution in terms ofknown special functions) for general values of the rate parameters.

8A + B ¨ C , C ¨ A + D,< reaction “solution” – find fast algorithm for general values of the rateparameters.

3.3.12 Population biology example

To cement the case that SPG’s can be used to model dynamics at a wide variety of scales, we remarkthat common population biology models, both deterministic and stochastic, can be expressed very concisely usingbiochemical reaction notation as in Section 3.11. For example in the epidemiological susceptible-infected-suscepti-ble (SIS) model [19] there are two types of individuals, S=susceptible (but ininfected) and I =infected. Theprocesses or reactions are infection and recovery:

S, I Ø 2 I with ri êê infectionI Ø S with rr êê recovery

The pool state is HnI , nS = N - nI L and the time evolution operator is H`

= ri aI2 aS aI + ri aS aI . Using the

conservation law nS = Ntotal - nI , we may reduce dimensionality by projecting out nS from the pool state space,which is accomplished by mapping

aS # NS = diagHnS L = diagHNtotal L - diagHnI L = Ntotal II - NI ,

where II is the identity matrix in the nI space. Likewise aS # I and nIHmaxL # Ntotal . Then

H`# H

«= ri aI NI HI - NI ê Ntotal L + ri aI , which still has all nonnegative entries for nI b N . From the commuta-

tion relations we calculate H« = ri H1 - 1 êNtotal L aI2 aI - Hri ê Ntotal L aI

3 aI2 + ri aI . This could be formally inter-

preted in terms of birth and death rules for a single type, together with an unusual negative-probability rate rule“2 I Ø 3 I with - ri êNtotal ”, in some as-yet undefined generalization of Equation 12. But such an interpretation ofH«

is not essential.

ICSB_TutorialV12.nb

87

3.3.13 Derivation of Mass action from Moments of the Master Equation

Example

Starting again from the Master Equation

dpÅÅÅÅÅÅÅÅÅÅd t

= H p

with for example

H = kf Ha - IL + kr Ha - NLwe may multiply on the left by 1 to verify conservation of probability:

d H1 ÿ pLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= 1 ÿ

dpÅÅÅÅÅÅÅÅÅÅd t

= 1 ÿ H p = 1 ÿ IH` - diagI1 ÿ H` MM p = 0 .

We may take higher moments by interposing powers or falling factorials of N . E.g. :

d Xn\ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t=

d H1 ÿ N ÿ pLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= 1 ÿ N ÿ

d pÅÅÅÅÅÅÅÅÅÅÅd t

= 1 ÿ N H ÿ p = 1 ÿ IN ÿ H`

- N ÿ diagI1 ÿ H` MM p

Using a a = a a + I ,

NH = a a@k f Ha - IL + kr Ha - NLD= kf Ia2 a + a - a aM + kr Ia a2 - a2 a2 - a aM

whence

1 ÿ N H ÿ p = 1 ÿ Akf Ia2 a + a - a aM + kr Ia a2 - a2 a2 - a aME ÿ‚n=0

¶

pn † n_

= ‚n=0

¶

pn 1 ÿ Ak f Ia2 a + a - a aM + kr Ia a2 - a2 a2 - a aME ÿ † n\

= ‚n=0

¶

pn 1 ÿ @k f Hn † n + 1\ + † n + 1\ - n † n\L + kr HHnL2 † n - 1\ - HnL2 † n\ - n † n\LD

= ‚n=0

¶

pn @k f Hn + 1 - nL + kr HHnL2 - HnL2 - nLD

= ‚n=0

¶

pn@k f - kr nD

so finally

d Xn\ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= kf - kr Xn\

as it should be.

ICSB_TutorialV12.nb

88

General case.

In greater generality, the mass action operator

H = „r=1

R

kHrLÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅikjjjjj‰

i=1

I

Hai LniHrL y{zzzzz ikjjjjj‰

i=1

I

Hai LmiHrL y{zzzzz - ‰

i=1

I

HNi LmiHrL

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑfor the reactions

9miHrL Ai = ö

kHrL 9niHrL Ai=

we want to rewrite the equations

dÅÅÅÅÅÅÅÅÅd t

Xni\ = 1 ÿ Ni H ÿ p

entirely in terms of moments.First we calculate

HNLm = a m am

N an = an N + n an

N an am = an+1 am+1 + n an am

Ni Ha j Ln j Ha j Lmj = Ha j Ln j +1 Ha j Lm j +1 + di j nj Ha j Ln j Ha j Lmj

Ni H = „r=1

R

kHrLÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅikjjjjjj‰

ji

I

Ha j Ln jHrL y{zzzzzz KHai Lni

HrL +1 Hai LmiHrL +1 + ni

HrL Hai LniHrL

Hai LmiHrL O

ikjjjjjj‰

ji

I

Haj LmjHrL y{zzzzzz -

ikjjjjjj‰

ji

I

Ha j Lm jHrL y{zzzzzz KHai Lmi

HrL +1 Hai LmiHrL +1 + mi

HrL Hai LmiHrL

Hai LmiHrL O

ikjjjjjj‰

ji

I

Haj Lm jHrL y{zzzzzzÉ

Ö


1 ÿ Ni H ÿ p = „r=1

R

kHrL 1 ÿ Ni H ÿ ‚8l<

¶

p8l< † ... li ..._

= „r=1

R

kHrL ‚8l<

¶

p8l< 1 ÿ Ni H ÿ † ... li ...\

= „r=1

R

kHrL ‚8l<

¶

p8l< 1 ÿ 9AHli LmiHrL +1 + ni

HrL Hli LmiHrL E ° ... li + ni

HrL - miHrL ...] - AHli Lmi

HrL +1 + miHrL Hli Lmi

HrL E † ... li ...]=

= „r=1

R

kHrL ‚8l<

¶

p8l< 1 ÿ 9AHli LmiHrL +1 + ni

HrL Hli LmiHrL E ° ... li + ni

HrL - miHrL ...] - AHli Lmi


HrL E † ... li ...]=

ICSB_TutorialV12.nb

89

= „r=1

R

kHrL ‚8l<

¶

p8l< 9AHli LmiHrL +1 + ni

HrL Hli LmiHrL E - AHli Lmi


HrL E=

= „r=1

R

kHrL ‚8l<

¶

p8l< 9niHrL Hli Lmi

HrL - miHrL Hli Lmi

HrL =

= „r=1

R

kHrL IniHrL - mi

HrL M‚8l<

¶

p8l< Hli LmiHrL

= „r=1

R

kHrL IniHrL - mi

HrL M [‰j

Hlj LmjHrL _

Thus

(3.43)d Xni\ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= „

r=1

R

kHrL IniHrL - mi

HrL M [‰j

Hnj LmjHrL _

Now we have yet higher moments to calculate. They have dynamics too:

dÅÅÅÅÅÅÅÅÅd t

[‰s=1

k

HnLiHsL _ = 1 ÿ‰s=1

k

NiHsL H ÿ p

Now using

HNLk HaLn = HaLk HaLk HaLn= HaLk HaLk-1 AHaLn a + n HaLn-1 E

= HaLk HaLk-2 AHaLn HaL2 + 2 n HaLn-1 a + nHn - 1L HaLn-2 E

= Hai Lk ‚p=0

k

K kpO HnLp HaLn-p HaLk-p

= Hai Ln „p=0

k HnLp HkLpÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

p! HaLk-p HaLk-p

we calculate

1 ÿ‰i

HNi LkHiL H ÿ p

= „r=1

R

kHrLÄ

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ1 ÿÂ

j=1

I

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ„

p=0

kH jL InjHrL M

p HkH jLLp

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp !

Ha j Ln jHrL +kH jL- p Haj Lmj

HrL +kH jL-p

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÿ p -

É

Ö

ICSB_TutorialV12.nb

90

1 ÿÂj=1

I

Ä

Ç


p=0

kH jL ImjHrL M

p HkH jLLp

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp!

Ha j LmjHrL +kH jL-p Haj Lmj

HrL +kH jL- p

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÿ p

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑ

= „r=1

R

kHrL [Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÂ

i=1

I

Ä

Ç


p=0

kHiL IniHrL M

p HkHiLLp

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅpHiL!

Hni LmiHrL +kHiL- p

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑ- Â

i=1

I Ä

Ç


p=0

kH jL ImiHrL M

p HkHiLLp

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅp!

Hni LmiHrL+kHiL- p

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑ_

„r=1

R

kHrL „8pHiL=0<

8kHiL< Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅ

i

k

jjjjjjjjÂi=1

I IniHrL M

pHiL HkHiLLpHiLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

pHiL!

y

{

zzzzzzzz [‰i=1

I

Hni LmiHrL +kHiL-pHiL _ -

i

k

jjjjjjjjÂi=1

I ImiHrL M

pHiL HkHiLLpHiLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

pHiL!

y

{

zzzzzzzz [‰i=1

I

Hni LmiHrL +kHiL- pHiL _

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑSo a usable version of the moment equations is:

(3.44)d

ÅÅÅÅÅÅÅÅÅd t

[‰i

Hni LkHiL _ = „r=1

R

kHrL „8pHiL=0<

8kHiL< Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅ

i

k

jjjjjjjjÂi=1

I IniHrL M

pHiL HkHiLLpHiLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

pHiL!

y

{

zzzzzzzz -i

k

jjjjjjjjÂi=1

I ImiHrL M

pHiL HkHiLLpHiLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

pHiL!

y

{

zzzzzzzz

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑ [‰

i=1

I

Hni LmiHrL +kHiL-pHiL _

Notice that, again, the leading order terms (p(i)=0) cancel.Fortunately, it is possible to map back and forth between integer powers of a variable n and falling

factorials of n, by using Stirling numbers of the first and second kind.The primary feature of all these sets of equations is that they are infinite - there is no “closure”. Things

simplify if either (a) m=1, so that the corresponding grammar is context-free, or (b) we may approximate differentmolecular species as uncorrelated, so Z‰

i=1

I Hni LqHiL ^ º ‰i=1

I YHni LqHiL ] , or (c) there is a small, maximal number ofmolecules of each type that cuts off the equation coupling.

Under such simplifications, and assuming large number of all particles are present compared to stoichi-ometry numbers, we recover mass action dynamics:

(3.45)d Xni\ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t= „

r=1

R

kHrL IniHrL - mi

HrL M ‰j

Xnj \miHrL

3.3.14 Information flow and entropy change

A biochemical network may evolve from a single definite state of high information content to a diffusedistribution of low information content, or vice versa, depending on the network and the initial conditions. Totalinformation $ = -⁄I pI log2 pI will increase or decrease accordingly. But it may be useful to have an expressionthat attributes changes in such a global information measure to the contributions of individual state changes. In abiochemical network, pure global states I correspond to kets » 8ni <\ and are discrete and countable, so we can sumrather than integrate over them. Then

S = -‚I

pI log pI

ICSB_TutorialV12.nb

91

d SÅÅÅÅÅÅÅÅd t = -‚IH log pI + 1L d pIÅÅÅÅÅÅÅÅÅÅÅd t

= -‚IH log pI + 1L d pIÅÅÅÅÅÅÅÅÅÅÅd t

= -⁄I J log pI HI J pJ

= -‚I JHlog pI L H` I J pJ + ⁄J Hlog pJ L DJ J pJ

which partitions the change in entropy into changes occurring at particular pure states J and in the possibletransitions from states J towards new states I . Recall DJ J = ⁄I H

Ì J and H

Ì J r 0.

For a biochemical reaction, both terms can be broken down by the reaction number “r” involved, sinceJ and r determine I :

» 8ni HIL<\ = … 9ni HJL + niHrL - mi

HrL =]d SÅÅÅÅÅÅÅÅÅÅÅd t

= -‚r J

Hlog pJ+DHrL L H` r J pJ + ‚r J

Hlog pJ L DJ Jr pJ

This type of expression may be useful for computing the flow of various kinds of information within a biochemi-cal network, by breaking it down as a sum over reactions, r .

For pure terminal states J (for which "I H` I J = 0), the contributions to the sum over J are zero. For allother states we may define

qI J = H`

I J ì‚I

H`

I J = H`

I J êDJ J .

For each nonterminal J , this is a probability distribution (since qI J r 0 and ⁄I H`

I J = 1). Then

d SÅÅÅÅÅÅÅÅÅÅÅd t

= -„J

ikjjjjj‚

I

Hlog pI L H` I Jy{zzzzz pJ + ‚

J

Hlog pJ L DJ J pJ

= -„J

ikjjjjj‚

I

Hlog pI L qI Jy{zzzzz DJ J pJ + ‚

J

Hlog pJ L DJ J pJ

= -„J

Ä

ÇÅÅÅÅÅÅÅÅÅÅikjjjjj‚

I

Hlog pI L qI Jy{zzzzz - Hlog pJ L

É

ÖÑÑÑÑÑÑÑÑÑÑ DJ J pJ

= -‚I J

@Hlog pI L qI J - Hlog pJ L dI J D DJ J pJ

and thus


= -‚I J

Hlog pI L qI J DJ J pJ + ‚J

Hlog pJ L DJ J pJ

which again attributes the global change of entropy to local flows.We may use convexity of the -log function to bound d S ê d t .

ICSB_TutorialV12.nb

92


r -‚I J

Hlog pI qI J L DJ J pJ + ‚J

Hlog pJ L DJ J pJ

= -‚I J

@Hlog pI L + Hlog qI J LD DJ J pJ + ‚J

Hlog pJ L DJ J pJ

= -‚I J

Hlog qI J L DJ J pJ - ‚I J

Hlog pI L DJ J pJ

r 0

In other words we have rederived the second law of thermodynamics,


r 0.

This result does not depend on the assumption of a biochemical network, just a continuous-time stochasticprocess determined as above by H

`. A good guess for the limiting distribution of the process, then, would be

obtained by apportioning probability between all terminal states (if any), since they have d S êd t = 0, based ontheir reachability; and if there are states that can’t reach a terminal state due to lack of connectivity of state spaceor lack of terminal states, by looking for all constrained or conserved quantities (such as conserved linear combina-tions of ni HIL’s in a biochemical network) respected by the stochastic process determined by H` , and then maximiz-ing S as a function of pI with Lagrange multipliers to enforce all such constraints including⁄I pI = total nonterminal probability b 1. The result will typically be some kind of Boltzmann distribution on thenonterminal states that can’t reach terminal states.

The nonequilibrium approach to steady-state may be studied as a sequence of progressively removingconstraints on the maximization of S , as reactions or other processes (summands of H with different firing rates)with progressively longer time constants come into play and enable the violation of more conservation laws untilultimately only a few such laws are left.

Of course, any steady state d p êd t = 0 will enforce d S êd t = 0 so entropy is not necessarily maxi-mized in a steady (mixed) state; but if it is unique, a maximal entropy mixed state is necessarily a steady (mixed)state. We have assumed that pure states are discrete and countable.

3.3.15 Continuous-time stochastic/differential hybrid systems

Continuous-time rule:

O8DE< = -· d 8x< · d 8y< aH8y<L aH8x<L i

k

jjjjjjjj„i

“ yi vi H8y<L ‰k

dHyk - xk Ly

{

zzzzzzzz

Using

(3.46)

Xw » expHt O8DE< L » z\ = expikjjjjjt ‚

i

vi H8z<L “ zi

y{zzzzz dH8w< - 8z<L

= dikjjj8w< -

ikjjj8zH0L = z< + ‡

0

t vi HzHt£ LL d ty{

zzzy{zzz

We now derive a new simulation algorithm for combinations of discrete events and ODE’s, using operatoralgebra.

ICSB_TutorialV12.nb

93

Heisenberg picture

Let the operators, rather than the states, evolve in time according to H0 . This is traditionally called the“Heisenberg picture” in distinction to the “Schroedinger picture” in quantum mechanics. Here is the derivation ofthe Heisenberg picture for statisical field theory:

exp Ht HH0 + H1 LL =

‚k=0

¶ ÄÇÅÅÅÅÅÅÅÅ‡0

td tk ‡

0

tk

d tk-1 ‡0

t2d t1 expHHt - tk L H0 L H1 expHHtk - tk-1 L H0 L H1 expHt1 H0 L


= ‚k=0


td tk ‡

0

tkd tk-1 ‡

0

t2d t1 expHt H0 L@expH-tk H0 L H1 expHtk H0 LD@expH-tk-1 H0 L H1 expHtk-1 H0 LD

@expH-t1 H0 L H1 expHt1 H0 LDÉÖÑÑÑÑÑÑÑÑ

= expHt H0 L ‚k=0


td tk ‡

0

tkd tk-1 ‡

0

t2d t1@expH-tk H0 L H1 expHtk H0 LD@expH-tk-1 H0 L H1 expHtk-1 H0 LD

@expH-t1 H0 L H1 expHt1 H0 LDÉÖÑÑÑÑÑÑÑÑ

Defining for all t£

H1 Ht£ L = expH-t£ H0 L H1 expHt£ H0 L,we find as in [14]

(3.47)

exp Ht HH0 + H1 LL = expHt H0 L ‚k=0


td tk ‡

0

tkd tk-1 ‡

0

t2

d t1 H1 Htk L H1 Htk-1 L H1 Ht1 LÉÖÑÑÑÑÑÑÑÑ

= expHt H0 L ! expikjjj‡

0

t

d t H1 HtLy{zzz

where ! is the time-ordering super-operator

! @H1 Hti L H1 Ht j L D =lmn

H1 Hti L H1 Htj L if ti r tj

H1 Ht j L H1 Hti L if ti b tj

Hand likewise for higher order productsLbecause

! expikjjj‡

0

td t H1 HtLy{

zzz

= ‚k=0

¶

1

ÅÅÅÅÅÅÅÅÅn!

!ÄÇÅÅÅÅÅÅÅÅ‡0

td tk ‡

0

td tk-1 ‡

0

td t1 H1 Htk L H1 Htk-1 L H1 Ht1 L


= ‚k=0


td tk ‡

0

tkd tk-1 ‡

0

t2d t1 H1 Htk L H1 Htk-1 L H1 Ht1 L

ÉÖÑÑÑÑÑÑÑÑ.

ICSB_TutorialV12.nb

94

Application to ODE + clock decay

In the present case, the opererators H1 Htk L = -DHtk L defined at different times are all simultaneously diagonaland therefore commute:

@DHti L, DHtj LD = 0 .

Consequently, we can simply drop the time-ordering super-operator ! and write

(3.48)expHt HH0 - DLL = expHt H0 L expikjjj-‡

0

td t£ DHt£ Ly{

zzz

where

DHt£ L = expH-t£ H0 L D expHt£ H0 L.This calculation defines the “#” operation of the following section.In our case, (Equation 48) specializes to

(3.49)

expHt H-vH8z<L ÿ “ z -DLL = expH-t vH8z<L ÿ “ z L expikjjj-‡

0

td t£ DHzHt£ LLy{

zzz

= expHt O8DE< L expikjjjj-‡

0

td t£ D

ikjjjjzH0L + ‡

0

t£

vH8z<L d ty{zzzzy{zzzz

Equivalent ODE

Then

gHzL HtL ª et vH8z<L ÿ“ z gHyL e-t vH8z<Lÿ“ z = Iet Ad vH8z<L ÿ“ z gHw - zLM = gikjjjw -

ikjjjz + ‡

0

td t£ vHzHt£ LLy{

zzzy{zzz

Using (Equation 46)

exp Ht v H8z<L ÿ “ z L d Hw - zL = dikjjjw -

ikjjjz + ‡

0

t vHzHt£ LL d t£y

{zzzy{zzz

exp Ht v H8z<L ÿ “ z L f HzL = f ikjjjz + ‡

0

t

vHzHt£ LL d t£y{zzz

we calculate

expHt HvH8z<L ÿ “ z -DLL f HzL= expHt O8DE< L exp

ikjjjj-‡

0

td t£ D

ikjjjjzH0L + ‡

0

t£

vH8z<L d ty{zzzzy{zzzz f HzL

= expikjjjj-‡

0

td t£ D

ikjjjjzH0L + ‡

0

t£

vH8z<L d ty{zzzzy{zzzz f ik

jjjz + ‡0

t vHzHt£ LL d t£y

{zzz

This is an alternative way to get the semantics expressed in (Equation 49). The problem is with the first factor -how to obtain that from ODE’s? First introduce a new state variable l0 involved in every ODE-related rule:

ICSB_TutorialV12.nb

95

expHt HvH8z<L ÿ “ z -DLL f HzL =

‡-¶

¶

d l0 expH-l0 L f ikjjjz + ‡

0

t vHzHt£ LL d t£y

{zzz d

ikjjjjl0 - ‡

0

td t£ D

ikjjjjzH0L + ‡

0

t£

vH8z<L d ty{zzzzy{zzzz

Set lH0L = 0, and augment the ODE’s as follows

Z = Hz, lLVHzL = HvH8z<L, -DHzLL

“ Z = H“ z , l LOè8DE< = V HZL “ Z = vH8z<L ÿ “ z -DHzL l

Then

exp Ht O8DE< L F HZL = FHZHtLL =

FikjjjZH0L + ‡

0

t d t£ V HZHt£ LLy{

zzz = f HzL dHl0 L = f HzHtLL dHl0 HtLL = f ikjjjz + ‡

0

t d t£ vHzHt£ LLy{

zzz dikjjjl0 - ‡

0

td t£ DHzHt£ LLy{

zzz

= ‡-¶

¶

d l f ikjjjz + ‡

0

t

d t£ vHzHt£ LLy{zzz d

ikjjjl - ‡

0

t

d t£ DHzHt£ LLy{zzz dHl0 - lL

so then finally

expHt HvH8z<L ÿ “ z -DLL f HzL= expIt Oè 8DE< M dHl0 - lL expH-l0 L FHZL

(3.50)H0 = expIt O

è8DE< M dHl0 - lL

H1 = expH-l0 LH`

= H0 ÿ H1

where H`

represents the Markov process corresponing to the simulation algorithm.This expression provides the discrete-time semantics equivalent to the desired continuous-time semantics. Note

that H0 is a “guarded” version of the ODE evolution, which must terminate when its dynamical state variable lequals l0 . l0 itself is first chosen from an exponential distribution by a suitable discrete-time rule with semanticsH1 .

In implementations so far we have used instead

Oè 8DE< = VHZL “ Z = vH8z<L ÿ “ z -DHzL p p

with p = 1 - expH-lL .

3.3.16 Graph grammars

Graph grammars are composed of local rewrite rules for graphs (see for example [20]). We nowexpress a class of graph grammars in terms of SPG’s. A precursor example for graph grammars is the Anabaenamodel in which the graph nodes represent cells and the edges are connections between adjacent cells. Pointersbetween objects exist in the Anabaena grammar from section 3.17.1 , where cell objects are connected to neighbor-ing cell objects in a row. For example, the second rule in the Anabaena model which divides a vegetative cell intwo is:

8w := CHl, c ; Hw1 , w2 LL, VHwL< Ø8w11 := CHk l, c ; Hw1 , w12 LL, w12 := CHH1 - kL l, c ; Hw11 , w2 LL, V Hw11 L, VHw12 L<

ICSB_TutorialV12.nb

96

This rule can be converted automatically into the following form, which does not use special syntax for ObjectIdentifiers:

8CHw, l, c , w1 , w2 L, VHwL, OIDGenHNextOIDL< Ø8CHNextOID, k l, c , w1 , NextOID + 1L, CHNextOID + 1,

H1 - kL l, c , NextOID, w2 L, V Hw11 L, VHw12 L, OIDGenHNextOID + 2L<Thus the w := CH ...L notation associates a unique object reference to the variable w , the Object Identifier.

The following syntax introduces Object Identifier (OID) labels Li for each parameterized term, andallows labelled terms to point to one another through a graph of such labels. The graph is related to two subgraphsof neighborhood indices NHi, sL and N£ H j, sL specific to the input and output sides of a rule. Like types orvariables, the label symbols appearing in a rule are chosen from an alphabet 8Ll » l œ L< . Unlike types but likevariables Xc , the label symbols LlHiL actually denote unique values in some discrete domain such as the nonnega-tive integers, thus serving as unique addresses or object identifiers.

A graph grammar rule is of the form, for some nonnegative-integer-valued functions lHiL , l£ H jL ,NHi, sL , N£ H j, sL for which HlHiL = lH jLL fl Hi = jL , Hl£ HiL = l£ H jLL fl Hi = jL :

(3.51)9Ll HiL := ti Ixa HiL ;ALN Hi,sL »» s œ 1..sa HiLmax EM … i œ $=

Ø 8Ll HiL » i œ $1 Œ $< ‹ 9Ll£ H jL := t j Ixa£ H jL£ ;ALN £ H j,sL »» s œ 1..sa£ H jLmax EM … j œ ,=with rr IAxa£ H jL£ E … @xa HiL DM

(compare to Equation 2). Note that the fanout of the graph is limited by sicur b saHiLmax . Let

$ = $1 ‹ $2 and $1 › $2 = Ø

,1 = 8 j œ , Ô H$ i œ $2 » lHiL = l£ H jLL<,2 = 8 j œ , Ô H± i œ $2 » lHiL = l£ H jLL<

$3 = 8i œ $2 ÔH± j œ ,1 » lHiL = l£ H jL Œ $2 L<This syntax may be translated to an ordinary non-graph grammar rule.

3.3.17 Developmental biology example

Anabaena

As an example of application to multicellular biological modeling, a model [21] of filamentous cyano-bacteria Anabaena catenula has been fully reimplemented within our “Plenum” prototype implementation of aDynamical Grammar modeling language, with added stochasticity in the criterion for cell division. The modelrequires both discrete-time and continuous-duration rules. We used the Time Ordered Product expansion (Section3.9.2) to derive a simulation algorithm for this case. A typical state of the system is shown in Figure 7(b).

In the Anabaena model, cells are attached to two neighboring cells forming a row structure which canbe modeled by a simple graph (actually a one-dimensional linked list). We will use the notation of a graph gram-mar that is formally defined by reduction. The w := CHl, c ; Hw1 , w2 LL notation assigns an object reference to thevariable w , the Object Identifier. C is the cell object which has four parameters: l - cell length , c - compoundconcentration, and Hw1 , w2 L the references to the left and right neighbors. A cell can be of two types VHwL -vegetative or HHwL - heterocyst. When an object is used by a rule but left unchanged, we abbreviate and mentiononly the object identifier (w) in the RHS , as can be seen in the first rule. The “:=”, “;”, and “(...(...))” notationsused below can all be eliminated in favor of previously defined syntax.

ICSB_TutorialV12.nb

97

In the Anabaena model, cells are attached to two neighboring cells forming a row structure which canbe modeled by a simple graph (actually a one-dimensional linked list). We will use the notation of a graph gram-mar that is formally defined by reduction. The w := CHl, c ; Hw1 , w2 LL notation assigns an object reference to thevariable w , the Object Identifier. C is the cell object which has four parameters: l - cell length , c - compoundconcentration, and Hw1 , w2 L the references to the left and right neighbors. A cell can be of two types VHwL -vegetative or HHwL - heterocyst. When an object is used by a rule but left unchanged, we abbreviate and mentiononly the object identifier (w) in the RHS , as can be seen in the first rule. The “:=”, “;”, and “(...(...))” notationsused below can all be eliminated in favor of previously defined syntax.

Defining the soft threshold function

sHxL =k

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ1 + e-xêt > ; k if x ¥ 0

0 otherwise

we write the grammargrammar Anabaena {

(*continuous change in vegetative cell’s concentration and length, depending on neighbors parameters*)8w1 := CHl1 , c1 ; Hw0 , wLL, w := CHl, c ; Hw1 , w2 LL, w2 := CHl2 , c2 ; Hw, w3 LL, VHwL< Ø8w1 , CHl, c ; Hw1 , w2 LL, w2 , VHwL<

solving {c° = hHc1 + c2 - 2 cL - m c , l°

= l l}(*split a vegetative cell into two vegetative cells , if its length is longer than L. Connect the new

cells to neighbors*)8w := CHl, c ; Hw1 , w2 LL, V HwL< Ø

8w11 := CHk l, c ; Hw1 , w12 LL, w12 := CHH1 - kL l, c ; Hw11 , w2 LL, V Hw11 L, VHw12 L<with sHl - LL

(*change a vegetative cell into heterocyst if the concentration level drops below certain level*)8w := CHl, c ; Hw1 , w2 LL, V HwL < Ø 8w := CHl, c ; Hw1 , w2 LL, HHwL <

with sHQ - cL(*continuous change in heterocyst cell concentration and length*)

8w := CHl, c ; Hw1 , w2 LL, HHwL < Ø 8w := CHl, c ; Hw1 , w2 LL, HHwL < solving {l

°= y * HL - lL , c° = j * HK - cL}

}As defined by the grammar rules, vegetative cells’ signal concentration c is continuously changing

according to the ODE c£ = hHc1 + c2 - 2 cL - m c which combines diffusion and decay. The vegetative cellelongates according to an exponential rate l until it reaches a threshold length somewhere near the thresholdvalue L , and then it divides into two vegetative cells. A transformation into a heterocyst cell occurs when thevegetative cell reaches a signal concentration somewhere near the threshold value Q . The heterocyst cells’ lengthand concentration level converge exponentialy to the limit values L and K as described by the ODEs in the lastrule. Note also the use of terms C, V and H to implement a simple type hierarchy, with C as base type for V and H.

5 10 15 20

2.5

5

7.5

10

12.5

15

ICSB_TutorialV12.nb

98

10 20 30 40

2.5

5

7.5

10

12.5

15

Figure 7: Simulation snapshots of the Anabaena model: The light and dark gray bars represent vegetative cells,and the black bars represent heterocyst cells. A cell’s signal concentration is signified by both the bar height andthe gray level, while the bar width represent the cell’s length. System state after (a) 40 and (b) 80 iterations. Thesimulation was initiated with only 3 vegetative cells. Note the new heterocyst at position ~17. The distancebetween the heterocysts cells, both in length and in number of seperating cells, remains relatively constant despitecell divisions and continuous growth.

3.3.18 Partial differential equations (PDE’s) and stochastic PDE’s

Finally, an important problem is to translate partial differential equations and stochastic partial differen-tial equations of general form into the operator algebra. This can be done by relating PDE’s and SPDE’s to largesystems of ODE’s and SDE’s, and taking the limit symbolically. Nontrivial mathematics will be needed toconfirm whether the indicated limits really exist or not.

Consider the (possibly stochastic) PDE

(3.52)FHxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ t

= F@FD HxL = FKFHxL, FHxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅx

, ... ,n FHxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxn O + hHtL.

where x may be a scalar or a vector, and likewise for F. We make the following mapping to (Equation XXX):

Table 1. Ordinary vs. Partial differential objects

Ordinary differential object Partial differential object

d ê d t ê t i x xi FHxL yi F£ HxL ê xi Hpartial derivativeL d ê d FHxL Hfunctional derivativeLD (homog. scalar diffusion coef.) D Hhomog. scalar diffusion coef.L dHy - xL = ¤i dHyi - xi L DHF£ - FL = Ÿ d x dHF£ HxL - FHxLLŸ d x gHxL(ordinary integral)

Ÿ -F G@FD(functional integral)

at HxL = at H8xi <L at HFL = at H8FHxL<LWith this table of translations, the drift and diffusion operators for PDE’s and SPDE’s become

ICSB_TutorialV12.nb

99

(3.53)Odrift = -‡ ‡ -F -F£ aHF£ L at HFL K‡ d x d

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd F£ HxL F@F£ D HxL DHF£ - FL O

(3.54)Odiffusion = D ‡ ‡ -F -F£ aHF£ L at HFL ikjjj‡ d x

d2ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd F£ HxL 2 DHF£ - FLy{

zzz

This gives another potential application of the time-ordered product expansion which can be used to createsimulation algorithms.

With suitable PDE’s it becomes possible to represent dynamically changing manifolds, either bydifferential equations for the metric as in General Relativity, or for an explicit embedding into a higher dimen-sional space, or for an implicit embedding given by a function f HxL = 0 (a level set method).

3.3.19 Goal-directed approximation of dynamical systems

Here is a general, goal-oriented approximation framework in which both x and y spaces are mapped toan essential space of the “observables” of interest for a particular purpose, wherein a distortion measure is defined.Let zx H8z< » 8xi <L be a defined relationship (a given conditional distribution) between fine-scale variables x and a setof observables 8z< , and let zy H8z< » 8ya <L be the relationship, to be optimized, between coarse-scale variables 8ya <and the same set of observables 8z< . Then a suitable distortion measure for optimizing zy H8z< » 8ya <L and QH8ya <L is

(3.55)D@PH8xi <L, QH8ya <LD = DKL C‡ zx H8z< » 8xi <L PH8xi <L d 8xi <, ‡ zy H8z< » 8ya <L QH8ya <L d 8ya < G

If zx is the identity then zy is the prolongation map. If the z’s can be mapped to the y variables, on the otherhand, then zx plays the role of a fixed (unoptimized) restriction map. The formulation of (Equation 55) hassimilarities to an autoencoder or more generally the “Information Bottleneck” formalism of Tishby, with y and itsdistribution QHyL playing the role of a bottleneck in the mapping between x and z .

Generalization to time series

Everywhere in the foregoing, we may split a collection of variables such as x, y, or z into a pair suchas HxHtL, xHt + D tLL and regard the modeled probability distribution as a conditional distribution PHxHt + D tL » xHtLLthat is independent of t (depends only on D t ) and defines a semigroup for a dynamical system - a stochasticprocess obeying:

(3.56)

PHxHt + D tL » xHtLL = PHxHD tL » xH0LL, 0 b D tPHx£ HtL » xHtLL = dHx£ HtL - xHtLL

PHxHt + D tL » xHtLL = Ÿ-¶

¶ 8d xHt£ L< PHxHt + D tL » xHt£ LL PHxHt£ L » xHtLL, t b t£ b t + D t

The multiscale approximation setting then becomes that of Figure A1.

x(t) x(t+Δ)

y(t) y(t+Δ)

z(t) z(t+Δ)

ICSB_TutorialV12.nb

100

Figure A1.We require that the joint distributions such as PHxHt + D tL, xHtLL take a specific form, and adapt the K-L

distance measures proposed above for MRF learning accordingly. In this way we arrive at a fundamental multi-scale model-reduction approach to the prediction of time series, including the image sequences in our featuredapplication domain.

ICSB_TutorialV12.nb

101

Chapter 4

Graphs

4.1 Network graph properties

Degree distributions

Figure: Degree distribution of C. elegans protein interactions derived by logistic regression. AshishBhan’s plot of Sternberg lab data.

4.2 Graph Laplacian

For a graph with adjacency matrix G, define

@L yDi = ‚j

Gi j y j -ikjjjjjj‚

j

Gj i

y{zzzzzz yi

@L Di j = Gi j -ikjjjjjj‚

j

Gj i

y{zzzzzz di j

This generalizes the Laplacian operator on a fixed d -dimensional grid to other graphs. Compare to the formulasfor H = H

`- D and KI J representations of stochastic processes. The Lagrangian is the time generator operator for

an “isotropic” homogeneous diffusion process on the graph G .Number of spanning trees of G : » det L » .

ICSB_TutorialV12.nb

102

Interleaving of the spectrum: Eigenvalues are interlaced when one edge is added to a graph joining twodistinct nodes.

e.g. [Godsil and Royle, Algebraic Graph Theory]The second eigenvector of the Laplacian is often suggested as a way to partition a graph into two

relatively unconnected subgraphs.

Open problem

Spectrum of graphs arising from a graph grammar.

4.3 Graph Grammars

Applications: tissue mechanics, gene duplication in netsPattern formation in space

As an example of a graph grammar, discrete link “color” labels can be used to specify a flexible meta-graph-grammar in terms of a matrix Gin jn

a b œ 80, 1< of allowed color transitions from parent to child links as afunction of predetermined child node labels Hik , jk L [22]. In the following grammar, boldface indices i, j refer tosequences Hi1 , i2 , ... ik-1 L and Hi j , j2 , ... jk-1 L which can index nodes in a binary tree, and the operation Hi, ik Lyields the concatenated sequence Hi1 , i2 , ... ik L that can index child nodes. Also the variable AHi,ik L is a 0/1-valued“aliveness” variable, that indicates which rooted subtree of the infinite binary tree has resulted from a node-generating grammar such as binaryclustergen (Section 5.2). If the subindices il are allowed to range over a largerset than {0,1} then the notation generalizes to trees with some larger fixed fanout than binary. With these nota-tions, we can link up nodes randomly and recursively as follows:

grammar (discrete-time) graph-recursion (start Ø 8nodeHiL, linkHa, i, j<) {start Ø nodeHH0LL, linkH1, H0L, H0LLN := nodeHiL Ø N, 8nodeHHi, ik LL » AHi,ik L = 1Ô ik < imax <

under E = m ⁄ik AHi,ik LlinkHa, i, jL, N := nodeHHi, ik LL, M := nodeHHi, ik LL Ø 9linkHb, Hi, ik L H j, jk LL … Gik jk

a b = 1=, N, M}Expander graph constructions have been defined in a similar manner.A recursion relation for the resulting connection matrix, in the absence of other rules that add noise, is [Bhan

and Mjolsness 2006]:

(4.1)GHi, jLa0 = ‚

8al <‰l=1

L

IGil jl

al-1 al MAHi1 ... il L AH j1 ... jl L

Graph theorists tend instead to use the inverse of such operations, (edge) deletion G \ e and contraction G ê e .

4.4 Graph Automata

Consider the Boltzmann distribution for the energy function

ICSB_TutorialV12.nb

103

(4.2)EHxi , GL = C0 ‚i

F1 Hxi L + C1 ‚i j

Gi j F2 Hxi , x j L + C2 ‚i j

Gi j Gj i + C3 ‚i j k

Gi j Gj k Gk i

and the dynamics of Markov processes that obey detailed balance for this distribution.Growth according to a simple meta-grammar (as in the previous section) could be served by an addi-

tional term (cf. Equation 4.1):

(4.3)C4 ‚i j in jn

‚a b

Gi ja Gin jn

a b GlHi in L lH j jn Lb

where l is a lineage tree map. Alternatively the indices could be refined one at a time:

C4 ‚i j in

‚a b

Gi ja Gin in

a b GlHi in L jb + C4 ‚

i j jn

‚a b

Gi ja Gout jn

a b Gi lH j jn Lb

An example of Equation 4.2 is provided by the weak spring model.

4.4.1 Weak spring model

Consider a network of weak springs with potentials that look rougly like the figure:

Figure: Weak spring potential function (but can be smoothed at point x=d). From [Shapiro and Mjolsness,Second International Conference on Systems Biology (ICSB) 2001.]

[From: The growth and development of some recent plant models: A viewpoint. E. Mjolsness, inJournal of Plant Growth Regulation, Journal of Plant Growth Regulation, in press.]

“What is needed is a network-like model of developmental space and in particular of the mechanics ofcellular compartments. An example of a “mechanical network” would be a tinkertoy arrangement of linearmechanical elements, called “struts” or “springs with nonzero resting length”, which exert force only along theiraxes. Truss bridges and structures can be modeled to first approximation with such elements. In our work and incomputer graphics these are known as “mass-spring models”. However, connections between cells may dwindlein relative overlap or break entirely upon cell division, so that the springs should be “weak springs” that cansmoothly break. All of these relationships can be modeled very simply by potential energy functions that dependonly on the actual length and the resting length of a spring or strut [Shapiro and Mjolsness 2001]. This mechanicalmodel has been used in modeling phyllotaxis [Jönsson and others 2006] where its flexible topology plays anessential role in allowing cell growth and division to make room for new primordia, allowing them to escapeinhibition by the old ones.

ICSB_TutorialV12.nb

104

“What is needed is a network-like model of developmental space and in particular of the mechanics ofcellular compartments. An example of a “mechanical network” would be a tinkertoy arrangement of linearmechanical elements, called “struts” or “springs with nonzero resting length”, which exert force only along theiraxes. Truss bridges and structures can be modeled to first approximation with such elements. In our work and incomputer graphics these are known as “mass-spring models”. However, connections between cells may dwindlein relative overlap or break entirely upon cell division, so that the springs should be “weak springs” that cansmoothly break. All of these relationships can be modeled very simply by potential energy functions that dependonly on the actual length and the resting length of a spring or strut [Shapiro and Mjolsness 2001]. This mechanicalmodel has been used in modeling phyllotaxis [Jönsson and others 2006] where its flexible topology plays anessential role in allowing cell growth and division to make room for new primordia, allowing them to escapeinhibition by the old ones.

Figure: [Mjolsness and others 2003] (a) Weak spring model with internal compression and external tension,along with (b) cell division in a (c) hexagonal array of cells with one recent cell division leads to (d) maintenanceof a clonal outer layer. Figures (c) and (d) courtesy Henrik Jonsson, Lund University.

Fortunately the weak spring model allows bidirectional coupling of mechanical and regulatory networkmodels. The regulatory network governs gene expression, metabolism, the growth of cell volume, the synthesisstructural molecules, and the cell cycle including mitosis and cell division, which again affects cell volume. Cellvolume and the amounts of any structural molecules govern the individual properties (strength and resting length)of the idealized spring between neighbors. Cell positions automatically minimize the total mechanical energy,through fast Aristotelian dynamics with velocity proportional to force over viscosity. The cell positions determinetheir geometry including the interface area between any two cells. This interface area modulates the strength ofany intercellular communication impinging on the regulatory network of each cell from the others; if it is zero,there is no direct signaling. Thus, the global regulatory network influences the mechanical network and themechanical network influences the regulatory network.”

4.5 Graph Homology

4.5.1 Cycle spaces of networks

Networks are directed graphs and give rise to other directed graphs through projections. It may beimportant to find the feedback loops in such a graph, due to their potential importance in dynamics. For a largenetwork there may be very many such cycles and potential feedback loops. We can take advantage of the cyclespace to compute just a minimal basis for the cycle space, rather than all possible cycles.

Implementations of such algorithms are described in [Mehlhorn and Michail, www.mpi-sb.mpg.de/~mehlhorn/ftp/CycleBasisImpl.pdf ]

ICSB_TutorialV12.nb

105

4.5.2 Homology on graphs

We attempt a definition similar to that of Chapter 5, Section 4.2 below for homology of topologicalspaces.

Define the ordered, directed simplex graph of dimension d:

Dd = H8vi » 0 b i b d<, 8ei j = Hvi , vj L » 0 b i, j b d<L= H@vi »» 0 b i b dD, @ei j »» 0 b i, j b dDL

This is simply the fully connected clique graph with d nodes.A map s from Dd to a directed or undirected graph G can’t be “continuous”; homology theories on

graphs will differ by the constraints imposed on s. For example s could be assumed to map directed edges (only)into directed “paths” (variously defined below according to minimal length). That would preserve the sense if notthe magnitude of any flows within the two graphs, such as metabolic fluxes or information transmission. Thiscondition on s : Dd Ø G defines a d -simplex in G . d -chains and boundary maps are defined as in Geometrysection on singular homology, below.

If “path” can be of length zero then this is a “singular” homology. If the path must of of length 1 ormore and G is undirected, then the boundaryless 1-chains (members of KerH1 L are closed paths of length 3 ormore (since the length 2 closed paths cancel out), which is the definition of a cycle in an undirected graph.KerH1 L is the “cycle space”.

Following Reinhart Diestel, Graph Theory (2000): For undirected graphs, another idea of homology isto define it over the field #2 = 80, 1< rather than #. Then the incidence matrix from edges to nodes is identicalwith the map 1 . It has a kernal (the cycle space .) and its transpose has an image (the cutset space .* ). Thesetwo spaces are orthogonal (over the field #2 ) and their dimensions add up to the number of edges in the graph.Diestel Amer. Math. Monthly 111 (2004) 559-57 finds that such homology is “nontrivial” for infinite graphs.

Returning to field #, if the paths must be of length 1 in either directed or undirected graph G , thenmembers of KerH1 L are integer-weighted sets of basis elements, namely closed paths of length at least 2 or 3respectively (which we can define as “cycles” in either case) and the members of ImH2 Lare similarly spanned bytriangulated cycles. H1 HGL is spanned by equivalence classes of cycles whose difference cycle can be triangulated.

If the paths must be of length 1 or more, then KerH1 L= cycle space . and ImH2 L=space spanned bycycles of length 3 or more = . for undirected graphs.

Stosic http://arxiv.org/abs/math/0605579 finds a homology definition on knots that extends to graphsand explains classic graph and knot polynomials. Alekseyevskaya et al “Matroid Homology”ma.umist.ac.uk/avb/pdf/abgw6.pdf define two boundary operators for undirected graphs in terms of edge deletionand contraction. A key source seems to be M. Kontsevich, Feynman diagrams and low-dimensional topology, inFirst European Congress of Mathematics, Paris, July 6–10, 1992, Birkhauser, 1994, vol. 2, pp. 97–122.

4.6 Graphs that specify models

Data-levelArtificial neural netsGene regulation networks

ICSB_TutorialV12.nb

106

Biochemical reaction networksMarkov Random FieldsBayes Networks

Model-levelPlates; plateletsDependency diagramsPetri NetsSPG’s as graphs; self-applicationCategory theory diagrams

ICSB_TutorialV12.nb

107

Chapter 5

Geometry

5.1 Mechanics of deformable media

5.1.1 Elastic and solid mechanics: Spring model

Tissue, cells, and macromolecules have nontrivial mechanical properties and modeling them is of greatimportance wherever geometry matters, for example in development. One approach to such modeling is throughenergy and force. For example the potential energy V Hx1 , x2 L of a single idealized spring connecting two end-points at x1 and x2 in d dimensions follows Hook’s law:

VHx1 , x2 L =kÅÅÅÅÅ2

I±x1 - x2 -l0 L2

Here l0 is the resting length of the spring and x1 - x2 = "####################################⁄a=1d Hx1 a - x2 a L2 is the distance between two

points. If the resting length is zero, this simplifies to the quadratic form VHx1 , x2 L = Hk ê2L ⁄a=1d Hx1 a - x2 a L2 .

Observe that V Hx1 , x2 L is only a function of the difference x1 - x2 , because adding a constant vector to both x1 andx2 is just a change of coordinates which can’t affect the potential energy. Observe further that VHx1 , x2 L isn’teven a general function of x1 - x2 , but is instead only a function of the distance x1 - x2 . The distance isinvariant under rotations of the coordinate system, which is sensible since an arbitrary choice of coordinate systemcan’t affect a physical scalar quantity like the potential energy. See Chapter 4, Section 4.1 for another V , corre-sponding to a breakable “weak spring” that allows cells to separate enough to escape one another’s mechanicalinfluence.

The force such any such spring will exert on its endpoints is

Fi = -

ÅÅÅÅÅÅÅÅÅÅÅÅxi

V Hx1 , x2 L

which for Hooke’s law implies F1 = -kHx1 - x2 L for the zero resting length case. Thus, force is the negativederivative of energy with respect to position. Conversely, energy is the integral of force through distance. All theforces acting on a point add up linearly, as vectors.

If there is an external force fexternal applied to the masses, then as a function of dispacement u fromequilibrium, the corresponding energy will by - fexternal ÿ u . The potential V will be approximately quadratic in uexcept at inflection points. The total energy will be

Vtotal º1ÅÅÅÅÅ2

u ÿ K ÿ u - fexternal ÿ u

and the externally modified equilibrium will be

ICSB_TutorialV12.nb

108

K ÿ u = fexternal .

K is called the stiffness matrix. Just as we have regulatory and metabolic networks, we may create mechanical networks by assembling

large numbers of such potential energy elements. One model of an elastic material is an infinitely fine network ofinterconnected springs, augmented by energies for connecting and disconnecting them. These imaginary springscould be realized as individual macromolecules, molecules or even covalent bonds, arranged in an organizedstructure. However, the mechanical state of a cell is probably more complex than would be suggested by thispicture. It has elements of active “cellular machinery” including motor molecules, viscous fluid flow, mechanicalcables, struts, membranes, and so on, at a small scale with thermal noise that turns our everyday mechanicalintuition into an imperfect guide.

Dynamics may be deduced from force in several different ways depending on the length scale involved.Three dynamics domains may be identified: Newtonian, Aristotelian, and Brownian. The most fundamental ofthese descriptions is Newtonian: F = m a = m d2 x êd t2 , so that force is proportional to acceleration with propor-tionality constant equal to the mass of the object being accelerated by the force. However, the total force thatappears in this equation may contain velocity-dependent terms such as -g x° = -g d x ê d t in an energy-dissipatingsystem. Such tems oppose movement and can be larger than the acceleration term m a within cells. In that casewe have “Aristotelian dynamics” g x° = F

` HxL , which reflects continual force balance between position-dependentand velocity-dependent forces. This situation approximates the pre-Newtonian suggestion of Aristotle that forceshould be proportional to velocity, though it is only an approximation to a fundamental, underlying Newtoniandynamical system. The Aristotelian dynamical domain is typical for the cellular scale. At smaller scales, randomforces generated by collisions with yet smaller molecules undergoing random thermal motion become important.For particles too large for their velocities to be thermalized directly, this is the “Brownian motion” first explainedquantitatively by Einstein. In this case, force balance includes a random term which scales up as the square rootof the temperature. For the remainder of this section we will ignore these random effects and work in the Aristote-lian domain.

Gradient descent optimization dynamics will then be

(5.1)g xiÅÅÅÅÅÅÅÅÅÅÅÅÅ t

= - VÅÅÅÅÅÅÅÅÅÅÅÅÅ xi

= Fi

Incidentally the same dynamics can be formally derived if the action functional to be optimizeddepends on x° quadratically (representing energy dissipation) e.g. using the nonstandard “greedy functionalderivative” [Mjolsness and Miranker]:

SHtL = ‡td t

Ä


gÅÅÅÅÅÅ2

‚i

x° i2 + ‚

i

VÅÅÅÅÅÅÅÅÅÅÅÅÅ xi

x° i

É


dG SHtLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅdG x° i HtL = 0 = g

xiÅÅÅÅÅÅÅÅÅÅÅÅÅ t

+ VÅÅÅÅÅÅÅÅÅÅÅÅÅ xi

.

5.1.2 Continuum elastic mechanics

We now consider simple models of position-dependent and velocity-dependent force, for a continuousmedium. This is an extreme limit in which a cell is modeled mechanically not as a handfull of springs, but aninfinite collection of infinitesimal springs. The reality lies in between, with polymeric macromolecules acting asnumerous interconnected springs with heterogeneous properties as a function of space e.g. cellular structures(cytoskeleton, nucleus, bare and reinforced membranes, etc.).

In a continuous medium, it is convenient to model stress instead of force. Force is a vector: Fa is thecomponent of this force vector acting at a point in direction a . Stress is a two-index tensor: sa b is the force perunit area acting in direction b , across a tiny unit of area whose surface is perpendicular to direction a . In this waywe pass immediately from one spring to the limit of a network of infinitely many springs or other mechanicalelements. Stress, like force, arises from position-dependent and velocity-dependent terms which are addedtogether to obtain (in the Aristotelian domain) zero.

ICSB_TutorialV12.nb

109

In a continuous medium, it is convenient to model stress instead of force. Force is a vector: Fa is thecomponent of this force vector acting at a point in direction a . Stress is a two-index tensor: sa b is the force perunit area acting in direction b , across a tiny unit of area whose surface is perpendicular to direction a . In this waywe pass immediately from one spring to the limit of a network of infinitely many springs or other mechanicalelements. Stress, like force, arises from position-dependent and velocity-dependent terms which are addedtogether to obtain (in the Aristotelian domain) zero.

To write out a potential energy function analogous to V Hx1 , x2 L for a continuous mechanical medium,we need variables which generalize the spring-end positions Hx1 , x2 L used above. Instead of a position vector xindexed by an integer i , we define a displacement vector u indexed by a starting position x , resulting in a displace-ment function uHxL which represents the amount by which position x in the medium has been moved to achievesome new position x£ HxL = x + uHxL . Just as the spring potential is unaffected by a global translation of coordinatesystem, and is therefore a function only of x1 - x2 , the new continuous-medium potential can only be a functionof differences of uHxL at neighboring positions x . Therefore we define the strain tensor, :

a b HxL ª1ÅÅÅÅÅ2

K ua HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxb

+ub HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxa

O

The reason for the symmetrization of indices (a b = b a ) is more subtle - it guarantees that only combinationsof derivatives that change the infinitesimal distance between some neighboring pairs of points are considered[Landau and Lifshitz vol. 7]. In particular, global rotations about any point do not affect a b . The desiredpotential energy V will be a function of . Since HxL is itself a function, V is a function of a function, also calleda “functional” and denoted by square brackets V@HxLD .

Of course the simplest potential energy function is just V = 0, which is appopriate for a medium inwhich a static displacement field uHxL doesn’t produce any change in potential energy or require any forces, if it isachieved very slowly. This is true in an idealized fluid (a liquid or gas), in which the force terms arise fromvelocities v = d uHxL êd t rather than directly from displacements uHxL .

We would like this potential energy functional to have the following properties: it should be invariantunder rotation and translation, and it should add up over subvolumes if the continuous medium is arbitrarilypartitioned. One form that has these properties is:

V@HxLD = ‡volume

Il trHHxLL2 + m trIHxL2 MM d x = ‡volume

)HL d x

where tr A = traceHAL is rotationally invariant. In this model the material properties l and m are homogeneous,but they can be turned into functions of position x at the cost of requiring iteration over time, to take into accountthe transport of inhomogeneities by displacements.

The general quadratic form for linear elasticity is:

V@HxLD =1ÅÅÅÅÅ2

‡volume

‚a b c d

la b c d HxL a b HxL c d HxL d x

where la b c d = lb a c d = la b d c = l c d a b .

As for the single spring model, we may differentiate the potential energy to get forces although thesenow come in the form of the stress tensor:

ICSB_TutorialV12.nb

110

sHxL =d V@HxLDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

dHxL =)HLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

= 2 m HxL + l I trHHxLL= 2 m KHxL -

IÅÅÅÅÅ3

trHHxLLO + k I trHHxLL @Landau and LifshitzD= E£ @HxL + n£ I trHHxLLD @MurrayD.

This is the stress-strain relationship for isotropic, linearly elastic bodies with the given quadratic energyfunction. The net elastic force acting on a point is given by the divergence of this quantity:

Felastic HxL = “ ÿ sHxL = ‚a

sa b HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxa

For the general quadratic,

sa b HxL =d V@HxLDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

dHxL = ‚c d

la b c d HxL c d HxL

which is the general stress-strain relationship for linear elasticity.The steady state can be obtained from by minimizing potential energy:

0 = -d V@uHxLDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d uHxL = -d

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd uHxL

1ÅÅÅÅÅ2

‡volume

‚a b c d

a b HxL la b c d HxL c d HxL d x

= -d


1ÅÅÅÅÅ2

‡volume

‚a b c d

ua HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxb

la b c d HxL uc HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxd

d x

=d


1ÅÅÅÅÅ2

loooomnoooo‡volume

„a

ua HxL ÅÅÅÅÅÅÅÅÅÅÅÅÅxb

Ä

ÇÅÅÅÅÅÅÅÅÅÅ‚b c d


É

ÖÑÑÑÑÑÑÑÑÑÑ d x

- ‡boundary surface

Ä

ÇÅÅÅÅÅÅÅÅÅÅ‚a c d

ua HxL la b c d HxL uc HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxd

É

ÖÑÑÑÑÑÑÑÑÑÑ d S

|oooo}~oooo

=d


lomno‡volume

uHxL ÿ H“ ÿ sHxLL d x - ‡boundary surface

uHxL ÿ sHxL ÿ d S|o}~o

= “ ÿ sHxL - sHxL ÿ nS HxL dS HxL = Felastic HxLAlternatively ...

0 =. .. =

‡volume

K d uHxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd uHxL O ÿ

ÅÅÅÅÅÅÅÅÅÅÅÅÅxb

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ‚b c d


É

Ö

ÑÑÑÑÑÑÑÑÑÑ d x - ‡

boundary surface

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ‚a c d

K d uHxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd uHxL O ÿ la b c d HxL uc HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

xd

É

Ö

ÑÑÑÑÑÑÑÑÑÑ d S

=

ÅÅÅÅÅÅÅÅÅÅÅÅÅxb

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ‚b c d


É

Ö

ÑÑÑÑÑÑÑÑÑÑ-

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅ‚a c d


É

Ö

ÑÑÑÑÑÑÑÑÑÑ dS HxL = “ ÿ sHxL - sHxL ÿ nS HxL dS HxL = Felastic HxL

If an external non-elastic force (such as gravity) is applied, in steady state it must balance the elasticforce. If an external stress is applied (such as pressure or shear on the surface), in steady state it must balance theelastic stress. Then the steady state equations become

ICSB_TutorialV12.nb

111

“ ÿ sHxL + Fexternal HxL + “ ÿ sexternal HxL = 0, or

“ ÿ‚c d

la b c d HxL c d HxL + Fexternal HxL + Hsa b Lexternal HxL = 0 .

Possible external forces include gravitational, electric, and magenetic forces. Possible external stresses includepressure and shear stresses applied to the surface, and residual stress sexternal HxHx0 LL in the body, where x0 is anundeformed coordinate system that can be used to record the identity of material points or particles within thecontinuous medium.

For constant l, or even for varying l, these equations can be spatially discretized in space to give a linearalgebra problem for the displacement field u :

K ÿ u = fexternal .

K is called the stiffness matrix.

5.1.3 Residual stress

Apply an external force to one block of continuous medium and let it relax to solve the foregoingequations; then smoothly join it to another relaxed block, and remove the external force. As a result there will bea new displacement u (in addition to the old one in one of the blocks) that solves the foregoing equation with aresidual stress term that balances the now-removed external force.

A typical mechanical arrangement for increased strength is to impose residual stresses that result intension on the outside and compression on the inside of a multilayer body.

5.1.4 Anisotropy

A particular anisotropic and heterogeneous variant of such a model can be obtained as follows: turn land m into simultaneously diagonalizable matrices, which depend on position. Then

V@HxLD = ‡volume

BItr è!!!!!!!!!!

l HxL HxLM2 + trImHxL HxL2 MF d x.

Within the preferred coordinate system:

)HL = ‚a b

ma a b HxL b a HxL +ikjjjjj‚

a

Iè!!!l M

a a a HxL

y{zzzzz

2

sa b HxL = ma a b HxL + 2 da b Iè!!!l M

a ‚

c

Iè!!!l M

c c c HxL = ma b a HxL

and finally

la b c d = ma 1ÅÅÅÅÅ2

Hda c db d + da d db c L + 2 Iè!!!l M

a Iè!!!

l Mc

da b dc d .

This model has six degrees of freedom. If the two matrices are not simultaneously diagonalizable, there arenine.

A more difficult form of anisotropy results in genuinely nonlinear elasticity. In this case deformationsuHxL can be large enough and sufficiently nonhomogeneous that the special directions that diagonalize m and/or lmust be stored as a function of initial position x0 , and then transformed according to the cumulative subsequentdeformation u(x). Once again we need the mapping x0 HxL , along with the intial special frame WHx0 L and itscovariant transformation under uHxHx0 , tLL .

ICSB_TutorialV12.nb

112

A more difficult form of anisotropy results in genuinely nonlinear elasticity. In this case deformationsuHxL can be large enough and sufficiently nonhomogeneous that the special directions that diagonalize m and/or lmust be stored as a function of initial position x0 , and then transformed according to the cumulative subsequentdeformation u(x). Once again we need the mapping x0 HxL , along with the intial special frame WHx0 L and itscovariant transformation under uHxHx0 , tLL .

Excercise. Take a continuum limit of the nonlinear mass-spring model for springs whose strength andresting length are uniform, small but nonzero, arranged on a regular spatial lattice (cubic, tetragonal, etc.). Whatis )HL? What is sHL? Vary the spring constants spatially to build in frustration in a regular pattern. What ire)HL and sHL?

5.1.5 Viscosity

Velocity-dependent forces can be treated somewhat similarly. We define the velocity or strain ratetensor analogous to the strain itself:

va b HxL ª1ÅÅÅÅÅ2

K va HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ xb

+vb HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxa

O =d a b HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

d t,

and consider how this tensor may contribute to a functional, called the “dissipative function”, from which thevelocity-dependent stress can be derived [Landau and Lifshitz, section 34] . Then as before,

SHtL = ‡td tÄ

Ç

ÅÅÅÅÅÅÅÅÅÅ‡volumeIl` trH° HxLL2 + m trI° HxL2 MM d x + ‚

a b

)HLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ a b

° a b

É

Ö

ÑÑÑÑÑÑÑÑÑÑ

0 =d SHtL

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅd ° HxL = 2 m ° HxL + l

Ì trH° HxLL +

)HLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ a b

so

sviscous HxL + selastic HxL = 0

where

sviscous HxL = 2 m ° HxL + l`

I trH° HxLLor in greater generality

sviscous HxL = ‚c d

l`

a b c d HxL ° c d HxL

and selastic HxL is modeled as above. For the isotropic case with l`

=0,

° HxL = -H1 ê2 mL selastic HxL

ICSB_TutorialV12.nb

113

5.2 Finite Element Method

Convert energy integrals into discrete sums by using low-order polynomial approximations to functionswithin each polyhedral finite element. Basis set for such polynomials is typically chosen to be nonzero on one ofthe vertices and zero on the rest; this gives unique control of the function value on that vertex. Other basiselements may give control of the derivative at a vertex (using double roots at other points), the function value atother symmetrically placed points on the sides or middle, or the normal derivative on the sides. With such a basisone can control the degree of continuity or smoothness (number of continuous derivatives) at the boundariesbetween elements - it will be much lower than the degree of the polynomial basis since there are many boundaries.There are also “quadrature formulas” for quickly calculating the integrals for area, moments, and basis innerproducts, as a function of the basis coefficients.

These FEM possibilities are conventionally diagrammed with a 2D or 3D picture of the polygonal orpolyhedral element, with thick dots at vertices and other points where the function value is controlled, open circlesaround those dots where the derivative is also controlled, nested open circles for higher order derivatives, andslashes on the sides where normal derivatives are controlled

In local coordinates (x, h, zL such basis functions can be given in the cube [-1,1]^3 by [Hughes]

1ÅÅÅÅÅ8

H1 xL H1 hL H1 zLThese are called trilinear hexahedral finite elements. Wedges and tetrahedra can also be used. The wedge

basis is:

1ÅÅÅÅÅ8

H1 xL H1 - hL H1 zL, 1ÅÅÅÅÅ4

H1 + hL H1 zLThe tetrahedral basis is [Hughes p. 126]:

1ÅÅÅÅÅ8

H1 xL H1 - hL H1 - zL, 1ÅÅÅÅÅ8

H1 + xL H1 + hL H1 - zL, 1ÅÅÅÅÅ8@3 + x + H1 - xL hD H1 + zL

The Rayleigh-Ritz approach to FEM is to construct an approximating optimization problem, andoptimize its coefficients.

uHxL = ‚elements i

‚bases a

ci a Pa Hxi HxLL

xi HxL = Xi ÿ K1xO

or:

xi HxL = uiHoldL HxL

Likewise for all other functions of space and time that may be involved, including spatial memory functionsx0 Hx » tL .

ICSB_TutorialV12.nb

114

V@uHxLD =1ÅÅÅÅÅ2

‡volume

‚a b c d

ua HxLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅxb


d x - ‡volume

‚a

Fa HxL ua HxL d x

=1ÅÅÅÅÅ2

‚i a

‚j b

ci a Ki a j b cj b - ‚i a

ci a fi a

Rayleigh-Ritz method is to minimize V with respect to the c’s, obtaining again the linear equation K c = f withstiffness matrix K .

[Strang, chapter 5.4] Galerkin’s method; Rayleigh-Ritz method.

5.3 Growth

5.3.1 Growing tissue models - continuum

General growth equation:

tr ° Hx, tL =

ÅÅÅÅÅÅÅÅÅÅ t

tr Hx, tL = gHx, tLBut, what determines gHx, tL ?

1D niches

Growth equation:

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ x0

xHt, x0 LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

t= gHxL

or more precisely

limt2 Øt1

ÅÅÅÅÅÅÅÅÅÅÅ t2

ÅÅÅÅÅÅÅÅÅÅÅÅÅx1

FHt2 , t1 , x0 L = gHxL

Here growth is a function gHxL of 1-d position, which is encoded by dynamically stable patterns of geneexpression such as those of shoot or root meristems.

GHx; x0 L = ‡x0

xgHsL d s

KHx; x0 L = ‡x0

x d uÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅGHuL

(5.2)xHtL = K-1 HKHx0 L + t - t0 LSemigroup property: Calculate xHt2 L from Ht1 , x1 L and xHt1 L from Ht0 , x0 L :

ICSB_TutorialV12.nb

115

xHt2 L = K-1 HKHx1 L + t - t1 L= K-1 IKIK-1 HKHx0 L + t1 - t0 LM + t - t1 M

= K-1 HKHx0 L + t1 - t0 + t - t1 L= K-1 HKHx0 L + t - t0 L

which is just xHt2 L calculated from Ht0 , x0 L .

1D inflation

ProblemConditions on inflationary growth:

limt2 Øt1

ÅÅÅÅÅÅÅÅÅÅÅ t2

ÅÅÅÅÅÅÅÅÅÅÅÅÅx1

FHt2 , t1 , x0 L = f Ht, x0 L

(Note this is a hyperbolic PDE with 45-degree rotated coordinate system from the usual). Also

FHt1 , t1 , x1 L = x1

and the semigroup property:

FHt2 , t0 , x0 L = FHt2 , t1 , FHt1 , t0 , x0 LLSolution (with Przemek Prusinkiewicz)

FHt, t0 , x0 L = ‡0

x0

expÄÇÅÅÅÅÅÅÅÅÅ‡t0

tf0 Ht, x0 L d t

ÉÖÑÑÑÑÑÑÑÑÑ d x

Specializes for time-invariant growth function f to:

FHt, t0 , x0 L = ‡0

x0

exp@Ht - t0 L f0 Hx0 LD d x

Semigroup property:

FHt2 , t0 , x0 L = ‡0

x0


t2f0 Ht, x0 L d t

ÉÖÑÑÑÑÑÑÑÑÑ d x

= ‡0

x0


t2

f0 Ht, x0 L d tÉÖÑÑÑÑÑÑÑÑÑ exp

ÄÇÅÅÅÅÅÅÅÅÅ‡t0

t1

f0 Ht, x0 L d tÉÖÑÑÑÑÑÑÑÑÑ d x

= ‡0

x0


t2

f0 Ht, x0 L d tÉÖÑÑÑÑÑÑÑÑÑ FHt1 , t0 , x0 LÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

x0 d x

= ‡0

x1 =FHt1 ,t0 ,x0 Lexp


t2f0 Ht, x0 HFLL d t

ÉÖÑÑÑÑÑÑÑÑÑ d F

= ‡0

x1 =FHt1 ,t0 ,x0 Lexp


t2

f1 Ht, xL d tÉÖÑÑÑÑÑÑÑÑÑ d x

= FHt2 , t1 , x1 = FHt1 , t0 , x0 LLThus

FHt2 , t0 , x0 L = FHt2 , t1 , FHt1 , t0 , x0 LL

ICSB_TutorialV12.nb

116

5.3.2 Lively surfaces and manifold embeddings

On top of the PDE’s for growth of the previous section, one can add regulatory networks that are localin the extreme sense: one copy of the network for every point. For example, use a ANN-GRN network at eachpoint. The result is a reaction-growth equation (analogous to but different from Turing’.s reaction-diffusionequations.) To get interaction between the two sectors of the model, we need growth functions gHxL or f Ht, x0 L todepend on some component of the reaction network, and to add communication between “neighboring” points.Then the choice between inflationary and niche style growth functions would be made by the dynamics rather thandirectly by the modeler. The traditional way to add communciation to a spatial continuum model is to allow someor all reactants to diffuse, which can be modeled with a reaction-diffusion equation. However, this may not be themost realistic form of local interaction for cases such as phyllotaxis, where cells are polarized in their direction ofintercellular transport of auxin. A model with local reactions, local communication, and dynamic control of itsown growth function, all appearing in coupled PDE’s, is capable of autonomously deforming the original embed-ded manifold (of dimensionality 1, 2, or 3 into $3 ) under the control of a network capable of distributed localcomputation. Being both “smart” and “active”, we might label such an embedding as “lively”.

5.4 Homology

5.4.1 Discretized Geometry

Geometry in biological development is determined by the relationships between objects of variousdimensionalities: d = 0 (points and countable point sets), d = 1 (line segments and curves), d = 2 (planar regionssuch as polygons, and surfaces), and d = 3 (polyhedra and general volumes). For dimensions 1 and up there arepossibilities of linear vs. continuously varying objects of those dimensions, as just listed. The relationships amongsuch objects include containment or composition (among objects of equal dimensionality), approximation (ofnonlinear objects by sets of linear objects of the same dimensionality), and boundary (of a dimension d object byone or more dimension d - 1 objects). For all of these objects we may define energy functions that contain theessential physics of the system.

Let us now set up a coordinate system for all of these quantities. First introduce indices:

i = vertexa = edgep = face

t = triangle within facea = volume

l = level number

a = vertexa = edgep = face

t = triangle within facei = volume

l = level number

and correspondingly

ICSB_TutorialV12.nb

117

va œ $3 = position of vertexea œ $N = unit vector in abstract vertex space corresponding to i /th vertex

Ci p œ 8-1, 0, 1< = oriented face boundary of volumeBp a œ 8-1, 0, 1< = oriented edge boundary of face

Aa a œ 8-1, 0, 1< = oriented vertex boundary of edge

which satisfy the relationships that the boundary of a boundary is zero, i.e. that

(5.3)B A = 0C B = 0

vaÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ri

Note: some would prefer to transposes all the foregoing matrices - it could be worth switching.In addition

Bè p t œ 8-1, 0, 1< = oriented triangle decomposition of faceB`

t g œ 8-1, 0, 1< = oriented edge boundary of triangleD«

t a œ 80, 1< = unoriented vertex boundary of triangleB = Bè B

`

D«

= » B` » ÿ » A » ê2

and likewise one may decompose volumes and edges.One may form abstract sums that represent an edge, face, or volume as a linear combination of objects

of one less dimension:

HA eLa = ‚i

Aa a ea

HB e£ Lp = ‚a

Bp a He£ La

HC e Li = ‚i

Ci p He Lp

bearing in mind however that B A e = 0, C B e£ = 0, etc. Now we connect vertices to volumes with the netincidence matrix

Di a = „p a

Ä


ƒƒƒƒƒƒƒƒƒƒƒƒCi p Bp a Aa a

ƒƒƒƒƒƒƒƒƒƒƒƒê‚

a

ƒƒƒƒƒƒƒƒƒƒƒƒBp a Aa a

ƒƒƒƒƒƒƒƒƒƒƒƒ

É


Now we can form the face and volume centroids

cp =⁄a a » Bp a Aa a » vaÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ⁄a a » Bp a Aa a » ,

xi =⁄ a Di a vaÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ⁄ a Di a

.

We can also calculate the face area and volume of a volume element

ICSB_TutorialV12.nb

118

Areap =1ÅÅÅÅÅ2

‚p t

ƒƒƒƒƒƒƒƒƒƒƒƒƒBè p t

ƒƒƒƒƒƒƒƒƒƒƒƒƒ‚

a<b<c

… D«

t a D«

t b D«

t c … detHva - vc , vb - vc , 1L

Vi =1

ÅÅÅÅÅÅÅÅÅ3!

‚p t

ƒƒƒƒƒƒƒƒƒƒƒƒƒCi p Bè p t

ƒƒƒƒƒƒƒƒƒƒƒƒƒ‚

a<b<c

… D«

t a D«

t b D«

t c … detHva - xi , vb - xi , vc - xi L

where as usual

detHxa , yb , zc L = ‚1bd,e, fb3

d e f xa d yb e zc f .

From these quantities we may calculate derivatives of any area or volume with respect to any vertex.

SAM slice with approximately polygonal cell slices.

5.4.2 Singular homology

This section follows: Hatcher, “Algebraic Topology”.Define the ordered d -simplex

Dd =loomnooHx0 , ... xd L

ƒƒƒƒƒƒƒƒƒƒƒƒ‚i=0

d

xi = 1Ô"i xi r 0|oo}~oo

It has vertices vi = H0, ... 0, 1, 0, ... 0Lwith a “1” in the i ’th position. These vertices are ordered @v0 , ..., vd D .A singular d -simplex is a continuous map s from Dn to a topological space X . It need not be one-to-

one (it can be “singular” ). A singular d-chain is a finite formal sum ⁄ j nk sk with integer coefficients nk œ # .These sums can be added and subtracted like vectors with an arbitrary number of possible basis vectors s j ; thevector space is a “free abelian group” Cd HXLof d -chains.

Define the linear “boundary map” ∂:

d HsL = ‚i=0

d

H-1Li Hs » @v0 , ..., vi , ... vd DL

ICSB_TutorialV12.nb

119

where vi means the corresponding vertix is omitted from the ordered list and » means restriction of a function.The ordered list @v0 , ..., vi , ... vd D is implicitly (and invertibly) mapped onto Dd-1 by the standard map thatpreserves order of vertices. Thus s » @v0 , ..., vi , ... vd D becomes a singular d - 1 chain, and so is d HsL .

Then if d > 1

d-1 d HsL = ‚0b j<ibd

H-1Li H-1L j Hs » @v0 , ..., v j , ..., vi , ... vd DL

+ ‚0bi< jbd

H-1Li H-1L j-1 Hs » @v0 , ..., vi , ..., v j , ... vd DL = 0

by interchanging i and j . So if d > 0, d d+1 HsL = 0 and we have linear maps d+1 : Cd+1 HXL Ø Cd HXL andd : Cd HXL Ø Cd-1 HXL satisfying ImageHd+1 L ª ImHd+1 L Œ KerHd L ª KernalHd L . We may consider a new vectorspace with integer coeffiecients consisting of equivalence classes of members of KerHd L that differ only bymembers of ImHd+1 L ; by the linearity of , such equivalence classes remain entirely in KerHd L . The vector spaceof these equivalence classes is denoted

Hd HXL = KerHd L ê ImHd+1 L ,

the d ’th singular homology group. The group operation is: addition of representative members of KerHd L ,followed by remapping to the equivalence class of their sum.

The fact that the s maps may be singular allows the image of a lower-dimension singular Hd - 1L -simplex in X to coincide with the images of a set of singular d -simplexes.

Reference: Hatcher, “Algebraic Topology”.The maps A , B , C of the previous section are 1 , 2 , and 3 respectively (restricted to chains that

define particular polygons and polyhedra), whence C B = 0 and B A = 0.

5.4.3 Differential and algebraic geometry: some concepts

This is just a study agenda.References: Breedon, Algebraic Geometry; Weinberg, General Relativity; Hirsch, Differential

Topology.

Differential manifolds

definitions: topology, homotopy, manifoldmanifold metric, covariant derivative in component notation

PDEs: diffusion, fluid flow on a manifoldmanifold embeddings

level set representations - can change topologytransversality and differential topology

fiber bundle, tangent bundle

Homology theories

axioms for homology

ICSB_TutorialV12.nb

120

simplicial , singular , and cellular (CW) complexes and homology theoriesde Rham cohomology

Expected applications

models of biological structures and use of space in development and within eukaryotic cells.simplifiying continuum limits of large-scale discrete models of such biological structures, composed of many

discrete units such as molecules or cells1 ê N expansion corrections to approximation by such continuum limits.topology-changing dynamics in such models

ICSB_TutorialV12.nb

121

Chapter 6

Postscripts

6.1 Summary concept map


6.2 Acknowledgements

Useful discussions with Ashish Bhan, Sergei Nikolaev, Nikolay Podkolodny, Przemyslaw Prusinkiewicz, BruceShapiro, and Guy Yosiphon are gratefully acknowledged. The work was supported in part by the National ScienceFoundation’s Frontiers in Biological Research (FIBR) program, award number EF-0330786, by a BiomedicalInformation Science and Technology Initiative (BISTI) grant (number R33 GM069013) from the National Insti-tute of General Medical Sciences, and by the Center for Cell Mimetic Space Exploration (CMISE), a NASAUniversity Research, Engineering and Technology Institute (URETI), under award number #NCC 2-1364.

6.3 Time-ordered operator expansion

We rederive the Time-Ordered Product expansion (Equation 2.14 of [14]Equation 4.29 of [16]) byelementary probabilistic means as follows. For arbitrary operators H0 and H1 we wish to calculate

expHt HL ÿ p0 = expHt HH1 + H0 LL ÿ p0

ICSB_TutorialV12.nb

122

To this end we introduce an extra variable z , which can ultimately be set to 1, in order to create a generatingfunction that keeps track of the number of times operator H1 is applied in polynomial expansions of theexponential:

SHzL = expHt HH1 z + H0 LL ÿ p0 = ‚n=0

¶

sk zk

= „k=0

¶zkÅÅÅÅÅÅÅÅk !

Azk exp Ht HH1 z + H0 LL E

z=0ÿ p0

= „k=0

¶zkÅÅÅÅÅÅÅÅÅn !

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅz

k „l=0

¶ Ht HH1 z + H0 LLlÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅl !

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑz=0

ÿ p0

= „k=0

¶zkÅÅÅÅÅÅÅÅk !

Ä

Ç


l=k

¶1

ÅÅÅÅÅÅÅl !

‚80bip bl-k<Ô ⁄p=0

k ip =l-k

k ! Ht H0 Lik t H` Ht H0 Lik-1

t H` Ht H0 Li0

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÿ p0

= „k=0

¶

zk tk

Ä

Ç


l=0

¶1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHl + kL! ‚80bip bl<Ô ⁄p=0

k ip =l

Ht H0 Lik H1 Ht H0 Lik-1 H1 Ht H0 Li0

É

Ö


= „k=0

¶

zk tk

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ„

l=0

¶

„80bip bl<Ô⁄p=0

k ip =l

¤p=0k Hip L!

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅI⁄p=0k ip + kM !

Ht H0 Lik

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L !H1

Ht H0 Lik-1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L ! H1

Ht H0 Li0

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L!

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÿ p0

= „k=0

¶

zl tl

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ„

80bip b¶<

¤p=0k Hip L!

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅI⁄p=0k H ip + 1L - 1M !

Ht H0 LikÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L!

H1Ht H0 Lik-1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L! H1

Ht H0 Li0ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L!

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÿ p0

= „k=0

¶

zk tk

Ä

Ç


80bip b¶<

¤p=0k GHip + 1L

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅGI⁄p=0

k H ip + 1LM Ht H0 Lik

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L!H1

Ht H0 Lik-1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L ! H1

Ht H0 Li0

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L !

É

Ö


Now we use the Multinomial-Dirichlet normalization integral

¤p=0n GHip + 1L

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅGI⁄p=0

n H ip + 1LM = ‡0

1d q0 ‡

0

1d qk d

ikjjjjjj‚

p=1

k

qp - 1y{zzzzzz ‰

p=0

k

Hqp Lip .

Accordingly,

S HzL = „k=0

¶

zk tk

Ä

Ç


80bip b¶<‡

0

1

d q0 ‡0

1

d qk dikjjjjjj‚

p=1

k

qp - 1y{zzzzzz ikjjjjjj‰

p=0

k

Hqp Lipy{zzzzzz

Ht H0 Lik

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L!H1

Ht H0 Lik-1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L! H1

Ht H0 Li0ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L!

É

Ö


ICSB_TutorialV12.nb

123

= „k=0

¶

zk tk

Ä

Ç


80bipb¶<‡

0

1d q0 ‡

0

1d qk d

ikjjjjjj‚

p=1

k

qp - 1y{zzzzzz

Hqk t H0 LikÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L!H1

Hqk-1 t H0 Lik-1

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L! H1

Hq0 t H0 Li0

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L!

É

Ö


= „k=0

¶

zk tk

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ·

0

1

d q0 ‡0

1d qk d

ikjjjjjj‚

p=1

k

qp - 1y{zzzzzz

„80bi0 b¶<

Hqk t H0 Li0ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik L!

H1 „80bi1 b¶<

Hqk-1 t H0 Li1ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHik-1 L ! H1 „

80bik b¶<

Hq0 t H0 Lik

ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅHi0 L!

É

Ö


= „k=0

¶

zk tk

Ä

Ç

ÅÅÅÅÅÅÅÅÅÅÅ‡

0

1d q0 ‡

0

1d qk d

ikjjjjjj‚

p=1

k

qp - 1y{zzzzzz expHqk t H0 L H1 expHqk-1 t H0 L H1 expHq0 t H0 L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÿ p0

Thus

(6.1)SHzL = „k=0

¶

zk

Ä

Ç


0

td t0 ‡

0

td tk d

ikjjjjjj‚

p=1

k

tp - ty{zzzzzz expHtk H0 L H1 expHtk-1 H0 L H1 expHt0 H0 L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑÿ p0

In summary (since p0 was never used in the above calculations),

expHt HH1 + H0 LL

= „k=0

¶ Ä

Ç


0

td t0 ‡

0

td tk d

ikjjjjjj‚

p=1

k

tp - ty{zzzzzz expHtk H0 L H1 expHtk-1 H0 L H1 expHt0 H0 L

É

Ö

ÑÑÑÑÑÑÑÑÑÑÑ.

Alternatively, define t1 = t0 , t2 = t1 + t1 , ... tn+1 = tn + tn = t . Then the evolution of the state vector isgiven by

expHt HH1 - H0 LL ÿ p0 =

„k=0


td t1 ‡

t1

td t2 ‡

tk-1



In the special case H1 = H` , H0 = -D this simplifies to:

(6.2)

expIt IH` - DMM ÿ p0 =

„k=0


td t1 ‡

t1

td t2 ‡

tk-1

td tn expH-Ht - tk L DL H

èxpH-Htk - tk-1 L DL H

` expH-t1 DL


Since D is diagonal, the terms expH-t DL are analytically calculable and easy to simulate with largejumps in time. Between these easy terms are interposed single powers of H

` representing the occurrence of

discrete-time grammar events that must be simulated. These last two expression for expIt IH` - DMM have a significant interpretation in the case of reaction

kinetics: they correspond to the Gillespie algorithm for stochastic simulation. The exponential distribution ofwaiting times until the next reaction is given by expH-t DL , which depends on the state of the system but doesn’tchange it, and the reaction events are modeled by the interdigitated powers of H` .

ICSB_TutorialV12.nb

124

These last two expression for expIt IH` - DMM have a significant interpretation in the case of reactionkinetics: they correspond to the Gillespie algorithm for stochastic simulation. The exponential distribution ofwaiting times until the next reaction is given by expH-t DL , which depends on the state of the system but doesn’tchange it, and the reaction events are modeled by the interdigitated powers of H

`.

The same derivation can be accomplished for any decomposition of H into a solvable part H0 (above,-D , but it need not be diagonal) plus a more difficult term H1 (here, H

`):

(6.3)

expHt HH0 + H1 LL ÿ p0 =

„k=0


td t1 ‡

t1

td t2 ‡

tk-1



This is one formulation of the time-ordered product expansion.This perturbative approach is equivalent to the use of perturbative methods including Feynman diagram

calculations in quantum field theory, except for an occasional factor of i =è!!!!!!!

-1 which would turn our probabili-ties into the complex-valued probability factors of quantum mechanics, as discussed in Section 3.8.1.

6.4 Errata

Advanced Cellerator arrows table.(Add bug reports here.)

ICSB_TutorialV12.nb

125

B A C K M A T T E R

References[1] Jönsson, H. , et al. (2006, January 13). An auxin-driven polarized transport model for phyllotaxis. Proc. Natl.

Acad. Sciences USA, 103(5), 1633–1638. Retrieved 13 January 2006 from http://www.pnas.org/cgi/content/abstract/103/5/1633

[2] Mjolsness, E. (2005). Variable-Structure Systems from Graphs and Grammars. UC Irvine School of Information and Computer Sciences, Irvine. UCI ICS TR# 05-09, http://computableplant.ics.uci.edu/papers/ vbl-Struct_GG_TR.pdf

[3] Hawkins, D. , & Ulam, S. (1944). Theory of Multiplicative Processes, 1.. Los Alamos Scientific Laboratory, Los Alamos. LA-171

[4] Athreyea, K. B. , & Ney, P. E. (1972). Branching Processes. Springer-Verlag; Dover.[5] Frey, B. (2003). Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models.

Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence.[6] Lauritzen, S. (1996). Graphical Models. : Oxford University Press.[7] Gold, S. , Rangarajan, A. , & Mjolsness, E. (1996, May 15). Learning with Preknowledge: Clustering with

Point and Graph Matching Distance Measures. Neural Computation, 8(4).[8] Mjolsness, E. (2004). Labeled Graph Notations for Graphical Models: Extended Report. University of

California, Irvine. UCI ICS TR #04-03, http://www.ics.uci.edu/~emj [9] Buntine, W. L. (1994). Operations for Learning with Graphical Models. Journal of Artificial Intelligence

Research .[10] van Kampen, N. G. (1981). Stochastic Processes in Physics and Chemistry. North-Holland.[11] Gillespie, D. T. (1976). Exact Stochastic Simulation of Coupled Chemical Reactions. Comput. Phys. 22,

403-434.[12] Reed, M., & Simon, B. (1972). Methods of Modern Mathematical Physics: Functional Analysis I. New

York: Academic Press.[13] Engel, K. , & Nagel, R. (2000). One-Parameter Semigroups for Linear Evolution Equations. New York:

Springer Graduate Texts in Mathematics 194.[14] Mattis, D. C. , & Glasser, M. L. (1998). The uses of quantum field theory in diffusion-limited reactions.

Reviews of Modern Physics, 70, 979–1001. [15] Dyson, F. (1949). Phys. Rev., 75, 486.[16] Risken, H. (1984). . Berlin: Springer.[17] Gillespie, D. J. (1976). Comput. Physics, 403–434.[18] McQuarrie, D. A. . Stochastic Approach to Chemical Kinetics. J. Appl. Prob., , 413–478.[19] Jacquez, J. A. , & Simon, C. P. (1993). The Stochastic SI Model with Recruitment and Deaths. I. Compari-

son with the Closed SIS Model. Mathematical Biosciences, 117, 77–125. [20] Cuny, J. , Ehrig, H. , Engels, G. , & Rozenberg, G. (1994). Graph Grammars and their Applications to

Computer Science. Springer.[21] Federl, P. , & Prusinkiewicz, P. (2004). Solving differential equations in developmental models of multicellu-

lar structures expressed using L-systems.. In M. Bubak , G. van Albada , P. Sloot & J. Dongarra (Ed.), Proceedings of Computational Science. ICCS 2004, II. Lecture Notes in Computer Science 3037 (pp. 65–72). Berlin: Springer.

ICSB_TutorialV12.nb

126

[22] Bhan, A. , & Mjolsness, E. (2006, July/August ). Static and Dynamic Models of Biological Networks. Complexity, 57–63.

ICSB_TutorialV12.nb

127

Documents

BAD - New... · Table of Contents: Chapter 1: Biological Problem Formulation Introduction Biological networks Multiscale modeling Dynamics overview Problem formulation languages Intuitively