A Genetically Modi ed Hoare Logic - unice.frbernot/BiomodelsBIS/2018-TCS_Hoare... · 2019. 1. 6. · ters. The application of temporal logic to biological regulatory networks was

A Genetically Modified Hoare Logic

G. Bernota, J.-P. Cometa, Z. Khalisa, A. Richarda, O. Rouxb

aUniversity Cote d’AzurI3S laboratory, UMR CNRS 7271,

CS 40121, 06903 Sophia Antipolis Cedex, FrancebIRCCyN UMR CNRS 6597, BP 92101,

1 rue de la Noe, 44321 Nantes Cedex 3, France

Abstract

An important problem when modelling gene networks lies in the identificationof parameters, even when considering a discrete framework as the one ofRene Thomas. We present in this article a new approach based on Hoarelogic to generate constraints on parameter values. Specifications of observedbehaviours play a role comparable to programs in the classical Hoare logic,and deduced weakest preconditions characterize the sets of all compatibleparameterizations, expressed as constraints on parameters. Finally we give,in supplementary materials, a proof of soundness of our Hoare logic for genenetworks as well as a proof of completeness and decidability based on thenotion of the weakest precondition.

Keywords: Hoare logic, Gene regulatory networks, Thomas’ networks,Parameter identification, Soundness and completeness

1. Introduction

Different methods for studying the behaviour of gene networks in a sys-tematic way have been proposed. Among them, ordinary differential equa-tions played an important role, which however mostly lead to numerical sim-ulations. Besides, the abstraction procedure of Rene Thomas [1], approx-imating sigmoid functions by step functions, makes it possible to describethe qualitative dynamics of gene networks as paths in a finite state space.Nevertheless this qualitative description of the dynamics is still governed bya set of parameter values, which, although becoming small integers, remaindifficult to deduce from classical experimental knowledge. In this context, we

Preprint submitted to Theoretical Computer Science July 20, 2017

are interested in the exhaustive search of parameter values that are consis-tent with specifications formalizing the experimentally observed behavioursof gene regulatory networks.

Several works were undertaken with the objective to identify the parame-ters. The application of temporal logic to biological regulatory networks waspresented in [2, 3], then constraint programming was used in [5, 4].

In this paper, we present a somewhat unexpected application of formalmethods to biology through a new approach based on Hoare logic [6] and itsassociated weakest precondition calculus [7] that generates constraints on pa-rameters. The formalism on which we decided to apply this idea is the one ofRene Thomas because it is now universally recognized as the reference frame-work for discrete modelling of gene networks. The key point of our proposalis to define a language able to capture the actual traces observed by molecu-lar biologists by means of a set of experiments (either at the transcriptomicor proteomic level [8]). We have designed a language which is expressiveenough to specify sets of observed traces while preserving the completenessof a corresponding extended Hoare logic. Since this method avoids buildingthe complete state graph, it results in a powerful technics to find out the con-straints representing the set of consistent parameterizations with a tangiblegain for computation time. Indeed, the weakest precondition proof strategywhich extracts the constraints, goes through the trace specification syntaxbut is independent of the size of the gene network.

The paper is organized as follows. The basic concepts of classical Hoarelogic and its associated Dijkstra weakest precondition are quickly remindedin Section 2. The classical formal definitions for Thomas’ discrete gene reg-ulatory networks are reminded in Section 3. Section 4 gives our definitionof (genetically modified) Hoare triples, including the assertion language andthe trace specification language. In Section 5, an extended Hoare logic forgene networks is defined for Thomas’ discrete models. In Section 6, the ped-agogical example of the incoherent feedforward loop of type 1 (made popularby Uri Alon in [9, 10]) highlights the whole process of our approach to findout the suitable parameter values. Section 7 (Related works) sketches thepreviously existing methods for formal identification of discrete parametersin gene network models. We conclude in Section 8. Supplementary materialsprovide the mathematical semantics of these extended Hoare triples and aproof of soundness of our Hoare logic for gene networks as well as a proof ofcompleteness and decidability.

2

2. Reminders on standard Hoare logic

The Hoare logic is a formal system for reasoning about the soundness ofimperative programs. In [6], Tony Hoare introduced the notation “{P} p {Q}”to mean “If the assertion P (precondition) is satisfied before performing theprogram p and if the program terminates, then the assertion Q (postcondi-tion) will be satisfied afterwards.” This constitutes de facto a specificationof the program under the form of a triple, called the Hoare triple. In [7],Edsger Dijkstra has defined an algorithm taking the postcondition Q andthe program p as input and computing the weakest precondition P0 that en-sures Q if p terminates. In other words, weakest means that the Hoare triple{P0} p {Q} is satisfied and that for any precondition P , {P} p {Q} is satisfiedif and only if P ⇒ P0 is semantically satisfied. Notice that weakest precondi-tion means that it does not contain any useless condition, so, it means thatthe set of states that satisfy the weakest precondition is the largest one. Thebasic idea is to stamp the sequential steps of a program with assertions thatare infered according to the instruction they surround.

Within the following inference rules, p, p1 and p2 stand for programs,P , P1, P2, I and Q stand for first-order assertions on the variables of theprogram, v stands for a variable of the imperative program, and Q[v ← expr]means that expr is substituted to each free occurrence of v in Q:

Assignment: {Q[v←expr]} v:=expr {Q}

Sequential composition:{P2} p2 {Q} {P1} p1 {P2}

{P1} p1;p2 {Q}

Conditional branching:{P1} p1 {Q} {P2} p2 {Q}

{(e∧P1)∨(¬e∧P2)} if e then p1 else p2 {Q}

Iteration:{e∧I} p {I} ¬e∧I⇒Q{I} while e with I do p {Q}

Empty program:P⇒Q

{P} ε {Q} (where ε stands for the empty program)

The Iteration rule deserves some comments. The assertion I is called theloop invariant and it is well known that finding the weakest loop invariant(if any) is undecidable in general [11, 12]. So, Tony Hoare asks the pro-grammer to give a loop invariant explicitely (with I). There are approaches

3

to help finding loop invariants similar to the iterative approach adopted inASTREE [13] (abstract interpretation [14]).

Some authors prefer the following iteration rule {e∧I} p {I}{I} while e with I do p {¬e∧I}

that requires the application of the empty program rule to become equivalentto our version. By doing so, these authors put the light on the fact that withina program, each while instruction carries its own (sub)specification and itcan consequently be proved apart from the rest of the program.

From the standard set of Hoare logic rules, the following proof strategybuilds a proof tree that computes the weakest precondition [7].

Definition 2.1. (Dijkstra Backward strategy). Let {P} p {Q} be a Hoaretriple. We call backward strategy the proof strategy defined inductivelly onp as follows:

1. If p is of the form p1; p2 where p2 is made of a single instruction,then apply the Sequential composition rule.

2. If p is a single instruction, then apply the corresponding rule ( Iterationrule, Conditional branching rule or assignment rule).

3. Only after steps 1 and 2 have fully treated p, i.e. when all instructionshave been treated, apply the Empty program rule.

Notice that, these three items being mutually exclusive, the backward strategygenerates a unique proof tree. (In addition, the remaining leafs of the prooftree must be handled using first order logic and arithmetic knowledge.)

By doing so, the precondition P0 obtained just before applying the lastEmpty program rule is the weakest precondition. According to StephenCook [15], the Hoare logic is complete assuming that each loop invariantin the program is the weakest loop invariant with respect to the conditioncomputed just at the right of its while statement. More technically, a pro-gram with a while statement is of the form: “p1 ; while e with I do p ; p2.”The Dijkstra backward strategy computes inductively the weakest precondi-tion Q2 such that, after the execution of p2, the postcondition is satisfied.So Q2 becomes the postcondition of the while statement. The Cook resultis then valid when the invariant I is the weakest condition that ensures Q2

if the program exits from the while statement. All in all, the Cook resultmeans that the Hoare triple {P} p {Q} is correct if and only if P ⇒ P0 issemantically satisfied. So the full completeness of the Hoare logic depends on

4

two things: a sufficient expressive power to express all the previously men-tionned weakest loop invariants and the existence of a first-order proof treefor P ⇒ P0 whenever it is semantically satisfied. Technically, this relies onthe expressiveness of the chosen underlying assertion language [16].

The most striking feature of the backward strategy for Hoare logic isthat, owing to very simple sequences of syntactic formula manipulations, wecapture the mathematical semantics of a program within first order logic.Nevertheless, it is worth noticing that we only address partial correctnesssince Hoare logic does not give any proof of the termination of the program(while instructions may induce infinite loops).

3. Reminders on discrete gene regulatory network models

This section presents the formal framework based on the discrete mod-elling method of Rene Thomas [17, 18] and introduced in [19]. As shown in

y(1)

x(2)

(x ≥ 2) ∧ µ3µ1

µ2x ≥ 1

µ3¬(y ≥ 1)

Figure 1: The graphical representation of a gene regulatory graph R = (V,M,EV , EM )with V = {x, y}, the bounds of x and y are respectively 2 and 1, M = {µ1, µ2, µ3}, ϕµ1

is((x > 2) ∧ µ3), ϕµ2

is (x > 1), ϕµ3is ¬(y > 1).

Figure 1, a gene regulatory graph is visualized as a labelled directed graph inwhich vertices are either variables (within circles) or multiplexes (within rect-angles). Variables abstract genes or their products, and multiplexes containpropositional formulas that encode situations in which a group of variables(inputs of multiplexes) influence the evolution of some variables (outputs ofmultiplexes). In the figure the simple multiplex µ2 expresses that the variablex can help the activation of the variable y when its state is at least equalto 1. In general, multiplexes can represent combined biological phenomena,one of the simplest being the formation of complexes (in which case the for-mula would simply contain a conjunction). In the figure, the situation of µ1

is a little bit more elaborated: It reflects an auto-activation of x at level 2which is controled by µ3. Because µ3 contains a negation, µ1 does not modela positive cooperation of x and y: The auto-activation of x is inhibited by y.

5

So, in this example, there are three qualitatively interesting intervals ofexpression levels for x: an interval called 0, where x can neither act on y noron itself, an interval called 1, where x can act on y and never on itself, andan interval called 2, where x can act on y as well as on itself provided thatµ3 is satisfied. From the biological point of view, there is a threshold (i.e. agiven number of intracellular molecules produced by x) such that x is unable(resp. able) to act on its target gene if its expression level is under (resp.over) the threshold.

We say that the bound of x is bx = 2 and similarly there are only 2qualitatively interesting intervals for y, so the bound of y is by = 1.

In general, this labeled directed graph is formally defined as follows.

Definition 3.1. A gene regulatory graph with multiplexes is a tuple R =(V,M,EV , EM) satisfying the following conditions:

• V and M are disjoint sets, whose elements are called variables andmultiplexes respectively.

• G = (V ∪M,EV ∪ EM) is a labeled directed graph such that:

– Edges of EV start from a variable and end to a multiplex, andedges of EM start from a multiplex and end to either a variable ora multiplex.

– Every directed cycle of G contains at least one variable.

– Every variable v of V is labeled by a positive integer bv called thebound of v.

– Every multiplex m of M is labeled by a formula ϕm belonging tothe language Lm inductively defined by:

− If v → m belongs to EV and s ∈ IN , then v > s is an atomof Lm.

− If m′ → m belongs to EM then m′ is an atom of Lm.

− If ϕ and ψ belong to Lm then ¬ϕ, (ϕ ∧ ψ) and (ϕ ∨ ψ) alsobelong to Lm.

All in all, the discrete values of a variable x abstract intervals of quantityof molecules produced by x within the cell. These intervals are obtained bysorting the activation thresholds of x on the list of its targets. Consequently

6

only the knowledge of the thresholds order is useful and not their actualvalues. The multiplexes use these abstract levels in order to encode peculiarbiological knowledge into formulas that define the conditions under whichthe regulation positively acts on its targets. If there is no peculiar knowledgeabout cooperation over a given target, there is one multiplex per regulatinggene acting on this target, whose formula is reduced to an atom.

Successive multiplexes can be combined by flattening their formulas:

Definition 3.2. The flaten version of a formula ϕm, denoted ϕm, is obtainedby recursively substituting each occurence of a multiplex m′ in ϕm by itsformula ϕm′ (this recursive process of substitutions is well defined because Ghas no directed cycle with only multiplexes).

In Figure 1, the flatten formula ϕµ1 is (x > 2) ∧ ¬(y > 1).As a result of the flatening transformation, all the atoms of a flaten formulaare of the form v > s.

A state is obviously an assignment of integer values to the variables vof V within the intervals [0, bv]. According to a given state, by replacingvariables by their values, ϕm becomes a propositional formula whose atomsare the results of the integer inequalities.

Definition 3.3. (States η, satisfaction relation |=N and resources ρ). Let Nbe a grn and V be its set of variables. A state of N is a function η : V → INsuch that η(v) 6 bv for all v ∈ V . Let L be the set of propositional formulaswhose atoms are of the form v > s with v ∈ V and let s be a positive integer(so that ϕm is a formula of L for every multiplex m of N). The satisfactionrelation |=N between a state η of N and a formula ϕ of L is inductivelydefined by:

• If ϕ is an atom of the form v > s, then η |=N ϕ if η(v) > s.

• If ϕ ≡ ψ1 ∧ ψ2 then η |=N ϕ if η |=N ψ1 and η |=N ψ2; and we proceedsimilarly for the other connectives.

Given a variable v ∈ V , a multiplex m ∈ N−(v) (where N−(v) is the set ofmultiplexes m such that m → v belongs to the interaction graph of N) is aresource of v at state η if η |=N ϕm. The set of resources of v at state η isdefined by ρ(η, v) = {m ∈ N−(v) | η |=N ϕm}.

7

According to figure 1, at the state where η(x) = 2 and η(y) = 1, ϕµ2 issatisfied and consequently µ2 is the only resource of y. On the contrary ϕµ1is false and consequently the set of resources of x is empty.

The equilibrium toward which the expression level of a gene v is attractedonly depends on its set ω of resources. The interval number between 0 and bvcontaining this equilibrium is classically denoted Kv,ω [20, 21, 17, 22, 2, 19].

Definition 3.4. A gene regulatory network (grn for short) is a couple N =(V,M,EV , EM ,K) satisfying the following conditions:

• R = (V,M,EV , EM) is a gene regulatory graph with multiplexes,

• K = {Kv,ω} is a family of integers indexed by v ∈ V and ω ⊂ N−(v),where N−(v) is the set of multiplexes m such that m→ v is an edge ofEM . Each Kv,ω must satisfy 0 6 Kv,ω 6 bv.

A usual notation abuse is the following: we write Kv instead of Kv,∅ andwe write Kv,m1m2... instead of Kv,{m1,m2,...}.

At a given state η, each variable v tries to evolve in the direction ofparameter Kv,ρ(η,v). Hence, at state η, v can increase if η(v) < Kv,ρ(η,v), itcan decrease if η(v) > Kv,ρ(η,v), and v is stable if η(v) = Kv,ρ(η,v).

y

x

1

1

0

0 2

Figure 2: State graph obtained according to Definition 3.5, following Figure 1 and arbi-trarily assuming that Kx = 0, Kx,µ1

= 2, Ky = 0 and Ky,µ2= 1.

In Figure 2, at the state (2, 1), we have Kx = 0 < η(x) = 2 andKy,µ2 = η(y) = 1, but (0, 1) is not a successor state of (2, 1) because theprotein degradation occurs one protein after the other and consequently theconcentration level of x cannot jump from 2 to 0. Consequently (1, 1) is thenext state.At (1, 0), both Kx = 0 < η(x) = 1 and Ky,µ2 = 1 > η(y) = 0, but theprobability for x and y to cross their threshold exactly at the same time is

8

null [20, 21, 17, 22, 2, 19]1. Consequently, there are two possible next states:(0, 0) if x crosses its threshold first and (1, 1) if y crosses its threshold first.

So, Thomas’ method assumes that variables evolve asynchronously andby unit steps toward their respective target levels:

Definition 3.5. (State Graph). Let N = (V,M,EV , EM ,K) be a grn. Thestate graph of N is the directed graph S whose set of vertices is the set ofstates of N , and such that there exists an edge (called transition) η → η′ ifone of the following conditions is satisfied:

• For all variables v ∈ V we have η(v) = Kv,ρ(η,v), and then η′ = η.

• There exists v ∈ V such that η(v) 6= Kv,ρ(η,v), and

η′(v) =

{η(v) + 1 if η(v) < Kv,ρ(η,v)

η(v)− 1 if η(v) > Kv,ρ(η,v)and ∀u 6= v, η′(u) = η(u).

For each variable v such that η(v) 6= Kv,ρ(η,v), there is a transition allowingv to evolve (±1) toward its focal level Kv,ρ(η,v). Every outgoing transition ofη is supposed to be possible, so that there is an nondeterminism as soon as ηhas several outgoing transitions. Figure 2 represents a complete state graph.

4. Syntax of Hoare triples for gene networks

In order to formalize known information about a gene network, we intro-duce in this section a language to express properties of states (assertions) anda language to express properties of state transitions (trace specifications).

4.1. Assertions for discrete models of gene networks

Definition 4.1. (Terms and Assertions). Let N = (V,M,EV , EM ,K) be agrn. The well formed terms for N are inductively defined by:

• Each integer n ∈ IN constitutes a well formed term

1Indeed, biologically, each threshold corresponds to a precise number of molecules pro-duced by x or y respectively in the cell. So, the probability for the degradation to make thenumber of x-molecules cross the x-threshold exactly at the same time as a new moleculeproduced by y makes the y-threshold crossed, is null (a sufficiently precise time scale willdistinguish the two events).

9

• For each variable v ∈ V , the name of the variable v, considered as asymbol, constitutes a well formed term.

• Similarly, for each v ∈ V and for each subset ω of N−(v), the symbolKv,ω constitutes a well formed term.

• If t and t′ are well formed terms then (t+ t′) and (t− t′) are also wellformed terms.

Let N = (V,M,EV , EM ,K) be a grn. The assertions for N are inductivelydefined by:

• If t and t′ are well formed terms then (t = t′), (t < t′), (t > t′), (t 6 t′)and (t > t′) are atomic assertions for N .

• If ϕ and ψ are assertions for N then ¬ϕ, (ϕ∧ψ), (ϕ∨ψ) and (ϕ⇒ ψ)are also assertions for N .

A state η of the network N satisfies an assertion ϕ if and only if itsinterpretation is valid in ZZ, after substituting each variable v by η(v) andeach symbol Kv,ω by its value according to the family K. We note η |=N ϕ.

Moreover, conventionnaly, we denote “>” the tautology (e.g. “1 = 1”).

4.2. Trace specifications for discrete models of gene networks

When biologists observe the dynamics of gene expression levels along aset of experiments, they extract, as a direct experimental knowledge, somesets of observed traces. It is consequently of first interest to see these sets ofobservations as basic elements for the specification of gene networks.

Definition 4.2. (Trace specifications). Let N = (V,M,EV , EM ,K) be agrn. The set of trace specifications for N is inductively defined by:

• For each v ∈ V and n ∈ [0, bv] the expressions v+, v− and v := nare atomic trace specifications (respectively increase, decrease or as-signment).

• If e is an assertion for N , then the expression assert(e) is an atomictrace specification.

10

• If p1 and p2 are trace specifications then (p1; p2) is also a trace speci-fication (sequential composition). Moreover the sequential compositionis associative, so that we can write (p1; p2; · · · ; pn) without intermediateparentheses.

• If p is a trace specification and if e and I are assertions for N , then(while e with I do p) is also a trace specification. The assertion I iscalled the invariant of the while loop.

• If p1 and p2 are trace specifications then ∀(p1, p2) and ∃(p1, p2) arealso trace specifications (quantifiers). Moreover the quantifiers are as-sociative and commutative, so that we can write ∀(p1, p2, · · · , pn) and∃(p1, p2, · · · , pn) as useful abbreviations.

Conventionnaly, we denote:

• ε (called the empty trace) the trace specification assert(>).

• if e then p1 else p2 (called conditional branching) the trace specifica-tion ∃( assert(e); p1 , assert(¬e); p2 ), where p1 and p2 are any tracespecifications and e is an assertion for N .

Intuitivelly, v+ (resp. v−, v := n) means that the expression level ofvariable v is increasing by one unit (resp. decreasing by one unit, set to aparticular value n during the experiment). assert(e) allows one to express aproperty of the current state without change of state. Sequential compositionallows one to concatenate two trace specifications. The loop invariant I, asin classical Hoare logic, is a way to handle an unknown number of trace rep-etitions: It will facilitate proofs of Hoare triples. Finally it becomes possibleto group together several trace specifications thanks to the quantifiers ∀ and∃. These intuitions are formalized as follows via a binary relation betweenstates and sets of states.

Notation 4.3. For a state η, a variable v and i ∈ [0, bv], we note η[v ← i]the state η′ such that η′(v) = i and for all u 6= v, η′(u) = η(u).

Definition 4.4. (Mathematical semantics of a trace specification). Let N =(V,M,EV , EM ,K) be a grn, let S be the state graph of N whose set ofvertices is denoted S and let p be a trace specification for N . The binaryrelation

p; is the smallest subset of S × P(S) such that, for any state η:

11

1. If p is the atomic expression v+, then let us consider the state η′ =η[v ← (η(v) + 1)]: If η → η′ is a transition of S then η

p; {η′}.

2. If p is the atomic expression v−, then let us consider the state η′ =η[v ← (η(v)− 1)]: If η → η′ is a transition of S then η

p; {η′}.

3. If p is the atomic expression v := i, then ηp; {η[v ← i]}.

4. If p is of the form assert(e), if η |=N e, then ηp; {η}.

5. If p is of the form ∀(p1, p2): If ηp1; E1 and η

p2; E2 then η

p; (E1∪E2).

6. If p is of the form ∃(p1, p2): If ηp1; E1 then η

p; E1, and if η

p2; E2

then ηp; E2.

7. If p is of the form (p1; p2): If ηp1; F and if {Ee}e∈F is a F -indexed

family of state sets such that ep2; Ee, then η

p; (

⋃e∈F Ee).

8. If p is of the form (while e with I do p0):

• If η 6|=N e then ηp; {η}.

• If η |=N e and ηp0;p; E then η

p; E.

Detailed comments about this definition can be found in supplementary ma-terials Appendix A.

4.3. Hoare triples

Similarly to Section 2, two assertions and one trace specification are usedto constitute a Hoare triple for gene networks.

Definition 4.5. A Hoare triple for a grn N is an expression of the form{P} p {Q} where P and Q are assertions for N , called pre- and post-condition respectively, and p is a trace specification for N .

In practice P can describe a set of states where cells have been synchronised atthe beginning of the experiment, for example all states for which the variablev has value zero (P ≡ (v = 0)), the trace specification p describes biologicallyobserved dynamic processes, for example increase of the expression level ofv (p ≡ v+), and the postcondition also describes observations at the end ofthe experiment, for example all states for which the variable v has value one(Q ≡ (v = 1)), and so on. Whether or not the triple is satisfied by a givengene network N , will depend on its state transition graph, thus it will dependon the parameter values in K.

12

Definition 4.6. (Semantics of a Hoare triple). Let N = (V,M,EV , EM ,K)be a grn and let S be the state graph of N whose set of vertices is denotedS. A Hoare triple {P} p {Q} is satisfied if and only if:

For all η ∈ S satisfying P , there exists E such that ηp; E and for all

η′ ∈ E, η′ satisfies Q.

See supplementary materials Appendix A for more details.

5. A Hoare logic for discrete models of gene networks

In this section, we define our genetically modified Hoare logic by givingthe rule for each constructor of trace specifications (Definition 4.2). First,let us introduce a few conventional names to denote formulas that will beintensively used.

Notation 5.1. For each variable v of a grn N , we conventionally use thefollowing notations:

1. For each subset ω of N−(v) we denote by Φωv the following formula

Φωv ≡ (

∧m ∈ ω

ϕm) ∧ (∧

m ∈ N−(v)rω

¬ϕm)

where N−(v) r ω stands for the complementary subset of ω in N−(v).From Definition 3.3, for all states η, η |=N Φω

v if and only if ω = ρ(η, v),that is, ω is the set of resources of v at state η. Consequently, for eachv, there exists a unique ω such that η |=N Φω

v .

2. We denote by Φ+v the following formula

Φ+v ≡

∧ω⊂N−(v)

(Φωv =⇒ Kv,ω > v)

From Definition 3.5, we have η |=N Φ+v if and only if there is a transi-

tion (η → η[v ← v + 1]) in the state graph S, that is, if and only if thevariable v can increase.

3. We denote by Φ−v the following formula

Φ−v ≡∧

ω⊂N−(v)

(Φωv =⇒ Kv,ω < v)

Similarly, η |=N Φ−v if and only if the variable v can decrease from thestate η in the state graph S.

13

See Section 6 where examples of these formulas are given.Our Hoare logic for discrete models of gene networks is then defined by

the following inference rules, where v is a variable of the grn and k ∈ [0, bv].

1. Rules encoding Thomas’ discrete dynamics.

Incrementation: { Φ+v ∧ Q[v←v+1] } v+ {Q}

Decrementation: { Φ−v ∧ Q[v←v−1] } v− {Q}

2. Rules coming from Hoare logic. These rules are similar to the onesgiven in Section 2. Obvious rules for the expression assert(Φ), and forthe quantifiers, are added:Assert: { Φ ∧ Q } assert(Φ) { Q }

Universal quantifier:{P1} p1 {Q} {P2} p2 {Q}{P1∧P2} ∀(p1,p2) {Q}

Existential quantifier:{P1} p1 {Q} {P2} p2 {Q}{P1∨P2} ∃(p1,p2) {Q}

Assignment: {Q[v←k]} v:=k {Q}

Sequential composition:{P1} p1 {P2} {P2} p2 {Q}

{P1} p1;p2 {Q}

Iteration:{e∧I} p {I} ¬e∧I⇒Q{I} while e with I do p {Q}

Empty trace:P ⇒ Q{P} ε {Q}

3. Boundary axioms asserting that all values stay between their bounds,for each v ∈ V and ω ⊂ N−(v):

0 6 v ∧ v 6 bv ∧ 0 6 Kv,ω ∧ Kv,ω 6 bv

Remark 5.2.

• (Φ+v ⇒ v < bv) can be deduced from the boundary axioms: Φ+

v impliesthat for ω corresponding to the current set of resources, Kv,ω > v and,using the boundary axiom Kv,ω 6 bv, we get v < bv.

14

• Similarly, we have (Φ−v ⇒ v > 0).

These implications will be used in Section 6.The conditional branching rule of the standard Hoare logic has not been

reproduced here because the trace specification (if e then p1 else p2) is ashorthand for ∃( assert(e); p1 , assert(¬e); p2 ). The conditional branchingrule remains correct.

We prove in Supplementary Materials Appendix B that this modifiedHoare logic is sound and complete and we show that the weakest loop in-variants can always be computed. More precisely, the proof strategy calledbackward strategy, already described at the end of Section 2, also applies here:It computes the weakest precondition.

Nevertheless, similarly to classical Hoare logic which reflects a partialsoundness of imperative programs, the previous definition does not implytermination of while loops.

6. A pedagogical example

In [9, 10] Uri Alon and co-workers have studied the most common invivo patterns involving at most four genes. Among them, even without con-sidering feedback loops such as in [23], there are interesting patterns whosedynamics is less obvious than it seems. In particular they have emphasizedthe incoherent feedforward loop of type 1. It is composed by a transcrip-tion factor a that activates a second transcription factor c, and both a andc regulate a gene b. The gene a is an activator of b whereas the gene c isan inhibitor of b. There is a “short” positive action of a on b and a “long”negative action via c: a activates c which inhibits b. The left hand side ofFigure 3 shows such a feedforward loop. Supposing that both thresholds ofactions of a are equal leads to a Boolean network since, in that case, thevariable a can take only the value 0 (a has no action) or 1 (a activates both band c). The right hand side of the figure shows the corresponding grn withmultiplexes: σ encodes the “short” action of a on b, whilst l followed by λconstitute the “long” action.

Classical interpretation: Uri Alon and many biologists have in mind thatif a is equal to 0 for a sufficiently long time, both b and c will also beequal to 0, because b and c need a as a resource in order to reach thestate 1. They also have in mind that the function of this feedforward

15

a

c

b1 +

1+

1 −

a 1

1

c

b

1

l

a > 1

a > 1

σ

¬(c > 1)

λ

Figure 3: (Left) Boolean “incoherent feedforward loop of type 1” according toUri Alon. (Right) Corresponding grn N=(V,M,EV , EM ,K). V={a, b, c} withba=bb=bc=1. M={l, λ, σ}, φl ≡ (a > 1), φλ ≡ (¬(c > 1)), φσ ≡ (a > 1).K={Ka,Kc,Kc,l,Kb,Kb,σ,Kb,λ,Kb,σλ}.

loop is to ensure a transitory activity of b that signals when a hasswitched from 0 to 1. The idea is that a activates the productions of band c, and then c stops the production of b.

Here, we revisit this affirmation via four different trace specifications, andwe prove formally that the affirmation is only valid under some constraintson the parameters of the network, and only under the assumption that bstarts its activity before c.

Is a transitory production of b possible? The simple popular idea that b isactivated and then the activation of c inhibits b is specified by the Hoaretriple {P} P1 {Q0} where P ≡ (a = 1 ∧ b = 0 ∧ c = 0), P1 ≡ (b+; c+; b−)and Q0 ≡ (b = 0). The backward strategy using our genetically modifiedHoare logic on this example gives the following successive conditions.

• The weakest precondition obtained through the last expression “b−” isΦ−b ∧Q0[b← b−1] (Decrementation rule):

Φ∅b ⇒ Kb < b

Φσb ⇒ Kb,σ < b

Φλb ⇒ Kb,λ < b

Φσ,λb ⇒ Kb,σλ < b

b− 1 = 0

≡

(¬¬(c > 1) ∧ ¬(a > 1))⇒ Kb < b(¬¬(c > 1) ∧ (a > 1))⇒ Kb,σ < b(¬(c > 1) ∧ ¬(a > 1))⇒ Kb,λ < b(¬(c > 1) ∧ (a > 1))⇒ Kb,σλ < bb− 1 = 0

16

which simplifies as Q1 ≡

b = 1((c > 1) ∧ (a < 1)) =⇒ Kb = 0((c > 1) ∧ (a > 1)) =⇒ Kb,σ = 0((c < 1) ∧ (a < 1)) =⇒ Kb,λ = 0((c < 1) ∧ (a > 1)) =⇒ Kb,σλ = 0

• Then, the weakest precondition obtained through the expression “c+”is Φ+

c ∧Q1[c← c+ 1]:

¬(a > 1)⇒ Kc > ca > 1⇒ Kc,l > cb = 1((c+ 1 > 1) ∧ (a < 1))⇒ Kb = 0((c+ 1 > 1) ∧ (a > 1))⇒ Kb,σ = 0((c+ 1 < 1) ∧ (a < 1))⇒ Kb,λ = 0((c+ 1 < 1) ∧ (a > 1))⇒ Kb,σλ = 0

which simplifies as

Q2 ≡

c = 0a < 1⇒ Kc = 1a > 1⇒ Kc,l = 1b = 1a < 1⇒ Kb = 0a > 1⇒ Kb,σ = 0

using the boundary axioms and Re-

mark 5.2.

• Lastly, the weakest precondition obtained through the first “b+” of the

trace is Φ+b ∧Q2[b← b+1] which simplifies asQ3 ≡

a < 1⇒ Kb,λ = 1a > 1⇒ Kb,σλ = 1c = 0a < 1⇒ Kc = 1a > 1⇒ Kc,l = 1b = 0a < 1⇒ Kb = 0a > 1⇒ Kb,σ = 0

Then, using the Empty trace rule, it follows that P =⇒ Q3 i.e. (a = 1 ∧ b =0 ∧ c = 0) =⇒ Q3. After simplification we get correctness if and only ifKb,σλ = 1 and Kc,l = 1 and Kb,σ = 0. So, under these three hypothesesand whatever the values of the other parameters, the system can exhibit atransitory production of b in response to a switch of a from 0 to 1.

Is a transitory production of b possible without increasing c? The previoustrace specification P1 is not the only one reflecting a transitory production

17

of b, there may be other realisations of this property. For example one canconsider the trace specification

P2 ≡ (b+; b−).

With respect to this trace specification, the weakest precondition obtainedthrough the last expression “b−” is of course Q1 as previously. Then, theweakest precondition obtained through “b+” is

Q4 ≡

b = 0((c > 1) ∧ (a < 1)) =⇒ ((Kb = 1) ∧ (Kb = 0))((c > 1) ∧ (a > 1)) =⇒ ((Kb,σ = 1) ∧ (Kb,σ = 0))((c < 1) ∧ (a < 1)) =⇒ ((Kb,λ = 1) ∧ (Kb,λ = 0))((c < 1) ∧ (a > 1)) =⇒ ((Kb,σλ = 1) ∧ (Kb,σλ = 0))

Q4 is not satisfiable: It implies that each parameter associated with b is bothequal to 0 and 1. The trace (b+; b−) is not realisable (inconsistent weakestprecondition).

The existence of the trace (b+, c+, b−) does not imply a transitory productionof b for all traces in the same gene network. When Kb,σλ = 1, Kc,l = 1 andKb,σ = 0, that is when trace (b+, c+, b−) is realisable, this does not preventfrom some other traces that do not exhibit a transitory production of b. Forinstance the simple trace specification P3 ≡ c+ leaves b constantly equal to0, and the Hoare triple{

a = 1 ∧ b = 0 ∧ c = 0 ∧Kb,σλ = 1 ∧ Kc,l = 1 ∧ Kb,σ = 0

}c+

{b = 0

}is satisfied, as the corresponding weakest precondition Q5 is clearly impliedby the precondition.

Q5 ≡ Φ+c ∧Q0[c← c+ 1] ≡

c = 0a = 0 =⇒ Kc = 1a = 1 =⇒ Kc,l = 1b = 0

Once a constantly equals 1, if c reaches level 1 before b, even transitorily, thenno production of b is possible anymore. We prove this property by showingthat the following triple is inconsistent, whatever the loop invariant I:a = 1 ∧ b = 0 ∧c = 1 ∧Kb,σλ=1 ∧Kc,l=1 ∧ Kb,σ=0

while b<1 with I do ∃(b+, b−, c+, c−)︸︷︷︸P4

{b=1}

18

The subtrace specification ∃(b+, b−, c+, c−) reflects the fact that a staysconstant but b or c evolves. Thus, the while statement allows b and c toevolve freely until b becomes equal to 1.Applying the Iteration rule, I has to satisfy ¬(b < 1) ∧ I =⇒ (b = 1): Thisproperty is trivially satisfied whatever the assertion I, due to the boundaryaxioms. I has also to satisfy {b < 1 ∧ I} ∃(b+, b−, c+, c−) {I} which givesvia the existential quantifier rule:

Q6 ≡{

(Φ+b ∧ I[b← b+ 1]) ∨ (Φ−b ∧ I[b← b− 1]) ∨

(Φ+c ∧ I[c← c+ 1]) ∨ (Φ−c ∧ I[c← c− 1])

Consequently I must be any assertion such that

(b = 0 ∧ I) =⇒ Q6

Let us denote P the precondition of the trace specification P4. Applying theEmpty trace rule, it results that I must also satisfy P =⇒ I. So, becauseP =⇒ (b = 0), we have P =⇒ (b = 0 ∧ I), which, in turn implies Q6.Moreover, let us remark that Q6 =⇒ (Φ+

b ∨Φ−b ∨Φ+c ∨Φ−c ). Consequently,

if the Hoare triple of P4 is correct, then P =⇒ (Φ+b ∨Φ−b ∨Φ+

c ∨Φ−c ) whichis impossible because, if P is satisfied then

• Φ+b is false, as a = 1, c = 1 and Kb,σ = 0

(indeed,Φ+b implies a = 1 ∧ c = 1⇒ Kb,σ > 0)

• Φ−b is false, as b = 0 (Φ−b implies b > 0)

• Φ+c is false, as c = 1 (Φ+

c implies c < 1)

• Φ−c is false, as a = 1, c = 1 and Kc,l = 1(Φ−c implies a = 1 ∧ c = 1⇒ Kc,l < 1).

So, we have formally proved that when a is constantly equal to 1, as soonas c has reached the level 1, it becomes never possible for b to increaseto 1. As announced at the begining of this section, this proof contradicts theuniversality of the classical interpretation of this incoherent feedforward loopof type 1.

7. Related Works

What motivates the introduction of formal methods in discrete modellingof gene networks (or any complex system) is of course the automation ofparameter identification.

19

First approaches based on Thomas’ modelling used hand-made identifi-cation taking benefit from known mathematical properties on circuits andusing simulations, on a “trial and error” method [24, 25]. Later on, simu-lation softwares helped systematic simulations, mainly GNA [26] and GIN-sim [27] that also include some tools for the determination of invariants. Onbiological systems where sufficient biological knowledge drastically limits thepossible parameter values, approaches purely based on simulations remainefficient [28].

The first use of the power of formal methods really comes with temporallogics and model checking with the software SMBioNet [2]. Later on, GNAalso included some aspects of model checking and Alexander Bockmayr andHeike Siebert [29] introduced timed automata using UPPAAL. Constraintsolving efficiently complemented the temporal logic approach [5, 4] as well assymbolic execution technics [3]. More detailed descriptions of these methodsand their variants can be found in [30, 31]. These approaches fully take ben-efit from biological expertise, formalizing knowledge into temporal formulasbut they need a large interpretation capacity of the experimental observa-tions. This was our motivation to introduce Hoare Logic which uses tracespecifications directly extracted from experiments.

Following the same motivation, Heike Siebert and co-workers [32] encodedtime-series measurements into CTL formulas. Their approach is able to takeinto account partially known time-series measurements using repeatedly en-capsulated EF statements. Then, they use softwares such as SMBioNet inorder to identify the parameters. The price to pay is a huge computationtime to identify the parameters, compared to constraint solving. Also, com-pared to our Hoare Logic, neither assignment, nor quantifier nor iteration arepossible. Notice that although the Siebert’s approach is based on a modallogic, a procedure based on tableau semantics [33, 34], does not apply becausethe objective of using time-series from biological experiments is, similarly toour approach, to extract constraints on the Thomas’ parameters; it is not toprove the satisfiability of the considered time-series2.

On the semantic side, Definition 4.4 is in fact rather natural and similar

2Notice also that, although both Dijkstra weakest precondition algorithm and thetableau procedure for LTL go backwards, they are intrinsically different. In particular, inthe Hoare approach as well as ours, the size of the formulas built by the Dijkstra algorithmincreases up to the final constraint, contrarily to tableau procedure that builds a sequenceof decreasing subformulas of the considered formula.

20

ideas have been used for concurrent systems in computer science, such as forinstance in [35, 36] where the authors defined a mathematical semantics forconcurrent propositional dynamic logic. Our definition has a slightly differenttreatment of quantifiers, disjunctions and conjunctions in order to cope withthe biological meaning of non-determinism.

Last but not least, whatever the aforementioned formalism, there is nopossibility to model an intervention of the biologist during the experiment.Knock-Outs of genes are typical examples of such interventions. In our for-malism they are easy to express in trace specifications, using asignment ex-pressions (such as v := 0). They are not directly expressible in the otherformalisms, including CTL or LTL, because the logic formulas they considerare by definition satisfied (or not) according to the paths within a given modelwhich is a transition graph deduced from the gene network: knock-outs donot correspond to transitions.

Let us additionally remark that abstract interpretation [14] subsumesthe Hoare logic, so a natural question is should we use genetically modifiedabstract interpretation instead of genetically modified Hoare logic? The pointis that the dynamics of Thomas’ networks is formalized in a easy way usingHoare inference rules, whereas abstract interpretation would make thingsmore complicated without actual benefit. Hoare triples facilitate discussionswith biologists.

8. Conclusion

As a consequence of our results, when a genetically modified Hoare tripleis correct, we are always able to automatically generate all the weakest loopinvariants and to build a syntactic proof tree that establishes the soundness.In other words, the assertion language of Definition 4.1 is expressive enoughto ensure the purely logical soundness and decidability of our geneticallymodified Hoare logic with while loops and quantifiers. This is an importantstep towards a systematic exploitation of the numerous gene expression tracesavailable in biological databases.

We used our genetically modified Hoare logic on several examples in-cluding the classical epigenetic switch of λ phage and, in cooperation withbiologists, other examples of credible size such as the circadian clock or thecell cycle in mammals. In all examples the computation of the weakest pre-condition takes less than one tens of second on a standard laptop (dual core,2GHz). What can take time is the resolution of constraints, varying from

21

ten seconds to one day, depending on the chosen constraint solver and theproblem under consideration (CTL based softwares require several days tomodel check all the possible sets of parameter values). On the mammal cellcycle example, inspired by the model proposed by John Tyson in [37], wemade a discrete model with 5 variables and 11 multiplexes. We obtaineda set of 339 738 624 possible valuations, each model with 48 states and 26parameters. From biological knowledge we extracted 12 trace specifications.After applying our Hoare logic method, 13 parameters were entirely identified(50%) and only 8192 valuations remained possible according to the generatedconstraints (0.002%). Lastly additional reachability properties (endoreplica-tion and quiescent phase) have been necessary to identify all parameters byformalizing them into temporal logic.

One may easily imagine similar works for many applications besides genenetworks. When modelling any complex system, the cornerstone lies, what-ever the application domain, in the identification of the parameters. Hoarelogic was initially designed for proofs of imperative programs. In this paper,we divert this approach for exhibiting constraints on parameters of gene net-work models. One can imagine several other adaptations for several typesof discrete complex systems, the key point is to extract from the consideredunderlying modelling framework, a first order formula that characterizes theconditions under which a transition exists.

Acknowledgment

The authors thank the French National Agency for Research (ANR-14-CF09-0011 HyClock project) for its support. This work has also been partlysupported by the ANR-10-BLANC-0218 BioTempo project, by the CNRSPEPII project CirClock and by the European PHC PROCOPE projectTiGeRNet.

References

[1] R. Thomas, Regulatory networks seen as asynchronous automata : Alogical description, J. theor. Biol. 153 (1991) 1–23.

[2] G. Bernot, J.-P. Comet, A. Richard, J. Guespin, Application of formalmethods to biological regulatory networks: Extending Thomas’ asyn-chronous logical approach with temporal logic, Journal of TheoreticalBiology 229 (3) (2004) 339–347.

22

[3] D. Mateus, J.-P. Gallois, J.-P. Comet, P. Le Gall, Symbolic modelingof genetic regulatory networks, Journal of Bioinformatics and Compu-tational Biology 5 (2B) (2007) 627–640.

[4] F. Corblin, S. Tripodi, E. Fanchon, D. Ropers, L. Trilling, A declara-tive constraint-based method for analyzing discrete genetic regulatorynetworks, Biosystems 98 (2) (2009) 91–104.

[5] E. Fanchon, F. Corblin, L. Trilling, B. Hermant, D. Gulino, Modelingthe molecular network controlling adhesion between human endothelialcells: Inference and simulation using constraint logic programming, in:CMSB, 2004, pp. 104–118.

[6] C. Hoare, An axiomatic basis for computer programming, Communica-tions of the ACM 12 (10) (1969) 576–585.

[7] E. W. Dijkstra, Guarded commands, nondeterminacy and formal deriva-tion of programs, Commun. ACM 18 (1975) 453–457.

[8] A. Bernot, Genome transcriptome and proteome analysis, John Wiley& Sons, 2004.

[9] S. Shen-Orr, R. Milo, S. Mangan, U. Alon, Network motifs in the tran-scriptional regulation network of escherichia coli, Nature Genetics 31(2002) 64–68.

[10] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon,Network motifs: Simple building blocks of complex networks, Science298 (2002) 824–827.

[11] W. Hatcher, A semantic basis for program verification, J. of Cybernetics4 (1) (1974) 61–69.

[12] A. Blass, Y. Gurevich, Inadequacy of computable loop invariants, ACMTransactions on Computational Logic 2 (1) (2001) 1–11.

[13] P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Min, D. Monniaux,X. Rival, The ASTRE analyser., in: M. Sagiv (Ed.), ESOP 2005 TheEuropean Symposium on Programming, no. 3444 in LNCS, Springer,2005, pp. 21–30.

23

[14] P. Cousot, R. Cousot, Basic concepts of abstract interpretation., in:R. Jacquard (Ed.), Building the Information Society, Kluwer Academic,2004, pp. 359–366.

[15] S. A. Cook, Soundness and completeness of an axiom system for programverification, SIAM Journal on Computing 7 (1) (1978) 70–90.

[16] D. Kozen, J. Tiuryn, On the completeness of propositional Hoare logic,Information Sciences 139 (3) (2001) 187–195.

[17] R. Thomas, R. d’Ari, Biological Feedback, CRC Press, 1990.

[18] R. Thomas, M. Kaufman, Multistationarity, the basis of cell differentia-tion and memory. II. logical analysis of regulatory networks in terms offeedback circuits, Chaos 11 (2001) 180–195.

[19] Z. Khalis, J.-P. Comet, A. Richard, G. Bernot, The SMBioNet methodfor discovering models of gene regulatory networks, Genes, Genomes andGenomics 3(special issue 1) (2009) 15–22.

[20] R. Thomas, A. Gathoye, L. Lambert, A complex control circuit. regula-tion of immunity in temperate bacteriophages., Eur. J. Biochem. 71 (1)(1976) 211–27.

[21] R. Thomas, Logical analysis of systems comprising feedback loops., J.Theor. Biol. 73 (4) (1978) 631–56.

[22] E. Snoussi, R. Thomas, Logical identification of all steady states : theconcept of feedback loop caracteristic states, Bull. Math. Biol. 55 (5)(1993) 973–991.

[23] B. Yordanov, G. Batt, C. Belta, Model checking discrete-time piecewiseaffine systems: application to gene networks, in: Control Conference(ECC), 2007 European, IEEE, 2007, pp. 2619–2626.

[24] M. Kaufman, J. Urbain, R. Thomas, Towards a logical analysis of theimmune response, Journal of theoretical biology 114 (4) (1985) 527–561.

[25] R. Thomas, M. Kaufman, Multistationarity, the basis of cell differenti-ation and memory. I. & II., Chaos 11 (2001) 170–195.

24

[26] H. de Jong, J. Geiselmann, C. Hernandez, M. Page, Genetic networkanalyzer: qualitative simulation of genetic regulatory networks., Bioin-formatics 19 (3) (2003) 336–44.

[27] A. Gonzalez, A. Naldi, L. Sanchez, D. Thieffry, C. Chaouiya, Ginsim: asoftware suite for the qualitative modelling, simulation and analysis ofregulatory networks, Biosystems 84 (2) (2006) 91–100.

[28] R. Khoodeeram, G. Bernot, J.-Y. Trosset, An ockham razor model ofenergy metabolism, in: P. Amar, F. Kps, V. Norris (Eds.), Proc. ofthe Thematic Research School on Advances in Systems and SyntheticBiology, EDP Science, 2017, pp. 81–101.

[29] H. Siebert, A. Bockmayr, Temporal constraints in the logical analysisof regulatory networks, Theoretical Computer Science 391 (3) (2008)258–275.

[30] G. Bernot, J.-P. Comet, C. Risso-de Faverney, Regulatory networks,in: B. Reisfeld, A. Mayeno (Eds.), Computational Toxicology, Vol. II,Humana Press, ISBN 978-1-62703-058-8, USA, 2013, pp. 215–234.

[31] G. Bernot, J.-P. Comet, H. Snoussi, Formal methods applied to genenetwork modelling, in: L. Farinas del Cerro, K. Inoue (Eds.), LogicalModeling of Biological Systems, Bioengineering and health science se-ries, ISTE & Wiley, ISBN 978-1-84821-680-8, 2014, pp. 245–289.

[32] H. Klarner, A. Streck, D. Safranek, J. Kolcak, H. Siebert, Parameteridentification and model ranking of Thomas networks, in: Computa-tional Methods in Systems Biology, Springer, 2012, pp. 207–226.

[33] M. Reynolds, A traditional tree-style tableau for LTL, CoRRabs/1604.03962.URL http://arxiv.org/abs/1604.03962

[34] M. Bertello, N. Gigante, A. Montanari, M. Reynolds, Leviathan: A newLTL satisfiability checking tool based on a one-pass tree-shaped tableau.,in: IJCAI, 2016, pp. 950–956.

[35] D. Peleg, Concurrent dynamic logic, Journal of the ACM (JACM) 34 (2)(1987) 450–479.

25

[36] M. Hennessy, Algebraic Theory of Processes, MIT Press, 1988.

[37] J. Tyson, B. Novak, Temporal organization of the cell cycle, CurrentBiology 18 (17) (2008) R759–R768.

26

Supplementary materials

Appendix A. Semantics of Hoare triples for gene networks

We define the semantics of a trace specification via a binary relationbetween states and sets of states. This relation characterises all the possiblerealisations of the trace specification. The general ideas that motivate ourdefinition are the following:

• Starting from an initial state η, a trace specification without existentialor universal quantifier is either realised by associating with η anotherstate η′ or is not realisable so that η′ does not exist. For example, theatomic expression v+ associates η′ with η (where ∀u 6= v, η′(u) = η(u)and η′(v) = η(v) + 1) if and only if the transition η → η′ exists in thestate space. If, on the contrary, this transition does not exist, the tracespecification is not realisable.

• Existential quantifiers open a sort of space of possibilities for η′: Ac-cording to the chosen trace specification under each existential quanti-fier one may get different associated states. Consequently, one cannotdefine the semantics as a partial function that associates a unique η′

with η; a binary relation is a more suited mathematical object (denoted; in the sequel).

• A universal quantifier induces a sort of unity/solidarity between all thestates η′ that can be obtained through each trace specification underits scope. We will see later (Definition Appendix A.2) that all thesestates have to satisfy the postcondition. For this reason, we define abinary relation that associates a set of states E with the initial stateη: “η ; E”. Such a set E can be understood as grouping togetherthe states it contains in preparation for checking the forthcoming postcondition.

• When the trace specification p contains both existential and universalquantifiers, we may consequently get several sets E1, · · · , En such thatη

p; Ei, each of the Ei being a possibility through the existential quan-

tifiers of p and all the states belonging to a given Ei being togetherthrough the universal quantifiers of p. On the contrary, if p is not re-alisable, then there is no set E such that η

p; E (not even the empty

set).

27

Definition Appendix A.1. (Mathematical semantics of a trace specifica-tion). Let N = (V,M,EV , EM ,K) be a grn, let S be the state graph of Nwhose set of vertices is denoted S and let p be a trace specification for N .The binary relation

p; is the smallest subset of S ×P(S) such that, for any

state η:

1. If p is the atomic expression v+, then let us consider the state η′ =η[v ← (η(v) + 1)]: If η → η′ is a transition of S then η

p; {η′}.

2. If p is the atomic expression v−, then let us consider the state η′ =η[v ← (η(v)− 1)]: If η → η′ is a transition of S then η

p; {η′}.

3. If p is the atomic expression v := i, then ηp; {η[v ← i]}.

4. If p is of the form assert(e), if η |=N e, then ηp; {η}.

5. If p is of the form ∀(p1, p2): If ηp1; E1 and η

p2; E2 then η

p; (E1∪E2).

6. If p is of the form ∃(p1, p2): If ηp1; E1 then η

p; E1, and if η

p2; E2

then ηp; E2.

7. If p is of the form (p1; p2): If ηp1; F and if {Ee}e∈F is a F -indexed

family of state sets such that ep2; Ee, then η

p; (

⋃e∈F Ee).

8. If p is of the form (while e with I do p0):

• If η 6|=N e then ηp; {η}.

• If η |=N e and ηp0;p; E then η

p; E.

This definition calls for several comments.The relation

p; exists because (i) the set of all relations that satisfy the

properties 1–8 of the definition is not empty (the relation which links allstates to all sets of states satisfies the properties) and (ii) the intersection ofall the relations that satisfy the properties 1–8, also satisfies the properties.

A simple atomic expression such as v+ may be not realisable in a stateη (if η → η′ is not a transition of S). In this case, there is no set E such

that ηv+; E. The same situation happens when the trace specification is an

assertion that is not satisfied at the current state η.Universal quantifiers propagate non-realisable trace specifications: If one

of the pi is not realisable then ∀(p1, · · · , pn) is not realisable. It is not the

case for existential quantifiers : If ηpi; Ei for one of the pi then η

∃(p1···pn); Ei

even if one of the pj is not realisable.When a while loop does not terminate, there is no set E such that

ηwhile...; E. This is due to the minimality of the binary relation

p;. On

28

the contrary, when the while loop terminates, it is equivalent to a tracespecification containing a finite number of occurrences of the subtrace p0 insequence, starting from η.

The semantics of sequential composition may seem unclear for readers notfamiliar with commutations of quantifiers. We give an example to explainthe construction of

p1;p2; (see Figure A.4):

p2p1

F1=

ηa

ηb

ηc

η

E4

E1

E2

E3

F2=

gives: ηE1 ∪ E4

E2 ∪ E3

E2 ∪ E4

E1 ∪ E3

p1; p2

Figure A.4: An example for the semantics of sequential composition

• Let us assume that starting from state η, two sets of states are possiblevia p1: η

p1; F1 = {ηa, ηb} and η

p1; F2 = {ηc}. It intuitively means that

p1 permits a choice between F1 and F2 through some existential quan-tifier and that the trace specification leading to F1 contains a universalquantifier grouping together ηa and ηb.

• Let us also assume that

– starting from the state ηa, two sets of states are possible via p2:ηa

p2; E1 and ηa

p2; E2,

– starting from the state ηb, two sets of states are possible via p2:ηb

p2; E3 and ηb

p2; E4,

– and there are no set E such that ηcp2; E.

When focusing on the traces of (p1; p2) that encounter F1 after p1, the tracessuch that p1 leads to ηa must be grouped together with the ones that leadto ηb. Nevertheless, for each of them, p2 permits a choice of possibilities:

29

between E1 or E2 for ηa and between E3 or E4 for ηb. Consequently, whengrouping together the possible futures of ηa and ηb, one needs to consider thefour possible combinations: η

p1;p2; (E1∪E3), η

p1;p2; (E1∪E4) η

p1;p2; (E2∪E3)

and ηp1;p2; (E2 ∪ E4).

Lastly, when focusing on the traces of (p1; p2) that encounter F2 after p1,since ηc has no future via p2, there is no family indexed by F2 as mentionedin the definition and consequently it adds no relation into

p1;p2; .

Let us remark that, if ηp; E then E cannot be empty; it always contains

at least one state. The proof is easy by structural induction of the tracespecification p (using the fact that a while loop which terminates is equiva-lent to a trace specification containing a finite number of occurrences of thesubtrace p0).

Definition Appendix A.2. (Semantics of a Hoare triple). Given a grnN = (V,M,EV , EM ,K), let S be the state graph of N whose set of verticesis denoted S. A Hoare triple {P} p {Q} is satisfied if and only if:

For all η ∈ S satisfying P , there exists E such that ηp; E and for all

η′ ∈ E, η′ satisfies Q.

The previous definition implies the consistency of all the traces describedby the trace specification p with the state graph: If the specification p is notrealisable starting from one of the states satisfying pre-condition P , then theHoare triple cannot be satisfied. For instance if some v+ is required by thetrace specification p but the increasing of v is not possible according to thestate graph, then the Hoare triple is not satisfied.For example, let us consider the grn in Figure A.5 and its state graph.

1. The Hoare triple {(a = 0) ∧ (b = 0)} a+; a+; b + {(a = 2) ∧ (b = 1)}is satisfied, because

• for all states that do not satisfy the pre-condition, the Hoare tripleis satisfied by definition,

• there is, in this example, a unique state satisfying the precondition(a = 0) ∧ (b = 0) and from this state, the trace specificationa+; a+; b+ is possible and leads to the state (2, 1) and

• the state (2, 1) satisfies the postcondition (a = 2) ∧ (b = 1).

2. The Hoare triple {(a = 2) ∧ (b = 0)} b+; a−; a − {(a = 0) ∧ (b = 1)}is not satisfied because from the state satisfying the precondition, the

30

µ1

µ2

µ3

b

a

1

1

0

0 2

a b(2) (1)

¬(b ≥ 1)

a ≥ 2

a ≥ 1

Figure A.5: (Left) The graphical representation of the grn N = (V,M,EV , EM ,K) withV = {a, b}, the bounds of a and b are respectivelly 2 and 1, M = {µ1, µ2, µ3}, ϕµ1

is(a > 2), ϕµ2 is (a > 1), ϕµ3 is ¬(b > 1). Finally the family of integers is {Ka = 1,Ka,µ1 = 2, Ka,µ3 = 2, Ka,µ1µ3 = 2, Kb = 1, Kb,µ2 = 1}. (Right) Representation of itsstate graph.

first expression b+ is realisable and necessarily leads to the state (2, 1)from which the next expression a− is not consistant with the stategraph.

3. The following Hoare triple contains two existantial quantifiers and auniversal one:{(a = 0)∧(b = 0)} ∀(a+, b+);∃(a+, b+);∃(ε, b+) {(b = 1)} (rememberthat ε denotes the empty trace and is an abbreviation for assert(>)where > stands for a tautology).

• We have clearly (0, 0)∀(a+,b+); {(1, 0), (0, 1)}

• Since we have (1, 0)∃(a+,b+); {(2, 0)} and (1, 0)

∃(a+,b+); {(1, 1)} and

(0, 1)∃(a+,b+); {(1, 1)}, we deduce (0, 0)

∀(a+,b+);∃(a+,b+); {(1, 1), (2, 0)}

and (0, 0)∀(a+,b+);∃(a+,b+)

; {(1, 1)}.

• We have trivially (1, 1)∃(ε,b+); {(1, 1)}

• Moreover we have both (2, 0)∃(ε,b+); {(2, 0)} and (2, 0)

∃(ε,b+); {(2, 1)}

• We deduce that the considered trace specification p can lead to 3different sets of states: (0, 0)

p; {(1, 1), (2, 0)}, (0, 0)

p; {(1, 1)}

and (0, 0)p; {(1, 1), (2, 1)}.

Because the postcondition is satisfied in both states (1, 1) and (2, 1),the two last sets of states which are in relation with (0, 0), satisfy thepostcondition. Consequently although the first set does not, one candeduce that the Hoare triple is satisfied.

31

Appendix B. Partial Soundness and Completeness

As usual in Hoare logic, “partial” has to be understood here as “assumingthat the while loops of the considered trace specification terminate.”

Appendix B.1. Partial soundness

The soundness of our modified Hoare logic means that: Given a networkN = (V,M,EV , EM ,K), if ` {P} p {Q} according to the inference rules ofSection 5 (and after substituting the symbols K... by their value in N), then

for all states η that satisfies P , if there exists E such that ηp; E, then there

exists E ′ such that ηp; E ′ and ∀η′ ∈ E ′, η′ |=N Q.

The proof is made as usual by induction on the proof tree of ` {P} p {Q}.Hence, we have to prove that each rule of Section 5 is correct. Here we developonly the Incrementation rule and the Sequential composition rule since thecorrectness of the other inference rules is either similar (Decrementation rule),trivial (Assert rule, Quantifier rules, Assignment rule, Empty trace rule andBoundary axioms) or standard in Hoare logic (Iteration rule). Let us notethat the correctness of the Sequential composition rule is not trivial becauseits semantics is enriched to cope with the quantifiers.

Let η be any state of N .

Incrementation rule: { Φ+v ∧ Q[v←v+1] } v+ {Q} (where v is a

variable of N)

From Definition Appendix A.2, the hypothesis is

H η |=N Φ+v and η |=N Q[v ← v + 1]

and we have to prove the conclusion

C there exists E ⊂ S such that ηv+; E and ∀η′ ∈ E, η′ |=N Q

Let us choose E = {η′} with η′ = η[v ← η(v) + 1]. From Notation 5.1,the hypotesis η |=N Φ+

v is equivalent to (η → η′) ∈ S, which in

turn, according to Definition 4.4, implies ηv+; {η′}. Hence, it only

remains to prove that η′ |=N Q, which results from the hypothesisη |=N Q[v ← v + 1]. 2

32

Sequential composition rule:{P1} p1 {P2} {P2} p2 {Q}

{P1} p1;p2 {Q}From Definition Appendix A.2, we consider the following three hy-potheses:

H1 for all η1 ∈ S such that η1 |=N P1 there exists E1 such that

η1p1; E1 and ∀η′ ∈ E1, η

′ |=N P2

H2 for all η2 ∈ S such that η2 |=N P2 there exists E2 such that

η2p2; E2 and ∀η′′ ∈ E2, η

′′ |=N Q

H3 η |=N P1

and we have to prove the conclusion:

C there exists E ⊂ S such that ηp1;p2; E and ∀η′′ ∈ E, η′′ |=N Q

Let us arbitrarily choose a setE1 such that ηp1; E1 and ∀η′ ∈ E1, η

′ |=N

P2 (we know that E1 exists from H1 and H3 ).

For each η′ ∈ E1, we similarly choose a set Eη′

2 such that:

η′p2; Eη′

2 and ∀η′′ ∈ Eη′

2 , η′′ |=N Q (we know that the family {Eη′

2 }η′∈E1

exists from H2 and the fact that η′ |=N P2 for all η′ ∈ E1)

Let E = (⋃η′∈E1

Eη′

2 ), we have: ηp1;p2; E from Definition 4.4 and

∀η′′ ∈ E, η′′ |=N Q (from the way the union is built). 2

Appendix B.2. Completeness and weakest precondition

Completeness of Hoare logic is defined as follows. Given a network N =(V,M,EV , EM ,K), if the Hoare triple {P} p {Q} is satisfied in N (accordingto Definition Appendix A.2) then ` {P} p {Q} (using the inference rulesof Section 5 and after substituting the symbols K... by their value in N). Weprove the completeness by establishing that one can compute the weakestinvariants of all while loops and that the backward strategy gives a proof of{P} p {Q}.

The main difference with respect to the classical completeness proof isthat we navigate into a finite state space, so that we will not have to careabout the incompleteness of arithmetic or restrictions about weakest loopinvariants. In the following proposition, we see that one can compute theweakest invariant for each while occurrence in the trace specification. Only

33

practical reasons in order to facilitate proofs justify to ask the specifier toinclude loop invariants into trace specifications: Often, a slightly non minimalinvariant considerably simplifies the proof tree.

Proposition Appendix B.1. (Existence of the weakest loop invariant).Given a grn N = (V,M,EV , EM ,K), let us consider two assertions Q ande, and a trace specification p. There exists a weakest loop invariant I suchthat the Hoare triple {I} while e with I do p {Q} is correct.

The following proof is constructive and gives a way to compute I (see re-mark Appendix B.4).Proof:

1. In the first step of the proof, we build a set D as a countable union.

• Let q0 = {η ∈ S | η |=N Q∧¬e} be the set of all states that satisfyQ without entering the while loop.

• given qi, let qi+1 = {η ∈ S | η |=N e and ∃E ⊂ S, ηp; E and E ⊂

qi}. From Definition Appendix A.2, for each i, qi is the set of statesthat induce exactly i while loops and such that the resulting statessatisfy Q.

• Let Dn =⋃ni=0 qi. The sequence of Dn is increasing and because

S is finite, it is stationary. So D =⋃∞i=0 qi exists and can be

inductively computed.

2. In the second step of the proof, we show that the caracteristic formulaof D is a loop invariant.

• Because D is finite, there is a formula I such that η |=N I iffη ∈ D: I ≡

∨η∈D 1η where 1η ≡

∧v∈V v = η(v)

• I is a loop invariant because for each state η that satisfies I, thereis an integer i such that η ∈ qi.

– If i > 0, then η satisfies I ∧ e and by definition, there is a setE such that η

p; E and E ⊂ qi−1, consequently E satisfies I

because every state of qi−1 satisfies I.

– If i = 0, then η |=N ¬e, thus η 6|=N e ∧ I, which implies that{e ∧ I} p {I} is satisfied for η, according to Definition Ap-pendix A.2 and elementary truth tables.

34

3. In the last step of the proof, we show that each state of D satisfies anyminimal loop invariant.

• Let J be a minimal loop invariant. Assume that there is a stateη ∈ D that does not satisfy J . Then J ∨ 1η (where 1η is theformula characterizing the state η), is strictly weaker than J . Butit is also an invariant since after i iterations of the while loop fromη, one of the resulting sets of states E satisfies Q. This contradictsthe minimality of J .

• Consequently I is the weakest loop invariant. 2

Theorem Appendix B.2. (Completeness theorem on the genetically mod-ified Hoare logic). Given a grn N , a trace specification p and a post-condition Q, the backward strategy defined at the end of Section 2, with theinference rules of Section 5, computes after steps 1 and 2 the weakest precon-dition P0 such that {P0} p {Q} is satisfied. In other words, for any assertionP , if {P} p {Q} is satisfied, then P ⇒ P0 is satisfied (that is, the third stepof the backward strategy).

This theorem has an obvious corrolary.

Corollary Appendix B.3. Given a grn N , our modified Hoare logic iscomplete.

Proof of the corollary: if {P} p {Q} is satisfied, then, from the theoremabove, there is a proof tree that infers the Hoare triple if there is a prooftree for the property P ⇒ P0 (which is semantically satisfied because P0 isthe weakest precondition). First order logic being complete and the numberof possible substitutions being finite (the state space being finite), the prooftree for P ⇒ P0 exists. 2

Proof of the soundness theorem:Under the following two hypotheses

H1 the Hoare triple {P} p {Q} is satisfied, i.e., for all η satisfying P , there

exists E such that ηp; E and for all η′ ∈ E, η′ satisfies Q,

H2 for all while statements of p, the corresponding loop invariant I is theweakest one (Proposition Appendix B.1),

one has to prove the conclusion:

35

C P ⇒ P0 is satisfied, where P0 is the precondition computed from p andQ by the steps 1 and 2 of the backward strategy with the inferencerules of Section 5.

The proof is done by structural induction according to the backward strategyon p.

• If p is of the form v+, then the only set E such that ηv+; E is E =

{η[v ← v + 1]}. The hypothesis H1 becomes:

H1 for all η satisfying P , η′ = η[v ← v + 1] satisfies Q and η → η′ isa transition of S

and from the Incrementation rule, the conclusion becomes:

C P ⇒ (Φ+v ∧Q[v ← v + 1]) is satisfied.

So, H1 ⇒ C straightforwardly results from the definition of Φv+

(Notation 5.1) and we do not use H2 .

• If p is of the form p1; p2, then we firstly inherit the two structuralinduction hypotheses:

H3 for all assertions P ′ and Q′, if {P ′} p1 {Q′} is satisfied then P ′ ⇒P1 is satisfied, where P1 is the precondition computed from Q′ viathe backward strategy

H4 for all assertions P ′′ and Q′′, if {P ′′} p2 {Q′′} is satisfied thenP ′′ ⇒ P2 is satisfied, where P2 is the precondition computed fromQ′′ via the backward strategy

Moreover the hypothesis H1 becomes (Definition 4.4):

H1 for all η satisfying P , there exists a family of state sets F =

{Ee}e∈F such that ηp1; F and e

p2; Ee for all e ∈ F and for all

η′ ∈ E = (⋃e∈F Ee), η

′ satisfies Q

Lastly, from the Sequential composition rule, the conclusion becomes:

C P ⇒ P1 is satisfied, where P1 is the weakest precondition of{· · ·} p1 {P2}, P2 being the weakest precondition of {· · ·} p2 {Q}.

36

From H4 (with Q′′ = Q) it results that all the states e ∈ F of hypoth-

esis H1 satisfy P2. Consequently {P} p1 {P2} is satisfied. Thus, from

H3 (with Q′ = P2 and P ′ = P ) it comes P ⇒ P1, which proves theconclusion.

• If p is of the form while e with I do p′, then, by construction of thebackward strategy, applying the Iteration rule, we get P0 = I, and theconclusion results immediately from H2 .

• Similarly to the soundness proof, we do not develop here the other casesof the structural induction. They are either similar to already devel-oped cases (Decrementation rule) or trivial (Assert rule, Quantifierrules, and Assignment rule).

This ends the proof. 2

Remark Appendix B.4. Soundness and completeness being now established,one can extend Proposition Appendix B.1 by giving a purely symbolic compu-tation of the weakest loop invariant I of a while loop. Following the notationsof the proof of Proposition Appendix B.1:

• The set of states q0 is characterised by the formula Q0 ≡ ¬e ∧Q,

• In addition, assuming that the trace specification p terminates, the setof states qi+1 is inductively characterised by the weakest preconditionQi+1 obtained via the backward strategy of the proof of {Qi+1} p {Qi}(this is due to the soundness and completeness of our calculus).

• From this construction, we deduce that the first integer n such thatqn+1 ⊂ Dn (where Dn =

⋃ni=0 qi) is the first n such that Qn+1 ⇒∨n

i=0Qi. This implication is decidable because the set of possible sub-stitutions is finite.

Proposition Appendix B.1 implies that the integer n mentioned before exists.Consequently I =

∨ni=0Qi can be expressed in a purely symbolic way. And

more importantly, this can be done from the solely knowledge of the inter-action graph. The assertion I is then a constraint on states and parametersK..., what we used in Section 6.

37

Documents

A Genetically Modi ed Hoare Logic - unice.frbernot/BiomodelsBIS/2018-TCS_Hoare... · 2019. 1. 6. · ters. The application of temporal logic to biological regulatory networks was