An Algebra for Program Designs

An Algebra for Program Designs

Tony HoareMoscow July 2011

With ideas from

• Ian Wehrman• John Wickerson• Stephan van Staden• Peter O’Hearn• Bernhard Moeller• Georg Struth• Rasmus Petersen• …and others

Summary

operational rules

denotational models

algebraic laws

deduction rules

Part 1Algebra and Hoare logic

• Some familiar algebraic laws• their application to program designs• derivation of Hoare logic from them

Subject matter: designs

• variables (p, q, r) stand for programs, designs, specifications,…

• they all describe what happens inside/around a computer that is executing a program.

• The program itself is the most precise.• The specification is the most abstract.• Designs come in between.

Binary relation: p ⊑ q

• Everything described by pis also described by q , e.g.,– spec p implies spec q– prog p satisfies spec q– prog p more determinate than prog q

• stepwise development is– spec ⊒ design ⊒ program

• stepwise analysis is the reverse– program ⊑ design ⊑ spec

p ⊑ q

• below• lesser• stronger• lower bound• more precise• …deterministic• included in • antecedent =>

• above• greater• weaker• upper bound• more abstract• ...non-deterministic• containing (sets)• consequent (pred)

⊑ is a partial order

•⊑ transitive• p ⊑ r if p ⊑ q and q ⊑ r• needed for stepwise

development/analysis

• ⊑ antisymmetric and reflexive• p = r iff p ⊑ r and r ⊑ p• needed for abstraction

Binary operator: p ; q

• sequential composition of p and q•an execution of p;q consists of– all events x from an execution of p – together with all events y from an

execution of q

•strong sequence: x must precede y•weak sequence: y must not precede x•the algebraic laws will apply to both

Hoare triple: {p} q {r}

• defined as p;q ⊑ r – starting in the final state of an execution of p,

q ends in the final state of some execution of r– p and r may be arbitrary designs.

•example: {..x+1 ≤ n} x:= x + 1 {..x ≤ n} • where ..b (finally b) describes all executions that

end in a state satisfying a single-state predicate b .

Partial correctness

• disregards unending executions• ..b is re-interpreted as including them all• ‘if the execution terminates, it will end in a

state satisfying b‘.• definition of triple stays the same• partial correctness logic is the same as total

correctness logic.

monotonicity

• Law: ( ; is monotonic) :– p;q ⊑ p’;q’ if p ⊑ p’ and q ⊑ q’

• justifies modular design/evolution– p’ and q’ may be developed independently

• Theorem (rule of consequence):– p’ ⊑ p & {p} q {r} & r ⊑ r’ implies {p’} q

{r’}

• Law is also provable from the theorem

associativity

• Law (; is associative) :– (p;q);q’ = p;(q;q’)

• Theorem (sequential composition):– {p} q {s} & {s} q’ {r} implies {p} q;q’

{r}

• half the law provable from theorem

Unit(skip):

• a program that does nothing• Law ( is the unit of ;):– p; = p = ;p

• Theorem (nullity)– {p} {p}

• a quarter of the law is provable from theorem

concurrent composition: p | q

• execution of (p|q) consists of – all events x of an execution of p,– and all events y of an execution of q

• same laws apply to both:– interleaving: x precedes or follows y– true concurrency: x neither precedes nor

follows y.

• Laws: | is associative, commutative and monotonic

Separation Logic

• Law (locality):– (s|p) ; q ⊑ s |(p;q) (left locality)– p ; (q|s) ⊑ (p;q) | s (right locality) – a weak version of associativity– a weak version of distribution

• Theorem (frame rule) :– {p} q {r} implies {p|s} q {r|s}– in Hoare logic, & replaces | , with side-

condition that q does not make s false

• Left locality provable from the theorem!

Concurrency law

• Law (; exchanges with *)– (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’)– a weak kind of mutual distribution

• Theorem (| compositional)– {p} q {r} & {p’} q’ {r’} implies

{p|p’} q|q’ {r|r’}

• the law is provable from the theorem

Exchange law

|

p|q ; p’|q’

p p’

q’q|

p|q ; p|q ⊑ p;p’|p;q

p|q ; p’|q’

p p’

q’q|

;

;

p;p’|p;q

Regular language model

• p, q, r,… are sets of strings (languages).

• p ⊑ q is inclusion of languages• p;q is (lifted) concatenation of

strings• p|q is (lifted) interleaving of strings

Left locality

•Theorem: (s|p) ; q ⊑ s |(p;q)in lhs: s interleaves with just p ,

and all of q comes at the end.in rhs: s interleaves with all of p;qso lhs is a special case of rhs• right locality is similar

Exchange

• Theorem: (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’)– in lhs: all of p and q comes before

all of p’ and q’ .– in rhs: p may interleave with q’

and p’ with q– the lhs is a special case of the rhs.

Conclusion

• regular expressions satisfy all our laws for ⊑ , ; , and |

• and other operators introduced later

Part 2. more operators and laws

• Complete lattices• Iteration, recursion, fixed points• Subroutines, abstraction

Subject matter

• variables (p, q, r) stand for programs, designs, specifications,…

• they are all descriptions of what happens inside and around a computer that is executing a program.

• the differences between programs and specs are often defined from their syntax.

Specification syntax includes

• disjunction (or) to express abstraction, or to keep options open– ‘it may be painted green or blue’

• conjunction (and) to combine requirements– it must be cheaper than x and faster than y

• negation (not) for safety and security– it must not explode

• implication to define contracts– if the user observes the protocol, so will the system

Program syntax excludes

• disjunction– non-deterministic programs difficult to test

• conjunction– inefficient to find a computation satisfying both

• negation– Incomputable

• implication– there is no point in executing it

programs include

• sequential composition (;)• concurrent composition (|)• iteration• recursion• interfaces• transactions• assignments, inputs, outputs, jumps,…

• So let’s include these in our specification/designs

Bottom •A specification that has no implementation

like the false predicate

•A program that has no executione.g., because of some syntactic error

• Define as the least solution of _ ⊑ q– r ⊑ q implies ⊑ r

• Law ( is the zero of ;) :– ; p = = p ;

• Theorem :– {p} {q}

Top ⊤

• a program with a run-time error– for which the programmer is responsible– e.g., subscript error, division by zero,

divergence,…

• defined as the least solution of q ⊑ _• Law: it is a zero of ;• ⊤; p = ⊤ = p ;⊤ if p ≠

• Theorem: none

Non-determinism (or): p ⊔ q

• describes all executions that either satisfy p or satisfy q .

• The choice is not (yet) determined.• It may be determined later– in development of the design– or in writing the program– or by the compiler – or even at run time

lub (join): ⊔

• Define p⊔q as least solution ofp ⊑ _ & q ⊑ _

• Theorem– p ⊑ r & q ⊑ r iff p⊔q ⊑ r

• Theorem– ⊔ is associative, commutative,

monotonic, idempotent and increasing– it has unit ⊥ and zero ⊤

glb (meet): ⊓

• Define p⊓q as greatest solution of_ ⊑ p & _ ⊑ q

Distribution

• Law ( ; distributive through ⊔ )– p ; (q⊔q’) = p;q ⊔ p;q’– (q⊔q’) ; p = p;q ⊔ p’;q

• Theorem (non-determinism)– {p} q {r} & {p} q’ {r} implies {p}

q⊔q’ {r}– i.e., to prove something of q⊔q’ prove the same thing of both q and q’

• quarter of law provable from theorem

Conditional: p if b else p’• Define p ⊰b⊱ p’ as

b.. ⊓ p ⊔ not(b).. ⊓ p’– where b.. describes all executions that

begin in a state satisfying b .

• Theorem. p ⊰b⊱ p’ is associative, idempotent, distributive, and– p ⊰b⊱ q = q ⊰not(b)⊱ p (symm)– (p ⊰b⊱ p’ ) ⊰c⊱ (q ⊰b⊱ q’) =

(p ⊰c⊱ q) ⊰b⊱ (p’ ⊰c⊱ q’) (exchange)

Transaction

• Defined as (p ⊓..b) ⊔ (q ⊓..c)– where ..b describes all executions that

end satisfying single-state predicate b .

• Implementation:– execute p first– test the condition b afterwards– terminate if b is true– backtrack on failure of b– and try an alternative q with condition c.

Transaction (realistic)

• Let r describe the successful executions of a transaction t .– r is tested when execution of t is complete.– any successful execution of t is committed – a single failed execution of t is undone, – and q is done instead.

• Define: (t if r else q) = t if t ⊑ r

= (t ⊓ r) ⊔ q otherwise

Complete Lattice

• Let S be an arbitrary set of designs

• Define ⊔S as its least upper bound– ∀s∊ S . s ⊑ ⊔S

– ∀s∊ S . s ⊑ r ⇒ r ⊑ ⊔S (all r ∊ PR)

• everything is an upper bound of { } ,

so ⊔ { } = – a case where ⊔S ∉ S

similarly

• ⊓S is greatest lower bound of S

• ⊓ { } = ⊤

Iteration (Kleene *)

• q* is least solution of – (ɛ ⊔ (q; _) ) ⊑ _

• q* =def ⊔{s| ɛ ⊔ q; s ⊑ s} – ɛ ⊔ q; q* ⊑ q* – ɛ ⊔ q; q’ ⊑ q’ implies q* ⊑ q’

– q* = ⊔ {qⁿ | n ∊ Nat}

• Theorem (invariance):– {p}q*{p} if {p}q{p}

Infinite replication

• !p is the greatest solution of p|_ ⊑ _– as in the pi calculus

• all executions of !p are infinite– or possibly empty

Recursion

• Let F(_) be a monotonic function between programs.

• Theorem (Knaster-Tarski): all functions defined by monotonic operators are monotonic.

• μF is weakest solution of F(_) ⊑ _• νF is strongest solution of _ ⊑ F(_)• Theorem (Knaster-Tarski): These

solutions exist.

Interfaces

• Let q be the body of a subroutine• Let s be its specification• Let (q => s) assert that q meets s• Programmer error (⊤) if incorrect • Caller of subroutine may assume s• Implementer may execute q

Subroutine: q => s

• Define (q=>s) as least solution ofq ⊑ _ & _ ⊑ s

• Theorem: (q=>s) = q if q ⊑ s = ⊤ otherwise

Basic statements/assertions

• skip • bottom • top ⊤• assignment: x := e(x)• assertion: assert b• assumption: assume b• finally ..b• initially b..

more

• assign thru pointer: [a] := e• output: c!e• input: c?x• points to: a|-> e– a |-> _ =def exists v . a|-> v

• throw• catch

Laws(examples)

• assume b =def b..⊓• assert b =def b..⊓ ⊔ not(b).. • x := e(x) ; x := f(x) = x :=

f(e(x))– in languages without interleaving

more

• p|-> _ ; [p] := e ⊑ p|-> e– in separation logic

• c!e | c?x = x := e– in CSP but not in CCS or Pi

• throw x ; (catch x; p) = p

Part 3Unifying Semantic Theories

• Six familiar semantic definition styles. • Their derivation from the algebra• and vice versa.

operational rules

algebraic laws

deduction rules

Hoare Triple

• a method for program verification• {p} q {r} ≝ p;q ⊑ r– one way of achieving r

is by first doing p and then doing q

• Theorem:– {p} q {s} & {s} q’ {r} implies {p}

q;q’ {r}– proved by associativity

Plotkin reduction

• a method for program execution• <p , q> -> r =def p ; q ⊒ r– if p describes state before execution of q

then r describes a possible final state, eg.

–<..(x2 = 18) , x := x+1> -> ..(x = 37)

• Theorem:• <p, q> -> s & <s, q’> -> r

implies <p, q;q’> r

Milner transition

• method of execution of concurrent processes

• p – q -> r ≝ p ⊒ q;r– one of the ways of executing p is by first

executing q and then executing r .– e.g., (x := x+3) –(x:=x+1)-> (x:=x+2)

• Theorem:– p –q-> s & s –q’-> r => p –(q;q’)-> r

(big-step rule for ; )

test generation

• method of test case generation• p[q]r =def p ⊑ q;r– if r describes erroneous states

resulting from execution of q , then p describes some initial states in which a test-run of q will certainly reveal the error.

• Theorem:• p [q] s & s [q’] r implies p [q;q’] r

Summary

• {p} q {r} =def p;q ⊑ r– Hoare triple

• <p,q>->r =def p;q ⊒ r– Plotkin reduction

• p –q->r =def p ⊒ q;r–Milner transition

• p [q] r =def p ⊑ q;r– test generation

Sequential composition

• Law: ; is associative• Theorem: sequence rule is valid for all four

triples.

• the Law is provable from the conjunction of all of them

Skip

• Law: p ; = p = ; p

• Theorems: {p} {p} p [] p

p − → p <p, > –>p

• Law follows from conjunction of all four theorems

Left distribution ; through ⊔

• Law: p;(q ⊔ q’) = p;q ⊔ p;q’ • Theorems:– {p} (q⊔q’) {r} if {p}q{r} and

{p}q’{r} – <p,q⊔q’>-> r if <p,q>-> r or <p, q’>-> r – p [q⊔q’] r if p [q] r and p [q’] r – p -(q⊔q’)-> r if p –q->r or p -q’->r

• law provable from consecutive pairs of theorems

locality and frame

• left locality (s|p) ; q ⊑ s | (p;q)• Hoare frame: {p} q {r} ⇒ {s|p} q {s|r}

• right locality p ; (q|s) ⊑ (p;q) | s• Milner frame: p -q-> r ⇒(p|s) - q-> (r|s)

• Full locality requires both frame rules

Separation logic

•Exchange law: – (p | p’) ; (q| q’) (p ; q) | (p’;q’)

•Theorems– {p} q {r} & {p’} q’ {r’} ⇒ {p|p’} q|q’ {r|

r’}– p -q -> r & p’–q’-> r’ => p|p’ –q|q’-> r|r’

• the law is provable from either theorem• For the other two triples, the rules are

equivalent to the converse exchange law.

usual restrictions on triples

• in {p} q {r} , p and r are of form ..b, ..c

• in p [q] r , p and r are of form b.., c..• in <p,q>->r, p and r are of form ..b, ..c• in p –q->r, p and r are programs • in p –q->r (small step), q is atomic • (in all cases, q is a program)

• all laws are valid without these restrictions

Weakest precondition (-;)Specification statement (;-)

•(q -; r) =def the weakest solution of ( _ ;q ⊆ r)

– the same as Dijkstra’s wp(q, r)– for backward development of programs

•(p ;- r) =def the weakest solution of ( p ; _ ⊆ r)

– Back/Morgan’s specification statement– same as p⇝r in RGSep– for stepwise refinement of designs

Weakest precondition (-;)

• Law (-; adjoint to ;)– p ⊑ q -; r iff p;q ⊑ r (galois)

• Theorem– (q -; r) ; q ⊑ r– p ⊑ q -; (p ; q)

• Law provable from the theorems– cf. (r div q) q ≤ r– r ≤ (rq) div q

Theorems

• q’ ⊑ q & r ⊑ r’ => q-;r ⊑ q’-;r’• (q;q’)-;r ⊑ q-;(q’-;r)• q-;r ⊑ (q;s) -; (r;s)

Law of consequence

Frame laws

Part 4Denotational Models

A model is a mathematical structure that satisfies the axioms of an algebra, and realistically describes a useful application, for example, program execution.

Modelsdenotational models

algebraic laws

Some Standard Models:

• propositional calculus (Boole)( {0,1}, ≤, , , not(_) )

• predicate logic (Frege, Heyting)– (ℙS,├, , , not(_), => , ∃, ∀)

• regular expressions (Kleene):– (ℙA*, ⊆, ∪, ; , ɛ , {<a>} , | )

• binary relations (Tarski):– (ℙ(SS), ⊆, ∪, ∩, ; , Id , not(_), converse(_))

• algebra of designs is a superset of these

Model: (EV, EX, PR)

• EV is an underlying set of events (x, y, ..) that can occur in any execution of any program

• EX are executions (e, f,…), modelled as sets of events

• PR are designs (p, q, r,…), modelled as sets of executions.

Set concepts

• ⊑ is (set inclusion)• ⊔ is (set union) • ⊓ is (intersection of sets)• is { } (the empty set)• ⊤ is EV (the universal set)

With (|)

• p | q = {e ∪ f | e ε p & f ε q & e∩f = { } }

– each execution of p|q is the disjoint union of an execution of p and an execution of q

– p|q contains all such disjoint unions

• | generalises many binary operators

Introducing time

• TIM is a set of times for events– partially ordered by ≤

•Let when : EV -> TIM – map each event to its time of occurrence.

Definition of <

•x < y =def not(when(y) ≤ when(x))– x < y & y < x means that x and y can

occur concurrently.

• e < f =def ∀x,y . x∊e & y∊f => x < y– no event of f occurs before an event of e

•If ≤ is a total order, – there is no concurrency, – executions are time-ordered strings

Sequential composition (then)

• p ; q = {ef | e∊p & f∊q & e<f}

• special case: if ≤ is a total order, – e < f means that ef is concatenation

(e⋅f) of strings– ; is the composition of regular

expressions

Theorems

• These definitions of ; and | satisfy the locality and exchange laws.

•(s|p) ; q ⊑ s |(p;q)•(p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’)– Proof: the lhs describes fewer

interleavings than the rhs.

• regular expressions satisfy all our laws for ⊑ , ⊔ , ; , and |

Disjoint concurrency (||)

• p||q =def (p ; q) (q ; p)– all events of p concurrent with all of q .– no interaction is possible between them.

• Theorems: (p||q) ; r p || (q ; r) (p||q) ; (p’||q’) (p;p’) || (q;q’)

– Proof: the rhs has more disjointness constraints than the lhs .

– the wrong way round!

• So make the programmer responsible for disjointness, using interfaces!

Interfaces

• Let q be the body of a subroutine• Let s be its specification• Let (q => s) assert that q is correct • Caller may assume s• Implementer may execute q

Solution

• p*q =def (p|q => p||q) = p|q if p|q ⊑ p||q

⊤ otherwise – programmer is responsible for absence

of interaction between p and q .

• Theorem: ; and * satisfy locality and exchange.– Proof: in cases where lhs ≠ rhs, rhs =

⊤

Problem

• ; is almost useless in the presence of arbitrary interleaving (interference).

• It is hard to prove disjointness of p||q• We need a more complex model– which constrains the places at which a

program may make changes.

Separation

• PL is the set of places at which an event can occur

• each place is ‘owned’ by one thread,– no other thread can act there.

• Let where:EV -> PL map each event to its place of occurrence.

• where(e) =def {where(x) | x ∊ e }

Separation principle

• events at different places are concurrent

• events at the same place are totally ordered in time

• ∀x,y ∊ EV . where(x) = where(y) iff x≤y or y≤x

Picture

time

space

Theorem

• p || q = {ef | e ∊ p & f ∊ q& where(e) where(f) =

{ } }• proved from separation principle

Convexity Principle

• Each execution contains every event that occurs between any of its events.

• ∀e ∊ EX , y ∊ EV. ∀x, z ∊ e .when(x) ≤ when(y) ≤ when(z) => y ∊ e – no event from elsewhere can interfere

between any two events of an execution

A convex execution of p;q

time

space

p q

A non-convex ‘execution’ of p;q

time

space

p q

Conclusion:in Praise of Algebra

• Reusable• Modular• Incremental• Unifying

• Discriminative• Computational• Comprehensible• Abstract

• Beautiful!

Algebra likes pairs

• Algebra chooses as primitives– operators with two operands + , – predicates with two places = , – laws with two operators & v , + – algebras with two components rings

Tuples

• Tuples are defined in terms of pairs.– Hoare triples– Plotkin triples– Jones quintuples – seventeentuples …

Semantic Links

deductions transitions

denotations

algebra

Increments

algebra

Filling the gaps

algebra

Documents

An Algebra for Program Designs