An integrated framework for formal development of open distributed systems

An integrated framework for formal development of open

distributed systemsq

Issa Traorea,*, Demissie Aredob, Hong Yea

aDepartment of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada V8W 3P6bNorwegian Computing Center, P.O. Box 114 Blindern, N-0314 Oslo, Norway

Abstract

This paper contributes to the discussion on issues related to the formal development of open distributed systems (ODSs). Deficiencies of

traditional formal notations in this setting are highlighted. We argue that there is no single formalism exhibiting all the features required to

capture properties of ODSs. As a solution, we propose an integrated development framework that involves two notations: the Unified

Modeling Language and the Prototype Verification System. We discuss the motivation for the choice of these notations, provide an overview

of a CASE tool we have developed to support the proposed framework, and present a case study to demonstrate usability of our approach.

q 2003 Elsevier B.V. All rights reserved.

Keywords: Formal methods; Open distributed systems; Unified Modeling Language; Prototype Verification System; Multi-formalism; Object-orientated

programming

1. Introduction

Motivated by the need for modeling the dynamic

features of object-oriented programming languages and

openness in distributed applications, the study of open, and

dynamically extendable systems has become a very

popular research area. In fact, since late 1980s, much

research within theoretical computer science has been

directed towards this kind of systems. The emphasis has

mainly been put on semantic issues; in particular, on how

such systems should be represented faithfully and fully

abstracted. The emphasis in our work is not on the

semantics of systems, rather on the formal system

development.

On one hand, most specification techniques supporting

the development of open distributed systems (ODSs), e.g.

the Unified Modeling Language (UML) [6], lack formal

semantics and the rigorous reasoning facilities necessary for

formal development of software systems. On the other hand,

the existing formal development methods suffer from

certain limitations, which constrain their application to

large scale projects in the industrial settings, especially their

esoteric feature is a major obstacle. Moreover, we are not

aware of any conventional formal development method that

is capable to fully handle the flexibility, extendability and

dynamic features characterizing contemporary distributed

systems. In RMODP [4], formal description techniques such

as LOTOS, Z, SDL and Estelle are proposed for the

specification of systems from various viewpoints. Yet, as

pointed out in Ref. [1], these languages are only partly

satisfactory.

Taking the above remarks into account, the challenge is

to build a development framework and a supporting CASE

tool that exhibits the following capabilities:

– can be grasped and used in an industrial context;

– supports description of major aspects such as openness

and dynamic reconfigurability exhibited by ODSs;

– supports formal system specifications that are amen-

able to rigorous reasoning;

– have strong and efficient tool support.

In this context, based on the evaluation of several

existing methods and CASE tools, we propose a multi-

formalism approach where we integrate existing technol-

ogies. More specifically, we propose a formal development

framework and a supporting tool that is based on the UML

for specification and refinement, and on the Prototype

Verification System (PVS) [7] for semantic foundation and

rigorous reasoning.

0950-5849/$ - see front matter q 2003 Elsevier B.V. All rights reserved.

doi:10.1016/j.infsof.2003.09.012

Information and Software Technology 46 (2004) 281–286

www.elsevier.com/locate/infsof

q An earlier version appeared in the Proceedings of ACM Symposium on

Applied Computing (SAC03), March 9–12, 2003, Melbourne, FL, USA.* Corresponding author. Tel.: þ1-250-721-8697; fax: þ1-250-721-6052.

E-mail address: [email protected] (I. Traore).

http://www.elsevier.com/locate/infsof

The rest of the paper is organized as follows: In Section 2

we give a brief overview of UML and discuss the rational

behind our choice of notations. In Section 3, we briefly

discuss our formalization approach. Then, in Section 4, we

present a case study of a network reconfiguration protocol.

Finally, in Section 5 we make some concluding remarks.

2. Modeling open distributed systems in UML

The choice of the UML notations was dictated by the fact

that it is built on an object-oriented paradigm and provides

several capabilities such as extension mechanisms (e.g.

stereotyping), dynamic and multiple classification, which

are useful for the description of open distributed systems

(ODSs). In addition, UML provides underlying method-

ology for specification and refinement, a graphical notation,

which contributes to communicability and friendliness, and

very importantly, UML is an international standard for

object-oriented modeling techniques.

In spite of the benefits it provides, UML has limitations

in the context of the formal development of ODSs. The

graphical UML constructs are not precise enough to achieve

a complete and formal descriptions of ODSs. For instance,

in Ref. [2] several incompleteness in the static semantic

model of the UML are reported, especially concerning the

definitions of the concepts of aggregation, inheritance,

constraints on inheritance hierarchies and abstract operation

descriptions. In order to fill this gap, there is a need for

extending the UML notations to achieve two main

objectives.

(i) To improve description of additional constraints on

modeling elements, e.g. invariants on classes and

types.

(ii) To provide formal semantics for different constructs

involved.

Currently, the first issue is generally addressed by using

natural language, which results in ambiguities and mis-

interpretations. An approach is to use the Object Constraint

Language (OCL) [8], an assertional language that is used to

specify well-formedness of modeling abstractions provided

by UML. Unfortunately, expressiveness of OCL is rela-

tively limited in the context of dynamic aspects of systems,

and as pointed out in Ref. [2], the semantic of OCL is not

mathematically defined. Hence, in order to achieve the

objectives mentioned earlier, we decided to use the PVS as

underlying semantic foundation for our development

framework.

3. Formalization of graphical OO models

Several works have attempted to provide a mathematical

basis for the concepts underlying object-oriented graphical

models using different approaches [2]. Some of the

approaches consist of adapting or extending a novel or

existing formal description technique with object-oriented

concepts. Others derive a formal specification from the

semi-formal (or informal) model built with existing object-

oriented notations such as UML. The main problem with

these approaches is the fact that users should have to deal

with a certain amount of formal artifacts, and as we have

already argued, this can be a barrier to their practical

usability in the industrial settings.

A more workable approach that is adopted in our

development framework integrates semi-formal modeling

techniques with formal methods by assigning formal

semantics to the graphical modeling constructs of an

existing notation. In this case, the formal ‘stuff’ is hidden

behind the graphical notations, and users deal with graphical

model they develop, while the formal stuff is automatically

processed at the back-end.

UML consists of nine standard diagrams; our formaliza-

tion work has focused so far only on three of them, namely

class, sequence, and statechart diagrams. In Section 3.1, we

present a brief sketch of a formal semantic definition we

proposed for the UML statechart.

3.1. Formal semantics of UML statecharts

The formalization scheme we adopted for the UML

statecharts consists of definition of formal semantics as a

transition system consisting of the triple (init, V, next),

where init is an initialization predicate that describes initial

global states, V defines the global state in which the machine

may be at a given time, and next is a global transition

relation that describes the execution sequence of the

underlying state machine.

A statechart diagram consists of a collection of state

nodes, also called state vertex. The state nodes are related by

transitions that are triggered by events, and may result in

execution of a series of actions. A transition is characterized

by a source state, a target state, a triggering event, a guard

condition, and an associated action, which is executed when

the transition is fired. Hence, we define the abstract syntax

of a transition as a PVS record type consisting of fields that

capture these elements.

Transition: TYPE þ ¼ [# source: Vertex, trigger: Event,

guard: Condition, effect: Action, target: Vertex #]

The main step in the formalization approach adopted in

our work consists of defining a set of elementary predicates

that describe properties of a system state or a system

operation. We represent the concrete state V as a record type

whose fields correspond to the concrete state variables. We

define three categories of predicates associated, respect-

ively, to the notions of state vertex, guard condition, and

action. The predicate associated to a state corresponds to

a condition that must hold for the state to be activated.

I. Traore et al. / Information and Software Technology 46 (2004) 281–286282

The predicate associated to an action corresponds to a

condition that holds after the execution of the action. This

can be assimilated by postcondition of the action. The state

and the guard conditions are functions of the current values

of the state variables, whereas the postcondition of an action

is a function of both the current and the future values of state

variables.

VC: TYPE ¼ [#current: V, next: V#]

vc: VAR VC; v: VAR V

%Predicates for states, conditions, and actions

pred: [Vertex– . PRED[V]]

pred: [Condition– . PRED[V]]

pred: [Action– . PRED[VC]]

In a statechart diagram, more than one state can be active

at once. If a simple state is active, then all the composite

states that contain it either directly or transitively are also

active. The set of all the states that are active simultaneously

defines what is called a state configuration. We define the

initial state configuration initConf of a statechart as a set of

all the default states involved in the diagram.

Configuration: TYPE þ ¼ finite_set[Vertex]

init: PRED[V] ¼ pred(initConf)

A transition is enabled if the event generated matches its

trigger, its guard condition is true and its source state is

active. An enabled transition may be eligible for firing.

Firing a transition will activate its target state and leads to an

execution of its action. Below we define the predicates

enabled and fired to describe, respectively, the enabling and

firing conditions of a transition. More than one transition

may be enabled within a state machine, resulting in a

conflict. The set of transitions that will actually be fired in

the whole state machine is a maximal set of enabled

transitions with the highest priorities, and that are not

mutually conflicting.

e: VAR Event; tr, tr1, tr2: VAR Transition

a: VAR set[Transition]; v1, v2: VAR V

enabled (e,tr,v): bool ¼ pred(source(tr))(v) AND

(trigger(tr) ¼ e) AND pred(guard(tr))(v)

fired (tr,v,v1): bool ¼ pred(target(tr))(v1) AND pred(ef-

fect(tr))(vc) WHERE vc ¼ (# current: ¼ v, next: ¼ v1#)

maxEnabled(a,v,e): bool ¼

subset?(a, transitions (sm)) AND

FORALL (tr: (a)): enabled(e,tr,v) AND

(FORALL (tr1: (a)): NOT conflict(tr,tr1)) AND

(FORALL (tr2lenabled(e,tr2,v) AND

NOT member(tr2,a)): hasPriority(tr,tr2) OR

samePriority(tr,tr2))

Semantics of UML statechart is based on the run-to-

completion assumption, meaning that events are dispatched

and processed one at a time. At the beginning of a run-to-

completion step, a statechart is in a stable state configur-

ation, with all the actions completed. At the end of the step,

the same conditions apply as well. Before starting a run-to-

completion step, a maximum set of enabled transitions is

chosen non-deterministically and then fired. We define

below a function called eprocess that describes event

processing operations. Event processing consists of selec-

tion and firing a maximal set of enabled transitions. In the

informal semantics of UML statecharts, there are no

assumptions on the order of event dequeuing; we adopt in

this work a simple priority scheme based on the first comes,

first served principle. We also define the global transition

relation next based on function eprocess.

c1, c2: VAR configuration; st: VAR set[Transition]

eprocess(e,v,v1): bool ¼

EXISTS st: subset?(st, transitions(sm)) AND

maxEnabled(st,v,e) ¼ .

(FORALL (tr: (st)): fired(tr,v,v1))

next(v1, v2): bool ¼ EXISTS (e: (events(sm)), c1, c2):

(pred(c1)(v1) AND pred(c2)(v2))

¼ .eprocess(e,v1,v2))

4. A case study

In this section, we illustrate usability of our approach

through a case study of a network reconfiguration protocol.

4.1. Summary of requirements

The IEEE 1394 tree identify protocol [3] is used by the

1394 high performance serial bus for leader election tasks. It

has an open and scalable architecture that allows addition

and removal of devices and peripherals at any time. After a

bus-reset, i.e. when a node is added to, or removed from the

network, all the nodes in the network have equal status and

they know only nodes to which they are directly connected.

The IEEE 1394 tree identify is based on a leader election

algorithm that allows the election of a leader (root) that will

act as a manager of the bus for subsequent phases of the

IEEE 1394 protocol.

4.2. UML specifications

We describe the system by providing UML class and

statechart diagrams shown, respectively, in Figs. 1 and 2.

The class diagram consists of two, classes: Node and

Network classes. The class Node represents individual

nodes involved in the network. A name, possibly a parent

node, and three collections of nodes corresponding,

respectively, to the neighbors, the actual children and the

potential children characterize an instance of Node.

Potential children are represented by the role name

I. Traore et al. / Information and Software Technology 46 (2004) 281–286 283

pending. The class Network corresponds to the collection

of nodes involved in the network. An instance of Node may

be either a regular child or the manager in an instance of

Network. The two associations between the two classes

capture this property.

The statechart diagram shown in Fig. 2 describes the

dynamic behavior of the Network class in terms of the

messages it sends and receives. Initially, a Network object is

in an initial state Init that corresponds to the state entered

immediately after the bus-reset. Then the election starts with

the occurrence of the electLeader event, bringing the

Network object to the Electing state. If a leader is elected,

represented by condition c4, the object will move to the

LeaderElected state ending the statechart. If a cycle is

detected, represented by condition c5, an error is reported,

and the object evolves to the ErrorDetected state. The

Electing state is a concurrent state whose direct substates,

also called regions, describe the individual behavior of the

elements, e.g. the nodes, involved in the collection

underlying a Network object. Dividing it using dashed line

specifies the regions of a concurrent state.

Given i such that 1 # i # N; region NodeiStatus starts in

a Waiting state where the corresponding node waits for ‘be

my parent’ request represented by event beMyParent from

its neighbors. If a request is received from a neighbor that is

not a child (condition c1), an acknowledgement is generated

(action accept), followed by an acknowledgement of the

acceptance (event confirm), and an update of the number of

children of the node (action update). The update may lead to

the Voting state, in case the number of neighbors that are not

children is exactly 1. In that state, the node can send a be my

parent request represented by event vote to the neighbor.

The node may also receive at the same time a be my parent

request from the same node resulting in contention

described by state Contention. After a timeout, the node

returns to the Voting state. If the request is accepted

(condition c2), the node evolves to the ParentElected state,

which represents the final state of the NodeiStatus region.

When all the nodes but one have their parents elected, the

election process is completed, and the node without any

parent becomes the elected leader (condition c4).

4.3. Complementary semantics and system properties

The standard UML document [6] provides only a partial

specification of a system. The UML specification produced

needs to be extended by providing complementary seman-

tics for the elementary features, e.g. state, actions,

conditions, etc. and properties involved using languages

like the OCL [8] or any mathematical or textual languages.

We give some examples of complementary semantics and

properties of the statechart diagram shown in Fig. 2 using

OCL. The context of the expressions is a Network object,

and two interacting Node objects k and n in a collection. Let

us say that node k corresponds to one of the nodes whose

behavior is described by StatuskNode. As example of guard

condition, we define c1 as follows:

c1(n: Node,k: Node): Boolean

self.nodes ! includes(n) AND

self.nodes ! includes(k) AND

k.children ! excludes(n) AND

k.neighbours ! includes(n)

As an example of action, we specify update by the

predicate predUpdate. Its outcome consists of moving the

requesting node from the pending list (i.e. list of the nodes

for which a beMyParent request has been received) to the

children list.

predUpdate(k: Node, n: Node): Boolean

k.children ! includes(n) AND (n.parent ¼ k) AND

k.pending ! excludes(n)

We also give an example of property named Prop1 that

characterizes a Network object. Prop1 ensures that there is

at most one root in the network.

Fig. 1. Class diagram of IEEE 1394 protocol.

Fig. 2. Statechart diagram of IEEE 1394 protocol.


Prop1:

self.nodes !

forAll(p1, p2lp1 ¼ self.root AND p2 ¼ self.root

implies p1 ¼ p2)

4.4. Formal analysis

In order to formally reason about the UML models, we

need a formal description of the system. As we already

stated, we use PVS for that purpose. More specifically, we

translate the OCL specification into PVS, and based on our

semantic framework, we do the same for the UML graphical

specification. The two PVS specification fragments in UML

and OCL are integrated into a single and homogeneous PVS

specification that serves as a basis for the formal verifica-

tion. We have developed a supporting environment, called

the Precise UML Development Environment (PrUDE), to

assist specifiers in generating PVS models. PrUDE also

gives the specifier the possibility to invoke PVS tools either

in a batch mode or interactively. Fig. 3 shows a snapshot of

property verification using the PrUPE tool. The lower

windows show the log report generated by running the

PVS tool in batch mode. The verification of the model

is conducted by expressing system properties as PVS

theorems, and then by checking them using the PVS tools.

For instance, property Prop1 (cf. Section 4.3), which states

that there is at most one root in the network, is expressed in

PVS as follows:

p1, p2: VAR VNode

prop1: THEOREM

(member(p1, nodes(v)) AND member(p2, nodes(v))

) (root(v) ¼ p1 AND root(v) ¼ p2 ) p1 ¼ p2))

By invoking the PVS theorem-prover interactively from

PrUDE, we obtain the following proof of property

Prop 1:

prop 1:

l– –

{1} FORALL (p1, p2: VNode, v: V): (member(p1,nodes

(v)) AND member(p2,nodes(v)) ¼ . (root(v)) ¼ p1 AND

root(v) ¼ p2 ¼ .p1 ¼ p2))

Rerunning step: (SKOSIMP*)

Repeatedly Skolemizing and flattening,

this simplifies to:

Fig. 3. Automatic verification of Prop 1 using PrUDE.

I. Traore et al. / Information and Software Technology 46 (2004) 281–286 285

prop 1:

{-1} member(p1!1, nodes(v!1))

{-2} member(p2!1, nodes(v!1))

{-3} root(v!1) ¼ p1!1

{-4} root(v!1) ¼ p2!1

l– –

{1} p1!1 ¼ p2!1

Rerunning step: (EXPAND “member”)

Expanding the definition of member,

this simplifies to:

prop1:

{-1} nodes(v!1)(p1!1)

{-2} nodes(v!1)(p2!1)

[-3] root(v!1) ¼ p1!1

[-4] root(v!1) ¼ p2!1

l– –

[1] p1!1 ¼ p2!1

Rerunning step: (GROUND)

Applying propositional simplification and

decision procedures,

Q.E.D.

Run time ¼ 0.17 s.

Real time ¼ 0.22 s.

NIL

PVS(33):

Conducting interactive proof-checking, even from the

PrUDE environment, is quite often tedious and time

consuming. The properties expressed in our framework

are based on a common template. Using that general

structure, we have succeeded in defining general PVS proof

strategies based on the notion of configuration pairs [5].

Each strategy consists of primitive strategies, and can be

used to check automatically system properties. The follow-

ing is a proof strategy for a statechart:

(defstep property-proof-strategy

(then (auto-rewrite “user_defined_axiom1” “user_de-

fined_axiom2”

…)

(skosimp)

(expand “ConfigurationPair”)

(grind)

)

)

The proof strategy denoted property-proof-strategy,

collects the complementary semantics, e.g. user-defined

axioms as auto-rewrite rules, invokes skosimp command to

replace universal quantifications in the target formulas with

constants. The expand command is then used to expand the

configuration pair definition. Finally, the grind command, a

catch-all strategy is invoked to apply all the necessary

simplifications and complete the proof. These proof

strategies are implemented in PrUDE and can be invoked

to check automatically any proof obligation based on our

framework. If the proof fails, a counterexample is generated

to trace errors in the original UML model. Fig. 3 presents a

snapshot of automatic verification of property Prop 1: the

property is edited using a property editor (the upper

window) and then checked automatically by invoking the

prover.

5. Concluding remarks

In this paper, we have presented a framework for formal

development of ODSs and an automated platform that

supports the framework. One of the main objectives of our

platform is to minimize the formal artifacts that users of the

platform should have to deal with. This in turn facilities the

industrial usability of the platform. In this respect, we have

decided to use the PVS-SL as underlying semantic

foundation and not as a specification language. As a result,

the user need not have an in-depth knowledge about the PVS

formal notation and proof system. The PVS-SL offers a very

general semantic foundation and a set of powerful tools. It is

highly expressive and offers several mechanisms for formal

analysis. In order to enhance the automation of the formal

verification process, we have defined suitable proof patterns

and strategies for the kinds of properties that can be derived

from our semantic model. These strategies are implemented

in the current version of the PrUDE tool, and allow

automatic processing of proof obligations.

References

[1] O.J. Dahl, O. Owe, Formal Methods and the RMODP, Research Report

No. 261, Department of Informatics, University of Oslo, Norway, 1998.

[2] A. Evans, UML class diagrams-filling the semantic gap, Technical

Report, York University, 1998.

[3] IEEE, IEEE Standard for a High Performance Serial Bus, Standard

1394-1995, August 1995.

[4] ISO-IEC JTC1/SC21/WG7, The Reference Model of Open Distributed

Processing, 1995.

[5] M. Liu Yanguo, Proof Patterns for UML-based Verification, Master

Thesis, ECE Department, University of Victoria, Victoria, Canada,

October 2002.

[6] The OMG, OMG Unified Modeling Language Specification, version

1.3, OMG standard document, June 1999.

[7] S. Owre, N. Shankar, J. Rushby, D.W. Stringer-Calvert, PVS Language

Reference, version 2.3, September 1999.

[8] J.B. Warmer, A.G. Kleppe, The Object Constraint Language:

Precise Modeling with UML, Addison Wesley Longman Inc., Readign,

MA, 1999.


Documents

An integrated framework for formal development of open distributed systems