[IEEE 2008 32nd Annual IEEE International Computer Software and Applications Conference - Turku, Finland (2008.07.28-2008.08.1)] 2008 32nd Annual IEEE International Computer Software

A Formal Approach to Developing Reliable Event-Driven Service-OrientedSystems

Ramesh BharadwajNaval Research Laboratory

[email protected]

Supratik MukhopadhyayUtah State [email protected]

Abstract

In this paper, we present a formal framework for de-veloping distributed service-oriented systems in an event-driven secure synchronous programming environment.More precisely, we present a synchronous programminglanguage called SOL (Secure Operations Language) thathas (i) capabilities for handling service invocations asyn-chronously, (ii) strong typing to ensure enforcement of in-formation flow and security policies, and (iii) the ability todeal with failures (both benign and byzantine) of compo-nents. SOL is supported by formal operational semantics.Applications written in our framework can be verified us-ing formal static checking techniques like theorem proving.The framework runs on the top of the SINS (Secure Infras-tructure for Networked Systems) infrastructure that we havedeveloped.

1 Introduction

In this paper, we present a distributed service-orientedasynchronous framework in an event-driven [9] formal syn-chronous programming [2] environment (a’ la’ SCR [7],and Esterel [3]). More precisely, we present a model-drivenapproach based on a synchronous programming languageSOL (Secure Operations Language) that has capabilitiesof handling service invocations asynchronously, providesstrong typing to ensure enforcement of information flowand security policies, and has the ability to deal with failures(both benign and byzantine) of components. In the syn-chronous programming paradigm, the programmer is pro-vided with an abstraction that respects the synchrony hy-pothesis, i.e., one may assume that an external event is pro-cessed completely by the system before the arrival of thenext event. One might wonder how a synchronous program-ming paradigm can be effective for dealing with widely dis-tributed systems where there is inherent asynchrony. Theanswer may seem surprising to some, but perfectly reason-able to others: we have shown elsewhere [6] that under cer-

tain sufficient conditions (which are preserved in our case)the synchronous semantics of a SOL application are pre-served when it is deployed on an asynchronous, distributedinfrastructure. The individual modules follow a “publish-subscribe” pattern of interaction while asynchronous ser-vice invocations are provided using continuation-passing.The design of SOL was heavily influenced by the design ofSAL (the SCR Abstract Language), a specification languagebased on the SCR Formal Model [7]. Applications writtenin our framework can be verified using formal static check-ing techniques like theorem proving. We provide a statictype system to ensure respectively (1) static type sound-ness, and (2) to prevent runtime errors in the presence ofthird party (possibly COTS) component services that mayundergo reconfigurations at runtime due to network faultsor malicious attacks. The framework runs on the top of theSINS (Secure Infrastructure for Networked Systems) [4] in-frastructure that we have developed. SINS is built on the topof the Spread toolkit [1] which provides a high performancevirtual synchrony messaging service that is resilient to net-work faults. A typical SINS system comprises SINS Vir-tual Machines (SVMs), running on multiple disparate hosts,each of which is responsible for managing a set of moduleson that host. SINS provides the required degree of trust forthe modules, in addition to ensuring compliance of moduleswith a set of requirements, including security policies.

2 Related Work

In contrast with the web services paradigm [10], ourframework is based on the synchronous programming lan-guage SOL. In SOL, the message passing between mod-ules (henceforth we will use the term agent for module in-stances) is based on a (push) publish-subscribe. A mod-ule listens to those “controlled variables” of another modulethat it “subscribes to” by including them as its “monitoredvariables”. A module receives the values of its monitoredvariables as input and computes a function whose outputcan change the values of its controlled variables. Serviceinvocations (both synchronous and asynchronous) needed

Annual IEEE International Computer Software and Applications Conference

0730-3157/08 $25.00 © 2008 IEEE

DOI

227


0730-3157/08 $25.00 © 2008 IEEE

DOI 10.1109/COMPSAC.2008.87

227


0730-3157/08 $25.00 © 2008 IEEE

DOI 10.1109/COMPSAC.2008.87

227

to compute the function are dealt uniformly using continu-ation passing. SOL agents run on the SINS platform whichis built on the top of the Spread toolkit that provides guar-anteed message delivery and resilience to network faults.Dynamic reconfiguration the system in response to fail-ures can be obtained using a “hierarchical plumbing” a’la’ [12]. The event-driven publish-subscribe-based interac-tion between the individual modules make SOL ideal forprogramming service-based systems that are deployed innetworks involving sensors and other physical devices hav-ing complex dynamical behavior

In [11], the authors use a synchronous framework forglobally asynchronous designs. However, their frameworkis more suited to a hardware design environment rather thana large scale distributed computing one.

The communicating concurrent processes, the dominantparadigm for distributed application development, has re-mained unchallenged for almost 40 years. Not only is thismodel difficult to use for the average developer, but in ad-dition it fails as a paradigm for designing applications thatmust satisfy critical requirements such as real-time guaran-tees [8]. Therefore, applications developed using conven-tional programming models are vulnerable to deadlocks,livelocks, starvation, and synchronization errors. More-over, such applications are vulnerable to catastrophic fail-ures in the event of hardware or network malfunctions. Herewe present an alternative approach. We embed an asyn-chronous framework in an event-driven synchronous pro-gramming environment (a’ la’ SCR [7], and Esterel [3]). Asopposed to other synchronous programming languages likeESTEREL, SOL is a synchronous programming languagefor distributed applications.

3 SOL: The Secure Operations Language

A module is the unit of specification in SOL and com-prises of type definitions, flow control rules, unit declara-tions, unit conversion rules, variable declarations, servicedeclarations, assumptions and guarantees, and definitions.A module in SOL may include one or more attributes. Theattribute deterministic declares the module as beingfree of nondeterminism (which is checked by the SOL com-piler). Attribute reactive declares that the module willnot cause a state change or invoke a method unless its (vis-ible) environment initiates an event by changing state orinvoking a method (service); moreover, the module’s re-sponse to an environmental event will be immediate; i.e.,in the next immediate step. The attribute continuationdeclares that the module will serve as a continuation forsome (external) service invocation. Each (asynchronous)external service invocation is managed by a continuationmodule that receives the response for the invocation and in-forms the module invoking the service about it. An agent

is a module instance. In the sequel, we will use the termsmodule and agent interchangeably.

The definition of a SOL module comprises a sequenceof sections, all of them optional, each beginning with oneor more keywords.

The type definitions section allows the user to declare“secrecy” types (e.g., secret, classified, unclassified etc.)in order to enforce information flow policies and preventunwanted downgrading of sensitive information from “se-cret” variables to “public” variables. The flow control rulessection provides rules that govern the downgrading/flow ofinformation between variables of different “secrecy” types(e.g., the rule unclassified => classified, signifies that avariable of type unclassified can be assigned to a variableof type classified, i.e., information flow from an unclassi-fied to a classified variable is allowed). The flow controlrules can be used to compute the secrecy types of expres-sions from those of its constituent variables. If not specifiedin the flow control section, information flow between vari-ables/expressions with different secrecy types is allowedonly in the presence of explicit coercion provided by theprogrammer. These policies are enforced statically by atype system. The unit declaration section declares units forthe physical quantities that the module monitors and manip-ulates (e.g., lb, kg, centigrade etc.). This section providesconversion (coercion) rules between the different units (e.g.,kg=2.2 lb). Units of expressions can be computed from theunits of their constituent subexpressions. The variable dec-laration section for reactive/deterministic modules is subdi-vided into five subsections. The continuation variable dec-laration subsection defines continuation variables that willbe used for service invocations. There will be one continu-ation variable for each service invocation in a module. Thetype “continuation” before a variable designates it as a con-tinuation variable (e.g., continuation cont;). Correspond-ing to each node in a distributed system, there will be acontinuation module handling the service invocation asso-ciated with all agents on that node; they transfer the resultsof service invocations to invoking agents through continua-tion variables. The other four subsections declare the “mon-itored” variables in the environment that an agent monitors,the “controlled” variables in the environment that the agentcontrols, “service” variables that only external service in-vocations can update, and “internal” variables introduced tomake the description of the agent concise. A variable decla-ration can specify the unit (declared in the unit declarationsection) of the physical quantity that it is supposed to as-sume values for (e.g., int weight unit lb;). Assignment of avariable/expression with a unit U to a variable with unit Vis allowed only if it is specified in the unit conversion rulessection. In that case, the value of the variable/expression isconverted to the unit V using the corresponding conversionrule before being assigned to a variable with unit V . The

228228228

service declarations section declares the methods that areinvoked within a module along with the services providingthem. It also describes for each method the preconditionsthat are to be met before invoking the method as well asthe post conditions that the return value(s) from the methodis/are supposed to respect. The preconditions and postcon-ditions consist of conjunctions of arithmetic constraints aswell as type expressions. These conditions are enforced dy-namically under a runtime environment.

The assumptions section includes assumptions uponwhich correct operation of the agent depends. Executionaborts when any of these assumptions are violated by theenvironment resulting in the failure variable correspondingto that agent to be set to true. The required safety proper-ties of the agent are specified in the guarantees section.Variable definitions, provided as functions or more gener-ally relations in the definitions section, specify valuesof internal and controlled variables. A SOL module speci-fies the required relation between monitored variables, vari-ables in the environment that the agent monitors, and con-trolled variables, variables in the environment that the agentcontrols. Additional internal variables are often introducedto make the description of the agent concise. In this paper,we often distinguish between monitored variables, i.e., vari-ables whose values are specified by the environment, anddependent variables, i.e., variables whose values are com-puted by a SOL module using the values of the monitoredvariables as well as those returned by the external serviceinvocations. Dependent variables of a SOL module includethe controlled variables, service variables, and internal vari-ables.

3.1 Events

SOL borrows from SCR the notion of events [7]. Infor-mally, an SCR event denotes a change of state, i.e., an eventis said to occur when a state variable changes value. SCRsystems are event-driven and the SCR model includes a spe-cial notation for denoting them. The following are the nota-tions for events that can trigger reactive/deterministic mod-ules. The notation @T(c) denotes the event “condition cbecame true”, @F(c) denotes “condition c became false”,@Comp(cont) denotes that “the result of the service in-vocation associated with the continuation variable cont isavailable”, and @C(x) the event “the value of expression xhas changed”. These constructs are explained below. In thesequel, PREV(x) denotes the value of expression x in theprevious state.

@T(c)def= ¬PREV(c) ∧ c

@F(c)def= PREV(c) ∧ ¬c

@C(c)def= PREV(c) �= c

Events may be triggered predicated upon a condition by in-cluding a “when” clause. Informally, the expression fol-lowing the keyword when is “aged” (i.e., evaluated in theprevious state) and the event occurs only when this expres-sion has evaluated to true. Formally, a conditioned event,defined as

@T(c) when ddef= ¬PREV(c) ∧ c ∧ PREV(d),

denotes the event “condition c became true when condi-tion d was true in the previous state”. Conditioned eventsinvolving the @F and @C constructs are defined along sim-ilar lines. The event @Comp(cont) is triggered by theenvironment in which the agent is running and is receivedas an event by the agent whenever the result of a service in-vocation is received by the continuation module associatedwith the module.

Each controlled and internal variable of a module has oneand only one definition which determines when and how thevariable gets updated. All definitions of a module m implic-itly specify a dependency relation Dm such that a variable adepends on variable b (i.e., (a, b) ∈ Dm) if and only if b ap-pears in the definition of a. Note that variable a may dependon the previous values of other variables (including itself)which has no effect on the dependency relation. A depen-dency graph may be inferred from the dependency relationby taking each variable in the module to be a node and in-cluding an edge from a to b if a depends on b1. It is requiredthat the dependency graph of each module is acyclic.

3.2 Service Invocation

A service variable is defined by a definition in terms ofa service invocation expression. A service invocation ex-pression is of the form A:B(var list)ˆ cont wherethe identifier A is the name/URL of the service, B is thename of the method invoked, var list is the list of vari-ables passed as formal arguments to the method, and contis the passed continuation variable. In this case, the ser-vice variable depends the variables in var list. Whilefrom the syntax, it seems that the services considered hereare different from those used in the popular web servicesframework (e.g., those based on GET and POST), in realitysuch services can be easily cast into our framework (com-pare services developed in the .Net environment using C#).For each service invocation in a module, a distinct contin-uation variable is used. Internally, each service invocationis handled by a continuation module that uses the contin-uation variable to transfer the value to the invoking mod-ule. Corresponding to each node in a distributed system isa continuation module that handles the service invocations

1The notion of a dependency relation is easily extended to the entiresystem.

229229229

for all modules running on that node. A continuation mod-ule has the same structure as the reactive/deterministic onesexcept that it can have an additional subsection in the vari-able declaration section: channel variables. Channel vari-ables receive values from external services. In addition, itcan have another section called triggers that lists actions inthe environment that the module can trigger. Actions in thetrigger section can be defined in the same way as variables.A continuation module for a node in a distributed system isgenerated automatically by the SOL compiler from the SOLdefinitions of the modules running on the node and is keptaway from the view of the programmer.

3.3 Assumptions and Guarantees

The assumptions of a module, which are typically as-sumptions about the environment of the subsystem beingdefined, are included in the assumptions section. It isup to the user to make sure that the set of assumptions isnot inconsistent. Users specify the module invariants in theguarantees section, which are automatically verified bya theorem prover such as Salsa [5].

4 Experiences

Our approach has been used for developing significantlylarge mission-critical service-oriented applications. Theseinclude a torpedo tube control protocol (TTCP), an auto-mated therapeutic monitoring system, a sensor network-based distributed system for soil and water management,and a distributed control system for intelligent managementof an electric power grid. In the soil and water managementsystem, moisture and water flow sensors distributed acrossa field report data to base stations where agents control thesprinklers based on it. In the power grid control system,power factor meters and voltmeters distributed across dif-ferent loads report to base stations where agents switch onstatic VAR compensators for power factor correction basedon the data. Graduate students as well as professional pro-grammers were involved in these projects. The applicationswritten in SOL were first verified using theorem provers forfunctional correctness before submitting to the SOL com-piler for type checking and compilation. One of the factsthat we observed was the reluctance of professional pro-grammers in using SOL due to its unusual syntax (comparedto C++, Java). In order to gain industrial acceptance, we arecurrently trying to embed SOL as a domain-specific exten-sion of Java. The resulting embedding (called SOLj) has aJava-like syntax, with extensions that can again be compiledto Java.

5 Concluding Remarks

SOL is based on ideas introduced in the Software CostReduction (SCR) project [7] which dates back to the lateseventies. The design of SOL was directly influenced bythe sound software engineering principles in the design ofSAL (the SCR Abstract Language), a specification languagebased on the SCR Formal Model [7].

The goal of SINS is to provide an infrastructure for de-ploying and protecting time- and mission-critical applica-tions on a distributed computing platform, especially in ahostile computing environment, such as the Internet. Thecriterion on which this technology should be judged is thatcritical information is conveyed to principals in a mannerthat is secure, safe, timely, and reliable.

References

[1] E. Amir and J. Stanton. The spread wide area group com-munication system. Technical report, Johns Hopkins Uni-versity, 1998.

[2] A. Benveniste, P. Caspi, S. A. Edwards, N. Halbwachs, P. L.Guernic, and R. de Simone. The synchronous languages 12years later. Proceedings of the IEEE, 91(1):64–83, 2003.

[3] G. Berry and G. Gonthier. The Esterel synchronous pro-gramming language: Design, semantics, implementation.Sci. of Computer Prog., 19, 1992.

[4] R. Bharadwaj. SINS: a middleware for autonomous agentsand secure code mobility. In Proc. Second Interna-tional Workshop on Security of Moble Multi-Agent Systems(SEMAS-02), Bologna, Italy, July 2002.

[5] R. Bharadwaj and S. Sims. Salsa: Combining constraintsolvers with BDDs for automatic invariant checking. InProc. 6th International Conference on Tools and Algorithmsfor the Construction and Analysis of Systems (TACAS’2000),ETAPS 2000, Berlin, Mar. 2000.

[6] R. Bharadwaj and S.Mukhopadhyay. From synchrony tosins. Technical report, West Virginia University, 2005.

[7] C. L. Heitmeyer, R. D. Jeffords, and B. G. Labaw. Auto-mated consistency checking of requirements specifications.ACM Transactions on Software Engineering and Methodol-ogy, 5(3):231–261, April–June 1996.

[8] E. A. Lee. Absolutely positively on time: What would ittake? Computer, 38(7):85–87, 2005.

[9] D. Luckham. The Power of Events. Addison Wesley, 2005.[10] E. Newcomer. Understanding Web Services. Addison Wes-

ley, 2002.[11] J.-P. Talpin, P. L. Guernic, S. K. Shukla, R. K. Gupta, and

F. Doucet. Polychrony for formal refinement-checking ina system-level design methodology. In ACSD, pages 9–19,2003.

[12] S. S. Yau, S. Mukhopadhyay, and R. Bharadwaj. Specifica-tion, analysis, and implementation of architectural patternsfor dependable software systems. In IEEE WORDS, 2005.

230230230

Documents

[IEEE 2008 32nd Annual IEEE International Computer Software and Applications Conference - Turku, Finland (2008.07.28-2008.08.1)] 2008 32nd Annual IEEE International Computer Software