From Business Process Models to Process-Oriented Software Systems

Embed Size (px)

Citation preview

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    1/37

    2

    From Business Process Models to

    Process-Oriented Software SystemsCHUN OUYANG

    Queensland University of Technology

    MARLON DUMAS

    University of Tartu and Queensland University of Technology

    WIL M. P. VAN DER AALST

    Eindhoven University of Technology

    and

    ARTHUR H. M. TER HOFSTEDE and JAN MENDLING

    Queensland University of Technology

    Several methods for enterprise systems analysis rely on flow-oriented representations of businessoperations, otherwise known as business process models. The Business Process Modeling Notation(BPMN) is a standard for capturing such models. BPMN models facilitate communication betweendomain expertsand analysts andprovide input to software development projects. Meanwhile, thereis an emergence of methods for enterprise software development that rely on detailed process def-initions that are executed by process engines. These process definitions refine their counterpartBPMN models by introducing data manipulation, application binding, and other implementationdetails. The de facto standard for defining executable processes is the Business Process Execu-tion Language (BPEL). Accordingly, a standards-based method for developing process-orientedsystems is to start with BPMN models and to translate these models into BPEL definitions for sub-sequent refinement. However, instrumenting this method is challenging because BPMN modelsand BPEL definitions are structurally very different. Existing techniques for translating BPMNto BPEL only work for limited classes of BPMN models. This article proposes a translation tech-nique that does not impose structural restrictions on the source BPMN model. At the same time,the technique emphasizes the generation of readable (block-structured) BPEL code. An empiri-cal evaluation conducted over a large collection of process models shows that the resulting BPEL

    This work is supported by the AustralianResearch Councilunder theDiscoveryGrant Expressive-ness Comparison and Interchange Facilitation between Business Process Execution Languages(DP0451092).

    Authors addresses: C. Ouyang, Queensland University of Technology, GPO Box 2434, Brisbane,QLD 4001, Australia; email: [email protected]; M. Dumas, University of Tartu andQueenslandUniversity of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia; W. M. P. Van der Aalst,Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven; A. H. M. ter Hofstede, J.Mendling, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.Permission to make digital or hard copies of part or all of this work for personal or classroom useis granted without fee provided that copies are not made or distributed for profit or commercialadvantage and that copies show this notice on the first page or initial screen of a display along

    with the full citation. Copyrights for components of this work owned by others than ACM must behonored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,to redistribute to lists, or to use any component of this work in other works requires prior specificpermission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 PennPlaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected] ACM 1049-331X/2009/08-ART2 $10.00DOI 10.1145/1555392.1555395 http://doi.acm.org/10.1145/1555392.1555395

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    2/37

    2:2 C. Ouyang et al.

    definitions are largely block-structured. Beyond its direct relevance in the context of BPMN andBPEL, the technique presented in this article addresses issues that arise when translating fromgraph-oriented to block-structure flow definition languages.

    Categories and Subject Descriptors: D.2.1 [Software Engineering]: Requirements/Specifica-tionsLanguages; tools

    General Terms: Design, Languages

    Additional K ey Words and Phrases: Business process modeling, Web services, BPMN, BPEL

    ACM Reference Format:

    Ouyang, C., Dumas,M., Vander Aalst, W. M. P., ter Hofstede, A. H. M.,and Mendling, J. 2009. Frombusiness process models to process-oriented software systems. ACM Trans. Softw. Eng. Methodol.19, 1, Article 2 (August 2009), 37 pages.DOI= 10.1145/1555392.1555395 http://doi.acm.org/10.1145/1555392.1555395

    1. INTRODUCTION

    Business Process Management (BPM) is an established discipline for building,maintaining, and evolving large enterprise systems on the basis of businessprocess models [Becker et al. 2003]. A business process model is a flow-orientedrepresentation of a set of work practices aimed at achieving a goal, such as pro-cessing a customer request or complaint, satisfying a regulatory requirement,etc.

    The Business Process Modeling Notation (BPMN) [OMG 2006] is gainingadoption as a standard notation for capturing business processes [Recker et al.2005]. The main purpose of business process models generally, and BPMN mod-els in particular, is to facilitate communication between domain analysts andto support decision-making based on techniques such as cost analysis, scenarioanalysis, and simulation [Rosemann 2003; Recker et al. 2005]. However, BPMN

    models are also used as a basis for specifying software system requirements,and in such cases, they are handed over to software developers. In this set-ting, the motivating question of this article is: How can developers fully exploitBPMN process models produced by domain analysts?

    Meanwhile, the Business Process Execution Language (BPEL) [Jordan andEvdemon 2007] is emerging as a de facto standard for implementing businessprocesses on top of Web service technology. More than a dozen platforms, such asOracle BPEL, IBM WebSphere, and Microsoft BizTalk, support the execution ofBPEL process definitions (seehttp://en.wikipedia.org/wiki/BPELfor a list).BPEL process definitions are more detailed than BPMN ones. For example,they include elements related to data manipulation, Web service bindings, andother implementation aspects that are not present in their counterpart BPMNmodels.

    In this setting, a standards-based approach to process-oriented systems de-velopment is to take BPMN models as input and to translate these models intotemplates of BPEL process definitions for subsequent manipulation by softwaredevelopers. However, the instrumentation of this method is hindered by a fun-damental mismatch between BPMN and BPEL [Recker and Mendling 2006].

    A BPMN model consists of nodes that can be connected through control flow

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    3/37

    From Business Process Models to Process-Oriented Software Systems 2:3

    arcs in arbitrary ways. Meanwhile, BPEL offers block-structured constructs tocapture control flow, plus a notion of control link to connect a collection of activ-ities in an acyclic graph. In other words, BPMN supports arbitrary control flowstructures, whereas BPEL supports only restricted control flow structures. Asa result, existing mappings between BPMN and BPEL [OMG 2006; Mendlinget al. 2006] impose restrictions on the structure of the source models. For ex-ample, they are restricted to BPMN models such that every loop has one singleentry point and one single exit point and such that each point where the flowof control branches has a corresponding point where the resulting branchesmerge back.

    The ensuing problem is to some extent similar to that of translating un-structured flowcharts into structured ones (or GOTO programs into WHILEprograms) [Oulsnam 1982]. A major difference though is that process modelinglanguages include constructs for capturing parallel execution and constructs forcapturing choices driven by the environment (also called event-driven choices),

    as opposed to choices driven by data such as those found in flowcharts. It turnsout that, due to these additional features, the class of structured process modelsis strictly contained in the class of unstructured process models as discussedin Kiepuszewski et al. [2000]. This raises the question: Can every BPMN modelbe translated into a BPEL model?

    This article shows that the answer is yes. However, the resulting transla-tion heavily uses a construct in BPEL known as event handler which servesto encode event-action rules. Specifically, the original BPMN process model isdecomposed into a collection of event-action rules that trigger one another toencode the underlying control flow logic. Arguably, the resulting BPEL codeis not readable and thus difficult to modify and to maintain. For the gener-ated BPEL code to be readable, the control flow logic should be captured us-ing BPELs block-structured control flow constructs and control links, as op-

    posed to a construct intended for event handling. But since BPELs controlflow constructs are syntactically restricted, it is not always possible to gener-ate BPEL code satisfying these readability criteria. Therefore, the article alsoaddresses the question: Are there classes of BPMN models that can be trans-lated into readable BPEL process definitions, that is, process definitions inwhich control flow dependencies in the source model are not encoded as eventhandlers?

    This article identifies two such classes of BPMN models. The first one cor-responds to the class of structured process models as defined in Kiepuszewskiet al. [2000]. Such models can be mapped onto the structured control flow con-structs of BPEL. The second class corresponds to the class of synchronizingprocess models as defined in Kiepuszewski et al. [2003], which can be mappedonto BPEL control links. An acyclic BPMN model, or an acyclic fragment of a

    BPMN model, falls under this class if it satisfies a number of semantic condi-tions such as absence of deadlock. We apply Petri net analysis techniques tostatically check these semantic conditions on the source BPMN model.

    The article also shows how the proposed translation techniques can be com-bined, such that a technique yielding less readable code is only applied whenthe other techniques cannot, and only for model fragments of minimal size. The

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    4/37

    2:4 C. Ouyang et al.

    combined translation technique has been implemented as an open-source tool,namely BPMN2BPEL.

    It is beyond the scope of this work to discuss every detail of a translationfrom BPMN to BPEL. Many of these details, such as how to map tasks andevents into BPEL, are discussed in an appendix of the BPMN standard speci-fication [OMG 2006]. Instead, this article concentrates on open issues arisingfrom the mismatch between BPMN and BPEL discussed before.

    Beyond its direct relevance in the context of BPMN and BPEL, this arti-cle addresses difficult problems that arise generally when translating betweenflow-based languages with parallelism. In particular, the main results are stilllargely applicable to automate a mapping from UML Activity Diagrams [OMG2005] to BPEL.

    The rest of the article is structured as follows: Section 2 overviews BPMNand BPEL, and defines an abstract syntax for each of them. Section 3 presentsthree approaches which comprise an overall algorithm for translating BPMN

    into BPEL. The translation algorithm is then illustrated through two examplesin Section 4. Section 5 discusses the tool support for our translation approachand uses a set of 568 business process models from practice to test whetherthe approach really yields readable BPEL models. Finally, Section 6 comparesthe proposal with related work while Section 7 concludes and outlines futurework. In addition, a formal semantics of BPMN in terms of Petri nets is givenin Appendix A, which can be accessed in the ACM Digital Library.

    2. BACKGROUND: BPMN AND BPEL

    2.1 Business Process Execution Language for Web Services (BPEL)

    BPEL [Jordan and Evdemon 2007] combines features found in classical imper-

    ative programming languages with constructs for capturing concurrent execu-tion and constructs specific to Web service implementation. A BPEL processdefinition consists of a set of interrelated activities. An activity is either a basicor a structured activity. Basic activities correspond to atomic actions such as:invoke, invoking an operation on a Web service; receive, waiting for a messagefrom a partner;empty, doing nothing; etc. To enable the presentation of complexstructures the followingstructured activitiesare defined:sequence, for definingan execution order; flow, for parallel routing; if, for conditional routing;pick,for race conditions based on timing or external triggers; whileandrepeatUntil,for structured looping; and scope, for grouping activities into blocks to whichevent, fault, and compensation handlers may be attached.

    An event handler is an event-action rule associated with a scope. It is en-abled while the scope is under execution and may execute concurrently with

    the scopes main activity. When an occurrence of the event (a message receiptor a timeout) associated with an enabled event handler is registered, the bodyof the handler is executed. The completion of the scope as a whole is delayeduntil all active event handlers have completed. Faultand compensation han-dlers are designed for exception handling and are not used further in thisarticle.

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    5/37

    From Business Process Models to Process-Oriented Software Systems 2:5

    In addition to these block-structured constructs, BPEL provides a constructknown ascontrol linkswhich, together with the associated notions ofjoin con-ditionand transition condition, allow the definition of directed acyclic graphsof activities. A control link between activities A and B indicates that B can-not start before A has either completed or has been skipped. Moreover, B canonly be executed if its associated join condition evaluates to true, otherwise Bis skipped. This join condition is expressed in terms of the tokens carried bycontrol links leading to B. These tokens may take either a positive(true) or anegative(false) value. An activity X propagates a token with a positive valuealong an outgoing link L if and only if X was executed (as opposed to beingskipped) and the transition condition associated to L evaluates to true. Transi-tion conditions are boolean expressions over the process variables (just like theconditions in anifactivity). The process by which positive and negative tokensare propagated along control links, causing activities to be executed or skipped,is called dead path elimination. A control link is always defined inside a flow

    activity. In the definition of our mapping, it is convenient to differentiate be-tweenflow activities that have control links attached to them, from those thatdo not. Accordingly, we use the term link-based flow (or link-flowfor short) torefer to a flow activity that has at least one control link directly attached to it.

    The following is an abstract syntax of BPEL used in the rest of the article.Since BPEL process definitions consist primarily of nested activities, we chooseto represent this abstract syntax using a functional notation. Note that we usea superscriptseq for specifying an ordered list of elements, andset for a normalset of elements. A BPEL process is a (top-level) scopeactivity.

    Definition2.1 (Abstract Syntax of BPEL).

    event = msgReceipt: messageType | alarm: timeSpeccond = bool expression

    activity =invoke: messageType| empty|receive: messageType| reply | wait | assign | exit |sequence: activityseq |if: (cond activity)seq |pick: (event activity)set |while: cond activity|repeatUntil: cond activity|flow: activityset |link-flow: linksInfo activityset |scope: (event activity)set activity

    linksInfo = STRUCT(Links: (activity activity)set ,TransCond: (link cond)set ,JoinCond: (activity cond)set )

    The abstract syntax introduces an abstract datatype for BPEL activities anddefines a number of constructors for this type (one per type of activity). Some of

    these constructors are parameterized. For example, the sequenceconstructortakes as parameter a sequence of activities. The abstract syntax does not coverall constructs, but only those that are used in the rest of the article. Also, thesyntax does not capture some syntactic constraints such as the fact that the setof control links in a process definition cannot form cycles. A more comprehensiveabstract syntax for BPEL can be found in Ouyang et al. [2007].

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    6/37

    2:6 C. Ouyang et al.

    Fig. 1. A set of BPMN elements covering the fundamental control-flow constructs in BPMN.

    2.2 Business Process Modeling Notation (BPMN)

    BPMN [OMG 2006] essentially provides a graphical notation for business pro-cess modeling, with an emphasis on control flow. It defines a Business ProcessDiagram (BPD), which is a kind of flowchart incorporating constructs tailored tobusiness process modeling, such as AND-split, AND-join, XOR-split, XOR-join,and deferred (event-based) choice.

    A BPD is made up of BPMN elements as shown in Figure 1. There are objectsandsequence flows. A sequence flow links two objects in a BPD and shows thecontrol flow relation (i.e., execution order). An object can be anevent, a task, oragateway. An event may signal the start of a process ( start event), the end of aprocess (end event), a message that arrives, or a specific time-date being reachedduring a process (intermediate message/timer event). A task is an atomic activ-ity and stands for work to be performed within a process. There are seven tasktypes:service,receive,send,user,script,manual, andreference. For example, a

    receive task is used when the process waits for a message to arrive from an ex-ternal partner. Also, a task may be none of the previous types, which we refer toas ablanktask. A gateway is a routing construct used to control the divergenceand convergence of sequence flow. There are:parallel fork gateways for creatingconcurrent sequence flows,parallel join gatewaysfor synchronizing concurrentsequence flows, data/event-based XOR decision gateways for selecting one outof a set of mutually exclusive alternative sequence flows where the choice isbased on either the process data (data-based) or external events (event-based),and XOR merge gateways for joining a set of mutually exclusive alternativesequence flows into one sequence flow. In particular, an event-based XOR deci-sion gateway must be followed by either receive tasks or intermediate events tocapture race conditions based on timing or external triggers (e.g., the receipt ofa message from an external partner). Also, for a data-based XOR decision gate-

    way, the default flow (arrowed line with a diagonal slash) can be defined andit is used only if all the other outgoing conditional flow is not true at runtime,thus ensuring that there is always one outgoing flow taken.

    The set of BPMN elements shown in Figure 1 can be used to build BPDs cov-ering the fundamental control flows in BPMN. A BPD, which is made up of thisset of BPMN elements, is hereafter referred to as a core BPD (see Definition 2.2).

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    7/37

    From Business Process Models to Process-Oriented Software Systems 2:7

    For other BPMN elements, such as task looping, exception flow, subprocess in-vocation, we consider them as noncore elements. In this article, we focus on themapping of core BPDs, and the mapping of noncore elements onto BPEL doesnot entail additional challenges.

    Definition2.2 (Core BPD). A core BPD is a tuple BPD = (O, T, E, G, T R ,ES,EI,EE , EIM, E

    IT,G

    F, GJ, GD,GM, GV,F,Cond) where:

    O is a set of objects which is divided into disjoint sets of tasks T, events E,and gateways G,

    T R Tis a set of receive tasks,

    Eis divided into disjoint sets of start events ES, intermediate events EI, andend events EE ,

    EI is divided into disjoint sets of intermediate message events EIMand timereventsEIT,

    Gis divided into disjoint sets of parallel fork gateways GF, join gateways GJ,data-based XOR decision gateways GD, event-based decision gateways GV,and XOR merge gateways GM,

    F O O is the control flow relation, that is, a set of sequence flows con-necting objects,

    Cond: F B is a function mapping sequence flows emanating fromdata-based XOR decision gateways to conditions,1 namely dom(Cond) =F(GD O).

    The relation Fdefines a directed graph with nodes (objects) O and arcs (se-quence flows) F. For any given node x O, input nodes of x are given byin(x) = {y O | yFx} and output nodes ofx are given byout(x) = {y O | xFy }.

    Definition 2.2 allows for graphs which are unconnected, not having start or

    end events, containing objects without any input and output, etc. Therefore weneed to restrict the definition to well-formed core BPDs.

    Definition2.3 (Well-Formed Core BPD). A core BPD as defined inDefinition 2.2 is well formed if relation Fsatisfies the following requirements:

    s ES ,in(s)= |out(s)| =1, that is, start events have an indegree of zeroand an outdegree of one,

    e EE , out(e) = |in(e)| = 1, that is, end events have an outdegree ofzero and an indegree of one,

    x T EI, |in(x)| = 1 and |out(x)| = 1, that is, tasks and intermediateevents have an indegree of one and an outdegree of one,

    g GF GD GV: |in(g )| = 1 |out(g )| > 1, that is, fork and both typesof decision gateways have an indegree of one and an outdegree of more thanone,

    1B is the set of all possible conditions. A condition is a Boolean function operating over a set ofpropositional variables. Note that we abstract from these variables in the control flow definition.We simply assume that a condition evaluates to true or false, which determines whether or not theassociated sequence flow is taken during the process execution.

    ACM Transactions on Software Engineering and Methodology, Vol. 19, No. 1, Article 2, Pub. date: August 2009.

  • 8/13/2019 From Business Process Models to Process-Oriented Software Systems

    8/37

    2:8 C. Ouyang et al.

    g GJ GM, |out(g )| = 1 |in(g)| > 1, that is, join and merge gatewayshave an outdegree of one and an indegree of more than one,

    g GV

    , out(g ) EI

    T R

    , that is, event-based XOR decision gateways mustbe followed by intermediate events or receive tasks,

    g GD, an order < which is a strict total order over the set of outgoingflows of g (i.e., {g }out(g )), and for x out(g ) such that f{g}out(g )(f