Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation
The 6th International Conference on Service Oriented Computing (ICSOC 2008)
December 2008
Automatic Workflow GraphRefactoring and Completion
Jussi Vanhatalo [email protected] Völzer [email protected] Leymann [email protected] Moser [email protected]
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation2
Workflow graphs
� A workflow graph represents the control flow of a business process model
– Business Process Modeling Notation (BPMN)
– UML 2 Activity Diagrams
– Event-driven Process Chains (EPCs)
Processbank account
payment
Processcredit card
paymentx1
p1s
x2
p2 e
Prepareshipment
Receiveorder
Deliverproduct
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation3
Research problem: How to complete a workflow graph?
� Use case: Complete a partial workflow graph at modeling time [Gschwind, Koehler, Wong, BPM 2008]
– Add exclusive and/or parallel gateways to join the open ends
x2
p2 eProcessbank account
payment
Processcredit card
paymentx1
p1s
Prepareshipment
Receiveorder
Deliverproduct
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation4
Soundness of a completion
� A completed workflow graph should be sound:
– No deadlock, and
– No lack of synchronizationDeadlock
Lack of synchronization
x1
PS
BA
CCp1
G0 e1
e3
e2RO
s
x2
p2
x1
PS
BA
CCp1
G3
ROs
DPe
x1
PS
BA
CCp1
G2
x2
ROs
DPe
x1
PS
BA
CCp1
G1
p2
ROs
DPe
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation5
Why would we benefit from an automatic technique?
� Other use cases:
– Local termination detection
– Runtime optimization
� Difficulties:
– A completion may be complicated
– Workflow graphs may be large
• Hundreds of edges and nodes
– Should be sound for execution
a3
a2
p2
p3 x2
a6
a8
x3a7
G0
s
a4
p1x1
a1
a9
a5
e4
e2
e3
e5
e1
a3
a2
p2
p3 x2
a6
a8
x3a7
G1
s
a4
p1x1
a1
a9
a5
x4p4 e
p5x5
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation6
Outline
� Introduction to the research problem
� Automatic completion of workflow graphs
– The refined process structure tree (RPST)
– Fast completion technique
– General completion technique
– Hybrid completion technique
� Automatic refactoring of workflow graphs
� Conclusion
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation7
The refined process structure tree (RPST)
� An R-fragment is a set of edges that form a connected subgraph that has exactly one entry node and exactly one exit node.
� An canonical R-fragment does not overlap with any R-fragment.
– Therefore, the canonical R-fragments form a hierarchy.
� The refined process structure tree is the tree of canonical R-fragments of a workflow graph G, such that the parent of a canonical R-fragment F is the smallest canonical R-fragment of G that properly contains F.
G
a2
a1a4 es xu
w v
a5a3
DH FE
CA
B
G
I
A B C
D E
F G
I
H
[Vanhatalo, Völzer,Koehler, BPM 2008]
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation8
Fast completion technique
s
e1
a3
a2
a1ve2
e3
u
w
G0
s
e1
a3
a2
a1v
u
w
e2
e3
ex?
G1
G3
s
e1
a3
a2
a1v
u
w
e2
e3
ex2x1
A*B
� Fast: Computed in linear time based on the refined process structure tree
� An incomplete technique
� Finds a user-friendly completion: each join of the completion is paired with a split
s
e1
a3
a2
a1v
u
w
e2
e3
e
G2 A*Bx1? x2
?
1) Simplecompletion
& RPST
2) R-fragment based node splitting
3) Determinegateway types
AB
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation9
General completion technique
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
G0
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
y
ez
G1
� A complete technique: finds a completion if it exists
� Finds a compact completion
� Based on state space exploration: exponential worst case complexity
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation10
General completion technique
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
G0
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
y
ez
G1
1) Compute the final states, i.e., S1 = {e1, e3}, S2 = {e2, e3}
2) Compute a set of tests, where
� each test has exactly one end node from each final state, and
� each end node is in at least one test.� I.e., T1 = {e1, e2}, T2 = {e3}
3) Create an XOR-join for each test, and connect the end nodes of this test to this XOR-join
4) Create an AND-join, and connect the XOR-joins of each test to this AND-join
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation11
Existence of a completion
� Some workflow graphs may not have a (sound) completion
� We characterize the workflow graphs for which a completion exists
– We can determine this with an algorithm
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
G0
s
e2a2
a1
v
v a3
a4
x
e1G2
s
e3a2
a1
v
w
u
a3
a5
x
e1
a4e2
y
ez
G1
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation12
Hybrid completion technique
� Combines the best of both worlds:– Fast in practice
• Transforms the workflow graph using the fast technique• The general technique only applied locally• With our library: at most 38 states instead of more than 100,000 states
– A complete technique – Finds a user-friendly completion: join-split pairing, where applicable
a3
a2
p2
p3 x2
a6
a8
x3a7
G1
s
a4
p1x1
a1
a9
a5
x4p4
e
p5
x5B
C
A
x5
?
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation13
The general completion vs. the hybrid completion
a3
a2
p2
p3 x2
a6
a8
x3a7
G2
s
a4
p1x1
a1
a9
a5
x4p4 e
p5x5
a3
a2
p2
p3 x2
a6
a8
x3a7
G1
s
a4
p1x1
a1
a9
a5
ep6
x4
x5
p5
x6
Result of the hybrid completion technique
Result of the general completion technique
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation14
Outline
� Introduction to the research problem
� Automatic completion of workflow graphs
� Automatic refactoring of workflow graphs
– Refactoring technique
– Other use cases of the refined process structure tree
� Conclusion
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation15
Automatic refactoring of workflow graphs
� A well-structured workflow graph has matching pairs of splits and joins
– Easier to comprehend [van der Aalst et al., 2002], and analyze [Vanhatalo et al., ICSOC 2007]
� We present an RPST based refactoring technique that splits nodes to make a workflow graph more well-structured
G
a3
a2a5
a4
a6
a1
s
v
w
eu
xG*
a3
a2
a5
a4
a6
a1s
u2
v
w
e
x1
u1 x2
Original workflow graph Well-structured version
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation16
Other use cases of the refined process structure tree
� Translating a graph-based process model (e.g. BPMN) into a block-based process model (e.g. BPEL) [Vanhatalo et al., BPM 2008]
� Speeding up control-flow analysis [Vanhatalo et al., ICSOC 2007]
� Pattern-based editing [Gschwind et al., BPM 2008]
� Process merging [Küster et al., BPM 2008]
� Understanding large process models
� Subprocess detection
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation17
Outline
� Introduction to the research problem
� Automatic completion of workflow graphs
� Automatic refactoring of workflow graphs
� Conclusion
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation18
Conclusions
� New completion techniques
1. The fast completion technique
• Based on the refined process structure tree• Fast but incomplete
2. The general completion technique
• Complete but potentially slow
3. The hybrid completion technique
• Complete and faster than the general technique
� A new refactoring technique
– Transforms a workflow graph into a more well-structured form
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation19
Backup slides
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation20
Refactoring technique for making structure more explicit
� The RPST based refactoring technique
– Splits nodes based on R-fragments
• If an R-fragment has more than one entry edge (or exit edge), then split the entry node (or the exit node) to normalize the R-fragment
� Results a normal form that is, in general, more well-structured
G
a2
a1a4 es xu
w v
a5a3
D EH G*
a2
a1
a5a3
s
u2 v1
eu1 xa4wv2
E*D*H*
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation21
Use Cases of a Completion Technique
� Complete a partial workflow graph at modeling time
� Runtime optimization:
– Execute models containing inclusive OR-joins faster
– Turn off the dead-path-elimination (DPE) in BPEL
� Local termination detection at the single end node
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation22
The Normal Process Structure Tree
� An N-fragment is a set of nodes that form a connected subgraph that has exactly one entry edge and exactly one exit edge.
� An canonical N-fragment does not overlap with any N-fragment.
– Therefore, the canonical N-fragments form a hierarchy.
� The normal process structure tree is the tree of canonical N-fragments of a workflow graph G, such that the parent of a canonical N-fragment F is the smallest canonical N-fragment of G that properly contains F.
G
a2
a1a4 es xu
wv
a5a3
C
A
B
G
H
J K
a1 a2 a4
u v w
a3 a5
xA B
H
C
J K
G
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation23
The Refined Process Structure Tree
� An R-fragment is a set of edges that form a connected subgraph that has exactly one entry node and exactly one exit node.
� An canonical R-fragment does not overlap with any R-fragment.
– Therefore, the canonical R-fragments form a hierarchy.
� The refined process structure tree is the tree of canonical R-fragments of a workflow graph G, such that the parent of a canonical R-fragment F is the smallest canonical R-fragment of G that properly contains F.
G
a2
a1a4 es xu
w v
a5a3
FD E
C
A
B
G
HI
A B C
D E
F G
I
H
F
evu a1s
a2
G
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation24
More Precisely:
� If anything inside a fragment F is executed, then
– the entry node was executed before, and
– the exit node will be executed afterwards
� A boundary node is an entry if
– all incoming edges are outside F, or
– all outgoing edges are inside F
� A boundary node is an exit if
– all incoming edges are inside F, or
– all outgoing edges are outside F
� A fragment F is a connected subgraph that has
– exactly two boundary nodes,
– one entry, and one exit
� [Tarjan and Valdes, 1980]These boundary nodes are
neither entries nor exits
Not a fragment!
entry exit
fragment
u vF
u v
a2
a1
entry exit
fragment
u v F
a1
entry exit
fragment
F
a3
u v
a4
F
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation25
References
� [HT73] J. Hopcroft and R. E. Tarjan. Dividing a graph into triconnected components. SIAM J. Comput., 2:135–158, 1973.
� [Val78] Jacobo Valdes Ayesta. Parsing flowcharts and series-parallel graphs. PhD thesis, Stanford, CA, USA, 1978.
� [TV80] Robert E. Tarjan and Jacobo Valdes. Prime subprogram parsing of a program. In POPL ’80: Proceedings of the 7th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 95–105, New York, NY, USA, 1980. ACM.
� [JJP94] Richard Johnson, David Pearson, and Keshav Pingali. The program structure tree: Computing control regions in linear time. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), pages 171–185, 1994.
� [Joh95] Richard Craig Johnson. Ecient program analysis using dependence flow graphs. PhD thesis, Ithaca, NY, USA, 1995.
� [GM00] Carsten Gutwenger and Petra Mutzel. A linear time implementation of SPQR-trees. In Joe Marks, editor, Graph Drawing, volume 1984 of Lecture Notes in Computer Science, pages 77–90. Springer, 2000.
� [VVL07] Jussi Vanhatalo, Hagen Völzer, and Frank Leymann. Faster and more focused control-flow analysis for business process models though SESE decomposition. In 5th International Conference on Service-Oriented Computing (ICSOC), volume 4749 of Lecture Notes in Computer Science, pages 43–55. Springer-Verlag Berlin Heidelberg, September 2007.