Upload
erik-gibbs
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Data Conformance Checking using Optimal Alignments
Felix Mannhardt, Massimiliano de Leoni, Hajo A. Reijers
(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Sue});(c; {I = 530;D = OK;E = Sue});(f; {E = Pete});
(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Sue});(c; {I = 530;D = OK;E = Sue}); (f; {E = Pete});
Problem (Adapted from Massimiliano de Leoni)
PAGE 2 / 18
Activity d should have occurred, since amount < 5000
«Sue» not authorized to
perform b: is not Assistant
Activity h hasn’t been executed: D cannot be «OK»
(a; {A = 5001;R = Michael; E = Pete}); (b; {V = OK;E = Pete});(c; {I = 530;D = NOK;E = Sue}); (f; {E = Pete});
Department of Mathematics and Computer Science
Department of Mathematics and Computer Science
How Does Data Alignment Work?
• Petri Nets with Data:
• Two new “Moves” with associated “Costs”:• Move with incorrect write operation• Move with missing write operation
• Formulation of an MILP problem for CF Alignment:
•
PAGE 3 / 18
[1] M. de Leoni, W. M. P. van der Aalst (2103). Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming.[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness analysis.
D
B
C
A
𝑋<5000
𝑋 ≥5000
X
+𝑐𝑜𝑠𝑡 (𝑚𝑖𝑠𝑠𝑖𝑛𝑔)=𝜅𝐷𝐴
Department of Mathematics and Computer Science
Current Data Conformance Checker in ProM
PAGE 4 / 18
Data Conformance Checking
Petri Net with Data
Input Output
Event Log
DataAlignment
Control Flow Alignment
DataAlignment
Cost
Shortcomings of the Current Solution
PAGE 5 / 18Department of Mathematics and Computer Science
(a; {A = 3000;R = Michael}); (b; {V = NOK});(c; {I = 530;D = OK}); (f);
Perfect CF Alignment ()
L a b c f
P a b c f
(a; {A = 5001; R = Michael}); (b; {V = OK});(c; {I = 530;D = NOK});(f);
Resulting DF Alignment ()
(a; {A = 3000;R = Michael}); (b; {V = NOK});(c; {I = 530;D = OK}); (f);
Better DF Alignment ()
Department of Mathematics and Computer Science
First Idea (Multi-Alignment Approach)
PAGE 6 / 18
Data Conformance CheckingPetri Net with
Data
Input Output
Event Log
Optimal DataAlignment
Control Flow Alignment
DataAlignment
Cost
Optimal?
Yes
No
Image source: http://commons.wikimedia.org/wiki/File:Pictofigo_-_Idea.png
OR
Cache
Department of Mathematics and Computer Science
(a; {A = 3000;R = Michael}); (b; {V = NOK});(c; {I = 530;D = OK}); (f);
Second Idea (Single-Alignment Approach)
PAGE 7 / 17
<>
<a> <a> <a>
Move in Log Move in ModelMove in Both
<a,b> <a,b>
Move in Log
Move in Both
… …
<a,b,c>
Move in BothMove in
Log
<a,b,c>
Move in Both
<a,b,c,f>
(0,0)
(,)
(0,2)(1,0)
(1,0)
Image source: http://commons.wikimedia.org/wiki/File:Pictofigo_-_Idea.png
Department of Mathematics and Computer Science
Single-Alignment Approach I
• For each node in the search space• Compute a Data Alignment (MILP) for the prefix• Remember and the variable assignment• Use the variable assignment to check if an MILP needed
• A best-first search on the overall cost () returns one optimal Data Alignment• Use of ILP heuristic for A* [2] still possible• never gets better (no negative edges!)• But, our search space is bigger!
PAGE 8 / 18
[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness analysis.
Department of Mathematics and Computer Science
Single-Alignment Approach: Search Space
PAGE 9 / 18
[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness analysis.
vs.
• Two states are equivalent iff• Same marking of “Event Net” & Process Model as in [2]• Same variable assignment wrt. all guards
(a; {A = 5001;R = Michael}); (b; {V = OK});(c; {I = 530;D = OK}); (c; {I = 530;D = NOK});(f);
(a; {A = 5001;R = Michael}); (b; {V = OK});(c; {I = 530;D = OK}); (c; {I = 530;D = NOK});(f);
Department of Mathematics and Computer Science
Comparison: Improvement of Fitness
Average Max0%
10%
20%
30%
40%
50%
60%
Insurance Institute (12,000 Traces)Synthetic Example (1,200 Traces, Length 4-15, 10% Noise)
Ch
ang
e in
Fit
nes
s
PAGE 10 / 18
2 Traces254 Traces
Department of Mathematics and Computer Science
Comparison: Dutch Insurance Institute
Running Time0
20
40
60
80
100
120
Old Multi Single
Sec
on
ds
PAGE 11/ 18
Department of Mathematics and Computer Science
Comparison: Dutch Insurance Institute [1]
PAGE 12 / 18
Max0
20
40
60
80
100
120
Multi Single
# M
ILP
Average0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Multi Single
# M
ILP
[1] M. de Leoni, W. M. P. van der Aalst (2103). Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming.
Department of Mathematics and Computer Science
Comparison: Dutch Insurance Institute
PAGE 13 / 18
Average0
5
10
15
20
25
30
35
40
45
50
Old Multi Single
# Q
ueu
ed S
tate
s
Max0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
64
Old Multi Single
# Q
ueu
ed S
tate
s
Department of Mathematics and Computer Science
Comparison: Synthetic Model (10% Noise)
Running Time0
10
20
30
40
50
60
70
80
Old Multi Single
Sec
on
ds
PAGE 14 / 18
Department of Mathematics and Computer Science
Comparison: Synthetic Model (10% Noise)
PAGE 15 / 18
Max0
200
400
600
800
1,000
1,200
1,400
1,600
Multi Single
# M
ILP
Average0
2
4
6
8
10
12
14
16
Multi Single
# M
ILP
Department of Mathematics and Computer Science
Comparison: Synthetic Model (10% Noise)
PAGE 16 / 18
Average0
100
200
300
400
500
600
700
800
2666
Old Multi Single
# Q
ueu
ed S
tate
s
Max0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
Old Multi Single
# Q
ueu
ed S
tate
s
Department of Mathematics and Computer Science
Comparison Wrap-up
PAGE 17 / 18
• Multi-Alignment Approach• Building CF Alignments (sorted) up to a certain is not
feasible for certain models/traces• Though faster in some cases (Good Fitness)
• Single-Alignment Approach• Again, solving many (smaller) MILPs• Integrated Optimizations:
− Check if guards already fulfilled No MILP− Check if only write operations missing No MILP− Re-use calculated Data Alignments 1 x
MILP
Department of Mathematics and Computer Science
What Next?
• Improve the Implementation• Faster MILP solving by re-use the lpsolve instance?• Reduce memory footprint of both approaches?
• Will a Decomposition of the process model help?
• Case study with Event Log from Italian local police • Event Log about the management of road-traffic fines• Process with multiple decision points• Process with non-trivial guards• Event Log contains data attributes
• Submit Paper to FASE 2014
PAGE 18 / 18
Department of Mathematics and Computer Science
Summary
• Current Data Alignment could be sub-optimal• Two approaches for an optimal Data Alignment
• Multi-Alignment Approach
• Single-Alignment Approach
• Both implemented in ProM• Soon to be integrated in Data Aware Replayer• Which one to use depends on the case
LAST PAGE
MILP
CFAlignment
MILP
CFAlignment
MILP
CFAlignment
…MILP
CFAlignment
Optimal? DataAlignment
Best First SearchData
Alignment