Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
An Introduction to Process Mining and ConformanceChecking
Thomas Chatain
LSV, ENS Paris-Saclay, Cachan, [email protected]
Collaborations with:
Mathilde Boltenhagen, Josep Carmona, Boudewijn van Dongen
June 6, 2019
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Mining
Process Mining
Discovery of process models from real process executions
Input: Event Logs Data recorded from process executions, e.g.:
I analyze usage of an e-commerce web site
I analyze medical processes in hospitals
I improve user interface
I detect deviant behavior
Output: Process Models
open andregister
transactionchecksender
processcash
payment
processcheque
payment
processelectronicpayment
checkreceiver
transfermoney
notify andclose
transaction
2/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Mining
I At the interface betweenI Data scienceI Business Process ManagementI Machine learningI Formal models:
models used as representation for data
I Young and very active research domain
I New conference ICPMI 50 submissions. . .
3/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Many (Industrial) Process Mining Tools
I Celonis
I Disco
I Minit
I ProM
I . . .
4/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp doctor age cost
5781 make X-ray 23-1-2014:10.30 Dr. Jones 45 70.005541 blood test 23-1-2014:10.18 Dr. Scott 61 40.005833 blood test 23-1-2014:10.27 Dr. Scott 24 40.005781 blood test 23-1-2014:10.49 Dr. Scott 45 40.005781 CT scan 23-1-2014:11.10 Dr. Fox 45 1200.005833 surgery 23-1-2014:12.34 Dr. Scott 24 2300.005781 handle payment 23-1-2014:12.41 Carol Hope 45 0.005541 radiation therapy 23-1-2014:13.57 Dr. Jones 61 140.005541 radiation therapy 23-1-2014:13.08 Dr. Jones 61 140.00
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
5781 make X-ray 23-1-2014:10.305541 blood test 23-1-2014:10.185833 blood test 23-1-2014:10.275781 blood test 23-1-2014:10.495781 CT scan 23-1-2014:11.105833 surgery 23-1-2014:12.345781 handle payment 23-1-2014:12.415541 radiation therapy 23-1-2014:13.575541 radiation therapy 23-1-2014:13.08
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
5781 make X-ray 23-1-2014:10.305541 blood test 23-1-2014:10.185833 blood test 23-1-2014:10.275781 blood test 23-1-2014:10.495781 CT scan 23-1-2014:11.105833 surgery 23-1-2014:12.345781 handle payment 23-1-2014:12.415541 radiation therapy 23-1-2014:13.575541 radiation therapy 23-1-2014:13.08
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
5781 make X-ray 23-1-2014:10.305781 blood test 23-1-2014:10.495781 CT scan 23-1-2014:11.105781 handle payment 23-1-2014:12.415541 blood test 23-1-2014:10.185541 radiation therapy 23-1-2014:13.575541 radiation therapy 23-1-2014:13.085833 blood test 23-1-2014:10.275833 surgery 23-1-2014:12.34
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
5781 make X-ray 23-1-2014:10.305781 blood test 23-1-2014:10.495781 CT scan 23-1-2014:11.105781 handle payment 23-1-2014:12.415541 blood test 23-1-2014:10.185541 radiation therapy 23-1-2014:13.085541 radiation therapy 23-1-2014:13.575833 blood test 23-1-2014:10.275833 surgery 23-1-2014:12.34
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
5781 make X-ray5781 blood test5781 CT scan5781 handle payment5541 blood test5541 radiation therapy5541 radiation therapy5833 blood test5833 surgery
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
XBCPBRRBS
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Event Logs and Data Extraction1
patient activity timestamp
XBCPBRRBS
〈X ,B,C ,P〉〈B,R,R〉〈B,S〉
1Acknowledgements to Wil van der Aalst5/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Discovery
Automatic construction of a model N from an event log L that represents a partialobservation of a system S.
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
L
−→
N
6/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Discovery
Automatic construction of a model N from an event log L that represents a partialobservation of a system S.
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
L
−→
N
6/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Discovery
Automatic construction of a model N from an event log L that represents a partialobservation of a system S.
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
L
−→
N
6/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
One Process Discovery Technique: Inductive Mining
Credits: Wil van der Aalst
7/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Process Discovery: Several Solutions
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
8/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking
Define quality criteria to evaluate models:
I N fits L if L ⊆ L(N)
I N is precise if L(N)\L is small
I N generalizes L with respect to S if L(N) contains some unobserved behaviorin L(S)\L
I simplicity. . .
9/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking: Example
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
fittingfairly precise
simplegeneralizing
fittingvery imprecise
simplegeneralizing
fittingvery precisenot simple
not generalizing
10/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking: Example
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
fittingfairly precise
simplegeneralizing
fittingvery imprecise
simplegeneralizing
fittingvery precisenot simple
not generalizing
10/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking: Example
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉 fitting
fairly precisesimple
generalizing
fittingvery imprecise
simplegeneralizing
fittingvery precisenot simple
not generalizing
10/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking: Example
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉 fitting
fairly precisesimple
generalizing
fittingvery imprecise
simplegeneralizing
fittingvery precisenot simple
not generalizing
10/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conformance Checking: Example
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉 fitting
fairly precisesimple
generalizing
fittingvery imprecise
simplegeneralizing
fittingvery precisenot simple
not generalizing10/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Measuring Precision – State of the Art
Log:
〈a, b, c , d〉〈a, c , b, e〉〈a, f , g , h〉
〈a, b, i , b, c , d〉
Alignment-based precision metrics [Adriansyah et al.]
I Build a representation AΓ(N,L) of the part of the behaviour of the modelwhich is covered by the log
I Count escaping points in AΓ(N,L)
Drawbacks of alignment-based precision:
I Short sighted: only a step ahead of log behavior is considered
I Non-monotonic: observing a new trace may unveil new imprecisions
11/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Measuring Precision – State of the Art
Log:
〈a, b, c , d〉〈a, c , b, e〉〈a, f , g , h〉
〈a, b, i , b, c , d〉
Alignment-based precision metrics [Adriansyah et al.]
I Build a representation AΓ(N,L) of the part of the behaviour of the modelwhich is covered by the log
I Count escaping points in AΓ(N,L)
Drawbacks of alignment-based precision:
I Short sighted: only a step ahead of log behavior is considered
I Non-monotonic: observing a new trace may unveil new imprecisions
11/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Measuring Precision – State of the Art
Log:
〈a, b, c , d〉〈a, c , b, e〉〈a, f , g , h〉
〈a, b, i , b, c , d〉
Alignment-based precision metrics [Adriansyah et al.]
I Build a representation AΓ(N,L) of the part of the behaviour of the modelwhich is covered by the log
I Count escaping points in AΓ(N,L)
Drawbacks of alignment-based precision:
I Short sighted: only a step ahead of log behavior is considered
I Non-monotonic: observing a new trace may unveil new imprecisions
11/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Measuring Precision – State of the Art
Log:
〈a, b, c , d〉〈a, c , b, e〉〈a, f , g , h〉〈a, b, i , b, c , d〉
Alignment-based precision metrics [Adriansyah et al.]
I Build a representation AΓ(N,L) of the part of the behaviour of the modelwhich is covered by the log
I Count escaping points in AΓ(N,L)
Drawbacks of alignment-based precision:
I Short sighted: only a step ahead of log behavior is considered
I Non-monotonic: observing a new trace may unveil new imprecisions
11/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Measuring Precision – State of the Art
Log:
〈a, b, c , d〉〈a, c , b, e〉〈a, f , g , h〉〈a, b, i , b, c , d〉
Alignment-based precision metrics [Adriansyah et al.]
I Build a representation AΓ(N,L) of the part of the behaviour of the modelwhich is covered by the log
I Count escaping points in AΓ(N,L)
Drawbacks of alignment-based precision:
I Short sighted: only a step ahead of log behavior is considered
I Non-monotonic: observing a new trace may unveil new imprecisions
11/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Alignments
Alignment
Given a trace σ and a model N,an alignment is a full run u of N which minimizes its distance to σ.
Example:For trace 〈a, f , c , h〉,best alignment: 〈a, f , g , h〉
Important notion in process mining:
I for computing fitness and precision,
I for detecting deviations,
I for model enhancement techniques.
12/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Alignments
Alignment
Given a trace σ and a model N,an alignment is a full run u of N which minimizes its distance to σ.
Example:For trace 〈a, f , c , h〉,best alignment: 〈a, f , g , h〉
Important notion in process mining:
I for computing fitness and precision,
I for detecting deviations,
I for model enhancement techniques.
12/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Alignments
Alignment
Given a trace σ and a model N,an alignment is a full run u of N which minimizes its distance to σ.
Example:For trace 〈a, f , c , h〉,best alignment: 〈a, f , g , h〉
Important notion in process mining:
I for computing fitness and precision,
I for detecting deviations,
I for model enhancement techniques.
12/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments and Precision
13/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments – Motivation
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Motivation
In order to measure precision, find the run of N which is most misaligned with thelog L.
Here: 〈A,B,D,E , I 〉
14/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments – Motivation
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Motivation
In order to measure precision, find the run of N which is most misaligned with thelog L.
Here: 〈A,B,D,E , I 〉
14/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
I L ⊂ Σ∗: a log (set of traces) of an observed system
I N: a (labeled) Petri net model (constructed by process discovery)
Definition (Anti-alignment)
An (n,m)-anti-alignment of a model N w.r.t. a log L is a run γ ∈ L(N) such that
I |γ| ≤ n and
I for every σ ∈ L, dist(γ, σ) ≥ m.
15/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Which distance dist?
Definition (Levenshtein’s edit distance dist(γ, σ))
Number of letter replacements/deletions/insertions needed to edit γ to σ.
I Example: distLevenshtein(〈ababababab〉, 〈bababababa〉) = 2
Definition (Hamming distance)
For two traces γ = γ1 . . . γn and σ = σ1 . . . σn, of same length n, define
dist(γ, σ)def=
∣∣{i ∈ {1 . . . n} | γi 6= σi}∣∣.
Pad when different lengths
I Example: distHamming(〈ababababab〉, 〈bababababa〉) = 10
16/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments: Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
(5, 3)-anti-alignment 〈A,B,D,E , I 〉
17/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
NP-completeness
Lemma
The problem of existence of (n,m)-anti-alignment is NP-complete.(with n and m represented in unary.)
Proof.
The problem is clearly in NP: checking that a run γ is a (n,m)-anti-alignment fora net N and a log L takes polynomial time.
For NP-hardness, reduction from the problem of reachability of a marking M in asafe acyclic Petri net N, known to be NP-complete a.
aCheng, A., Esparza, J., Palsberg, J.: Complexity results for safe nets. Theor.Comput. Sci. 147(1&2) (1995) 117–136
18/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision
I L ⊂ Σ∗: a log (set of traces) of an observed system
I N: a (labeled) Petri net model (constructed by process discovery)
Anti-alignment-based precision metrics
Pn(N, L) = 1− maxn(N, L)
n
with
I n: (in the order of) the maximal length for a trace in the log
I maxn(N, L): the largest m for which there exists a (n,m)-anti-alignment
Clearly, maxn(N, L) ∈ [0 . . . n] which implies Pn(N, L) ∈ [0 . . . 1].
19/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision – ExerciseSort the models by decreasing precision.
For each model, find the best anti-alignment of length ≤ 7.
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
Anti-alignment 〈A,C ,G ,H,D,F , I 〉P7(N1, L) = 0.857
Anti-alignment〈I , I , I ,A,A,A,A〉P7(N2, L) = 0
No (7, 1)-anti-alignmentP7(N3, L) = 1
20/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision – ExerciseSort the models by decreasing precision.For each model, find the best anti-alignment of length ≤ 7.
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
Anti-alignment 〈A,C ,G ,H,D,F , I 〉P7(N1, L) = 0.857
Anti-alignment〈I , I , I ,A,A,A,A〉P7(N2, L) = 0
No (7, 1)-anti-alignmentP7(N3, L) = 1
20/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision – ExerciseSort the models by decreasing precision.For each model, find the best anti-alignment of length ≤ 7.
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
Anti-alignment 〈A,C ,G ,H,D,F , I 〉P7(N1, L) = 0.857
Anti-alignment〈I , I , I ,A,A,A,A〉P7(N2, L) = 0
No (7, 1)-anti-alignmentP7(N3, L) = 1
20/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision – ExerciseSort the models by decreasing precision.For each model, find the best anti-alignment of length ≤ 7.
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
Anti-alignment 〈A,C ,G ,H,D,F , I 〉P7(N1, L) = 0.857
Anti-alignment〈I , I , I ,A,A,A,A〉P7(N2, L) = 0
No (7, 1)-anti-alignmentP7(N3, L) = 1
20/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Anti-alignments to Measure Precision – ExerciseSort the models by decreasing precision.For each model, find the best anti-alignment of length ≤ 7.
Log:
〈A,B,D,E , I 〉〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,C ,H,D,F , I 〉〈A,C ,D,H,F , I 〉
Anti-alignment 〈A,C ,G ,H,D,F , I 〉P7(N1, L) = 0.857
Anti-alignment〈I , I , I ,A,A,A,A〉P7(N2, L) = 0
No (7, 1)-anti-alignmentP7(N3, L) = 1
20/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Handling Models with Loops
A model with an executable loop has
I arbitrary long runs
I runs arbitrary far from any finite log
Drop the bound n, but penalize long runs when looking for the optimal.
Pε(N, L)def= 1− sup
γ∈L(N)
dist(γ, L)
(1 + ε)|γ|
with some ε ≥ 0 which is a parameter of this definition.
21/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉
〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)
22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉
〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)
22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉
〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)〈A,B,D,E , I 〉 4 3
7 22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉
〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)
22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉
〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)〈A,C ,H,D,F , I 〉 2 5
7 22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉
〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)
22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉
〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)〈A,C ,H,D,F , I 〉 2 5
7 22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)
22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. New ObservationsObserving a new trace which happens to be already a run of the model, can onlyincrease the precision measure.
Theorem
For every N, L and for every σ ∈ L(N),
Pn(N, L ∪ {σ}) ≥ Pn(N, L)
Hint: every (n,m)-anti-alignment for (N, L ∪ {σ}) is also a (n,m)-anti-alignmentfor (N, L).
Example
Log L:
〈A,C ,D,G ,H,F , I 〉〈A,C ,G ,D,H,F , I 〉〈A,B,D,E , I 〉〈A,C ,D,H,F , I 〉〈A,C ,H,D,F , I 〉
Best anti-alignment max7(N, L) P7(N, L)〈A,C ,G ,H,D,F , I 〉 1 6
7 22/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Monotonicity w.r.t. Model Language
Theorem
Given two models N1 and N2, if L(N1) ⊆ L(N2), then N1 is more precise than N2.
L(N1) ⊆ L(N2) =⇒ Pn(N1, L) ≥ Pn(N2, L)
23/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Implementation
Formula Φnm(N, L) states that γ is a (n,m)-anti-alignment:
I γ = λ(t1) . . . λ(tn) ∈ L(N), and
I for every σ ∈ L, dist(γ, σ) ≥ m.
Encoding in SATΦn
m(N, L) is coded using the following Boolean variables:
I τi,t for i = 1 . . . n, t ∈ T means that transition ti = t.
I mi,p for i = 0 . . . n, p ∈ P means that place p is marked in marking Mi (safePetri nets: Boolean variables)
I δi,j,σ to encode the distances dist(γ, σ).
Total size for the SAT encoding of the formula Φnm(N, L):
O(n × |T | ×
(|N|+ m2 × |L|
))
24/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Encoding in SAT (1) γ = λ(t1) . . . λ(tn) ∈ L(N)
I Initial marking: (∧p∈M0
m0,p
)∧(∧
p∈P\M0¬m0,p
)I One and only one ti for each i :∧n
i=1
∨t∈T (τi,t ∧
∧t′∈T ¬τi,t′)
I The transitions are enabled when they fire:∧ni=1
∧t∈T (τi,t =⇒
∧p∈•t mi−1,p)
I Token game (for safe Petri nets):
n∧i=1
∧t∈T
∧p∈t•
(τi,t =⇒ mi,p)
n∧i=1
∧t∈T
∧p∈•t\t•
(τi,t =⇒ ¬mi,p)
n∧i=1
∧t∈T
∧p∈P,p 6∈•t,p 6∈t•
(τi,t =⇒ (mi,p ⇐⇒ mi−1,p))
25/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Encoding in SAT (2) dist(γ, σ) ≥ m
I For Hamming distance: easy
I For Levenshtein’s distance:Use same relations as the classical algorithm:
dist(〈u1, . . . , ui 〉, ε) = idist(ε, 〈v1, . . . , vj〉) = jdist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj+1〉) =
dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj〉) if ui+1 = vj+1
1 + min(dist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj〉),dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj+1〉))
if ui+1 6= vj+1
Encoding as SAT formula using variables δi,j,dδi,j,d = true means dist(〈u1 . . . ui 〉, 〈v1 . . . vj〉) ≥ d .
δ0,0,0 ∧∧
d>0 ¬δ0,0,d (1)∧d
∧ni=0 (δi+1,0,d+1 ⇔ δi,0,d) (2)∧
d
∧nj=0 (δ0,j+1,d+1 ⇔ δ0,j,d) (3)∧
d
∧i,j s.t. ui+1=vj+1
δi+1,j+1,d ⇔ δi,j,d (4)∧d
∧i,j s.t. ui+1 6=vj+1
δi+1,j+1,d+1 ⇔ (δi+1,j,d ∧ δi,j+1,d) (5)
26/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Encoding in SAT (2) dist(γ, σ) ≥ mI For Hamming distance: easy
I For Levenshtein’s distance:Use same relations as the classical algorithm:
dist(〈u1, . . . , ui 〉, ε) = idist(ε, 〈v1, . . . , vj〉) = jdist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj+1〉) =
dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj〉) if ui+1 = vj+1
1 + min(dist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj〉),dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj+1〉))
if ui+1 6= vj+1
Encoding as SAT formula using variables δi,j,dδi,j,d = true means dist(〈u1 . . . ui 〉, 〈v1 . . . vj〉) ≥ d .
δ0,0,0 ∧∧
d>0 ¬δ0,0,d (1)∧d
∧ni=0 (δi+1,0,d+1 ⇔ δi,0,d) (2)∧
d
∧nj=0 (δ0,j+1,d+1 ⇔ δ0,j,d) (3)∧
d
∧i,j s.t. ui+1=vj+1
δi+1,j+1,d ⇔ δi,j,d (4)∧d
∧i,j s.t. ui+1 6=vj+1
δi+1,j+1,d+1 ⇔ (δi+1,j,d ∧ δi,j+1,d) (5)
26/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Encoding in SAT (2) dist(γ, σ) ≥ mI For Hamming distance: easy
I For Levenshtein’s distance:Use same relations as the classical algorithm:
dist(〈u1, . . . , ui 〉, ε) = idist(ε, 〈v1, . . . , vj〉) = jdist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj+1〉) =
dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj〉) if ui+1 = vj+1
1 + min(dist(〈u1, . . . , ui+1〉, 〈v1, . . . , vj〉),dist(〈u1, . . . , ui 〉, 〈v1, . . . , vj+1〉))
if ui+1 6= vj+1
Encoding as SAT formula using variables δi,j,dδi,j,d = true means dist(〈u1 . . . ui 〉, 〈v1 . . . vj〉) ≥ d .
δ0,0,0 ∧∧
d>0 ¬δ0,0,d (1)∧d
∧ni=0 (δi+1,0,d+1 ⇔ δi,0,d) (2)∧
d
∧nj=0 (δ0,j+1,d+1 ⇔ δ0,j,d) (3)∧
d
∧i,j s.t. ui+1=vj+1
δi+1,j+1,d ⇔ δi,j,d (4)∧d
∧i,j s.t. ui+1 6=vj+1
δi+1,j+1,d+1 ⇔ (δi+1,j,d ∧ δi,j+1,d) (5)
26/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Experiments: Alignments (showing averages)
27/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Experiments: Anti-alignments
28/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Experiments: Anti-alignments (Hamming distance)
benchmark |P| |T | |L| |AL| n m Φnm(N, L) minm(N, L) maxn(N, L)
prAm6 347 363 761 272 41 1 ! 3 39
5 ! 7
21 1 ! 3 19
5 ! 7
1200 363 41 1 ! 4 19
5 ! 8
21 1 ! 4 15
5 ! 8
BankTransfer 121 114 989 101 51 1 ! 8 32
10 ! 17
21 1 ! 8 14
10 ! 17
2000 113 51 1 ! 15 16
10 ! 37
21 1 ! 15 5
10 % 37
29/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Experiments: Multi-alignments
30/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Conclusion
Anti-alignment
I Run of the model which maximizes its distance to the observed traces
I New metric for precision in process miningI monotonic w.r.t. new observations
Implementations
I DarkSider (using SAT encoding)www.lsv.ens-cachan.fr/~chatain/darksider
I Also available in ProMwww.promtools.org
SAT-based approach for conformance checking
I Very flexible
I Good for prototyping
I Efficiency depends a lot on precise problem and encoding
31/32
Introduction Process Discovery Conformance Checking Anti-alignments A Metric for Precision Implementation
Thank you!
32/32