Upload
jan-claes
View
14
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Slides of my presentation at ProM meeting at Technische Universiteit Eindhoven, 6 February 2012, Eindhoven, NL
Citation preview
Faculty of Economics and Business Administration Department of Management Information and Operations Management
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION
Jan Claes for TUe 20128 April 2023
Merging Event Logs in ProMJan Claes
Ghent Universityhttp://processmining.ugent.be
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20122 / 21
Merging Event Logs
ProM plugin
?Merged event logMultiple event logs
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20123 / 21
Merging Event Logs
1. Find links 2. Merge chronologically 3. Add unlinked traces 4. Put in new log file
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20124 / 21
Approaches
Genetic Algorithm J. Claes, G. Poels, Integrating Computer Log Files for Process Mining: a Genetic
Algorithm Inspired Technique, in CAiSE 2011 Workshops, LNBIP 83, 2011
Artificial Immune System J. Claes, G. Poels, Merging Computer Log Files for Process Mining: an Artificial
Immune System Technique, in BPM 2011 Workshops, LNBIP 99, 2011
Rule Based J. Claes, G. Poels, Merging Event Logs for Process Mining: A Rule Based Merging
Method and Rule Suggestion Algorithm, to be submitted in 2012
Faculty of Economics and Business Administration Department of Management Information and Operations Management
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION
Jan Claes for TUe 20128 April 2023
1. Genetic Algorithm
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20126 / 21
1. Genetic Algorithm
SELPOP
cross-overMUTPOPmutation
RANDPOP
ReproductionSelection
fitness
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20127 / 21
1. Genetic Algorithm
Fitness function Sum of weighted factor scores per link
• Same trace id (STIi)
• Trace order (TOi) if all start events are in the first log
• Equal attribute values (EAVi)
• Number of linked traces (NLTi)
• Time distance (TDi)
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 20128 / 21
1. Genetic Algorithm
Simplification Population size one Only mutations
Improvements More intelligent start population (not random) More intelligent mutations (improve at least one
factor of the fitness function)Attention
Intensification vs. diversification
Faculty of Economics and Business Administration Department of Management Information and Operations Management
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION
Jan Claes for TUe 20128 April 2023
2. Artificial Immune system
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201210 / 21
2. Artificial Immune System
Immune cells(type B-cell)
Antibodies(receptor)
Antigen
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201211 / 21
2. Artificial Immune System
HIGH
LOW
INITPOP
HIGH
LOW
CLONEPOP
EDITPOP
SEED
HIGH
LOW
Initial population Hypermutation Receptor editingClonal selectionAffinity maturation
mutations MUTPOP
sortedPOP
RAND POP
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201212 / 21
2. Artificial Immune System
Clonal selection Clone the fittest x% solutions (I)
Hypermutation Randomly change each clone The higher the fitness score, the less changes (I)
Receptor editing Take the best y% solutions (I) Add totally random solutions to the set (D)
(I: Intensification, D: Diversification)
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201213 / 21
2. Artificial Immune System
Hypermutation Choose ‘random’ indicator factor to improve
• Higher chance to pick factors with positive previous effect Choose random action
• Add link, remove link or alter link Choose random candidate
• From all solutions that would improve with selected action Choose random improvement
• From all possible improvements for selected candidate
Faculty of Economics and Business Administration Department of Management Information and Operations Management
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION
Jan Claes for TUe 20128 April 2023
3. Rule Based
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201215 / 21
3. Rule Based
Automatic merging is not transparant(how good is the merging result?)
Previous algorithms are (too) slowMy experience
in most cases it is about finding an attribute value (literally) in a trace of the other log
you need data experts/analyst to get the right data, they mostly have a good idea about the link between two log files
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201216 / 21
3. Rule Based
Semi-automatic solution Let user configure merging rule based on attribute
values• More transparent• Faster• Includes expert knowledge if available
Help user by suggesting merging rules based on the data in the log
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201217 / 21
3. Rule Based
Merging rules Merge all traces where…
attribute <select name> from <select container> in the 1st log<select operator> attribute <select name> from <select container> in the 2nd log
E.g. Merge all traces where attribute Trace ID from a trace in the 1st log equals attribute Supplier Reference from event Send goods in the 2nd log
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201218 / 21
3. Rule Based
<select name>• Contains all possible attribute names available in the log
<select container>• From a trace• From any event in a trace• From a trace or any event in a trace• From event X, From event Y, From event Z, …
<select operator>• equals, is not equal, greater than, greater or equal, …• comes before, comes after
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201219 / 21
3. Rule Based
Suggesting rules Look at all attribute values in the log Make a rule for every equal match in both logs Count the number of linked traces for every rule Filter rules with only one link Sort such that rule that is closer to 1-to-1 match is
higher in the list• rules that make more or fewer links are lower in the list• if no 1-to-1 rule exist, the ‘best’ rule is still on top
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201220 / 21
3. Rule Based
Some remarks User can configure rules or select from the
suggestion list Suggestion list is currently limited to equals-rules
but is calculated very fast (order n1 + n2 !) Rules can be combined with And or Or By explicitly selecting rules, the approach is more
transparent Possible use as shortcut for merging logs from
within one system
Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for TUe 201221 / 21
Contact information
http://processmining.ugent.beTwitter: @janclaesbelgiumPav D8.a (until February 10)