Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
99
Process Mining: Overview and Opportunities
WIL VAN DER AALST, Eindhoven University of Technology
Over the last decade, process mining emerged as a new research field that focuses on the analysis of pro-cesses using event data. Classical data mining techniques such as classification, clustering, regression, as-sociation rule learning, and sequence/episode mining do not focus on business process models and are oftenonly used to analyze a specific step in the overall process. Process mining focuses on end-to-end processesand is possible because of the growing availability of event data and new process discovery and conformancechecking techniques.
Process models are used for analysis (e.g., simulation and verification) and enactment by BPM/WFM sys-tems. Previously, process models were typically made by hand without using event data. However, activitiesexecuted by people, machines, and software leave trails in so-called event logs. Process mining techniquesuse such logs to discover, analyze, and improve business processes.
Recently, the Task Force on Process Mining released the Process Mining Manifesto. This manifesto issupported by 53 organizations and 77 process mining experts contributed to it. The active involvement ofend-users, tool vendors, consultants, analysts, and researchers illustrates the growing significance of processmining as a bridge between data mining and business process modeling. The practical relevance of processmining and the interesting scientific challenges make process mining one of the “hot” topics in BusinessProcess Management (BPM). This paper introduces process mining as a new research field and summarizesthe guiding principles and challenges described in the manifesto.
Categories and Subject Descriptors: H.2.8 [Database Management]: Database Applications—Data Min-ing
General Terms: Management, Measurement, Performance
Additional Key Words and Phrases: Process mining, business intelligence, business process management,data mining
ACM Reference Format:Van der Aalst, W.M.P. 2011. Process Mining: Overview and Opportunities. ACM Trans. Manag. Inform. Syst.99, 99, Article 99 (February 2012), 16 pages.DOI = 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000
1. INTRODUCTIONProcess mining aims to discover, monitor and improve real processes by extractingknowledge from event logs readily available in today’s information systems [Aalst2011]. Over the last decade there has been a spectacular growth of event data and pro-cess mining techniques have matured significantly. As a result, management trendsrelated to process improvement and compliance can now benefit from process mining.
Starting point for process mining is an event log. Each event in such a log refers toan activity (i.e., a well-defined step in some process) and is related to a particular case(i.e., a process instance). The events belonging to a case are ordered and can be seenas one “run” of the process. Event logs may store additional information about events.
Author’s address: Department of Mathematics and Computer Science, Eindhoven University of Technology,PO Box 513, 5600 MB, Eindhoven, The Netherlands. [email protected] to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or commercial advantage and thatcopies show this notice on the first page or initial screen of a display along with the full citation. Copyrightsfor components of this work owned by others than ACM must be honored. Abstracting with credit is per-mitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any componentof this work in other works requires prior specific permission and/or a fee. Permissions may be requestedfrom Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)869-0481, or [email protected]© 2012 ACM 0000-0001/2012/02-ART99 $10.00
DOI 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:2 W. van der Aalst
In fact, whenever possible, process mining techniques use extra information such asthe resource (i.e., person or device) executing or initiating the activity, the timestampof the event, or data elements recorded with the event (e.g., the size of an order).
software
system
(process)
model
event
logs
models
analyzes
discovery
records
events, e.g.,
messages,
transactions,
etc.
specifies
configures
implements
analyzes
supports/
controls
enhancement
conformance
“world”
people machines
organizations
components
business
processes
Fig. 1. The three basic types of process mining: (a) discovery, (b) conformance, and (c) enhancement.
Event logs can be used to conduct three types of process mining as shown in Fig. 1[Aalst 2011]. The first type of process mining is discovery. A discovery technique takesan event log and produces a model without using any a-priori information. Processdiscovery is the most prominent process mining technique. For many organizations itis surprising to see that existing techniques are indeed able to discover real processesmerely based on example behaviors stored in event logs. The second type of processmining is conformance. Here, an existing process model is compared with an event logof the same process. Conformance checking can be used to check if reality, as recordedin the log, conforms to the model and vice versa. The third type of process mining isenhancement. Here, the idea is to extend or improve an existing process model therebyusing information about the actual process recorded in some event log. Whereas con-formance checking measures the alignment between model and reality, this third typeof process mining aims at changing or extending the a-priori model. For instance, byusing timestamps in the event log one can extend the model to show bottlenecks, ser-vice levels, and throughput times.
Unlike traditional Business Process Management (BPM) techniques that use hand-made models [Weske 2007], process mining is based on facts. Based on observed be-havior recorded in event logs, intelligent techniques are used to extract knowledge.Therefore, we claim that process mining enables evidence-based BPM. Unlike existinganalysis approaches, process mining is process-centric (and not data-centric), truly in-telligent (learning from historic data), and fact-based (based on event data rather thanopinions).
Process mining is related to data mining. Whereas classical data mining techniquesare mostly data-centric [Hand et al. 2001], process mining is process-centric. Main-stream business process modeling techniques use notations such as the Business Pro-cess Modeling Notation (BPMN), UML activity diagrams, Event-driven Process chains(EPC), and various types of Petri nets [Aalst and Stahl 2011; Desel and Reisig 1998;Weske 2007]. These notations can be used model process processes with concurrency,choice, iteration, etc.
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:3
This paper introduces not only process mining as a new research field, but also fa-miliarizes the reader with the Process Mining Manifesto [TFPM 2011] released by theTask Force on Process Mining in October 2011. The growing interest in log-based pro-cess analysis motivated the establishment of a Task Force on Process Mining in 2009.This manifesto aims to promote the topic of process mining. Moreover, by defining a setof guiding principles and listing important challenges, this manifesto hopes to serveas a guide for software developers, scientists, consultants, business managers, and end-users. The goal is to increase the maturity of process mining as a new tool to improvethe (re)design, control, and support of operational business processes.
The remainder of this paper is organized as follows. Section 2 introduces the notionof an event log, used as input for process mining. Section 3 shows how process mod-els can be discovered from scratch using only raw event data. Section 4 discusses thesecond type of process mining: conformance checking. Section 5 elaborates on the thirdtype of process mining: enhancement. The guiding principles and challenges listed inthe manifesto are summarized in Section 6. Section 7 discusses tool support and showssome real-life examples. Section 8 concludes the paper.
2. EVENT LOGS AS A STARTING POINT FOR PROCESS MININGDigital event data is everywhere – in every sector, in every economy, in every organi-zation, and in every home – and will continue to grow exponentially [Manyika et al.2011]. The omnipresence of such data allows for new forms of process analysis, i.e.,based on observed facts rather than hand-made models. Starting point for processmining is an event log.
To introduce the basic process mining concepts we use the event log shown in Fig. 2(log is taken from Chapter 5 of [Aalst 2011]). This event log contains 1391 cases, i.e.,instances of some reimbursement process. There are 455 process instances followingtrace acdeh. Activities are represented by a single character: a = register request, b =examine thoroughly, c = examine casually, d = check ticket, e = decide, f = reinitiaterequest, g = pay compensation, and h = reject request. Hence, trace acdeh models areimbursement request that was rejected after a registration, examination, check, anddecision step. 455 cases followed this path consisting of five steps, i.e., the first line inthe table corresponds to 455× 5 = 2275 events. The whole log consists of 7539 events.
Note that events can have all kinds of additional attributes (timestamps, trans-actional information, resource usage, etc.). Consider for example one of the 1391 “a”events. Such an event refers to the execution of “register request” for some reimburse-ment. The event may have a timestamp, e.g., “23-01-2012:8.38”, and an attribute de-scribing the resources involved. Moreover, data attributes of the reimbursement (e.g.,name of customer or number of loyalty card) and data attributes of the registrationevent (e.g., the amount claimed or a booking reference) may have been recorded. Allsuch attributes can be used by process mining techniques. However, the backboneof process mining is the control-flow perspective. Therefore, for simplicity, events inFig. 2 are described by their activity names only. However, it is important to realizethat events can have various attributes, e.g., timestamps can be used for bottleneckanalysis and resource attributes can be used for organizational mining (e.g., findingallocation rules).
3. DISCOVERYThis section introduces the notion of process discovery, i.e., automatically constructmodels based on observed events.
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:4 W. van der Aalst
astart register
request
bexamine
thoroughly
cexamine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
M1
acdeh
abdeg
adceh
abdeh
acdeg
adceg
adbeh
acdefdbeh
adbeg
acdefbdeh
acdefbdeg
acdefdbeg
adcefcdeh
adcefdbeh
adcefbdeg
acdefbdefdbeg
adcefdbeg
adcefbdefbdeg
adcefdbefbdeh
adbefbdefdbeg
adcefdbefcdefdbeg
455
191
177
144
111
82
56
47
38
33
14
11
9
8
5
3
2
2
1
1
1
# trace
1391
astart register
request
bexamine
thoroughly
cexamine casually
dcheck ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f2
end
M2
f1reinitiate request
p1
p2
p3
p4
p5
p1
p2
p3
Fig. 2. One event log and two potential process models (M1 and M2) aiming to describe the observed be-havior.
3.1. Applications of Process DiscoveryOrganizations use procedures to handle cases. Sometimes such procedures are en-forced by the information system. However, in most cases, procedures are informaland may not have been documented at all. Moreover, even when procedures have beendocumented, reality may be very different. Therefore, it is important to discover theactual processes using event data. The discovered process models may be used
— for discussing problems among stakeholders (to reach consensus; it is important tohave a shared view of the real processes),
— for generating process improvement ideas (seeing the actual process and its problemsstimulates re-engineering efforts),
— for model enhancement (e.g., bottleneck analysis, see Section 5), and— for configuring a WFM/BPM system (the discovered process model can serve as a
template).
3.2. Learning Process Models From Event LogsProcess discovery techniques produce process models based on event logs such as theone shown in Fig. 2. For example, the classical α-algorithm produces model M1 forthis log. This process model is represented as a Petri net [Aalst and Stahl 2011; Deseland Reisig 1998]. A Petri net consists of places (start , p1, p2, p3, p4, p5, and end ) andtransitions (a, b, c, d, e, f , g, and h). Transitions may be connected to places and placesmay be connected to transitions. It is not allowed to connect a place to a place or atransition to a transition.
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:5
The state of a Petri net, also referred to as marking, is defined by the distribution oftokens over places. A transition is enabled if each of its input places contains a token.For example, in M1, transition a is enabled in the initial marking of M1, because theonly input place of a contains a token (black dot).
An enabled transition may fire thereby consuming a token from each of its inputplaces and producing a token for each of its output places. Firing a in the initial mark-ing corresponds to removing one token from start and producing two tokens (one forp1 and one for p2). After firing a, three transitions are enabled: b, c, and d. There isa non-deterministic choice between b and d. Firing b will disable c because the tokenis removed from the shared input place (and vice versa). Transition d is concurrentwith b and c, i.e., it can fire without disabling another transition. Transition e becomesenabled after d and b or c have occurred. Note that transition e in M1 is only enabledif both input places (p3 and p4) contain a token. After executing e, three transitionsbecome enabled: f , g, and h. These transitions are competing for the same token thusmodeling a choice. When g or h is fired, the process ends with a token in place end. Iff is fired, the process returns to the state just after executing a.
It is easy to check that all traces in the event log can be reproduced by M1. This doesnot hold for the second process model in Fig. 2. M2 is able to reproduce traces suchas acdeh (455 instances), abdeg (191 instances), and acdefbdeh (33 instances). Note thatM2 has two transitions corresponding to activity f . To refer to them they are named f1and f2. M2 also allows for behavior very different from what can be observed in the log,e.g., abeg and abdddddf1 bddddeh are possible according to the model but do not appear inthe log. There are also traces in the log that cannot be replayed by M2, e.g., adceh (177instances), adceg (82 instances), and adcefcdeh (9 instances) are not possible accordingto M2.
The two process models in Fig. 2 are visualized in terms of Petri nets. In fact, bothmodels are so-called WF-nets [Aalst et al. 2011]. A WF-net is a Petri net with onesource place and one sink place such that all places and transitions are on a path fromsource to sink. Both models in Fig. 2 have a source place named start and a sink placeend and all nodes are on a path from start to end.
In general, the notation used to visualize the result may be very different from therepresentation used during the actual discovery process. All mainstream BPM nota-tions (Petri nets, EPCs, BPMN, YAWL, UML activity diagrams, etc.) can be used toshow discovered processes such as M1 [Aalst 2011; Weske 2007].
3.3. Process Discovery AlgorithmsSince the mid-nineties several groups have been working on techniques for automatedprocess discovery based on event logs [Aalst et al. 2004; Aalst et al. 2007; Agrawalet al. 1998; Cook and Wolf 1998; Datta 1998; Dongen and Aalst 2004; 2005; Grecoet al. 2006; Weijters and Aalst 2003]. In [Aalst et al. 2003] an overview is given of theearly work in this domain. The idea to apply process mining in the context of work-flow management systems was introduced in [Agrawal et al. 1998]. In parallel, Datta[Datta 1998] looked at the discovery of business process models. Cook et al. investi-gated similar issues in the context of software engineering processes [Cook and Wolf1998]. Herbst [Herbst 2000] was one of the first to tackle more complicated processes,e.g., processes containing duplicate tasks.
Most of the classical approaches have problems dealing with concurrency. The α-algorithm [Aalst et al. 2004] is an example of a simple technique that takes concur-rency as a starting point. The α-algorithm scans the event log for particular patterns.For example, if activity a is followed by b but b is never followed by a, then it is assumedthat there is a causal dependency between a and b. To reflect this dependency, the cor-responding Petri net should have a place connecting a to b. We use the notation, a > b
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:6 W. van der Aalst
if and only if there is a trace σ = 〈t1, t2, t3, . . . tn〉 in the log and an i ∈ {1, . . . , n−1} suchthat ti = a and ti+1 = b. a → b if and only if a > b and b 6> a; a#b if and only if a 6> band b 6> a; and a‖b if and only if a > b and b > a. These four ordering relations are usedto create places connecting the different transitions in the Petri net. The α-algorithmis simple and efficient, but has problems dealing with complicated routing constructsand noise (like most of the other approaches described in literature).
Region-based approaches are able to express more complex control-flow structureswithout underfitting. State-based regions were introduced in 1989 [Ehrenfeucht andRozenberg 1989] and generalized in various ways [Cortadella et al. 1998]. In [Aalstet al. 2010; Dongen et al. 2007; Sole and Carmona 2010] it is shown how these state-based regions can be applied to process mining. In parallel, several authors appliedlanguage-based regions to process mining [Bergenthum et al. 2007; Werf et al. 2010].The basic idea of these approaches is to discover places. Note that the addition of placeslimits the behavior of the Petri net. The idea is to add places that do not exclude anyof the behavior seen in the event log.
For practical applications of process discovery it is essential that noise and incom-pleteness are handled well. Surprisingly, only few discovery algorithms focus on ad-dressing these issues. Notable exceptions are heuristic mining [Weijters and Aalst2003], fuzzy mining [Gunther and Aalst 2007], and genetic process mining [Medeiroset al. 2007].
ProM’s heuristic miner uses the algorithm described in [Weijters and Aalst 2003](see also Section 6.2 in [Aalst 2011]). The algorithm first builds a dependency graphbased on the frequencies of activities and the number of times one activity is followedby another activity. Based on predefined thresholds, dependencies are added to thedependency graph graph (or not). The dependency graph reveals the “backbone” of theprocess model. This backbone is used to discover the detailed split and join behaviorof nodes. If an activity has multiple input arcs, then the heuristic miner analyzes thelog to see whether the join is an AND-join, an XOR-join or an OR-join. In case of anOR-join, the detailed synchronization behavior is learned. If an activity has multipleoutput arcs, then the “split behavior” is learned in a similar fashion.
See Chapter 6 of [Aalst 2011] for a more elaborate introduction to the various processdiscovery approaches described in literature.
4. CONFORMANCEIn recent years, powerful process mining techniques have been developed that canautomatically construct a suitable process model given an event log. Whereas processdiscovery constructs a model without any a priori information (other than the eventlog), conformance checking uses a model and an event log as input. The model may havebeen made by hand or discovered through process discovery. For conformance checking,the modeled behavior and the observed behavior (i.e., event log) are compared.
4.1. Applications of Conformance CheckingConformance checking techniques relate events in the log to activities in the model,e.g., events are mapped to transition firings in the Petri net. This way it is possible tocompare the observed behavior in the event log and the modeled behavior. For exam-ple, one can quantify differences (e.g., “80% of the observed cases are possible accord-ing to the model”) and diagnose deviations (e.g., “in reality activity x is often skippedalthough the model does not allow for this”). Conformance checking can be used
— to check the quality of documented processes (asses whether they describe realityaccurately),
— to identify deviating cases and understand what they have in common,
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:7
— to identify process fragments where most deviations occur,— for auditing purposes,— to judge the quality of a discovered process model,— to guide evolutionary process discovery algorithms (e.g., genetic algorithms need to
continuously evaluate the quality of newly created models using conformance check-ing), and
— as a starting point for model enhancement.
The above list shows that conformance checking can be used for a variety of reasonsranging from evaluating a process discovery algorithm to auditing and compliancemonitoring. Note that auditors need to validate information about organizations bydetermining whether they execute business processes within certain boundaries set bymanagers, governments, and other stakeholders. Clearly, event logs provide valuableinput for this.
4.2. Diagnosing Differences Between Observed Behavior and Modeled BehaviorTypically, four quality dimensions for comparing model and log are considered: (a)fitness, (b) simplicity, (c) precision, and (d) generalization (Chapter 7 of [Aalst 2011]).
A model with good fitness allows for most of the behavior seen in the event log. Amodel has perfect fitness if all traces in the log can be replayed by the model frombeginning to end. Often fitness is described by a number between 0 (very poor fitness)and 1 (perfect fitness). Obviously, the simplest model that can explain the behaviorseen in the log is the best model. This principle is known as Occam’s Razor.
Fitness and simplicity alone are not sufficient to judge the quality of a discoveredprocess model. For example, it is very easy to construct an extremely simple Petri net(“flower model”) that is able to replay all traces in an event log (but also any other eventlog referring to the same set of activities). Similarly, it is often undesirable to have amodel that only allows for the exact behavior seen in the event log. Remember thatthe log contains only example behavior and that many traces that are possible maynot have been seen yet. (Note that in our simple example most traces are frequent, butoften there are many “one-of-a-kind” traces.)
A model is precise if it does not allow for “too much” behavior. Clearly, the “flowermodel” lacks precision. A model that is not precise is “underfitting”. Underfitting isthe problem that the model over-generalizes the example behavior in the log (i.e., themodel allows for behaviors very different from what was seen in the log).
A model should also generalize and not restrict behavior to just the examples seen inthe log. A model that does not generalize sufficiently is “overfitting”. Overfitting is theproblem that a very specific model is generated whereas it is obvious that the log onlyholds example behavior (i.e., the model explains the particular sample log, but a nextsample log of the same process may produce a completely different process model).
The four four quality dimensions for comparing model and log can be quantified invarious ways. See [Aalst 2011; Aalst et al. 2012; Adriansyah et al. 2011; Munoz-Gamaand Carmona 2011; Rozinat and Aalst 2008] for more details.
4.3. Conformance Checking AlgorithmsBasically, there are three approaches to conformance checking.
The first approach is to create an abstraction of the behavior in the log and an ab-straction of the behavior allowed by the model. An example is the notion of a footprintdescribed in Section 7.3 of [Aalst 2011]. A footprint is a matrix showing causal depen-dencies between activities. For example, the footprint of an event log may show that xis sometimes followed by y but never the other way around. If the footprint of the cor-responding model shows that x is never followed by y or that y is sometimes followed
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:8 W. van der Aalst
by x, then the footprints of event log and model disagree on the ordering relation of xand y.
The second approach replays the event log on the model. A naive approach to-wards conformance checking would be to simply count the fraction of cases that canbe “parsed completely” (i.e., the proportion of cases corresponding to firing sequencesleading from the initial state to the final state). This approach cannot distinguish be-tween an “almost fitting” case and a case that is completely unrelated to the modeledbehavior. A better approach is to continue replaying the event log on the model evenwhen transitions are not enabled. Simply “borrow tokens”, force the transition to fireanyway, and record the problem. In the end, the number of “borrowed tokens” and thenumber of “tokens left behind” (not consumed) indicate the fitness level. See [Rozinatand Aalst 2008] and Section 7.2 in [Aalst 2011].
The third, and most advanced, approach is to compute an optimal alignment betweeneach trace in the log and the most similar behavior in the model. Consider for examplethe following three alignments between the example log and model M2:
γ1 =a c d e ha c d e h
and γ2 =a d c e ha � c e h
and γ3 =a d c e f d b e ha � c e f2 � b e h
γ1 shows a perfect alignment: all moves of the trace in the event log (top part of align-ment) can be followed by moves of the model (bottom part of alignment). γ2 shows anoptimal alignment for trace adceh in the event log and model M2. The first move of thetrace in the event log can be followed by the model (event a). However, in the secondposition of the alignment, we see a move of the trace in the event log which cannotbe mimicked by the model. This move in just the log is denoted as (d,�). γ3 showsan optimal alignment for trace adcefdbeh in the event log and model M2. Here, we en-counter two situations where log and model cannot move together. Also note the move(f, f2), i.e., event f in the log corresponds to the execution of transition f2. Alignmentsγ2 and γ3 clearly show the reasons for non-conformance between model and log. Suchproblems can easily be quantified as shown in [Aalst et al. 2012; Adriansyah et al.2011].
Conformance can be viewed from two angles: (a) the model does not capture thereal behavior (“the model is wrong”) and (b) reality deviates from the desired model“the event log is wrong”). The first viewpoint is taken when the model is supposed tobe descriptive, i.e., capture or predict reality. The second viewpoint is taken when themodel is normative, i.e., used to influence or control reality.
5. ENHANCEMENTIt is also possible to extend or improve an existing process model using the event log. Anon-fitting process model can be corrected using the diagnostics provided by the align-ment of model and log. Moreover, event logs may contain information about resources,timestamps, and case data. For example, an event referring to activity “register re-quest” and case “992564” may also have attributes describing the person that regis-tered the request (e.g., “John”), the time of the event (e.g., “30-01-2012:14.55”), the ageof the customer (e.g., “45”), and the claimed amount (e.g., “650 euro”). After aligningmodel and log it is possible to replay the event log on the model. While replaying onecan analyze these additional attributes.
For example, it is possible to analyze waiting times in-between activities. Simplymeasure the time difference between causally related events and compute basic statis-tics such as averages, variances, and confidence intervals. This way it is possible toidentify the main bottlenecks [Aalst 2011].
Information about resources can be used to discover roles, i.e., groups of people fre-quently executing related activities. Here, standard clustering techniques can be used.
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:9
It is also possible to construct social networks based on the flow of work and analyze re-source performance (e.g., the relation between workload and service times). See [Songand Aalst 2008] for an overview of various process mining techniques analyzing theorganizational perspective based on event logs.
Standard classification techniques can be used to analyze the decision points in theprocess model [Rozinat and Aalst 2006]. For example, activity e (“decide”) has threepossible outcomes (“pay”, “reject”, and “redo”). Using the data known about the caseprior to the decision, we can construct a decision tree explaining the observed behavior.
Process mining is not restricted to offline analysis and can also be used for predic-tions and recommendations at runtime. For example, the completion time of a partiallyhandled customer order can be predicted using a discovered process model with timinginformation [Aalst et al. 2011].
6. PROCESS MINING MANIFESTOThe IEEE Task Force on Process Mining recently released a manifesto describing guid-ing principles and challenges [TFPM 2011]. The manifesto aims to increase the visibil-ity of process mining as a new tool to improve the (re)design, control, and support ofoperational business processes. It is intended to guide software developers, scientists,consultants, and end-users. Before summarizing the manifesto, we briefly introducethe task force.
6.1. Task Force on Process MiningThe growing interest in log-based process analysis motivated the establishment of theIEEE Task Force on Process Mining. The goal of this task force is to promote the re-search, development, education, and understanding of process mining. The task forcewas established in 2009 in the context of the Data Mining Technical Committee of theComputational Intelligence Society of the IEEE. Members of the task force include rep-resentatives of more than a dozen commercial software vendors (e.g., Pallas Athena,Software AG, Futura Process Intelligence, HP, IBM, Fujitsu, Infosys, and Fluxicon),ten consultancy firms (e.g., Gartner and Deloitte) and over twenty universities.
Concrete objectives of the task force are: to make end-users, developers, consultants,managers, and researchers aware of the state-of-the-art in process mining, to promotethe use of process mining techniques and tools, to stimulate new process mining ap-plications, to play a role in standardization efforts for logging event data, to organizetutorials, special sessions, workshops, panels, and to publish articles, books, videos,and special issues of journals. For example, in 2010 the task force standardized XES(www.xes-standard.org), a standard logging format that is extensible and supportedby the OpenXES library (www.openxes.org) and by tools such as ProM, XESame, Nitro,etc. See http://www.win.tue.nl/ieeetfpm/ for recent activities of the task force.
6.2. Guiding PrinciplesAs with any new technology, there are obvious mistakes that can be made when ap-plying process mining in real-life settings. Therefore, the six guiding principles listedin Table I aim to prevent users/analysts from making such mistakes. As an example,consider Guiding Principle GP4: “Events Should Be Related to Model Elements”. Itis a misconception that process mining is limited to control-flow discovery, other per-spectives such as the organizational perspective, the time perspective, and the dataperspective are equally important. However, the control-flow perspective (i.e., the or-dering of activities) serves as the layer connecting the different perspectives. There-fore, it is important to relate events in the log to activities in the model. Conformancechecking and model enhancement heavily rely on this relationship. Using Fig. 2, weshowed for example alignment γ3 which relates observed trace adcefdbeh to firing se-
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:10 W. van der Aalst
Table I. Six Guiding Principles Listed in the Manifesto
Event Data Should Be Treated as First-Class CitizensGP1 Events should be trustworthy, i.e., it should be safe to assume that the recorded events actu-
ally happened and that the attributes of events are correct. Event logs should be complete,i.e., given a particular scope, no events may be missing. Any recorded event should havewell-defined semantics. Moreover, the event data should be safe in the sense that privacyand security concerns are addressed when recording the event log.Log Extraction Should Be Driven by Questions
GP2 Without concrete questions it is very difficult to extract meaningful event data. Consider,for example, the thousands of tables in the database of an ERP system like SAP. Withoutquestions one does not know where to start.Concurrency, Choice and Other Basic Control-Flow Constructs Should be Sup-ported
GP3 Basic workflow patterns supported by all mainstream languages (e.g., BPMN, EPCs, Petrinets, BPEL, and UML activity diagrams) are sequence, parallel routing (AND-splits/joins),choice (XOR-splits/joins), and loops. Obviously, these patterns should be supported by pro-cess mining techniques.Events Should Be Related to Model Elements
GP4 Conformance checking and enhancement heavily rely on the relationship between elementsin the model and events in the log. This relationship may be used to “replay” the event logon the model. Replay can be used to reveal discrepancies between event log and model (e.g.,some events in the log are not possible according to the model) and can be used to enrichthe model with additional information extracted from the event log (e.g., bottlenecks areidentified by using the timestamps in the event log).Models Should Be Treated as Purposeful Abstractions of Reality
GP5 A model derived from event data provides a view on reality. Such a view should serve as apurposeful abstraction of the behavior captured in the event log. Given an event log, theremay be multiple views that are useful.Process Mining Should Be a Continuous Process
GP6 Given the dynamical nature of processes, it is not advisable to see process mining as a one-time activity. The goal should not be to create a fixed model, but to breathe life into processmodels such that users and analysts are encouraged to look at them on a daily basis.
quence acef2 beh inM2. After relating events to model elements, it is possible to “replay”the event log on the model [Aalst 2011]. Replay may be used to reveal discrepanciesbetween an event log and a model, e.g., some events in the log are not possible accord-ing to the model. Techniques for conformance checking can be used to quantify anddiagnose such discrepancies. Timestamps in the event log can be used to analyze thetemporal behavior during replay. Time differences between causally related activitiescan be used to add average/expected waiting times to the model. These examples illus-trate the importance of guiding principle GP4; the relation between events in the logand elements in the model serves as a starting point for different types of analysis.
6.3. ChallengesProcess mining is an important tool for modern organizations that need to managenon-trivial operational processes. On the one hand, there is an incredible growth ofevent data. On the other hand, processes and information need to be aligned perfectlyin order to meet requirements related to compliance, efficiency, and customer service.Despite the applicability of process mining there are still important challenges thatneed to be addressed; these illustrate that process mining is an emerging discipline.Table II lists the eleven challenges described in the manifesto [TFPM 2011].
As an example consider Challenge C4: “Dealing with Concept Drift”. The term con-cept drift refers to the situation in which the process is changing while being analyzed[Bose et al. 2011]. For instance, in the beginning of the event log two activities may beconcurrent whereas later in the log these activities become sequential. Processes maychange due to periodic/seasonal changes (e.g., “in December there is more demand” or“on Friday afternoon there are fewer employees available”) or due to changing condi-
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:11
Table II. Some of the Most Important Process mining Challenges Identified in the Manifesto
Finding, Merging, and Cleaning Event DataC1 When extracting event data suitable for process mining several challenges need to be ad-
dressed: data may be distributed over a variety of sources, event data may be incomplete,an event log may contain outliers, logs may contain events at different level of granularity,etc.Dealing with Complex Event Logs Having Diverse Characteristics
C2 Event logs may have very different characteristics. Some event logs may be extremely largemaking them difficult to handle whereas other event logs are so small that not enough datais available to make reliable conclusions.Creating Representative Benchmarks
C3 Good benchmarks consisting of example data sets and representative quality criteria areneeded to compare and improve the various tools and algorithms.Dealing with Concept Drift
C4 The process may be changing while being analyzed. Understanding such concept drifts is ofprime importance for the management of processes.Improving the Representational Bias Used for Process Discovery
C5 A more careful and refined selection of the representational bias is needed to ensure high-quality process mining results.Balancing Between Quality Criteria such as Fitness, Simplicity, Precision, andGeneralization
C6 There are four competing quality dimensions: (a) fitness, (b) simplicity, (c) precision, and (d)generalization. The challenge is to find models that score good in all four dimensions.Cross-Organizational Mining
C7 There are various use cases where event logs of multiple organizations are available foranalysis. Some organizations work together to handle process instances (e.g., supply chainpartners) or organizations are executing essentially the same process while sharing expe-riences, knowledge, or a common infrastructure. However, traditional process mining tech-niques typically consider one event log in one organization.Providing Operational Support
C8 Process mining is not restricted to off-line analysis and can also be used for online oper-ational support. Three operational support activities can be identified: detect, predict, andrecommend.Combining Process Mining With Other Types of Analysis
C9 The challenge is to combine automated process mining techniques with other analysis ap-proaches (optimization techniques, data mining, simulation, visual analytics, etc.) to extractmore insights from event data.Improving Usability for Non-Experts
C10 The challenge is to hide the sophisticated process mining algorithms behind user-friendlyinterfaces that automatically set parameters and suggest suitable types of analysis.Improving Understandability for Non-Experts
C11 The user may have problems understanding the output or is tempted to infer incorrectconclusions. To avoid such problems, the results should be presented using a suitable rep-resentation and the trustworthiness of the results should always be clearly indicated.
tions (e.g., “the market is getting more competitive”). Such changes impact processesand it is vital to detect and analyze them [Bose et al. 2011].
7. PROCESS MINING IN PRACTICEAlthough the manifesto lists many open challenges, existing process mining tech-niques can easily be applied in practice. At TU/e (Eindhoven University of Technology)we have applied process mining in over 100 organizations. To help the reader to getstarted with process mining, we briefly discuss tool support and show two case studiestaken from [Aalst 2011].
7.1. Tool SupportThe open-source tool ProM has been the de-facto standard for process mining duringthe last decade. Process discovery, conformance checking, social network analysis, or-ganizational mining, decision mining, history-based prediction and recommendation,
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:12 W. van der Aalst
etc. are all supported by ProM [Aalst 2011; Verbeek et al. 2010]. For example, dozensof different process discovery algorithms are supported by ProM. The functionality ofProM is unprecedented, i.e., there is no product offering a comparable set of processmining algorithms. However, the tool requires process mining expertise and is not sup-ported by a commercial organization. Hence, it has the advantages and disadvantagescommon for open-source software.
Fortunately, there is also a growing number of commercially available software prod-ucts offering process mining capabilities. Examples are: ARIS Process PerformanceManager (Software AG), Comprehend (Open Connect), Discovery Analyst (Stereo-LOGIC), Flow (Fourspark), Futura Reflect (Futura Process Intelligence), InterstageAutomated Process Discovery (Fujitsu), Process Discovery Focus (Iontas/Verint), Pro-cessAnalyzer (QPR), and Reflect|one (Pallas Athena).
All of the products mentioned support process discovery, i.e., constructing a processmodel based on an event log. For example, Futura Reflect supports genetic processmining as described in [Medeiros et al. 2007]. Some of the systems mentioned havedifficulties discovering concurrency, e.g., ARIS Process Performance Manager, Flow,and Interstage Automated Process Discovery. All systems take the timestamps in theevent log into account to be able to provide performance-related information, i.e., flowtimes and bottlenecks can be discovered.
None of the commercial software products provides comprehensive support for con-formance checking, i.e., the focus is on process discovery and performance measure-ment. However, ProM supports the different types of conformance checking describedin Section 4.3.
Some of these products embed process mining functionality in a larger system, e.g.,Pallas Athena embeds process mining in their BPM suite BPM|one. Other productsaim at simplifying process mining using an intuitive user interface.
7.2. Discovering “Spaghetti Processes”There is a continuum of processes ranging from highly structured processes (Lasagnaprocesses) to unstructured processes (Spaghetti processes). Figure 3 shows why un-structured processes are often called “Spaghetti processes”. The model was obtainedusing ProM’s heuristic miner [Weijters and Aalst 2003]. Hence, low frequent behaviorhas been filtered out. Nevertheless, the model is too difficult to comprehend. Note thatthis is not necessarily a problem of the discovery algorithm. Activities are only con-nected if they frequently followed one another in the event log. Hence, the complexityshown in Fig. 3 reflects reality and is not caused by the discovery algorithm.
Figure 3 is an extreme example used to illustrate the characteristics of a typicalSpaghetti process. Given the data set it is not surprising that the process is unstruc-tured; the 2765 patients did not form a homogeneous group and included individualswith very different medical problems. The process model in Fig. 3 can be simplifieddramatically by selecting a group of patients with similar problems or by selectingonly the most frequent activities. Nevertheless, its complexity exemplifies some of thechallenges mentioned in the manifesto (in particular C1, C2, C6, C10, and C11).
7.3. Analyzing “Lasagna Processes”Processes in municipalities are typically Lasagna processes. Figure 4 shows a so-called“WOZ process” discovered for a Dutch municipality. We applied the heuristic miner[Weijters and Aalst 2003] on an event log containing information about 745 objectionsagainst the so-called WOZ (“Waardering Onroerende Zaken”, i.e., Valuation of Real Es-tate) valuation. Dutch municipalities need to estimate the value of houses and apart-ments. The WOZ value is used as a basis for determining the real-estate property tax.The higher the WOZ value, the more tax the owner needs to pay. Therefore, Dutch mu-
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:13
B_Catheter a Demeure(start)2096
O_ECG dagelijks(schedule)
2191
0,996 1449
B_Halsinf./subclavia op OK(start)1294
0,998 755
O_ECG op aanvraag(schedule)
281
0,969 51
B_Drain(s) wond(start)167
0,917 48
B_Doorbewegen(start)129
0,981 56
B_Wondzorg open buik(start)
33
0,667 8
B_Nefrostomie catheter L(start)
7
0,667 4
O_Benzodiazepines(schedule)
1
0,5 1
O_CT-schedel(schedule)
30
0,667 9
B_Primo luchtmatras(start)
48
0,857 19
O_X arm(complete)
1
0,5 1
B_Supra Pubische blaascath(start)
23
0,5 2
B_Oogglazen(start)
3
0,5 2
B_Decubitus zorg stadium 1(start)
7
0,667 3
B_Decubitus zorg stadium 2a(start)
9
0,5 4
B_Ureter catheter L(start)
5
0,5 1
B_Decubitus zorg stadium 2b(start)
3
0,5 2
B_Halsinf./subclavia op Ok(start)772
B_Maagsonde(start)2430
0,992 657
B_Perifeer infuus(start)2837
0,936 2032
B_Wisselligging(start)306
0,958 78
C_-Asystolie(complete)
6
0,5 1
0,979 44
B_Bi-PAP(start)
6
0,5 2
B_Verwijderen Agraves(start)
5
0,5 1
B_IPPB(start)
8
0,5 2
B_Verband spalk(start)
7
0,5 3
B_Uro stoma(start)
12
0,5 2
B_Beademing(start)2187
0,982 1050
B_Catheter a demeure(start)534
0,998 532
B_Weanen(start)355
0,929 100
B_Tracheostomie - percutaan(start)
36
0,5 5
B_Reintubatie(complete)
73
0,8 20
B_Defibrilatie(complete)
12
0,5 2
B_Orthopaedische tractie(start)
2
0,5 1
C_Resp Insuff(complete)
6
0,5 3
B_Pacemaker inbrengen(complete)
7
0,5 3
B_Thoraxdrain(start)1863
0,999 1659
O_X-thorax dagelijks(schedule)
2308
0,962 414
B_Perifeer infuus 2(start)265
0,903 152
0,965 122
B_Swan Ganz op OK(start)117
0,958 25
B_Drain(s) redon(start)210
0,9 51
O_EMV score(schedule)
10
0,667 5
O_Echo nier blaas prostaat(schedule)
15
0,5 3
B_PCA pomp(start)
19
0,667 5
C_s1 Shock, Septisch(start)
24
0,5 4
O_Toxicologie(schedule)
2
0,5 1
O_Transthoracaal ECHO(schedule)
12
0,667 4
C_Decubitus stuit st. 3a(start)
1
0,5 1
C_Flebitis(start)
2
0,5 1
0,992 1718
B_Drain golf(start)
23
0,667 6
O_Pleura vocht kweek(schedule)
31
0,667 17
B_Pleura Punctie(complete)
3
0,5 2
C_Subcutaan emfyseem(complete)
1
0,5 1
0,969 2169
B_Basiszorg(start)2010
0,967 1169
O_Wegen 3x per week(schedule)
123
0,75 6
0,969 66
B_Beademing(complete)
1868
0,984 1564
B_Perifeer infuus(complete)
1573
0,909 1197
B_Arterie lijn op OK(complete)
1280
0,833 1024
M_MeasurementChemistry(complete)
19168
0,995 1716
O_ECG cito(schedule)
35
0,667 6
B_Medium care(start)768
0,889 195
B_Pacemaker standby(start)229
0,909 41
M_MeasurementDecubitus(complete)
824
0,923 130
B_Catheter epiduraal(start)170
0,975 56
O_Wond inspectie(schedule)
4
0,5 1
B_IABP in op OK(start)
56
0,889 15
O_CT thorax(schedule)
14
0,75 5
B_CAPD(start)
2
0,5 2
B_Arterie lijn op OK(start)2002
0,964 929
0,927 1518
B_Pacemaker AAN(start)158
0,95 33
C_Shock, Anaphylactisch(start)
6
0,5 1
B_Isolatie strikte(start)
4
0,5 1
B_Actief koelen(start)
2
0,5 1
C_Stridor(start)
2
0,5 1
C_Platzbauch(start)
4
0,5 1
M_MeasurementBloodGas(complete)
28252 1
21398
B_Actief warmte toevoegen(start)158
0,889 115
B_O2 masker/neusslang(start)1954
0,9 1359
C_Bacteriemie(start)
22
0,833 18
O_Bloedkweek 1(schedule)
412
0,968 326
B_Bi of Trilumen Catheter(start)101
0,8 49
C_-VT(start)
16
0,5 11
B_Arterie lijn op ICU(start)327
0,9 176
B_Perifeer infuus 2(complete)
143
0,889 106
C_-Asystolie(start)
16
0,7 10
B_Bronchiaal toilet(start)373
0,833 194
C_Trombopenie(start)
5
0,5 5
C_CVA(start)
13
0,923 13
C_Pneumonie (klinisch)(start)
4
0,5 2
O _ B EE(schedule)
291
0,995 250
C_ARDS(start)
12
0,75 9
C_Psychose/verward(start)
36
0,833 33
0,8 27
O_Wegen 3x per week(complete)
35
0,9 19
B_Cardioversie(start)
90
0,8 51
B_Bezoek: afw. tijden(complete)
40
0,8 32
O_Vancomycine dal / top(schedule)
30
0,667 25
B_Minitracheotomie(start)
4
0,667 4
B_Minitracheotomie(complete)
2
0,667 2
B_Medium care(complete)
390
0,969 320
B_Arterie lijn op ICU(complete)
184
0,803 158
C_Sufheid(start)
36
0,889 21
C_Anurie (<1ml/kg/24u)(start)
35
0,75 23
C_Ischemie, Myocard(start)
20
0,667 12
O_SDD keelkweek Ma/Do(schedule)
293
0,961 147
C_MI zeker(start)
46
0,875 41
B_Extubatie(start)202
0,974 161
B_Catheter a Demeure(complete)
150
0,861 83
B_Liescatheter(s)(start)
90
0,8 57
O_Wond kweek(schedule)
93
0,825 62
B_Halsinf./subclavia op IC(start)112
0,833 45
O_ECHO Buik(schedule)
28
0,643 18
O_EEG(schedule)
6
0,5 5
C_Bloeding waarvoor reOK(start)
48
0,75 38
O_Gentamycine dal / top(schedule)
122
0,875 95
C_Oligurie (< 5 ml/kg/24u)(start)
40
0,833 28
C_Beademingsafhankelijkheid(start)
27
0,875 19
B_Drain(s) sump(start)
5
0,667 3
O_CT-buik(schedule)
32
0,8 25
B_Bloedtoediening met druk(start)
5
0,667 5
B_Oogzalven / druppelen(complete)
56
0,8 41
B_Drain(s) wond(complete)
58
0,857 41
B_Fixateur Externe(start)
3
0,5 3
C_Hemi-beeld(start)
7
0,667 6
C_-VKF, atrium-flutter(complete)
52
0,75 31
C_DIS(start)
17
0,833 15
C_Resp Insuff(start)
82
0,75 62
B_Basiszorg(complete)
43
0,833 15
O_Pulmonalis angio(complete)
1
0,5 1
C_Febris e.c.i.(start)
6
0,667 6
O_Coronair angiogram(schedule)
6
0,667 3
B_PTCA(complete)
4
0,667 3
B_Liescatheter(s)(complete)
31
0,5 21
B_Vernevelaar(complete)
17
0,857 14
O_TEE(schedule)
84
0,833 44
C_Non oligurische nierinsuf(start)
13
0,75 10
B_Air fluid bed(complete)
28
0,955 27
B_Halsinf./subclavia op IC(complete)
50
0,667 35
C_Autoextubatie(start)
50
0,75 44
O_X been(schedule)
2
0,5 2
C_Pneumothorax(start)
31
0,8 22
B_Verpleegvorm boomstam(complete)
7
0,667 6
C_Para-valvulair lek na OK(start)
5
0,5 4
C_Bronchitis (klinisch)(start)
20
0,833 19
C_Acute Tubulus Necrose(start)
24
0,8 17
B_CVVH(complete)
55
0,679 44
B_Intermit. catheteriseren(complete)
16
0,909 14
C_Pancreatitis(complete)
1
0,5 1
C_Bronchitis -purulent(start)
13
0,8 13
B_Tracheostoma/Tube LOS(complete)
57
0,722 41
O_Kweek art. lijn(schedule)
14
0,5 7
B_Duo luchtmatras(complete)
57
0,762 38
C_Lijn sepsis(start)
9
0,667 8
O_Kweek liescatheter veneus(schedule)
10
0,625 6
C_Depressie(start)
1
0,5 1
B_Uritip(start)
3
0,5 2
O_ECG 3 x p.w.(complete)
10
0,667 9
B_Clysmeren(start)
14
0,667 9
B_IABP in op OK(complete)
53
0,75 38
C_MI mogelijk(start)
37
0,8 31
C_MI mogelijk(complete)
3
0,5 3
C_-SVT, paroxysmaal(start)
15
0,7 12
B_Low flow bed(start)
21
0,667 15
B_Low flow bed(complete)
10
0,8 10
B_Tracheostomie(start)
21
0,667 17
O_Kweek peritoneum(schedule)
7
0,667 3
O_Keel kweek(schedule)
19
0,75 12
C_Icterus (bili > 50 )(start)
7
0,75 5
O_Tobramycine dal / top(schedule)
19
0,667 15
C_s3 Shock, Hypovolaemisch(start)
7
0,75 7
O_Sigmoideoscopie(schedule)
3
0,5 1
C_Empyeem(start)
8
0,75 7
C_Urineweginfectie(start)
2
0,667 2
O_Echo perifere vaten(complete)
2
0,5 1
B_Buikligging(start)
18
0,667 13
B_Primo luchtmatras(complete)
14
0,857 9
C_Lekkage na plastiek(start)
5
0,667 3
C_Decubitus hak st. 2a(start)
3
0,5 2
C_-VF(start)
13
0,8 7
C_Hypoglycaemie(start)
25
0,8 20
B_Jejunumsonde(complete)
6
0,75 6
C_Hyperglycaemie >20mmol/l(start)
4
0,667 4
C_Subcutaan emfyseem(start)
7
0,667 5
C_Fistel bovenste tr dig(start)
3
0,5 1
C_Darmperforatie(start)
5
0,667 3
B_Vacuum therapie(start)
17
0,667 10
O_Fundus scopie(schedule)
1
0,5 1
O_Fundus scopie(complete)
1
0,5 1
B_Wondzorg open buik(complete)
10
0,833 8
C_Hepatitis, drug induced(start)
5
0,667 3
C_Hypoglycaemie(complete)
2
0,5 1
B_Beademing Niet Invasief(start)
8
0,667 5
B_Beademing Niet Invasief(complete)
7
0,75 4
C_Rhabdomyolysis(start)
1
0,5 1
B_CAVH(D)(start)
4
0,667 2
B_CAVH(D)(complete)
3
0,5 3
C_Aspiratie(start)
5
0,667 3
B_Buikligging(complete)
15
0,545 11
O_24 uurs urine Na Creat Ur(schedule)
1
0,5 1
O_Kweek perifeer infuus(schedule)
1
0,5 1
C_Abces(start)
2
0,667 2
B_Isolatie strikte(complete)
3
0,5 3
C_Critical illness polyneur(start)
3
0,5 3
B_Actief koelen(complete)
2
0,667 2
O_Huiduitstrijk Oksel Li /R(schedule)
2
0,5 1
C_Hypoxemie(start)
2
0,5 2
C_Ischemische hepatitis(start)
6
0,667 3
C_Candidosis invasief(start)
1
0,5 1
C_GI-bloeding(start)
9
0,625 7
C_Decubitus overig st. 1(start)
1
0,5 1
C_Autoextubatie(complete)
11
0,5 6
B_Pacemaker inbrengen(start)
7
0,8 5
C_Decubitus stuit st. 1(start)
1
0,5 1
C_Ischemische darm(start)
6
0,8 5
C_Pneumonie (mogelijk)(start)
1
0,5 1
B_PEP masker(complete)
5
0,75 4
C_Naadlekkage(start)
3
0,5 3
C_Lijnkweek positief(start)
2
0,5 2
C_Nosocomiale Pneumonie(start)
13
0,8 11
C_Loge Syndroom(start)
2
0,667 2
B_Fasciotomie(start)
2
0,667 2
B_Fasciotomie(complete)
2
0,5 1
C_Trombopenie(complete)
1
0,5 1
C_GI-bloeding(complete)
2
0,5 1
C_Pneumonie(start)
5
0,8 4
B_NO beademing(complete)
1
0,5 1
C_Tamponade(complete)
2
0,5 2
C_Maagretentie(>1500 ml/24)(start)
3
0,667 3
C_Beademingsafhankelijkheid(complete)
3
0,5 1
B_Isolatie aerogene(start)
1
0,5 1
B_Isolatie aerogene(complete)
1
0,5 1
C_Pleisterlaesie(start)
3
0,75 3
B_Necrotomie(complete)
5
0,5 1
C_Platzbauch(complete)
1
0,5 1
C_Peritonitis(start)
2
0,5 2
C_Geen plaats afd(start)
2
0,5 2
B_Empyeem spoeling(complete)
1
0,5 1
O_Methyl blauw/ fistulogram(complete)
2
0,5 1
C_Pleura-Effusie(start)
2
0,5 1
C_Colitis, pseudomembraneus(start)
1
0,5 1
C_Parotitis(start)
1
0,5 1
B_IPPB(complete)
2
0,667 2
B_Wondzorg open thorax(complete)
3
0,5 2
C_Coma(start)
2
0,5 1
B_Uritip(complete)
1
0,5 1
B_Isolatie Universeel(start)
3
0,5 3
C_ARDS(complete)
1
0,5 1
C_Hyperglycaemie >20mmol/l(complete)
1
0,5 1
B_Plasmaforese(complete)
2
0,5 1
C_TIA(start)
1
0,5 1
C_Cholecystitis, acalc(start)
1
0,5 1
B_Decubitus zorg stadium 3b(start)
1
0,5 1
C_Haemolyse(start)
1
0,5 1
B_Decubitus zorg stadium 4b(start)
1
0,5 1
C_Intra-peritoneaal Abces(start)
1
0,5 1
B_Supra Pubische blaascath(complete)
1
0,5 1
B_Verpleegvorm prikkelarm(complete)
1
0,5 1
0,889 105
B_Actief warmte toevoegen(complete)
150
0,975 147
B_Scleroseren GI bloeding(complete)
4
0,5 1
B_PEG catheter(start)
7
0,5 1
B_Donor Multi Orgaan(start)
5
0,667 2
0,9 1296
0,923 150
O_ECG dagelijks(complete)
374
0,964 57
C_Ischemie(start)
35
0,833 8
O_Wegen dagelijks(complete)
53
0,75 7
C_Hypotensie(start)
17
0,8 5
B_Tracheostomie(complete)
11
0,5 1
C_Bloedverlies > 50 ml/uur(complete)
2
0,5 1
B_Empyeem spoeling(start)
2
0,5 1
C_s3 Shock, Hypovolaemisch(complete)
2
0,5 1
B_Ontlastende LP bij druk(start)
1
0,5 1
B_Catheter spinaal(start)
1
0,5 1
B_Thoraxdrain(complete)
617
0,817 448
0,918 112
B_Catheter a demeure(complete)
18
0,75 6
C_Darmperforatie(complete)
2
0,5 2
O_Lab. 3x per week(complete)
5
0,5 3
B_Reanimatie(complete)
20
0,667 2
0,8 24
0,974 175
B_IABP in op ICU(complete)
12
0,5 4
C_Bloedverlies > 50 ml/uur(start)
47
0,833 20
B_Swan Ganz op ICU(complete)
15
0,667 3
B_Wondzorg overig(complete)
23
0,75 14
B_Rethoratocomie op OK(complete)
42
0,667 7
B_Amputatie Extremiteit(start)
3
0,5 2
B_Isolatie contact(complete)
3
0,5 3
B_PEP masker(start)
6
0,667 4
C_Psychose/verward(complete)
3
0,5 2
0,942 172
B_Halsinf./subclavia op OK(complete)
106
0,857 38
B_Pacemaker AAN(complete)
88
0,7 24
O_Doppler perifere vaten(complete)
2
0,5 1
B_Bi of Trilumen Catheter(complete)
29
0,5 3
B_PCA pomp(complete)
2
0,5 2
C_s2 Shock, Cardiaal(complete)
4
0,5 2
C_Sufheid(complete)
4
0,5 3
C_Lekkage na plastiek(complete)
1
0,5 1
M_MeasurementClinic(complete)
12474
0,978 995
0,935 929
1 9484
O_X-thorax cito(schedule)
60
0,833 29
0,955 316
B_Tracheostomie - percutaan(complete)
20
0,667 17
C_s2 Shock, Cardiaal(start)
47
0,833 32
0,98 153
O_SDD / SOD studie(schedule)
131
0,857 80
O_Doppler perifere vaten(schedule)
16
0,75 9
C_Bloeding waarvoor > 3 PC(start)
15
0,667 9
B_Wisselligging(complete)
64
0,667 45
C_Decompensatie na OK(start)
3
0,5 1
C_Sternumwondinfectie(start)
4
0,5 4
C_-Premature Slagen NNO(start)
3
0,5 2
B_Laparotomie(complete)
13
0,5 10
B_Jejunostomie(complete)
2
0,5 2
C_Tamponade(start)
7
0,5 4
B_Pleura Punctie(start)
3
0,5 2
B_CPAP(start)
18
0,75 12
B_Isolatie druppel(start)
15
0,667 13
C_Hemorrhoiden bloedend(start)
1
0,5 1
C_Ischemie waarvoor Re OK(start)
9
0,667 3
C_Endocarditis(start)
2
0,5 1
C_Cholecystitis, stenen(start)
2
0,5 2
C_Thrombo-embolie art(start)
2
0,5 2
C_Postanox encefalopat(start)
3
0,667 3
O_Fenytoine(schedule)
7
0,5 7
B_Decubitus zorg stadium 1(complete)
3
0,5 3
B_Decubitus behandeling(complete)
3
0,5 2
B_Isolatie Universeel(complete)
2
0,5 1
1 13945
C_-VKF, atrium-flutter(start)181
0,8 168
0,947 179
0,9 43
0,5 5
0,5 13
C_Ileus(start)
3
0,667 3
B_Isolatie druppel(complete)
3
0,5 3
0,5 3
O_Lithium(schedule)
1
0,5 1
C_Atelectase(start)
6
0,667 6
B_Vacuum therapie(complete)
6
0,667 6
B_Verband spalk(complete)
2
0,5 2
0,5 2
B_Decubitus behandeling(start)
4
0,5 4
O_BAL / Lavage(schedule)
6
0,5 6
C_Leucopenie(start)
1
0,5 1
O_Ascites kweek(schedule)
2
0,5 2
O_Coloscopie(schedule)
2
0,5 2
C_Pustuleuze afw(start)
1
0,5 1
O_Liquor kweek(schedule)
4
0,667 4
C_N Phrenicus Paralyse(start)
1
0,5 1
0,996 534
0,997 533
0,999 1282
0,964 91
B_Fysiotherapie(start)371
0,992 244
0,984 86
B_Mobiliseren(start)237
0,978 106
O_Gastro / Duodenscopie(schedule)
32
0,667 16
B_Bi-PAP(complete)
5
0,5 3
B_Verpleegvorm boomstam(start)
9
0,5 6
B_Verpleegvorm prikkelarm(start)
6
0,5 5
O_Virus serologie(schedule)
8
0,5 3
B_Verband gips(start)
3
0,667 3
C_Decubitus stuit st. 2a(start)
3
0,5 2
O_Paracetamol(schedule)
2
0,5 1
B_Isolatie Beschermend(start)
1
0,5 1
0,98 140
0,98 64
0,944 27
0,75 11
B_Wondzorg open thorax(start)
10
0,667 8
0,833 22
O_ECG cito(complete)
31
0,812 31
0,889 4
O_X-thorax cito(complete)
53
0,96 49
O_Bloedkweek 2(schedule)
258
0,951 230
O_Bloedkweek 1(complete)
403
0,955 124
O_Cito GRAM + sputumkweek(schedule)
97
0,941 23
O_Kweek bi/tri lumen cath.(schedule)
61
0,8 8
O_Bloedkweek 3(schedule)
14
0,667 7
O_Bloedkweek 2(complete)
252
0,944 150
O_Sputum kweek(schedule)
428
0,939 57
O_Faeces kweek(schedule)
63
0,833 5
0,938 15
O_Cito GRAM + bronchuskweek(schedule)
91
0,933 14
O_Kweek urinecatheter(schedule)
30
0,857 7
0,75 11
O_Wegen dagelijks(schedule)
158
0,9 99
O_Synacthen(schedule)
55
0,857 15
C_s1 Shock, Septisch(complete)
3
0,5 1
0,857 45
C_Myoclonieen(start)
4
0,5 1
C_Dwarslaesie(start)
1
0,5 1
O_Cito GRAM + sputumkweek(complete)
94
0,938 39
O_Bloedkweek 3(complete)
13
0,833 10
0,9 104
O_Keel kweek(complete)
19
0,5 3
0,8 12
0,875 22
B_Reanimatie(start)
26
0,857 15
0,8 87
B_IABP in op ICU(start)
17
0,667 2
B_Ballonneren(start)317
0,833 67
B_Sonde-Voeding(start)365
0,933 86
B_Anus Praeter Naturalis(start)
64
0,8 6
B_Bloedtoediening met druk(complete)
4
0,5 2
O_Kweek sheath(schedule)
7
0,5 2
B_Swan Ganz op ICU(start)
18
0,8 4
0,976 135
B_PTCA(start)
6
0,5 1
O_Ramsay-score(complete)
3
0,5 2
B_Decubitus zorg stadium 3a(start)
4
0,5 2
C_Polyurie (>40ml/kg/24u)(start)
1
0,5 1
0,667 10
C_-VT(complete)
2
0,667 2
0,5 2
0,857 8
0,9 289
B_IABP uit op ICU(start)
1
0,5 1
0,889 93
B_PEG catheter(complete)
3
0,5 1
C_-Brady / Aritmie(complete)
2
0,5 1
B_O2 masker/neusslang(complete)
213
B_Beademing gestart op ICU(start)
61
0,875 20
B_NO beademing(start)
1
0,5 1
B_Ontlastende LP bij druk(complete)
1
0,5 1
O_Urine kweek(schedule)
244
0,946 120
O_Urine kweek(complete)
236
0,97 106
O_Benzodiazepines(complete)
1
0,5 1
O_Kweek swan ganz(schedule)
5
0,5 1
O_Kweek overige(schedule)
49
O_Kweek overige(complete)
47
0,944 44
0,947 201
0,982 168
O_Wond kweek(complete)
88
0,889 7
O_Sigmoideoscopie(complete)
3
0,5 1
O_Kweek perifeer infuus(complete)
1
0,5 1
C_Exantheem / Rash(start)
4
0,5 1
0,833 44
B_Duo luchtmatras(start)192
B_Wondzorg overig(start)270
0,875 24
C_Decubitus hak st. 1(start)
2
0,5 1
0,667 2
0,929 386
B_Isolatie contact(start)
7
0,667 2
B_Scleroseren GI bloeding(start)
4
0,5 1
B_Pacemaker standby(complete)
130
0,857 128
C_-Brady / Aritmie(start)
22
0,5 4
0,815 83
B_Swan Ganz op OK(complete)
100
0,667 26
0,75 8
0,833 52
B_Vernevelaar(start)
25
0,75 12
0,971 54
0,833 317
B_Bezoek: afw. tijden(start)
70
0,857 14
0,909 47
B_Bezoek: waken(start)
52
0,8 14
B_Re OK(start)
11
0,5 2
O_X-thorax 3 x p.w.(complete)
4
0,5 1
B_Bezoek: kind. toegestaan(start)
1
0,5 1
0,917 14
0,909 91
B_Weanen(complete)
316
0,8 227
O_X TWK(schedule)
1
0,5 1
C_Wondinfectie(start)
3
0,5 1
B_Drain golf(complete)
6
0,5 1
0,8 78
B_Tracheostoma/Tube LOS(start)
85
0,75 53
O_IAP studie(schedule)
2
0,5 1
0,767 106
O_Lab. 3x per week(schedule)
10
0,5 1
O_Cystoscopie(schedule)
1
0,5 1
0,8 142
0,792 31
0,667 5
0,889 11
0,875 29
0,889 94
B_Mobiliseren(complete)
49
0,667 13
C_Oligurie (< 5 ml/kg/24u)(complete)
5
0,5 3
C_Fibro-proliferatieve ARDS(start)
5
0,667 3
C_-SVT, paroxysmaal(complete)
4
0,5 2
0,5 4
O _ B EE(complete)
290
0,982 282
O_EMV score(complete)
3
0,5 1
0,966 271
0,992 269
O_Pulmonalis angio(schedule)
1
0,5 1
C_Diabetes Insipides(start)
1
0,5 1
C_Convulsie(s)(start)
2
0,5 1
0,8 5
O_Sputum kweek(complete)
405
0,985 391
O_Kweek peritoneum(complete)
7
0,5 2
O_Virus serologie(complete)
8
0,5 1
O_I.V Catheter kweek overig(schedule)
29
0,75 6
0,965 170
C_s4 Shock, Onbekend(start)
6
0,5 2
C_Bronchitis (mogelijk)(start)
2
0,5 2
O_Huiduitstrijk Oksel Li /R(complete)
2
0,5 2
O_Ascites kweek(complete)
2
0,5 2
0,75 5
0,75 11
0,833 32
0,857 53
B_Isolatie beschermende(start)
1
0,5 1
0,667 6
0,75 105
B_Anus Praeter Naturalis(complete)
3
0,5 1
0,8 33
0,571 34
0,667 2
0,85 21
0,972 120
O_X-thorax op aanvraag(schedule)
157
0,872 141
0,823 53
O_X-thorax dagelijks(complete)
331
0,909 252
O_ECG 3 x p.w.(schedule)
27
0,706 14
0,8 72
0,889 6
B_Cardioversie(complete)
80
0,815 74
0,75 6
O_Methyl blauw/ fistulogram(schedule)
2
0,5 1
0,8 31
O_ECG op aanvraag(complete)
42
0,667 3
0,833 36
B_IABP uit op ICU(complete)
1
0,5 1
O_Faeces kweek(complete)
60
0,975 60
C_Candida kolonisatie(start)
1
0,5 1
0,955 54
O_Lumbaal Punctie(schedule)
5
0,5 1
0,667 17
O_Vancomycine dal / top(complete)
28
0,889 13
0,857 27
B_Ballonneren(complete)
216
0,75 97
0,667 2
B_Bronchiaal toilet(complete)
247
0,667 4
0,5 1
0,909 32
0,857 14
B_Sonde-Voeding(complete)
159
0,769 18
B_Catheter epiduraal(complete)
39
0,8 7
B_Intermit. Haemo Dialyse(complete)
14
0,667 4
O_X-thorax op aanvraag(complete)
28
0,667 8
O_Coloscopie(complete)
2
0,5 1
0,667 6
0,875 22
B_Drain(s) sump(complete)
2
0,5 1
B_CPAP(complete)
14
0,571 6
0,889 30
0,75 27
B_CVVH(start)
87
0,833 19
0,833 28
0,8 8
0,909 35
0,909 4
O_kweek pacemakerdraad(schedule)
3
0,5 1
O_Ramsay-score(schedule)
5
0,5 1
O_X-thorax 3 x p.w.(schedule)
22
0,615 13
0,667 11
C_Decompensatie geen OK(start)
4
0,5 1
C_Hepatitis, drug induced(complete)
1
0,5 1
0,667 18
C_Ischemie, Myocard(complete)
1
0,5 1
0,667 14
O_SDD rectumkweek Ma/Do(schedule)
300
0,959 282
O_SDD sputumkweek Ma/Do(schedule)
288
0,974 277
O_SDD rectumkweek Ma/Do(complete)
246
0,75 23
O_SDD sputumkweek Ma/Do(complete)
232
0,923 214
O_SDD keelkweek Ma/Do(complete)
240
0,974 208
O_SDD / SOD studie(complete)
37
0,833 21
0,766 112
0,875 40
C_Longbloeding(start)
3
0,5 1
0,984 203
B_Orthopaedische tractie(complete)
2
0,5 1
B_Decubitus zorg stadium 4a(start)
3
0,5 1
0,5 4
B_Extubatie(complete)
198
0,96 198
0,8 17
0,889 23
0,938 168
0,8 41
0,984 168
O_Lithium(complete)
1
0,5 1
0,812 167
O_IAP studie(complete)
1
0,5 1
0,909 33
C_Intra-peritoneaal Abces(complete)
1
0,5 1
B_Maagsonde(complete)
894
0,857 123
B_Doorbewegen(complete)
30
0,8 18
C_Hypertensie(start)
1
0,5 1
B_Decubitus zorg stadium 3a(complete)
2
0,5 1
0,929 15
0,923 16
B_Drain(s) redon(complete)
66
0,8 60
B_Necrotomie(start)
5
0,5 1
0,989 112
0,955 101
0,947 38
0,947 118
0,911 128
0,917 73
B_Fysiotherapie(complete)
16
0,667 5
B_Plasmaforese(start)
5
0,5 2
B_Intermit. catheteriseren(start)
28
0,769 16
B_Blaasspoelen(start)
12
0,75 4
0,667 2
B_Intermit. Haemo Dialyse(start)
43
0,833 29
B_Blaasspoelen(complete)
5
0,5 1
0,8 77
B_ E R C P(start)
2
0,5 1
0,923 10
0,889 71
0,952 70
0,947 9
0,889 45
0,75 18
0,875 49
0,833 97
0,8 1
C_Leverfalen(start)
2
0,5 1
O_ECHO Buik(complete)
26
0,938 25
O_Echo perifere vaten(schedule)
3
0,5 1
0,667 24
0,5 1
0,8 4
B_Oogzalven / druppelen(start)102
0,8 12
B_Bezoek: waken(complete)
27
0,667 19
O_Bronchoscopie(schedule)
28
0,75 3
O_Gastro / Duodenscopie(complete)
24
0,929 20
O_Pleurapunctie(schedule)
3
0,5 1
O_Bronchoscopie(complete)
26
0,909 25
O_Tracheaspoeling(schedule)
1
0,5 1
O_EEG(complete)
5
0,667 5
0,889 20
C_Addisson / Bijnier Insuff(start)117
0,667 33
C_Acute Lung Injury(start)
1
0,5 1
0,8 20
O_Tracheaspoeling(complete)
1
0,5 1
0,5 4
B_Jejunumsonde(start)
31
0,667 8
0,8 24
0,5 1
0,8 14
C_Bloeding waarvoor > 3 PC(complete)
4
0,5 1
0,75 43
0,9 42
C_Bloeding waarvoor reOK(complete)
12
0,667 3
B_Re OK(complete)
10
0,75 10
0,5 1
0,933 82
B_Brochusscopie(complete)
14
0,667 2
B_Intubatie(complete)
95
0,75 6
0,875 113
O_Gentamycine dal / top(complete)
115
0,932 115
0,912 99
0,833 37
0,833 23
0,875 20
0,667 8
C_Rethoratocomie(start)
6
0,667 4
0,5 3
0,7 13
B_Decubitus zorg stadium 4b(complete)
1
0,5 1
0,8 8
O_CT-buik(complete)
31
0,929 24
0,833 26
O_CT-schedel(complete)
26
0,75 3
C_Pancreatitis(start)
2
0,5 1
0,5 11
0,5 1
0,5 4
0,5 1
0,947 32
0,973 50
O_Kweek tracheostoma(schedule)
1
0,5 1
0,947 71
0,933 18
B_Verwijderen tampon(start)
1
0,5 1
O_Kweek tracheostoma(complete)
1
0,5 1
B_Verwijderen tampon(complete)
1
0,5 1
0,5 1
0,833 9
B _ E R C P(complete)
2
0,5 1
0,8 49
0,5 1
0,857 46
0,5 1
0,667 6
0,983 128
B_Jejunostomie(start)
22
0,667 4
0,667 5
B_Nefrostomie catheter R(start)
8
0,8 4
0,667 2
0,75 41
C_Aspiratie(complete)
1
0,5 1
C_s4 Shock, Onbekend(complete)
1
0,5 1
0,5 2
0,5 1
0,667 27
0,667 3
0,833 16
0,75 48
B_Reintubatie(start)
77
0,912 32
B_Intubatie(start)102
0,875 32
0,75 9
C_Stridor(complete)
1
0,5 1
0,8 67
B_Reintubatie na Autoext(complete)
14
0,667 2
0,667 10
B_Verwijderen Agraves(complete)
5
0,8 5
0,5 4
0,667 5
0,5 2
C_Shock, Anaphylactisch(complete)
3
0,667 3
0,857 27
O_Cito GRAM + bronchuskweek(complete)
86
0,962 86
0,864 40
B_Brochusscopie(start)
15
0,5 1
0,75 13
C_Atelectase(complete)
2
0,5 1
0,75 13
0,5 1
0,667 6
O_Sinus kweek(schedule)
5
0,667 3
O_Sinus kweek(complete)
5
0,5 4
0,5 1
0,5 1
0,667 2
O_Coronair angiogram(complete)
5
0,667 5
0,667 5
0,5 2
0,667 4
0,5 5
0,667 8
B_Amputatie Extremiteit(complete)
2
0,5 1
0,857 86
0,75 23
O_CT bekken(schedule)
1
0,5 1
0,8 17
O_X been(complete)
2
0,5 1
0,5 2
0,8 12
B_Beademing gestart op ICU(complete)
46
0,8 22
0,667 2
B_Oogglazen(complete)
2
0,5 1
C_Dehiscentie(start)
3
0,667 2
0,667 3
0,667 4
0,5 1
0,5 4
0,833 25
O_TEE(complete)
79
0,925 59
O_Synacthen(complete)
53
0,972 53
B_Air fluid bed(start)
42
0,9 40
0,75 11
0,95 22
0,824 21
0,909 30
0,857 19
0,8 25
C_reOK ivm pleuravocht(start)
1
0,5 1
0,875 16
0,5 9
0,75 29
B_Reintubatie na Autoext(start)
14
0,917 12
O_X TWK(complete)
1
0,5 1
0,5 1
0,75 7
0,5 1
O_X arm(schedule)
1
0,5 1
0,5 1
0,5 5
0,8 24
C_Pneumothorax(complete)
11
0,75 4
0,667 7
0,792 39
0,5 4
0,5 4
O_Kweek sheath(complete)
7
0,8 7
0,5 7
0,8 7
0,833 20
0,667 14
0,667 6
0,667 2
0,731 27
O_kweek pacemakerdraad(complete)
3
0,5 3
0,5 2
O_Kweek bi/tri lumen cath.(complete)
58
0,967 57
O_Kweek liescatheter art(schedule)
1
0,5 1
0,8 24
0,5 1
0,625 25
C_Decubitus overig st. 4b(start)
1
0,5 1
0,889 17
0,5 1
0,5 1
0,857 6
0,5 1
0,8 13
0,5 13
0,5 3
0,667 26
O_X b.o.z.(schedule)
10
O_X b.o.z.(complete)
10
0,833 10
0,75 10
0,667 18
O_Kweek art. lijn(complete)
12
0,833 12
0,8 12
0,941 52
O_Kweek liescatheter art(complete)
1
0,5 1
0,889 9
0,5 1
0,8 14
B_Decubitus zorg stadium 4a(complete)
1
0,5 1
O_Digoxine(schedule)
1
0,5 1
0,5 1
0,667 9
O_Kweek liescatheter veneus(complete)
10
0,833 10
0,75 8
0,5 1
0,5 10
0,5 2
0,8 15
C_Nosocomiale Pneumonie(complete)
2
0,5 1
0,5 1
0,5 3
0,8 7
0,75 6
0,667 5
B_Clysmeren(complete)
13
0,857 9
0,5 10
0,75 22
B_Rethoratocomie op OK(start)
43
0,75 6
0,75 47
0,8 30
0,667 1
0,667 12
0,667 20
0,667 1
0,667 15
B_Nefrostomie catheter L(complete)
1
0,5 1
0,75 3
0,8 9
0,667 6
B_Halsinf./subclavia op Ok(complete)
28
0,833 9
0,75 3
0,833 28
0,889 9
0,667 18
0,667 4
0,667 2
0,75 4
0,667 2
0,8 3
0,667 4
0,75 5
0,8 6
O_Echo nier blaas prostaat(complete)
15
0,917 15
0,8 10
0,667 2
0,75 7
B_Laparotomie(start)
13
0,625 12
C_Naadlekkage(complete)
1
0,5 1
0,667 5
0,667 1
0,667 6
O_Tobramycine dal / top(complete)
18
0,769 13
0,625 15
0,5 2
0,5 6
0,667 2
0,5 1
0,5 2
0,5 3
0,667 3
0,75 5
0,667 2
0,75 8
0,5 1
0,667 5
0,667 2
0,5 1
0,857 21
0,5 1
0,5 3
O_Kweek swan ganz(complete)
5
0,667 5
0,5 1
0,889 5
O_Pleura vocht kweek(complete)
26
0,824 21
0,75 8
O_Kweek urinecatheter(complete)
28
0,833 28
0,75 24
0,5 2
0,667 2
O_I.V Catheter kweek overig(complete)
27
0,929 25
0,833 23
C_Candidaemie(start)
3
0,5 1
0,5 2
0,667 3
0,667 16
0,8 14
0,5 1
0,5 1
0,9 10
0,5 1
0,667 4
0,5 2
0,5 1
C_Rethoratocomie(complete)
1
0,5 1
0,5 1
0,667 2
0,75 4
0,5 1
C_Decubitus stuit st. 2b(start)
2
0,5 1
0,5 1
O_Cystoscopie(complete)
1
0,5 1
0,5 1
0,667 5
O_CT thorax(complete)
14
0,833 14
0,75 11
O_Pleurapunctie(complete)
3
0,667 3
0,5 3
0,667 1
0,5 1
0,833 14
0,667 6
0,5 2
0,75 4
0,667 15
B_Defibrilatie(start)
14
0,75 8
C_-VF(complete)
5
0,667 4
0,8 11
0,5 6
0,8 24
0,667 3
0,667 4
0,5 4
0,8 5
0,5 1
0,75 5
0,5 1
0,5 1
0,5 1
0,857 20
0,909 5
0,667 5
0,5 1
0,667 14
O_Wond inspectie(complete)
1
0,5 1
0,5 1
0,5 1
0,75 3
0,5 1
0,75 5
0,5 2
0,75 7
0,5 1
0,5 1
0,5 1
0,5 1
C_Addisson / Bijnier Insuff(complete)
2
0,5 1
0,5 1
0,5 1
0,75 4
0,909 3
C_Bacteriemie(complete)
1
0,5 1
C_Empyeem(complete)
1
0,5 1
C_Bronchitis (klinisch)(complete)
1
0,5 1
C_Decompensatie na OK(complete)
1
0,5 1
C_Anurie (<1ml/kg/24u)(complete)
1
0,5 1
0,5 4
0,5 1
0,5 7
C_Ischemie waarvoor Re OK(complete)
2
0,5 1
0,5 3
0,5 1
0,667 5
0,5 2
0,5 2
0,5 2
0,5 1
0,5 6
0,667 6
0,5 1
0,667 4
0,667 2
0,75 4
0,5 1
0,5 1
0,5 1
0,5 1
O_Lumbaal Punctie(complete)
5
0,667 5
0,667 4
0,667 15
O_Toxicologie(complete)
2
0,667 2
0,5 2
0,5 2
O_24 uurs urine Na Creat Ur(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
B_T drain(start)
1
0,5 1
0,5 1
0,5 3
0,5 1
0,5 3
0,5 3
0,5 1
0,667 2
0,5 2
0,5 2
0,5 1
0,5 1
0,5 4
0,667 2
0,5 1
O_BAL / Lavage(complete)
6
0,75 5
O_Biopsie(schedule)
2
0,5 1
0,667 5
O_Biopsie(complete)
2
0,5 1
0,5 2
0,5 1
C_Thrombo-embolie art(complete)
1
0,5 1
0,5 1
0,667 5
0,5 1
0,5 1
O_Transthoracaal ECHO(complete)
10
0,75 10
0,5 8
0,5 2
0,5 1
0,667 9
0,5 1
0,5 1
0,5 1
0,5 3
0,75 9
0,667 4
0,5 4
0,667 1
0,5 1
0,5 1
0,667 2
0,5 3
0,5 9
0,8 14
0,667 4
0,5 1
0,5 1
0,5 3
0,75 7
0,5 1
0,5 1
0,5 2
0,5 1
0,5 3
0,5 1
0,5 3
0,5 2
O_Paracetamol(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,8 13
0,5 2
0,5 1
0,5 1
0,667 1
0,5 1
0,5 1
C_Convulsie(s)(complete)
2
0,5 1
0,5 1
0,5 1
0,5 1
0,667 5
0,5 1
0,5 1
0,5 1
0,5 3
0,5 1
0,5 1
0,5 2
0,5 1
B_Donor Weefsel(start)
1
0,5 1
0,5 1
0,5 2
0,667 5
0,667 2
0,5 1
B_Ureter catheter R(start)
4
0,667 2
0,5 1
0,5 1
0,5 1
0,5 1
0,75 8
B_Decubitus zorg stadium 2a(complete)
1
0,5 1
0,5 1
0,667 2
0,5 1
C_Decubitus hak st. 3a(start)
1
0,5 1
C_Decubitus overig st. 3a(start)
1
0,5 1
0,5 1
0,667 2
0,5 3
0,8 2
0,5 1
0,5 4
0,5 1
0,667 2
0,667 2
C_Hypotensie(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 2
0,5 1
0,5 1
0,5 1
0,5 2
0,5 1
0,5 1
0,5 1
0,25 2
0,667 3
O_Fenytoine(complete)
7
0,667 4
O_Liquor kweek(complete)
4
0,667 4
0,75 3
0,75 7
C_Colitis, pseudomembraneus(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
C_Lijn sepsis(complete)
1
0,5 1
0,5 1
0,5 1
0,667 2
0,5 1
0,5 2
0,5 1
0,5 1
0,667 2
B_Horizontaal(start)
1
0,5 1
0,5 2
0,5 1
0,5 1
0,5 1
0,5 2
0,5 2
0,5 1
0,5 1
0,5 3
0,5 1
0,5 1
0,5 1
0,5 2
0,5 1
0,5 1
0,5 1
0,5 1
B_Horizontaal(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
C_Druk necrose elders(start)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
B_Isolatie Beschermend(complete)
1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
0,5 1
O_Digoxine(complete)
1
0,5 1
Fig. 3. Spaghetti process describing the diagnosis and treatment of 2765 patients in a Dutch hospital. Theprocess model was constructed based on an event log containing 114,592 events. There are 619 differentactivities (taking event types into account) executed by 266 different individuals (doctors, nurses, etc.).
nicipalities need to handle many objections (i.e., appeals) of citizens that assert thatthe WOZ value is too high. Figure 4 shows the process of handling these objectionswithin a particular municipality. The diagram is not intended to be readable; it is onlyincluded to show the contrast with Fig. 3.
Domain: heus1complete
OZ02 Voorbereidenstart
OZ02 Voorbereidencomplete
OZ06 Stop vorderingstart
OZ06 Stop vorderingcomplete
OZ15 Zelf uitspraakstart OZ15 Zelf uitspraak
complete
OZ20 Administatiestart
OZ20 Administatiecomplete
OZ24 Start vorderingstart
OZ24 Start vorderingcomplete
OZ08 Beoordelenstart
OZ08 Beoordelencomplete
OZ12 Hertaxerenstart
OZ12 Hertaxerencomplete
OZ16 Uitspraakstart
OZ16 Uitspraakcomplete
OZ04 Incompleetstart
OZ04 Incompleetcomplete
OZ10 Horenstart
OZ10 Horencomplete
OZ09 Wacht Beoordstart
OZ09 Wacht Beoordcomplete
OZ18 Uitspr. wachtstart
OZ18 Uitspr. wachtcomplete
Fig. 4. WF-net discovered based on an event log of a Dutch municipality. The log contains events related to745 objections against the so-called WOZ valuation. These 745 objections generated 9583 events. There are13 activities. For 12 of these activities both start and complete events are recorded. Hence, the WF-net has25 transitions.
The discovered WF-net has a good fitness: 628 of the 745 cases can be replayedwithout encountering any problems. The fitness of the model and log at the event levelis 0.98876214. This value is based on the approach described in [Aalst 2011; Rozinatand Aalst 2008]. The high value shows that almost all recorded events are explainedby the model. Hence, the WOZ process is clearly a Lasagna process. Nevertheless, itwas interesting for the municipality to see the deviations highlighted in the model.Figure 5 shows a fragment of the diagnostics provided by the ProM’s conformancechecker.
The municipality’s log contains timestamps. Therefore, it is possible to replay theevent log while taking the timestamps into account. ProM can visualize the phases ofthe process that take most time. For example, the place in-between “OZ16 Uitspraakstart” (start of announcement of final judgment) and “OZ16 Uitspraak complete” (endof announcement of final judgment) was visited 436 times. The average time spent inthis place is 7.84 days. This indicates that activity “OZ16 Uitspraak” (final judgment)takes about a week. It is also possible to simply select two activities and measure thetime that passes in-between these activities. On average 202.73 days pass in-betweenthe completion of activity “OZ02 Voorbereiden” (preparation) and the completion of“OZ16 Uitspraak” (final judgment). Such examples illustrate that process mining –
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:14 W. van der Aalst
Fig. 5. Fragment of the WF-net annotated with diagnostics generated by ProM’s conformance checker.The WF-net and event log fit well (fitness is 0.98876214). Nevertheless, several low-frequent deviationsare discovered. For example. activity “OZ12 Hertaxeren” (re-evaluation of WOZ value) is started 23 timeswithout being enabled according to the model.
unlike classical Business Intelligence (BI) tools – helps organizations to “look inside”their processes. This is in stark contrast with contemporary BI tools that typicallyfocus on reporting and fancy looking dashboards.
8. CONCLUSIONThis paper introduced process mining as a new technology enabling evidence-basedprocess analysis. We introduced the three basic types of process mining (discovery, con-formance, and enhancement) using a small example and used some larger examples toillustrate the applicability in real-life settings. Nevertheless, there are still many openscientific challenges and most end-user organizations are not yet aware of the poten-tial of process mining. This triggered the development of the Process Mining Manifestoby an international task force involving 77 process mining experts representing 53 or-ganizations. This manifesto can be obtained from http://www.win.tue.nl/ieeetfpm/.The reader interested in process mining is also referred to the recent book on processmining [Aalst 2011]. Also visit www.processmining.org for sample logs, videos, slides,articles, and software.
ACKNOWLEDGMENTS
The author would like to thank all that contributed to the Process Mining Manifesto: Arya Adriansyah,Ana Karla Alves de Medeiros, Franco Arcieri, Thomas Baier, Tobias Blickle, Jagadeesh Chandra Bose, Pe-ter van den Brand, Ronald Brandtjen, Joos Buijs, Andrea Burattin, Josep Carmona, Malu Castellanos,Jan Claes, Jonathan Cook, Nicola Costantini, Francisco Curbera, Ernesto Damiani, Massimiliano de Leoni,Pavlos Delias, Boudewijn van Dongen, Marlon Dumas, Schahram Dustdar, Dirk Fahland, Diogo R. Ferreira,Walid Gaaloul , Frank van Geffen, Sukriti Goel, Christian Gunther, Antonella Guzzo, Paul Harmon, Arthurter Hofstede, John Hoogland, Jon Espen Ingvaldsen, Koki Kato, Rudolf Kuhn, Akhil Kumar, Marcello LaRosa, Fabrizio Maggi, Donato Malerba, Ronny Mans, Alberto Manuel, Martin McCreesh, Paola Mello, JanMendling, Marco Montali, Hamid Motahari Nezhad, Michael zur Muehlen, Jorge Munoz-Gama, Luigi Pon-tieri, Joel Ribeiro, Anne Rozinat, Hugo Seguel Perez, Ricardo Seguel Perez, Marcos Sepulveda, Jim Sinur,Pnina Soffer, Minseok Song, Alessandro Sperduti, Giovanni Stilo, Casper Stoel, Keith Swenson, MaurizioTalamo, Wei Tan, Chris Turner, Jan Vanthienen, George Varvaressos, Eric Verbeek, Marc Verdonk, Roberto
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
Process Mining: Overview and Opportunities 99:15
Vigo, Jianmin Wang, Barbara Weber, Matthias Weidlich, Ton Weijters, Lijie Wen, Michael Westergaard, andMoe Wynn.
REFERENCESAALST, W. VAN DER 2011. Process Mining: Discovery, Conformance and Enhancement of Business Processes.
Springer-Verlag, Berlin.AALST, W. VAN DER, ADRIANSYAH, A., AND DONGEN, B. VAN 2012. Replaying History on Process Models
for Conformance Checking and Performance Analysis. WIREs Data Mining and Knowledge Discovery.AALST, W. VAN DER, DONGEN, B., HERBST, J., MARUSTER, L., SCHIMM, G., AND WEIJTERS, A. 2003.
Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering 47, 2, 237–267.
AALST, W. VAN DER, HEE, K. VAN, HOFSTEDE, A., SIDOROVA, N., VERBEEK, H., VOORHOEVE, M., ANDWYNN, M. 2011. Soundness of Workflow Nets: Classification, Decidability, and Analysis. Formal Aspectsof Computing 23, 3, 333–363.
AALST, W. VAN DER, REIJERS, H., WEIJTERS, A., DONGEN, B. VAN, MEDEIROS, A., SONG, M., AND VER-BEEK, H. 2007. Business Process Mining: An Industrial Application. Information Systems 32, 5, 713–732.
AALST, W. VAN DER, RUBIN, V., VERBEEK, H., DONGEN, B. VAN, KINDLER, E., AND GUNTHER, C. 2010.Process Mining: A Two-Step Approach to Balance Between Underfitting and Overfitting. Software andSystems Modeling 9, 1, 87–111.
AALST, W. VAN DER, SCHONENBERG, M., AND SONG, M. 2011. Time Prediction Based on Process Mining.Information Systems 36, 2, 450–475.
AALST, W. VAN DER AND STAHL, C. 2011. Modeling Business Processes: A Petri Net Oriented Approach. MITpress, Cambridge, MA.
AALST, W. VAN DER, WEIJTERS, A., AND MARUSTER, L. 2004. Workflow Mining: Discovering Process Mod-els from Event Logs. IEEE Transactions on Knowledge and Data Engineering 16, 9, 1128–1142.
ADRIANSYAH, A., DONGEN, B. VAN, AND AALST, W. VAN DER 2011. Conformance Checking using Cost-Based Fitness Analysis. In IEEE International Enterprise Computing Conference (EDOC 2011), C. Chiand P. Johnson, Eds. IEEE Computer Society, 55–64.
AGRAWAL, R., GUNOPULOS, D., AND LEYMANN, F. 1998. Mining Process Models from Workflow Logs. InSixth International Conference on Extending Database Technology. Lecture Notes in Computer ScienceSeries, vol. 1377. Springer-Verlag, Berlin, 469–483.
BERGENTHUM, R., DESEL, J., LORENZ, R., AND MAUSER, S. 2007. Process Mining Based on Regionsof Languages. In International Conference on Business Process Management (BPM 2007), G. Alonso,P. Dadam, and M. Rosemann, Eds. Lecture Notes in Computer Science Series, vol. 4714. Springer-Verlag, Berlin, 375–383.
BOSE, R., AALST, W. VAN DER, ZLIOBAITE, I., AND PECHENIZKIY, M. 2011. Handling Concept Drift in Pro-cess Mining. In International Conference on Advanced Information Systems Engineering (Caise 2011),H. Mouratidis and C. Rolland, Eds. Lecture Notes in Computer Science Series, vol. 6741. Springer-Verlag, Berlin, 391–405.
COOK, J. AND WOLF, A. 1998. Discovering Models of Software Processes from Event-Based Data. ACMTransactions on Software Engineering and Methodology 7, 3, 215–249.
CORTADELLA, J., KISHINEVSKY, M., LAVAGNO, L., AND YAKOVLEV, A. 1998. Deriving Petri Nets fromFinite Transition Systems. IEEE Transactions on Computers 47, 8, 859–882.
DATTA, A. 1998. Automating the Discovery of As-Is Business Process Models: Probabilistic and AlgorithmicApproaches. Information Systems Research 9, 3, 275–301.
DESEL, J. AND REISIG, W. 1998. Place/Transition Nets. In Lectures on Petri Nets I: Basic Models, W. Reisigand G. Rozenberg, Eds. Lecture Notes in Computer Science Series, vol. 1491. Springer-Verlag, Berlin,122–173.
DONGEN, B. VAN AND AALST, W. VAN DER 2004. Multi-Phase Process Mining: Building Instance Graphs.In International Conference on Conceptual Modeling (ER 2004), P. Atzeni, W. Chu, H. Lu, S. Zhou, andT. Ling, Eds. Lecture Notes in Computer Science Series, vol. 3288. Springer-Verlag, Berlin, 362–376.
DONGEN, B. AND AALST, W. VAN DER 2005. Multi-Phase Mining: Aggregating Instances Graphs into EPCsand Petri Nets. In Proceedings of the Second International Workshop on Applications of Petri Nets toCoordination, Workflow and Business Process Management, D. Marinescu, Ed. Florida InternationalUniversity, Miami, Florida, USA, 35–58.
DONGEN, B. VAN, BUSI, N., PINNA, G., AND AALST, W. VAN DER 2007. An Iterative Algorithm for Applyingthe Theory of Regions in Process Mining. In Proceedings of the Workshop on Formal Approaches to
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.
99:16 W. van der Aalst
Business Processes and Web Services (FABPWS’07), W. Reisig, K. Hee, and K. Wolf, Eds. PublishingHouse of University of Podlasie, Siedlce, Poland, 36–55.
EHRENFEUCHT, A. AND ROZENBERG, G. 1989. Partial (Set) 2-Structures - Part 1 and Part 2. Acta Informat-ica 27, 4, 315–368.
GRECO, G., GUZZO, A., PONTIERI, L., AND SACCA, D. 2006. Discovering Expressive Process Models byClustering Log Traces. IEEE Transaction on Knowledge and Data Engineering 18, 8, 1010–1027.
GUNTHER, C. AND AALST, W. VAN DER 2007. Fuzzy Mining: Adaptive Process Simplification Based onMulti-perspective Metrics. In International Conference on Business Process Management (BPM 2007),G. Alonso, P. Dadam, and M. Rosemann, Eds. Lecture Notes in Computer Science Series, vol. 4714.Springer-Verlag, Berlin, 328–343.
HAND, D., MANNILA, H., AND SMYTH, P. 2001. Principles of Data Mining. MIT press, Cambridge, MA.HERBST, J. 2000. A Machine Learning Approach to Workflow Management. In Proceedings 11th European
Conference on Machine Learning. Lecture Notes in Computer Science Series, vol. 1810. Springer-Verlag,Berlin, 183–194.
TFPM – IEEE TASK FORCE ON PROCESS MINING. 2011. Process Mining Manifesto. In BPM Workshops.Lecture Notes in Business Information Processing Series, vol. 99. Springer-Verlag, Berlin.
MANYIKA, J., CHUI, M., BROWN, B., BUGHIN, J., DOBBS, R., ROXBURGH, C., AND BYERS, A. 2011. BigData: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute.
MEDEIROS, A., WEIJTERS, A., AND AALST, W. VAN DER 2007. Genetic Process Mining: An ExperimentalEvaluation. Data Mining and Knowledge Discovery 14, 2, 245–304.
MUNOZ-GAMA, J. AND CARMONA, J. 2011. Enhancing Precision in Process Conformance: Stability, Confi-dence and Severity. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011),N. Chawla, I. King, and A. Sperduti, Eds. IEEE, Paris, France.
ROZINAT, A. AND AALST, W. VAN DER 2006. Decision Mining in ProM. In International Conference on Busi-ness Process Management (BPM 2006), S. Dustdar, J. Fiadeiro, and A. Sheth, Eds. Lecture Notes inComputer Science Series, vol. 4102. Springer-Verlag, Berlin, 420–425.
ROZINAT, A. AND AALST, W. VAN DER 2008. Conformance Checking of Processes Based on Monitoring RealBehavior. Information Systems 33, 1, 64–95.
SOLE, M. AND CARMONA, J. 2010. Process Mining from a Basis of Regions. In Applications and Theory ofPetri Nets 2010, J. Lilius and W. Penczek, Eds. Lecture Notes in Computer Science Series, vol. 6128.Springer-Verlag, Berlin, 226–245.
SONG, M. AND AALST, W. VAN DER 2008. Towards Comprehensive Support for Organizational Mining.Decision Support Systems 46, 1, 300–317.
VERBEEK, H., BUIJS, J., DONGEN, B. VAN, AND AALST, W. VAN DER 2010. ProM 6: The Process MiningToolkit. In Proc. of BPM Demonstration Track 2010, M. L. Rosa, Ed. CEUR Workshop ProceedingsSeries, vol. 615. 34–39.
WEIJTERS, A. AND AALST, W. VAN DER 2003. Rediscovering Workflow Models from Event-Based Data usingLittle Thumb. Integrated Computer-Aided Engineering 10, 2, 151–162.
WERF, J., DONGEN, B. VAN, HURKENS, C., AND SEREBRENIK, A. 2010. Process Discovery using IntegerLinear Programming. Fundamenta Informaticae 94, 387–412.
WESKE, M. 2007. Business Process Management: Concepts, Languages, Architectures. Springer-Verlag,Berlin.
ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.