28
Advances in rule-based process mining: applications for enterprise risk management and auditing Filip Caron, Jan Vanthienen, Bart Baesens KBI_1305

advances in rule base process mining.pdf

  • Upload
    kamanw

  • View
    236

  • Download
    1

Embed Size (px)

Citation preview

  • Advances in rule-based process mining: applications for enterprise risk management and auditing Filip Caron, Jan Vanthienen, Bart Baesens

    KBI_1305

  • Advances in Rule-Based Process Mining: Applications forEnterprise Risk Management and Auditing

    Filip Carona,, Jan Vanthienena, Bart Baesensa,b,c

    aDepartment of Decision Sciences and Information Management, Faculty of Business and Economics,KU Leuven, Naamsestraat 69, B-3000 Leuven, Belgium

    bVlerick Leuven Gent Management School, Vlamingenstraat 38, B-3000 Leuven, BelgiumcSchool of Management, University of Southampton, Highfield Southampton, SO17 1BJ, United

    Kingdom

    Abstract

    Process mining research has mainly focused on the development of process mining tech-niques, with process discovery algorithms in the center of attention. However, far lessresearch attention has been paid to the actual applicability of these process mining tech-niques in common business settings. Consequently, there only exists a partial fit betweenthe existing process mining techniques and the compliance checking & risk managementapplications.

    This research report contributes to the process mining and compliance checking re-search by proposing an effective and efficient rule-based approach for analyzing organi-zational information and processes. Additionally, a general content-based business ruletaxonomy has been developed as a source of business rules for the compliance checkingapproach. Furthermore, we also provide formal grounding for and an evaluation of therule-based approach.

    Keywords: Business Rules, Compliance Checking, Risk Management, Process Mining,Process-Aware Information Systems

    1. Introduction

    While the value creation abilities of an organization are increasingly determined bythe flexibility of their information systems and business processes, this flexibility mayalso pose significant risks that could have an enormous impact on achieving the corpo-rate objectives [56]. Additionally, contemporary organizations are faced with evermorerestricting directives, both external (e.g. government and trade associations) and internal(e.g. business policies and corporate social responsibility programs). Consequently, theorganizations management performs a risk assessment and implements appropriate riskresponses, e.g. control procedures. Well-known examples of control procedures includethe segregation of duties, the authorization rules and the inclusion of approval activities.

    Corresponding author, Telephone: +32 16 32 65 58, Fax: +32 16 32 66 24Email address: [email protected] (Filip Caron)

  • Shareholders and other stakeholders of the organizations will demand an independentassessment of the effectiveness of these risk responses, which is typically performed byauditors. Both the risk and control effectiveness assessments make use of (similar) com-pliance checking techniques.

    Several research contributions have suggested the use of process mining techniquesfor such an effectiveness assessment, i.e. to uncover potential compliance failures (e.g.[62, 38, 53]). Process mining refers to the set of techniques that analyzes event logsof process-aware information systems to acquire unique insights into the real processes[73, 69]. These event logs contain an untapped reservoir of knowledge about the businessoperations, detailed information on the business events (e.g. timestamps, case identifiers,originator identifiers, etc.) is recorded in a structured way.

    While compliance checking in the audit department has been suggested as a po-tential application for several process mining techniques, much less attention has beenpaid to fully adapting the techniques to the audits specific needs. Furthermore, othergovernance, risk and compliance activities are not taken into account. Moreover, thedescription of specific applications of process mining in a control (assessment) setting arerare and potential generalizations are not provided. This research report contributes tothe applied process mining research by:

    Proposing an effective and efficient rule-based compliance checking approach withprocess mining for analyzing organizational information and processes, with con-crete applications in both (audit) compliance checking and risk management.

    Presenting a general content-based business rule taxonomy, resulting in a set offrequently used rule patterns for control (assessment) and risk identification &assessment.

    Providing both a formal grounding for and an evaluation of the rule-based compli-ance checking with process mining.

    The outline of the research report is as follows: section 2 provides an overview ofcompliance checking with process mining, describes the partial/limited fit and introducesthe rule-based compliance approach. Section 3 presents a general content-based ruletaxonomy. In the body of the research report we discuss the rule-based compliancechecking approach with process mining: the details and formal grounding in section4, the identified rule patterns in section 5 and an evaluation of the opportunities andchallenges in section 6. The final section concludes the research report and presents anoutlook for future research in this area.

    2. Compliance Checking with Process Mining

    The study of compliance checking is an emerging research field which also holds im-portant challenges for the process-aware information systems. The major contributions ofcompliance research in process-aware information systems can be grouped along the dif-ferent phases of the business process management lifecycle, e.g. [43, 60, 42, 81]. However,this contribution will focus on the monitoring and analysis phase and more specificallyon process mining as a set of techniques for compliance checking.

    2

  • 2.1. Process Mining

    Process mining addresses the problem that most organizations, and consequentlyalso the persons with compliance checking and risk management activities, have verylimited information about what is actually happening in the business processes [73, 78,25]. Insights into the real behavior are acquired through analysis of the structuredinformation contained in information systems event logs. Process mining has receiveda vast amount of research attention resulting in a plethora of techniques, e.g. processdiscovery techniques [9, 16, 24, 63, 79, 33], techniques for the analysis of event log data [6,64, 65, 68], techniques for trace classifications [58, 20], process metrics [53, 18] and appliedresearch [67, 44]. Figure 1 schematically represents the process mining architecture.

    Information

    System(ERP, WFMS, CRM,)

    Event Logs

    Legislation,

    Policies,

    Directives

    Designed

    Process Model

    Discovery &

    VisualizationDelta Analysis

    Conformance

    Checking

    Rule-Based

    Property

    Verification

    Process Influencers Process Mining Techniques

    Induced Process

    Model

    Figure 1: Process Mining Architecture

    2.2. Traditional Compliance Checking with Process Mining

    Three broad sets of techniques are of main interest for compliance checking: processdiscovery & visualization, conformance checking & delta analysis and rule based processmining.

    The set of process discovery and visualization techniques enables the acquisi-tion of a full and precise insight in the real business process, summarized in one visual.Therefore process visualization can be considered as the ideal technique for process explo-ration and thus a first step in a compliance analysis [62]. In a compliance checking settingthe [63], the + + [16] and the heuristics miner [79] (only in combination with lowthresholds) are suitable to obtain a process model from the event log of an informationsystem. Whereas process visualization traditionally focused on the overall control-flow,the visualization of other business process aspects can be of benefit for a complianceanalyst, e.g. the performance sequence diagram analysis plug-in [70], the originator-by-task matrix [71], bottleneck analysis [54]. Process visualization as an auditing tool isadvocated, for example, in [37, 62, 1].

    Conformance checking and delta analysis both aim at detecting inconsistenciesbetween a prescriptive process model and its corresponding real-life process. The majordifference between both lies in the comparison base for the real-life process, conformancechecking uses the event log while delta analysis uses a derived process model [52, 53,

    3

  • 3, 10, 32]. The execution paths specified in the process model often contain multipleimplicit internal controls (e.g. approval is needed before paying a bill). Both the riskassessment and the compliance analysis solely rely on the recall/fitness dimension, whichdescribes the extent to which the behavior in the event log can be associated with validexecution paths in the prescriptive process model.

    Rule-based property verification approaches allow the analyst to verify specificquestions [65]. Examples of properties an analyst might want to check include the four-eyes principle and a set of ordering constraints. LTL-Checker plug-in [14] for ProMprovides the analyst with sixty application-independent and configurable templates ofcommon process properties (e.g. the existence of certain activities). Semantic LTL-Checker adds a functionality that allows the user to provide concepts as input to theparameters of LTL formulae [15]. SCIFF Checker uses configurable CLIMB rule tem-plates to verify process properties [45]. Several research contribution successfully appliedproperty verification based on the standard templates stored in the LTL-Checker plug-in,e.g. [37, 38, 74, 6].

    2.3. Partial fit between Compliance Checking and Existing Process Mining Techniques

    Existing process mining techniques are often not fully adapted to the risk man-agement and audit compliance checking setting. Process mining researchers haveprimarily focused on improving techniques for control-flow and to a far lesser extent forsocial/organizational analysis. Besides the obvious requirement to be able to analyze andperform controls on the additional data in the event logs (e.g. amounts), this researchfocus results in other important restrictions. Firstly, conformance metrics might provideonly a first impression of the overall conformance with the designed model. An in-depthevaluation of the problematic process parts, where the deviation from the designed modelis significant, will be required. Additionally, we can question the correctness of the de-signed business processes. Procedural business process models often contain control-flowdependencies that are not dictated by internal or external directives, i.e. overspecificationof the process model [47]. Consequently, process deviations affecting the conformancemeasures, do not necessarily violate any internal or external directive. Moreover, defin-ing all possible execution paths to deal with natural variations in a business environmentcan be challenging.

    Due to the assumption that an event log will not contain all possible behavior, pro-cess discovery and visualization techniques have to balance model precision and gener-ality. However, in a risk management and compliance checking setting generality is notimportant, it might might actually result in a cover up of (infrequent) harmful processdeviations.

    2.4. Comprehensive Rule-Based Compliance Checking with Process Mining

    To deal with the partial/limited fit between compliance checking and traditionalprocess mining techniques, this research report will propose a comprehensive rule-basedapproach. The compliance analyst will be presented with an extensive set of configurablebusiness rule patterns that can be used to describe the most common controls. Thesepatterns can be tested against a broad set of process information, including the eventlog of at least one business process. Other data sources may include the human resourcedata bases, the client data bases, etc. By taking into account multiple business processevent logs, the compliance analyst will be able to make reconciliations.

    4

  • In addition to the ability to take into account specific case data of a variety of datasources, this rule-based approach to compliance checking with process mining will berobust to noise and consequently a trade-off between generality and specificity will notbe made. As a result there will be no cover up of harmful low frequency process devi-ations. Furthermore, the approach enables the analyst to obtain a precise indicator ofthe effectiveness of the control or potential risk event. Finally, this compliance checkingapproach will not be confronted with the disadvantages of process overspecification.

    A more elaborate discussion of the approach can be found in section 4. The nextsection will focus on identifying the business rule categories that can be used for therule-based compliance checking approach.

    3. Business Rules as a Resource for Compliance Checking with Process Min-ing

    The Business rules research has mostly focused on improving the business rule lan-guages in terms of expressibility [31, 30], comprehensibility [77, 5] and formalisms [26, 65].Moreover, several contributions have indicated the relevance of business rules for model-ing directives and control objectives, e.g. [55]. This section discusses the existing businessrule classifications and proposes a new content-based taxonomy to fit a rule-based com-pliance checking approach used by business users.

    3.1. Existing Business Rule Classifications

    As a consequence of the increasing research attention for business rules, several busi-ness rule classifications were presented. Due to the fact that most of these classificationstarget a different audience (e.g. data base administrators, application developers, processprofessionals), they tend to use different bases of comparison.

    The BRS Rule Classification Scheme proposes a categorization based on the threepossible ways to react to an event , i.e. reject, produce and project [51]. Therejectors, typically called constraints, are business rules that tend to disallow an event ifthis would create a violation of the rule. Producer business rules automatically measuresomething, compute or derive a specific value or infer a term or fact. Projector businessrules automatically invoke specific actions when relevant events occur.

    In its GUIDE report the Business Rules Group distinguishes three main business rulecategories based on structural aspects, i.e. derivation, structural assertion and actionassertion [28]. Derivation business rules define how additional knowledge can be obtainedfrom existing knowledge. Structural assertions can be either the definition of a businessterm or a fact relating different terms to each other. The third category, the actionassertions, deals with constraining enterprise behavior in some way, examples includeauthorizations and integrity constraints. Similarly, Ross distinguishes the definitionalrules from the behavioral rules, which are business rules that can be violated directly(comparable to action assertions) [50].

    According to Wagner rules can be classified by their format and the elements fromwhich they are composed [77]. Derivation rules consist of one or more conditions and oneor more conclusions, whereas production rules combine one or more conditions with oneor multiple actions. Integrity rules or constraints are composed of a constraint modalityand a constraint assertion. Every reaction rules contains a triggering event, an optional

    5

  • condition and a triggered action and/or post-condition. Finally, transformation rulesfocus on controlling the change of a systems state.

    Declarative process modeling contributions (e.g. ConDec [47]) implicitly provide aclassification structure for process or control-flow oriented business rules. Amongthese business rules we distinguish types such as activity existence rules, relationshiprules and choice rules. The existence rules deal with the cardinality of one activity typewithin a process. While required order between activities can be defined with relationshiprules, the choice rules specify the necessity to choose between multiple activities.

    Comparably, data modeling and data base contributions have been structuring thedata oriented business rules. The Object-Role-Modeling methodology [31], for exam-ple, specifies different role-based fact types for typical data rules such as existential facts,elementary facts, uniqueness constraints, mandatory roles, value constraints, derivationsand ring constraints. In [13] a distinction is made between constraints and derivations.The constraints are further divided in state, transition and stimulus/response constraints,specifying respectively the legal values, the allowed value changes and a combination oftriggering event and action. The derivations on the other hand can be subdivided intoinferences and computations.

    Finally, multiple contributions propose the modality or level of enforcement usedfor a business rule as a potential classification basis, e.g. [29, 31]. While alethic businessrules must be satisfied, deontic rules are obligatory but can be violated (in extreme casesthey can be considered as guidelines).

    3.2. Limited Business Audience for Existing Classifications

    The useability of a specific business rule classification scheme will be primarily deter-mined by its intended audience. While most of the existing business rule classificationswill focus on technical aspects, this research report aims at providing adequate guidanceto the business users that need to translate regulations, directives and other elementsof the business environment into business rules. Therefore the business rule taxonomypresented in section 3.3 and further elaborated for a process mining setting in section 4classifies business rules based on their content type. Classifying business rulesbased on content and adding a template to each individual type, which is both under-standable and grounded in a formal definition, enables business users to perform controlsfrom a broad spectrum of common types.

    3.3. Business Rule Content Taxonomy

    Every business rule type deals with an abstract business matter (e.g. an activityauthorization), it has a content type. For use within a specific corporate setting, aspecific business context should be augmented (e.g. a manager is authorized to performa sign/approve activity). The business rule taxonomy presented in figure 2, groupsthe most common business rule types in contemporary organizations by their contenttype. Four main categories have been distinguished: concept-oriented business rules,fact-oriented business rules, activity-oriented business rules and organizational-orientedbusiness rules.

    Concept-oriented rules are business rules that specify something of importance tothe business. While common concepts (e.g. customer) are generally understood, othermore specific business concepts require a concept definition (e.g. gold customer) [28].

    6

  • Common Business

    Rule Content

    Taxonomy

    Concept-Oriented

    BR

    Concept Definition

    BR (D11)

    Concept Value BR

    (D12)

    Concept Derivation

    BR (D13)

    Concept State BR

    (D14)

    Fact-Oriented BR

    Fact Definition BR

    (D21)

    Multiplicity BR

    (D22)

    Subtyping BR (D23)

    Aggregation BR

    (D24)

    Ring BR (D25)

    Activity-Oriented

    BR

    Activity Existence

    BR (F31)

    Activity Co-

    Existence BR (F32)

    Activity Order BR

    (C33)

    Activity

    Precondition BR

    (C34)

    Activity

    Postcondition BR

    (C35)

    Time-Driven

    Activity BR (C36)

    Organizational-

    Oriented BR

    Authorization BR

    (O41)

    Visibility BR (O42)

    Originator

    Cardinality BR

    (O43)

    Deontic BR (O44)

    Figure 2: Common Business Rule Content Taxonomy

    The concept value and concept derivation business rule types group all the business rulesthat define restrictions to the acceptable value. Concept derivations specify allowablevalue sets, intervals or uniqueness constraints, whereas concept derivations specify howthe value of a concept can be inferred/computed from other concepts [13]. Conceptstate business rules on the other hand can specify both the initial concept state and thepossible state transitions.

    Secondly, relationships between concepts are specified and constrained by fact ori-ented business rules. Comparable to concepts, context specific facts require exact factdefinitions. Multiplicity business rules restricts the number of instances for a concepttype in a specific fact type, e.g. each American has exactly one social security num-ber. The subtyping business rules group all business rules that specify the aspects ofworking with different subclasses, i.e. subtyping definitions, exclusiveness constraintsand exhaustiveness constraints. Aggregation business rules group the business rules thatspecify different types of is part of or is composed of relationships between differentconcepts. Ring business rules are rules that specifies restrictions on a pair of compatibleconcepts (i.e. same concepts or concepts with one or more subtyping business rules) in afact type. A typical example of a ring business rule is if employee1 supervises employee2than it is impossible that employee2 supervises employee1. The fact-oriented businessrules are related to the ones typically discussed in data modeling, e.g. [31].

    The third set of business rules is composed of activity-oriented rules, all focusing

    7

  • on different aspects of the activities which are part of the business operations. Activityexistence business rules define restrictions on the (number of) occurrence(s) of a specificactivity in the context of one business case. Comparably, activity co-existence businessrules restrict the co-occurrence of multiple activities within the context of a specific busi-ness case. Activity order business rules define sequence restrictions on multiple activities.The previous activity-oriented business rules are heavily related to the set presented in[48]. Activity preconditions and activity postconditions are business rules that specify acondition that needs to be fulfilled respectively before and after the execution of a specificactivity. Time-driven activity rules impose a time restriction on the business operations,on the activity duration or on the time interval between particular activities.

    Finally, the organizational-oriented business rules are business rules that modelthe obligations, restrictions, permissions etc. of specific agents within the organization.Different types of agents can be defined ranging from a specific employee or role to entiredivisions. Authorization rules specify whether a particular type of agent is allowed toperform a particular type of activity [61]. Comparably, visibility rules define if is anagent of a particular agent type may access particular information. How many timesa (particular) agent of an agent type may execute a specific activity will be recordedwith an originator cardinality business rule. Deontic business rules specify when deonticassignments come into existence or cease to exist based on the occurrence/absence ofcertain business events [21]. Often deontic business rules are combined with a time-restriction.

    4. Comprehensive Rule-Based Compliance Checking with Process Mining

    The previous sections provide both the legitimation for a rule-based compliance check-ing approach with process mining and an overview of potentially interesting business ruletypes. This section will start with an elaborate discussion of the approach. Followed byan adaption of the general content-based rule taxonomy and a formal foundation for theapproach. Finally, some possible extensions including are discussed (i.e. scope, dualityof activities and artifacts and interprocess reconciliations).

    4.1. Architecture behind Comprehensive Rule-Based Compliance Checking with ProcessMining

    The comprehensive rule-based compliance checking approach with process mining isbased on an architecture consisting of three main building blocks: the business prove-nance (or process information) block, the regulation, policies & directives block andthe techniques block.

    Business provenance deals with the systematic and reliant recording of businessevents and (evolutions of) other business artifacts, it keeps track of both the currentstatus and the history. As business processes are becoming heavily supported by process-aware information systems, the event logs of these information systems become a valuablesource of information on the business operations. This process centric information can befurther enriched with information from other data sources, such as the customer manage-ment systems with the client base, the human resource data, etc. The accurate trackingas proposed by business provenance techniques is essential for compliance checking andwill enable root cause analyses for compliance failures.

    8

  • Legislation, Policies,

    Directives

    Business processes Information systemssupported by Event logsactions

    recorded inExternal Data

    Sources

    Rule-based internal

    controls

    Enriched

    process data

    Rule pattern

    Tested against

    Figure 3: Advanced rule-based process mining architecture

    Secondly, legislation, policies and other directives are the major purposes thatjustify the existence compliance checking. These directives constrain the business opera-tions of an organization. Their origins can be structurally different, varying from externalrules imposed by the government (e.g. Sarbanes-Oxley act), over standards set by tradeassociations, to internal policies specified in support of the organizations corporate socialresponsibility programs.

    The third architectural building block contains the techniques that are used to per-form compliance checking: configurable business rule patterns and process mining. Thedirectives, presented in the previous building block, form a control objective that is trans-lated into one or more specific controls. These controls can often be easily mapped ona set of business rules. Moreover, while the controls are devised for a specific organiza-tional setting, it can be concluded that most of them follow one specific rule structure(e.g. segregation of duties, activity existence patterns, arithmetic derivation patterns,etc.) from a limited rule structure set. For these frequently reoccurring rule structureswe will define a rule pattern which is easily understandable by business users, and can beeasily configured to the specific setting. Additionally, each rule pattern is mapped on aformal specification which allows an automated rule testing against the enriched processdata through process mining based compliance checkers, i.e. (process) event-orientedcompliance checkers such as the LTL-checker or any other query environment.

    4.2. Adapting Business Rule Content Taxonomy for Process Mining

    The advanced rule-based process mining architecture indicates the necessity of acomprehensive set of configurable business rule patterns. While the common businessrule content taxonomy presented in section 3.3 forms a strong base for this pattern set, wehave to argue that the taxonomy is not adapted to the typical event-oriented structureof an event log. Therefore this section adapts the classification structure to perfectlyfit the proposed rule-based compliance checking approach with process mining. Theclassification structure is based on two dimensions: the process mining perspectiveand the rule restriction focus.

    4.2.1. Process Mining Perspective Dimension

    The first dimension of the classification refers to the process mining perspective(PMP) that is used in the business rule. Four different perspectives on business processmodeling were introduced in [12] and can also be used to classify the business rule typesin this context.

    9

  • Functional process perspective (PMP1) that deals with the process elements (e.g.activities) that are being performed/occur in a process instance, as well as therelevant process artifacts linked to these process elements (e.g. an invoice artifactfor a pay activity). The business rules in this class form a subset of the businessrules in the F31 and F32 classes in the general taxonomy.

    Control-flow process perspective (PMP2) that covers the process behavior in termsof when process elements can be performed/occur in a process instance. Thisincludes a wide variety of ordering relations between processes elements, as wellas complex decision making conditions, entry and exit criteria, etc. Control-flowbusiness rules could be classified in either class C33, C34, C35 or C36.

    Organizational process perspective (PMP3) that focuses on the organization behindthe business process, which agent performs the different process elements in a pro-cess instance taking into account factors such as timing, environmental conditions,etc. Note that agent can have a broad interpretation, varying from a single person,over a department to a whole organization. This perspective is represented by theorganizational-oriented business rule classes in the general taxonomy.

    Data process perspective (also known as informational perspective) (PMP4) thatrepresents the informational elements (e.g. event data, case date, etc.) that areused, produced or manipulated during the process, as well as relationships amongthem. Business rules from this perspective belong to either the concept-oriented orthe fact-oriented business rules.

    These process mining perspectives can be grouped by their data requirements forevent logs. For business rules from the functional and control-flow it suffices that theevent log contains timestamps, event and activity identifiers. The organizational anddata perspective based business rules require additional event data, such as an originatoridentifier, an amount, etc.

    4.2.2. Rule Restriction Focus Dimension

    Secondly business rules can be classified along their main rule restriction focus(RRF). Five new and distinctive business rule restriction focuses are identified:

    Cardinality-based rules (RRF1) are business rules that restrict the number of al-lowed instances of a specific process element type (i.e. activities, process events,originators or other relevant event data) in a specific process instance.

    Coexistence rules (RRF2) can be defined as business rules that restrict the coexis-tence of process elements of different types over the execution of a specific processinstance.

    Dynamic data-driven rules (RRF3) specify the influence of certain data elements(i.e. case or event data) and their value on the occurrence of process elements in aspecific process instance.

    Relative time rules (RRF4) focus on specifying a time restriction on process el-ements relative to certain points in a process execution (e.g. start of a process,completion of a specific activity, etc.).

    Static property rules (RRF5) deal with specifying a specific property for a particulartype of process element at a predefined process state.

    10

  • While the first four rule restriction focuses deal with dynamic properties (i.e. history-based or future constraining), the last focus deals with static properties (i.e. propertiesin one specific process state). Both dimensions cover the wide spectrum of controls thatcan be implemented and evaluated with process mining techniques.

    4.3. Formal Specification of Rule Patterns

    While the business rule patterns are proposed with a need for comprehensibility by thebusiness actors in mind, specifying them unambiguously is crucial in a risk managementand compliance checking setting. An unambiguous interpretation of the businessrule patterns and consequently the specific configurations, can be obtained by formallyspecifying them. Moreover, the existing model checking approaches enable an automatedidentification of conflicting business rules and consequently conflicting controls. Sinceeach formal expression language has different semantics and a different expressive power,the definition of certain rule patterns may be unnaturally or even impossible in a specificformal specification language. Therefore the use of LTL is recommended for dynamicproperties, whereas first order logic is more suited for static properties. This sectionstarts with some preliminaries, followed by a discussion of the different uses of formallanguages and ends with some concluding remarks.

    4.3.1. Preliminaries: Specifying Business Processes, Business Events and Audit Trails

    In this subsection we provide a formal definition for some basic concepts needed toperform advanced rule-based process mining, i.e. business process, business event andaudit trail.

    Definition 1: Business processes. A business process can be formally representedby a process schema S, which is defined by the tuple (A,R,O,P, , pi,BR) where

    The basic constructs include: A = {a1, a2, a3, ..., an} that denotes the finite set ofall activities, R = {r1, r2, r3, ..., rn} that is used to refer the finite set of roles, O ={o1, o2, o3, ..., on} that represents the finite set of originators and P = {p1, p2, p3, ..., pn}that stands for the finite set of (all other) properties.

    represents the property-type assignment function, : P . A possible setof generally relevant property types is defined as = {Time, InstanceIdentifier,Activity, EventType, Originator, Role, String, Rational, Boolean}.

    pi is the property-set function that specifies the set of properties that is applicablefor each activity, pi : A 2P.

    BR the set of business rules specifying the relevant relations and constraints fora business process (e.g. precedence relations, required roles, etc.). These businessrules may be either implicit (e.g. implicit preconditions in petri nets) or explicit(e.g. precondition in a declartive ConDec model).

    During the execution of a business process a multitude of business events can beobserved. A business event is a relevant occurrence of something (e.g. start of a specificactivity) that happens at a specific time and is of special interest to the business.

    Definition 2: Business event. The events related to activities, which are consideredas the states in a process instance, are described as follows:

    11

  • An event is specified by the values of the related relevant data properties, as suchan event can be denoted as e : P V with e E = {e1, e2, e3, ..., en} the set of allevents and V the set of all possible values for the properties.

    The partial function at least assigns a value to the property activity, which indicatesthe activity related to the event, and all the related properties identified by thepi(ai), in accordance with the property-type assignment.

    Contemporary information systems store a multitude of information about theseevents in a structured way. Therefore, the resulting event logs (denoted by , , etc.)precisely describes the execution of all process instances, within a certain timeframe, i.e.the audit trail.

    Definition 3: Audit trail. An audit trail E is an event sequence, where Erepresents all traces composed of zero or more events of E. Additional properties:

    = e1, e2, ..., en denotes an audit trail with || = n the length of the trail andei = [i] (with 1 i n) the i-th event in the audit trail.

    i represents the suffix of starting at [i], consequently i = ei, ei+1, ..., enEach event log record contains the information of a specific event, including a specific

    event identifier = {i1, i2, i3, ..., in}. The event identifiers can be grouped by the caseor process instance identifier (x) using the following subset definition x : {e E| =x}. Since the exact structure of a specific event log is not known in advance, thiscontribution will use a generic and flexible definition of the function that is used toquery for events. For specific business rules the function notation will contain all therelevant event properties for that function, e.g. for activity order rules it should suffice toprovide the activity (ai) and event type (tj) this results in the following notation (ai, tj).However, when a specific role (rk) is required the function notation will additionallyinclude at least this role, consequently (ai, tj , rk) etc.

    4.3.2. Specifying Dynamic Process Aspects in Linear Temporal Logic

    Since business process management systems can be considered as reactive systems,patterns can be modeled using linear temporal logic (LTL). This results in formulae thatcan be interpreted over linear state sequences, which allows to define internal controlsthat can be interpreted over the entire set of events (i.e. the different states) of a singleprocess instance. However, there exists a crucial difference between a business processinstance and a reactive system, which is the trace length. Regular reactive systems areconsidered non-terminating, which is reflected in the infinite semantics of regular LTL.Therefore a bounded version of the LTL definition (in accordance with [22]) should beused. Secondly, whereas each element of a trace in regular LTL may consist of a set ofproperties, an audit trail contains exactly one event for each element [46].

    Definition 4: LTL formula. An LTL formula p over a subset of E is a function p : E {true, false}, with p denoting that the formula p satisfies trace (i.e. p() = true)and 2 p denoting that formula p does not satisfy trace (i.e. p() = false). For allLTL formulas p and q true, false, p, pq, pq, 2p, 3p, #p, p U q and p W q are LTLformulas as well. The semantics and syntax (in the Manna/Pnueli notation) of LTL aredefined as follows:

    12

  • Atomic proposition: proposition: p if and only if p = [1] Boolean connectives: not (): p if and only if 2 p, and (): pq if and

    only if p and q, or (): pq if and only if p or q, implication(): p q if and only if p q, equivalence (): p q if andonly if (p q) (p q), true (true): true if and only if p p andfalse (false): false if and only if 2 true

    Temporal connectives: next (#): #p if and only if 2 p, until (U): pUq if and only if (1in : (i q (1j T0)))

    4.3.4. Composed Business Rule Patterns

    In reality there will be a need for more sophisticated rule patterns than the atomicones presented in the rule pattern taxonomy. Most of the desired internal control (ef-fectiveness test) patterns can be obtained through a composition of multiple atomicrule patterns. These compositions can be obtained through the use of logical connectorssuch as ,,,, etc. Two types of combination can be identified:

    Combination of rule patterns of different typesFor example a combination of a simple response and precedence rule: A register

    13

  • insurance policy activity (a1) should be followed by a process premium paymentactivity (a2) and a process premium payment activity should be preceded by aregister insurance policy (must be tested for each process instance x)i.e. 2((x, a1, tc) 3(x, a2, tc)) (((x, a2, ts) (x, a2, tc)))W(x, a1, tc)

    Combination of rule patterns of the same typeFor example a combination of two precedence constraints: A pay for damagesactivity (a1) must be preceded by a evaluate claim activity (a2) or an provideexpert review activity (a3) (must be tested for each process instance x)i.e. (((x, a1, ts) (x, a1, tc)))W ((x, a2, tc) (x, a3, tc))

    4.4. Determining the Rule-Scope for Configured Rule Patterns

    Defining the scope of a business rule or control can be considered as specifying theapplicability range of that business rule in terms of a certain dimension. The advancedrule-based approach for process mining seems to imply the business process dimen-sion, for which we distinguish three levels global, (set of) business processes and (set of)business process instances.

    The global process scope requires that each process in an organizations business oper-ations complies with that rule. Typically a record financial transaction activity must beperformed by an agent with role accountant. Secondly, business rules may be imposed toa specific (set of) business processes. This might be the case for a claim handling processwhere the significant settlement proposals (e.g. above $10000) must be approved by amanager. An insurance company might have different insurance products different claimhandling procedures, but it is reasonable to assume that for each process a managementapproval will be needed for a significant settlement proposal. Thirdly, the business rulescope may be restricted to a (set of) business process instances with a specific char-acteristic. Within the context of an order-to-cash a business rule may be defined thatnon-recurring customers give an advance before their order can be processed.

    While the process dimension seems natural in the rule-based process mining setting,other dimensions could be valuable. Typical examples are the stakeholder dimension (e.g.customers, employees, management) or the department/devision dimension (e.g. rangingfrom a specific department to multiple divisions). Additionally, for some business rulesa multidimensional scope definition may be required.

    4.5. Extending the Rule-Based Compliance Checking Approach: Process Reconciliations& Document Driven Information Systems

    This section briefly discusses two interesting extensions for the proposed comprehen-sive rule-based compliance checking approach with process mining: the duality betweenprocess activities and process artifacts and the relationship between process elements ofdifferent processes which enable interprocess reconciliations.

    4.5.1. Duality Between Activities and Artifacts: Compliance Checking for DocumentDriven Information Systems

    The proposed rule patterns are all specified in terms of activities. However, in addi-tion to process-aware information systems there is a significant amount of contemporaryorganizations that use a document-driven (i.e. artifact-centered) information system.An activity-artifact coexistence business rule that specifies that a specific artifact must

    14

  • exist before an activity can be completed, implicitly defines a duality between pro-cess activities and process artifacts. Since process activities and process artifactsare interchangeable, the rule patterns can be easily adapted and used in a data-drivenenvironment. For example, testing whether a provide expert review activity has beenperformed before a pay for damages activity has been executed in a single process in-stance, is equal to testing whether an expert report was created before a payment recordwas recorded for a single business case.

    4.5.2. Interprocess Reconciliations

    A final remark deals with the origin of the events used for the compliance analysis. Un-til now process mining techniques always assumed that the relevant events were containedin one event log. However, contemporary organizations use a multitude of informationsystems which each produce a multitude of event logs for different processes. There mayexist crucial relationships between process elements of different processes thatare worthwhile to analyze for anyone in a control function. While the interpretation ofthe rule patterns can be easily extended to include cross-process relations, the implemen-tation between them remains far from trivial in cases where there is no exact mappingbetween case identifiers in multiple systems. Using data analysis techniques to interpretand compare event data might enable the creation of such a mapping.

    5. Populating the Taxonomy: Identifying Relevant Rule Patterns for Com-pliance Checking & Risk Management

    Specific sets of business rule types will be defined for each possible combination ofelements from both the process mining perspective dimension and the rule restrictionfocus dimension. When selecting the business rule types we took into account the processmining technical feasibility and the auditability of each business rule type. Hence itappears that important (detective) controls, such as direct supervision, could not beincluded in the taxonomy as they are not technically feasible and/or not auditable. Eachof the following subsections introduce the different identified business rule types for aspecific process mining perspective, within the subsections an ordering according to rulerestriction focus is maintained.

    5.1. Functional Perspective

    The functional perspective on business processes deals with the occurrence of processactivities (and related artifacts) in a process instance. In the business rule class wherethe focus is put on cardinality, we typically observe activity cardinality rule subtypes.These business rule subtypes represent the rules that implement a restriction on thenumber of occurrences of a certain activity in a specific process instance [23].

    Combining both the functional perspective and a coexistence focus, results in busi-ness rule types that describe the possibilities of coexistence between activities of differentactivity types and of coexistence between certain activity and event types. The subtypesrange from required coexistence to a forced non-coexistence between activity types andactivity-event type combinations [48]. A dynamic data-driven restriction focus re-sults in rule types that link the existence or absence of an activity of a certain type toa data-oriented condition. This data-oriented condition can be true from the beginning

    15

  • (e.g. if based on case data) or can become true during the process instance execution(e.g. specific event data). Relative time rules in the context of the functional processperspective can refer to the rules that link the existence or absence of a certain activityin a process instance to a relative time condition, such as before X time units startingfrom the process start.

    In addition to rule types that appeal to the dynamic aspect of a process, staticproperty rule types can also be distinguished. The event-artifact coexistence rule sub-types deal with the relation between activities and certain process related artifacts in aparticular process state. This rule type is related to the postcondition as specified in theWeb Service Modeling Ontology (WSMO) [49].

    5.2. Control-Flow Perspective

    The control-flow perspective on business processes focuses on the ordering of the pro-cess elements in a process instance. Coexistence rules within this process perspectivedeal with specifying ordering rules between activities of different activity types. Whereasthe non-overlapping activity rule subtype can be used for just avoiding the concurrentexecution of specific activities, the activity order rule subtypes define exact ordering re-lationships between activities of certain activity types. A lot of research effort towardsdefining activity order rules has been performed by Pesic et al., e.g. [48].

    The data-driven rule types consist of business rules that define data conditionswhich need to be satisfied prior to the start of an activity of a particular activity type,i.e. data-driven activity preconditions. These rule subtypes can be related to the precon-ditions as specified in the WSMO [49].

    A combination of relative time rules and the control-flow perspective results in twosubtypes of time-driven control-flow rules: the activity time rule subtype and the inter-activity time rule subtype. Business rules that belong to the activity time rule subtypespecify a condition on the allowable execution time of an activity of a specific activitytype (e.g. minimum or maximum duration). In contrast business rules of the inter-activity time rule subtype define comparable conditions on the allowable time intervalbetween activities of (different) activity types. Based on the timestamp an absolute timerule type can be specified for the static property focus.

    There is, however, no cardinality-based rule type specified for the control-flow per-spective. The main reason for this is the use of varying granularity: activities in coarsegrained business processes may be composed of subprocesses in finer grained representa-tions of the same business process. Therefore, one could also use the activity cardinalityrules in this context.

    5.3. Organizational Perspective

    The organizational perspective on business processes deals with the (human) resourcesaspect of the business processes, i.e. who performs which activity in the process. Whenfocusing on cardinality, the originator cardinality rule type was identified. This ruletype allows to put restrictions on the allowable number of executions of activities of anactivity type by a specific agent within the context of a single process instance [61]. Thebusiness rule class that is defined by the combination of the organizational perspectivewith a coexistence focus defines three main business rule types: the segregation of dutiesrule type, the binding of duties type and the temporal engagement rule type. Whereas

    16

  • Tab

    le1:

    Ad

    van

    ced

    Ru

    le-B

    ase

    dC

    ontr

    ols

    Cla

    ssifi

    cati

    on

    Cardinality-Based

    Rules

    CoexistenceRules

    Dynamic

    Data-DrivenRules

    RelativeTim

    eRules

    StaticProperty

    Rules

    Functional

    Perspective

    Activity

    Cardinality

    Rule

    Activityexistence

    rule

    Activityabsence

    rule

    Activityrange

    rule

    Activity

    CoexistenceRule

    Activityinclusionrule

    Activitysubstitution

    rule

    Responded

    existence

    rule

    Activitychoicerule

    Event-Activity

    CoexistenceRule

    Event-activity

    inclusionrule

    Event-activity

    exclusionrule

    Event-activitychoice

    rule

    Data-Driven

    ExistenceRule

    Data-drivenactivity

    inclusionrule

    Data-drivenactivity

    exclusionrule

    Tim

    e-Oriented

    ActivityExistence

    Rule

    Tim

    edactivity

    existence

    rule

    Tim

    edactivityabsence

    rule

    Event-Artifact

    CoexistenceRule

    Activity-artifact

    coexistence

    rule

    Non-activity

    event-artifact

    coexistence

    rule

    Control

    Flow

    Perspective

    Non-Overlapping

    ActivityRule

    ActivityOrderRule

    Sim

    pleresponse

    rule

    Sim

    pleprecedence

    rule

    Alternate

    response

    rule

    Alternate

    precedence

    rule

    Chain

    response

    rule

    Chain

    precedence

    rule

    Data-Driven

    Activity

    Preconditionrule

    Activitystart

    preconditionrule

    Activitycomplete

    preconditionrule

    Tim

    e-Driven

    Control-FlowRule

    Activitytimerule

    Inter-activitytimerule

    Absolute

    Tim

    eRule

    1

    Table 1: Advanced Rule-Based Controls Classification

    17

  • Tab

    le1:

    Ad

    van

    ced

    Ru

    le-B

    ase

    dC

    ontr

    ols

    Cla

    ssifi

    cati

    on

    (Conti

    nu

    ed)

    Cardinality-Based

    Rules

    CoexistenceRules

    Dynamic

    Data-DrivenRules

    RelativeTim

    eRules

    StaticProperty

    Rules

    Organization

    Perspective

    Originator

    Cardinality

    Rule

    Segregationof

    Duties

    Con

    flic

    tin

    gro

    les

    Dyn

    am

    icse

    greg

    ati

    on

    of

    du

    ties

    Ope

    rati

    on

    al

    segr

    egati

    on

    of

    du

    ties

    His

    tory

    -base

    dse

    greg

    ati

    on

    of

    du

    ties

    BindingofDuties

    Temporal

    EngagementRule

    Exogenous

    AuthorizationRule

    Originator

    Attribute

    Rule

    Sta

    tic

    ori

    gin

    ato

    ratt

    ribu

    teru

    leD

    ynam

    icori

    gin

    ato

    ratt

    ribu

    teru

    le

    TemporalDeontic

    Rule

    Tem

    pora

    lobl

    igati

    on

    rule

    Tem

    pora

    lpe

    rmis

    sion

    rule

    Tem

    pora

    lpro

    hib

    itio

    nru

    le

    Static

    AuthorizationRule

    Role

    -base

    dact

    ivit

    yau

    thori

    zati

    on

    rule

    Pro

    hib

    ited

    role

    -base

    dall

    ocati

    on

    rule

    Not-

    opti

    mal

    all

    ocati

    on

    rule

    Required

    Originator

    Attribute

    Rule

    DelegationRule

    Pro

    hib

    ited

    role

    acq

    uis

    itio

    nru

    leD

    eleg

    ati

    on

    au

    thori

    zati

    on

    rule

    Ret

    ract

    rest

    rict

    ion

    rule

    Data

    Perspective

    EventData

    Cardinality

    Rule

    Man

    dato

    ryev

    ent

    data

    rule

    Eve

    nt

    data

    mu

    ltip

    lici

    tyco

    nst

    rain

    t

    EventData

    CoexistenceRule

    Dis

    jun

    ctiv

    eev

    ent

    data

    rule

    Mu

    tuall

    yex

    clu

    sive

    even

    tdata

    rule

    DerivedEventData

    Rule

    Ari

    thm

    etic

    der

    ivati

    on

    rule

    Cla

    ssifi

    cati

    on

    rule

    EventData

    ComparisonRule

    Eve

    nt

    data

    equ

    ali

    tyru

    leE

    ven

    tdata

    excl

    usi

    on

    rule

    DynamicIntegrity

    Rule

    Tim

    e-ori

    ente

    din

    tegr

    ity

    rule

    Act

    ivit

    y-ori

    ente

    din

    tegr

    ity

    rule

    Eve

    nt-

    ori

    ente

    din

    tegr

    ity

    rule

    EventData

    Value

    Rule

    Eve

    nt

    data

    valu

    ese

    tru

    leE

    ven

    tdata

    valu

    era

    nge

    rule

    Eve

    nt

    data

    un

    iqu

    enes

    sru

    leIr

    refl

    exiv

    eev

    ent

    data

    rule

    EventData

    Form

    at

    Rule

    1

    Table 2: Advanced Rule-Based Controls Classification (Continued)

    18

  • the segregation of duties rule subtypes focus on avoiding the risks related to letting thesame agent perform all activities, the binding of duties rule type requires that a set ofactivities is performed by the same person. These rule subtypes have been extensivelyresearched, e.g. in [19, 39]. The temporal engagement rule type makes it possible toverify if a specified person had a particular role/function at a certain point in time.

    Business rule types with a dynamic data-driven focus enable the specification oforganizational preconditions. Exogenous authorization rules deal with conditions andfactors external to the process instance executions. Originator attribute rules definerestrictions on who can perform an activity based on a combination of event/case dataand originator specific data. The relative time rules main rule subtypes are linkedto temporal deontic rules. These rules are based on the concept of a deontic assignmentthat can represent amongst others the obligation or permission of an agent to perform aparticular activity while respecting a time constraint, e.g. [80].

    The static property rule types include the static authorization rule type, therequired originator attribute rule type and the delegation rule type. The static au-thorization subtypes allow for the specification of non-evolving authorizations controls.Role-based authorization and prohibited allocation rules define respectively the right toperform an activity and the prohibition to perform an activity. Both are actively beenresearched for example in [19]. Suboptimal allocations that may harm efficiency andeffectiveness can be traced down with non-optimal allocation rules. The second maintype of static property rules, the required originator attribute rules, deals with verifyingwhether an agent possesses certain characteristics needed to perform the activity. Fi-nally, the delegation rule subtypes cover the whole spectrum of concerns that are relatedto role delegation: from allowable allocations, over delegation authorizations to retractpolicies [76].

    While the main organizational rules will focus on restricting the rights of specific in-dividuals or individuals of a certain role, several rule types could be specified for differentagents (e.g. an organizational entity).

    5.4. Data Perspective

    The data or informational perspective on business processes represents the informa-tional elements that are used, produced or manipulated during the process, as well asthe relationships among them. A distinction is made between data elements that relateto events (i.e. event data, e.g. invoice amount in event related to a pay for damagesactivity) and data elements that are specified for a specific business process instance (i.e.case data, e.g. claim for car insurance or premium customer involved).

    In the context of cardinality-based rules, the event data cardinality rule subtypescan be distinguished. Each of these subtypes deals with restricting the allowed numberof instances of a certain data element type for a single instance of a specific event type.The coexistence rules, on the other hand, specify co-occurrence restrictions for dataelements of different types within the context of a single instance of a specific event type.

    The dynamic data driven rule types define the value of specific event data elementsin terms of the value of other data elements. Whereas the derived event data rule subtypesprovide some sort of expression to determine the value of a data element, the event datacomparison rule subtypes define value comparison restrictions (e.g. is equal, is not equal,is larger than, etc.) based on the value of other data elements and possibly over multiple

    19

  • events. These rules are related to the set comparison constraints in the Object-RoleModeling Approach, e.g. [31]. Relative time rules in the data perspective mainlyfocus on specifying the dynamic integrity of data elements value. Admissible changes inthe data elements value are specified in terms of exact time, the execution of an activity[29] or the occurrence of an event.

    Included in the static property rule types for the data perspective are both businessrule types that specify acceptable event data values (in terms of sets, ranges, etc.) andthat impose the event data format, see also [31].

    6. Evaluating Opportunities and Challenges

    This section discusses the opportunities for strategic advantages that can be obtainedwith rule-based compliance checking approaches for process mining over more traditionalauditing and risk management techniques. In addition to the fact that the auditorscredibility is generally higher when they use new technologies [59], we distinguish threeopportunities: high issue detection effectiveness (pursuing absolute assurance), obtainingpersuasive evidence and achieving complete auditor independence. These advantages allcontribute to a more efficient and effective control environment. Furthermore, we discussthe challenges related to event log data quality & preprocessing, to distortions in interpre-tation & pattern design and to the implementation of continuous auditing/monitoring.

    6.1. Opportunity 1: High Issue Detection Effectiveness - Approaching Absolute Assur-ance

    The major enabler for process mining as a computer assisted auditing technique(CAAT) or as a risk event identification approach will be the increased level of assur-ance. Assurance is a measure used to indicate the level of certainty that an agent hasobtained about his statements, related to the controls (e.g. the absence of approval is-sues). Different levels of assurance have been identified: limited, reasonable and absoluteassurance [34]. However, no strict definitions have been provided in the literature.

    Process mining techniques enable the agent to perform the tests on the full populationand thereby offering (near) absolute assurance, with a marginally higher cost in termsof processor time compared to sample-based testing with process mining.

    6.2. Opportunity 2: Obtaining Persuasive Evidence

    In addition to the high issue detection effectiveness, the persuasiveness of the evidenceobtained through process mining techniques is expected to be high. This persuasivenessis strongly related to the competence and the sufficiency of the evidence that can beattained. According to [2], the competence of the evidence is mainly determined bythe independence of the provider, the evaluators direct knowledge, the degree of objec-tivity and the timeliness (referring both to the period covered and the ability to reducethe time-delay). While most of these determinants can be easily related to the basicadvantages of process mining as discussed in the previous sections, a continuous moni-toring / auditing approach is required in order to outperform the traditional monitoring/ auditing techniques on time-delay related timeliness. Due to the facts that the wholepopulation of transactions can be efficiently inspected and that the risk of overlookingan issue is extremely low, the sufficiency of the evidence will be optimal.

    20

  • 6.3. Opportunity 3: Realizing Complete Auditor Independence

    The value of an audit (report) is largely dependent on the independence of the personsinvolved in the auditing process from the organization that is being audited [2]. Twomajor architecture types have been proposed in the context of analytic CAATs; inter-nal (e.g. embedded auditing modules architecture [27]) and external (e.g. monitoringand control architecture [75] and process mining based auditing) auditing modules. Incontrast to the internal auditing module architectures that directly affect the clients in-formation systems, the external auditing module architectures only use the output of theclients software. This gives the latter a major advantage from the auditor independencepoint of view.

    Firstly, the functionality of the audit modules is designed and maintained by theauditor (or his/her firm), without any possible interference from the client or his sys-tem administrators. Secondly, architectures based on external auditing modules can fullyguarantee a non-existence of a-priori knowledge with the client about the auditingapproach and procedures. In contrast, the implementation of internal auditing moduleswould require cooperation of the client, which might result in a-priori knowledge withthe client. Consequently, the client could use this information to conduct fraudulentbehavior that would not trigger any alarms with the auditor. Other advantages of exter-nal over internal auditing modules include: the absence of legal liability for side-effects(e.g. decrease in performance) [40] of the modules in the clients information systemand the reusability of the modules as they are not written in information system specificlanguages (e.g. ERP programming languages) [17].

    6.4. Challenge 1: Event Log Data Quality & Preprocessing

    Compliance and risk analyses require event logs that meet high quality standards inorder to derive meaningful conclusions. Four main quality criteria can be distinguished:the recording of the events should be done in a trustworthy, securely, systematicand accurate manner. The data in an event-log of an information system must berecorded in a trustworthy manner, e.g. ID fraud or backdating must be impossible.Additionally, these recordings must be kept securely, which refers to the prevention ofany tampering with the data after the recording. The third quality criteria, systematicrecording, reflects the need for a timely recording of every important business event (i.e.completeness). In addition to start and end events for activities, the recording of denialevents (resulting from trying to perform unauthorized activities) and cancel events mightbe interesting from a control/audit perspective as they might indicate potential fraudattempts. Accuracy can be regarded as a measure that reflects the correctness of therepresentation in the event-log compared to a real-life event. This can be determinedboth on syntactical as semantical level. In [36], for example, it is argued that majorERP systems can contain missing and faulty data values. While trustworthy & securelyrecording looks at the possibilities for humans to influence the recorded data, accuracyfocuses on the ability of the provenance system to correctly record the data (i.e. noisebecause of technical errors).

    The actual preprocessing challenges are: the selection of a suitable instancelevel, dealing with convergence/divergence and attribute selection. Instancelevel selection is partially determined by the scope of the compliance or risk analysis (e.g.audit), i.e. the business object under evaluation. However, the major determinant will

    21

  • be the granularity of the available events (e.g. availability of order related events versusorder line related events). Related to the instance selection challenge are convergence anddivergence issues [57]. Convergence refers to the phenomenon in which the same activityis executed on multiple process instances at once, e.g. a client pays for multiple ordersat once. Divergence on the other hand refers to the business situation in which the sameactivity is executed multiple times for one process instance, e.g. for each order line of asingle order a delivery activity is recorded. Another important preprocessing decision isthe attribute selection. When too many attributes are included in the event log, the logbecomes unnecessarily large, hard to handle, difficult to load in various tools, etc. Onthe contrary, if too few attributes are included some analyses will become impossible.The scope of the analysis should help clear this up.

    6.5. Challenge 2: Distortions in Interpretation and Pattern Design

    Comparable to other compliance and risk analysis approaches interpretation and de-sign distortions might occur. Legislation, policies and directives can remain vague andambiguous, therefore assumptions are often made in the interpretation phase [41]. Dur-ing the actual configuration or design of the specific set of business rules, the analystsmight be confronted with expressibility limitations for a small number of extremelyrare and context specific internal controls. Additionally, the completeness assump-tion of an event log, every possible behavior is covered in the event log, can often bechallenged. This is especially important for the interpretation of risk assessment analysisresults.

    Compared to the other process mining approaches to risk and compliance analysis,however, the number of possible distortions remains limited. Both conformance checkingand delta analysis are confronted with process overspecification, implicit rules that arenot specified by any directive are included in the process model. Visualization and deltaanalysis techniques additionally face the trade-off between specificity and generalizationof behavior that is made by process mining algorithms.

    6.6. Challenge 3: Implementing Rule-Based Continuous Auditing/Monitoring

    Traditionally, an important time-delay can be observed between the occurrence ofimportant business events and the monitoring/audit report due to the periodic natureof most monitoring/auditing models. In the process mining based management/auditmodels this periodic nature and the related time-delay can also be observed. As a resultof this delay, the information contained in the management/audit report mightbecome less useful or beneficial for its user and therefore significantly affects his/herability to make well-grounded business decisions [4].

    Both continuous monitoring (implemented by management) and continuous auditingfocus on reducing this time-delay by providing reporting/assurance simultaneouslywith, or a short period of time after, the occurrence of events related to the subjectunder investigation [8]. Consequently continuous monitoring and auditing systems mayunlock interesting opportunities; e.g. timely meeting regulatory requirements, promptlyidentifying irregularities or enhancing a stakeholders ability to make decisions.

    Developing continuous auditing/monitoring models based on process mining can be aninteresting challenge. The architectural methodology that could be used for implementingcontinuous auditing with process mining techniques is based on a monitoring control

    22

  • layer (MCL) [75]. This monitoring control layer would consist of adapted process miningtechniques, which are using the event logs/event streams of the information system asan input [72].

    7. Conclusion & Outlook

    Process mining research has been characterized by a narrow research focus on theo-retical improvements and especially by the advancement of process discovery techniques.Therefore only a partial fit between, on the one hand, compliance checking & risk man-agement and, on the other hand, process mining techniques did exist. Aspects of thisrather limited fit include: the ignorance of case and event data, the reasonable doubtabout the correctness of designed process models including overspecification (for confor-mance checking & delta analysis), the need for balance between precision and generalitypotentially whipping out suspicious behavior (for process discovery), etc.

    In this research report we proposed a comprehensive rule-based compliance checkingand risk management approach as a possible solution to eliminate the limited fit. Theapproach enables analysts to uncover compliance failures as well as to identify and assesspotential risks. Improvements can be found in the ability to take additional data intoaccount, the reduction of possible distortions (including over specification), the ability todeal with noise (no need for generalization), etc. Additionally, a rule-based compliancechecking and risk management approach provides information on a potential compliancerisk, whereas recall/precision metrics only provide a process-wide indicator. This con-tribution proposed a extensive content-based business rule classification, resulting in anextensive set of rule patterns that is fit to be used in a common business setting. Whereasthe comprehensibility of the rule patterns is high due to the use of native English, the for-mal grounding removes every room for interpretation. Finally, an evaluation containingthe major opportunities (i.e. effectiveness, persuasive evidence and audit independence)and challenges (i.e. data quality & preprocessing, distortions in interpretation & pat-tern design and continuous monitoring/auditing) was presented. A logical future step inour research is to further test this approach on real-life cases and to tackle the identifiedchallenges. Focus will be placed on the development of a continuous monitoring/auditingapproach based on process mining techniques.

    Acknowledgements

    We would like to thank the K.U.Leuven research council for financial support un-der grant OT/10/010: Business Process Mining: New Techniques and Evaluation Met-rics and the Flemish research council for financial support under the Odysseus grantB.0915.09.

    References

    [1] R. Agrawal, C. Johnson, J. Kiernan, and F. Leymann. Taming compliance with sarbanes-oxleyinternal controls using database technology. In Proceedings of the 22nd International Conferenceon Data Engineering, pages 9292. IEEE, 2006.

    [2] A.A. Arens, R.J. Elder, and M.S. Beasley. Auditing and assurance services: An integrated approach.Pearson Education, 2005.

    23

  • [3] J. Bang-Jensen and G.Z. Gutin. Digraphs: theory, algorithms and applications. Springer Verlag,2010.

    [4] ISACA Standards Board. Continuous auditing: Is it fantasy or reality? Information SystemsControl Journal, 5, 2002.

    [5] F. Casati, S. Castano, M. Fugini, I. Mirbel, and B. Pernici. Using patterns to design rules inworkflows. IEEE Transactions on Software Engineering, 26(8):760785, 2000.

    [6] F. Chesani, P. Mello, M. Montali, F. Riguzzi, M. Sebastianis, and S. Storari. Checking complianceof execution traces to business rules. In Business Process Management Workshops, pages 134145.Springer, 2009.

    [7] E.M. Clarke, O. Grumberg, and D.A. Peled. Model checking. MIT Press, January 2000.[8] Wood Committee. Research report: continuous auditing. Technical report, Canadian Institute of

    Chartered Accountants & American Institute of Certified Public Accountants, 1999.[9] J.E. Cook and A.L. Wolf. Discovering models of software processes from event-based data. ACM

    Transactions on Software Engineering and Methodology, 7(3):215249, 1998.[10] J.E. Cook and A.L. Wolf. Software process validation: quantitatively measuring the correspondence

    of a process to a model. ACM Transactions on Software Engineering and Methodology, 8(2):147176, 1999.

    [11] COSO. Enterprise risk management - integrated framework. Technical report, Committee of Spon-soring Organizations of the Treadway Commission, 2004.

    [12] B. Curtis, M.I. Kellner, and J. Over. Process modeling. Communications of the ACM, 35(9):7590,1992.

    [13] C.J. Date. What not how: the business rules approach to application development. Addison-WesleyProfessional, 2000.

    [14] H. de Beer. The LTL checker plugins: A reference manual. Technical report, Eindhoven Universityof Technology, 2004.

    [15] A.K.A. de Medeiros, W.M.P. van der Aalst, and C. Pedrinaci. Semantic process mining tools: corebuilding blocks. In 16th European Conference on Information Systems, pages 19531964. Citeseer,2008.

    [16] A.K.A. de Medeiros, B.F. van Dongen, W.M.P. van der Aalst, and A. Weijters. Process mining: ex-tending the -algorithm to mine short loops. Technical report, Eindhoven University of Technology,2004.

    [17] R.S. Debreceny, G.L. Gray, J.J.J. Ng, K.S.P. Lee, and W.F. Yau. Embedded audit modules inenterprise resource planning systems: implementation and functionality. Journal of InformationSystems, 19:7, 2005.

    [18] R. Dijkman, M. Dumas, B. Van Dongen, R. Kaarik, and J. Mendling. Similarity of business processmodels: Metrics and evaluation. Information Systems, 36(2):498516, 2011.

    [19] D. Ferraiolo, D.R. Kuhn, and R. Chandramouli. Role-based access control. Artech House Publishers,2003.

    [20] D.R. Ferreira. Applied sequence clustering techniques for process mining. Handbook of Researchon Business Process Modeling. IGI Global, 2009.

    [21] D. Follesdal and R. Hilpinen. Deontic logic: An introduction, chapter Deontic Logic: Introductoryand Systematic Readings, pages 135. D. Reidel Publishing Company, Dordrecht, 1971.

    [22] D. Giannakopoulou and K. Havelund. Automata-based verification of temporal properties on run-ning programs. In Proceedings of the 16th Annual Conference on Automated Software Engineering,pages 412416. Published by the IEEE Computer Society, 2001.

    [23] S. Goedertier, R. Haesen, and J. Vanthienen. Rule-based business process modelling and enactment.International Journal of Business Process Integration and Management, 3(3):194207, 2008.

    [24] S. Goedertier, D. Martens, J. Vanthienen, and B. Baesens. Robust process discovery with artificialnegative events. The Journal of Machine Learning Research, 10:13051340, 2009.

    [25] R. Gopal, J.R. Marsden, and J. Vanthienen. Information mining-reflections on recent advancementsand the road ahead in data, text, and media mining. Decision Support Systems, 2011.

    [26] G. Governatori and Z. Milosevic. Dealing with contract violations: formalism and domain specificlanguage. In Proceedings of the International Enterprise Distributed Objet Computing Conference,pages 4757, 2005.

    [27] S.M. Groomer and U.S. Murthy. Continuous auditing of database applications: An embedded auditmodule approach. Journal of Information Systems, 3(2):5370, 1989.

    [28] Business Rules Group. Defining business rules: What are they really?(3rd edition). Technicalreport, Business Rules Group, 2000.

    [29] Object Management Group. Semantics of business vocabulary and rules (SBVR). Technical report,

    24

  • Technical Report dtc/060302, 2006.[30] G. Guizzardi. Ontological foundations for structural conceptual models. 2005.[31] T. Halpin. Object-role modeling (ORM/NIAM). Handbook on Architectures of Information Sys-

    tems, pages 81103, 2006.[32] S.M. Huang, D.C. Yen, Y.C. Hung, Y.J. Zhou, and J.S. Hua. A business process gap detecting

    mechanism between information system process flow and internal control flow. Decision SupportSystems, 47(4):436454, 2009.

    [33] S.Y. Hwang and W.S. Yang. On the discovery of process models from their instances* 1. DecisionSupport Systems, 34(1):4157, 2002.

    [34] IFAC. Handbook of international auditing, assurance and ethics pronouncements. InternationalFederation of Accountants, 2008.

    [35] IIA. Position statement: The role of internal audit in enterprise-wide risk management. Technicalreport, The Institute of Internal Auditors, 2004.

    [36] J.E. Ingvaldsen and J.A. Gulla. Preprocessing support for large scale process mining of SAP trans-actions. In Proceedings of the 2007 International Conference on Business Process Management,pages 3041. Springer, 2007.

    [37] M. Jans, N. Lybaert, and K. Vanhoof. Internal fraud risk reduction: Results of a data mining casestudy. International Journal of Accounting Information Systems, 11(1):1741, 2010.

    [38] M. Jans, N. Lybaert, K. Vanhoof, and J.M. van der Werf. A business process mining applicationfor internal transaction fraud mitigation. Expert Systems with Applications, 38(10):1335113359,2011.

    [39] K. Knorr and H. Stormer. Modeling and analyzing separation of duties in workflow environments.Trusted Information, pages 199212, 2002.

    [40] J.R. Kuhn Jr and S.G. Sutton. Continuous auditing in ERP system environments: The currentstate and future directions. Journal of Information Systems, 24(1).

    [41] K.J. Lee and B. Jeon. Analysis of best practice policy and benchmarking behavior for governmentknowledge management. Knowledge Management in Electronic Government, pages 7079, 2004.

    [42] Y. Liu, S. Muller, and K. Xu. A static compliance-checking framework for business process models.IBM Systems Journal, 46(2):335361, 2007.

    [43] R. Lu, S. Sadiq, and G. Governatori. Compliance aware business process design. In BusinessProcess Management Workshops, pages 120131. Springer, 2008.

    [44] R.S. Mans, M.H. Schonenberg, M. Song, W.M.P. Aalst, and P.J.M. Bakker. Application of pro-cess mining in healthcarea case study in a dutch hospital. Biomedical Engineering Systems andTechnologies, 25:425438.

    [45] M. Montali. Declarative process pining. Specification and Verification of Declarative Open Inter-action Models, pages 343365, 2010.

    [46] M. Pesic. Constrained-based workflow management systems: Shifting control to users. PhD thesis,Eindhoven University of Technology, 2008.

    [47] M. Pesic, M.H. Schonenberg, N. Sidorova, and W.M.P. Van Der Aalst. Constraint-based workflowmodels: Change made easy. In Proceedings of the 2007 OTM Confederated international conferenceon the move to meaningful internet systems, pages 7794. Springer-Verlag, 2007.

    [48] M. Pesic and W.M.P. van der Aalst. A declarative approach for flexible business processes man-agement. In Business Process Management Workshops, pages 169180. Springer, 2006.

    [49] D. Roman, U. Keller, H. Lausen, J. de Bruijn, R. Lara, M. Stollberg, A. Polleres, C. Feier, C. Bussler,and D. Fensel. Web service modeling ontology. Applied Ontology, 1(1):77106, 2005.

    [50] R. Ross. Business Rule Conepts (3rd Edition). Business Rule Solutions, 2009.[51] R.G. Ross. Principles of the business rule approach. Addison-Wesley Professional, 2003.[52] A. Rozinat and W.M.P. van der Aalst. Conformance testing: Measuring the fit and appropriateness

    of event logs and process models. In Business Process Management Workshops, pages 163176.Springer, 2006.

    [53] A. Rozinat and W.M.P. van der Aalst. Conformance checking of processes based on monitoringreal behavior. Information Systems, 33(1):6495, 2008.

    [54] V. Rubin, C. Gunther, W.M.P. van der Aalst, E. Kindler, B. van Dongen, and W. Schafer. Processmining framework for software processes. Software Process Dynamics and Agility, pages 169181,2007.

    [55] S. Sadiq, G. Governatori, and K. Namiri. Modeling control objectives for business process compli-ance. Business Process Management, pages 149164, 2007.

    [56] S.W. Sadiq, M.E. Orlowska, and W. Sadiq. Specification and validation of process constraints forflexible workflows. Information Systems, 30(5):349378, 2005.

    25

  • [57] I.E.A. Segers. Investigating the application of process mining for auditing purposes. Masters thesis,Technische Universiteit Eindhoven, 2007.

    [58] M. Song, C.W. Gunther, and W.M.P. Aalst. Trace clustering in process mining. In Business ProcessManagement Workshops, pages 109120. Springer, 2009.

    [59] S. Sutton, R. Young, and P. Mckenzie. An analysis of potential legal liability incurred through auditexpert systems. Intelligent Systems in Accounting, Finance and Management, 4:191204, 1994.

    [60] N. Syed Abdullah, S. Sadiq, and M. Indulska. Emerging challenges in information systems researchfor regulatory compliance management. In Advanced Information Systems Engineering, volume6051/2010.

    [61] K. Tan, J. Crampton, and C.A. Gunter. The consistency of task-based authorization constraintsin workflow. In Proceedings of the 17th IEEE Computer Security Foundations Workshop, pages155169. IEEE, 2004.

    [62] W.M.P. van Aalst, K.M. Van Hee, J.M. van Werf, and M. Verdonk. Auditing 2.0: Using processmining to support tomorrows auditor. Computer, 43(3):9093, 2010.

    [63] W. Van der Aalst, T. Weijters, and L. Maruster. Workflow mining: Discovering process modelsfrom event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9):11281142, 2004.

    [64] W.M.P. van der Aalst. Business alignment: Using process mining as a tool for delta analysis andconformance testing. Requirements Engineering Journal, 10(3):198211, 2005.

    [65] W.M.P. van der Aalst, H.T. De Beer, and B.F. van Dongen. Process mining and verification ofproperties: An approach based on temporal logic. On the Move to Meaningful Internet Systems2005: CoopIS, DOA, and ODBASE, 3760/2005:130147.

    [66] W.M.P. Van Der Aalst and M. Pesic. DecSerFlow: Towards a truly declarative service flow language.Web Services and Formal Methods, pages 123, 2006.

    [67] W.M.P. van der Aalst, H.A. Reijers, A.J.M.M. Weijters, B.F. van Dongen, A.K. Alves de Medeiros,M. Song, and H.M.W. Verbeek. Business process mining: An industrial application. InformationSystems, 32(5):713732, 2007.

    [68] W.M.P. Van der Aalst and M. Song. Mining social networks: Uncovering interaction patterns inbusiness processes. Business Process Management, pages 244260, 2004.

    [69] W.M.P. van der Aalst and B. Van Dongen. Discovering workflow performance models from timedlogs. Engineering and Deployment of Cooperative Information Systems, 2480/2002:107110.

    [70] W.M.P. van der Aalst, B. van Dongen, C. Gunther, R. Mans, A. de Medeiros, A. Rozinat, V. Rubin,M. Song, H. Verbeek, and A. Weijters. ProM 4.0: Comprehensive support for real process analysis.Petri Nets and Other Models of ConcurrencyICATPN 2007, pages 484494, 2007.

    [71] W.M.P. Van der Aalst, B.F. Van Dongen, C. Gunther, A. Rozinat, H.M.W. Verbeek, and A. Wei-jters. ProM: the process mining toolkit. In Proceedings of the International Conference on BusinessProcess Management, pages 14, 2009.

    [72] W.M.P. van der Aalst, K. van Hee, J.M. van der Werf, A. Kumar, and M. Verdonk. Conceptualmodel for on line auditing. Decision Support Systems, 50, 2010.

    [73] W.M.P. van der Aalst and A. Weijters. Process mining: a research agenda. Computers in Industry,53(3):231244, 2004.

    [74] M. van Giessel and M.H. Jansen-Vullers. Process mining in sap r/3. Masters thesis, EindhovenUniversity of Technology, Eindhoven, 2004.

    [75] M.A. Vasarhelyi, M.G. Alles, and A. Kogan. Principles of analytic monitoring for continuousassurance. Journal of Emerging Technologies in Accounting, 1(1):121, 2004.

    [76] K. Venter and M. Olivier. The delegation authorization model: A model for the dynamic delega-tion of authorization rights in a secure workflow management system. Proceedings of InformationSecurity South Africa (ISSA02), 2002.

    [77] G. Wagner. Rule modeling and markup. Reasoning Web, pages 251274, 2005.[78] A. Weijters and W.M.P. van der Aalst. Process mining: discovering workflow models from event-

    based data. In Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence(BNAIC 2001), pages 283290. Citeseer, 2001.

    [79] A. Weijters, W.M.P. van der Aalst, and A.K.A. de Medeiros. Process mining with the heuristicsminer-algorithm. Technische Universiteit Eindhoven, Tech. Rep. WP, 166, 2006.

    [80] P. Yolum and M.P. Singh. Reasoning about commitments in the event calculus: An approach forspecifying and executing protocols. Annals of Mathematics and Artificial Intelligence, 42(1):227253, 2004.

    [81] M. zur Muehlen and M. Rosemann. Integrating risks in business process models. In Proceedings of16th Australasian Conference on Information Systems, 2005.

    26

  • FACULTY OF ECONOMICS AND BUSINESS Naamsestraat 69 bus 3500

    3000 LEUVEN, BELGI tel. + 32 16 32 66 12 fax + 32 16 32 67 91

    [email protected] www.econ.kuleuven.be

    KBI_1305AuditBusinessRules