22
Decision Mining Revisited Discovering Overlapping Rules Felix Mannhardt, Massimiliano de Leoni, Hajo A. Reijers, Wil M.P. van der Aalst

Decision Mining Revisited - Discovering Overlapping Rules

Embed Size (px)

Citation preview

Page 1: Decision Mining Revisited - Discovering Overlapping Rules

Decision Mining RevisitedDiscovering Overlapping Rules

Felix Mannhardt, Massimiliano de Leoni,Hajo A. Reijers, Wil M.P. van der Aalst

Page 2: Decision Mining Revisited - Discovering Overlapping Rules

Scope: Mining decision rules from event logs

PAGE 2

Apply

Amount

GrantExtensive

Check

RejectEligibility

Simple Check

Request InformationIncome

Receive Information

Category

ActivityData

Page 3: Decision Mining Revisited - Discovering Overlapping Rules

Control-flow – Petri net defines order & possible choices

PAGE 3

Apply GrantExtensiveCheck

RejectSimple Check

Request Information

ReceiveInformation

Exclusive Choice

Sequence

Exclusive Choice

Page 4: Decision Mining Revisited - Discovering Overlapping Rules

Data-perspective – Data Petri Net modelling decisions

PAGE 4

Apply

Amount

GrantExtensiveCheck

RejectSimple Check

Eligibility

Rating

[Eligibility = No]

[Eligibility = Yes]

Request Information

ReceiveInformation

Decision point

Data recording

Decision rule

Page 5: Decision Mining Revisited - Discovering Overlapping Rules

PAGE 5

DMN 1.1 released on 2016

Widely adopted by tool vendors, for example:

U Eligibility Outcome1 Yes Grant

2 No Reject

Decision Table

Grant

Reject

[Eligibility = No]

[Eligibility = Yes]

Comparing the Petri net notation to DMN

Decision Rule / Guard

Page 6: Decision Mining Revisited - Discovering Overlapping Rules

Why are overlapping rules needed?

PAGE 6

Incomplete Information

• Not recorded• Process context• Confidential• ...

• Expert approval• Deferred choice• Randomized check• Inconsistent human behavior• ...

Page 7: Decision Mining Revisited - Discovering Overlapping Rules

Goal: Discover rules which may overlap

PAGE 7

Process Model

Event LogProcess Model with

Overlapping Decision Rules

Overlapping Rule Discovery

Page 8: Decision Mining Revisited - Discovering Overlapping Rules

Decision point - Mutually-exclusive rule

PAGE 8

Grant

Reject

[Eligibility = No]

[Eligibility = Yes]

Count Eligibility Outcome5x “No” Reject

20x “Yes” Grant

Observation instances from an event log

Grant

Reject

Page 9: Decision Mining Revisited - Discovering Overlapping Rules

Decision point – Overlapping rule

PAGE 9

ExtensiveCheck

Simple Check

Request Information

Apply

Amount

Rating[Rating = Unknown OR Rating = Bad AND Amount = High]

[Rating = Bad]

[Rating = Good OR Rating = Bad AND Amount = Low]

C Rating Amount Activity1 Good - Simple Check

2 Bad - Extensive Check

3 Bad Low Simple Check

4 Bad High Request Information

5 Unknown - Request Information

Alternative Decision Table Notation

Page 10: Decision Mining Revisited - Discovering Overlapping Rules

Proposed Discovery Method

PAGE 10

Process Model

Event LogProcess Model

With Overlapping RulesOverlapping Rule

Discovery

foreach Decision Point

Collect Instances

1st

Classification2nd

ClassificationCollect

MisclassifiedBuild Rules

Page 11: Decision Mining Revisited - Discovering Overlapping Rules

1) Collect Instances

PAGE 11

Event Log collect

Rating Amount Outcome

6x Good Low Simple

6x Good High Simple

6x Bad High Extensive

4x Bad High Request

6x Bad Low Extensive

4x Bad Low Simple

6x Unknown High Request

Observation instances

• Cyclic Behavior• Noise (Missing / Additional Events)• Unassigned values• Inconsistent recording

Alignment-based method

Page 12: Decision Mining Revisited - Discovering Overlapping Rules

2) 1st Classification & 3) Misclassified Instances

PAGE 12

Rating Amount Outcome

6x Good Low Simple

6x Good High Simple

6x Bad High Extensive

4x Bad High Request

6x Bad Low Extensive

4x Bad Low Simple

6x Unknown High Request

Rating

Simple RequestExtensive

Good UnknownBad

12 OK 12 OK8 NOK

6 OK

Instances Decision Tree

Page 13: Decision Mining Revisited - Discovering Overlapping Rules

4) 2nd Classification

PAGE 13

Instances

Amount

Request Simple

High Low

2nd Decision Tree

Rating Amount Outcome

4x Bad High Request

4x Bad Low Simple

Page 14: Decision Mining Revisited - Discovering Overlapping Rules

5) Build Overlapping Decision Rules

PAGE 14

Rating

Simple RequestExtensive

Good UnknownBad

Amount

Request Simple

High Low

Compiled to overlapping rules

If Rating = Good then Simple

If Rating = Unknown then Request

If Rating = Bad then Extensive

If Rating = Bad AND Amount = High

then Request

If Rating = Bad AND Amount = Low

then Simple

Page 15: Decision Mining Revisited - Discovering Overlapping Rules

Resulting Data-aware Process Model

PAGE 15

ExtensiveCheck

Simple Check

Request Information

Apply

Amount

Rating[Rating = Unknown OR Rating = Bad AND Amount = High]

[Rating = Bad]

[Rating = Good OR Rating = Bad AND Amount = Low]

Page 16: Decision Mining Revisited - Discovering Overlapping Rules

Trade-off: Precise and fitting model

PAGE 16

Rating Amount Outcome

6x Good Low Simple

6x Good High Simple

6x Bad High Extensive

4x Bad High Request

6x Bad Low Extensive

4x Bad Low Simple

6x Unknown High Request

ExtensiveCheck

Simple Check

Request Information

Apply

Amount

Rating

ExtensiveCheck

Simple Check

Request Information

Apply

Amount

Rating[Rating = Unknown OR Rating = Bad AND Amount = High]

[Rating = Bad]

[Rating = Good OR Rating = Bad AND Amount = Low]

ExtensiveCheck

Simple Check

Request Information

Apply

Amount

Rating [Rating = Unknown]

[Rating = Bad]

[Rating = Good]

Unfitting

Imprecise[Underfitting]

Good Trade-off

Page 17: Decision Mining Revisited - Discovering Overlapping Rules

Evaluation – Measures

PAGE 17

Precision Fitness

How much unobserved behavior

is modelled?

How much observed behavior is modelled?

Image source (CC BY-SA): https://en.wikipedia.org/wiki/Precision_and_recall#/media/File:Precisionrecall.svg

Page 18: Decision Mining Revisited - Discovering Overlapping Rules

Evaluation – Setup

PAGE 18

Method Description Expected Precision

Expected Fitness

WO Without rules Poor Good

DTF Mutually-exclusive approach Good Poor

DTT Naïve overlapping approach Poor Good

DTO Presented overlapping approach Balanced Balanced

Dataset # Traces # Events # Attributes # DecisionsRoad Fines 150,000 500,000 9 5

Hospital 1,000 15,000 39 11

Datasets

Compared Methods

Page 19: Decision Mining Revisited - Discovering Overlapping Rules

Evaluation – Example rules in the hospital data

PAGE 19

Intensive Care

Triage

S-p5Normal Care

skipLactate

Hypotensie

Infusions

Tests

Release

Method Intensive Care Normal Care SkipDTO L > 0 H = ∧ true L > 0 L ≤ 0 ∨

(L > 0 H = ∧ false)

DTT true L > 0 L ≤ 0

DTF false L > 0 L ≤ 0Imprecise

Unfitting

Good trade-off

Page 20: Decision Mining Revisited - Discovering Overlapping Rules

Evaluation – Precision & Fitness

PAGE 20

Fitness Precision

• Fitness how often rules are violated• DTO improves fitness over DTF (mutually-exclusive)

• Precision how strict are the rules• DTO improves precision against WO• DTO does sacrifice precision vs. DTF

Page 21: Decision Mining Revisited - Discovering Overlapping Rules

Conclusion & Future Work

• Method: Discovery of overlapping rules using event logs• Based on decision tree induction• ProM framework: MultiPerspectiveExplorer

http://www.promtools.org• Results: Trade-off fitness & precision

• Improves the model fitness over standard trees

• Improves the model precision over naïve approach

• Future work• Better experimental validation• Manage the complexity of discovered rules• Imbalanced distributions

PAGE 21

Page 22: Decision Mining Revisited - Discovering Overlapping Rules

Questions?

PAGE 22

@fmannhardt - [email protected] - http://promtools.org

Multi-Perspective Explorer