Upload
saimir-bala
View
156
Download
0
Embed Size (px)
Citation preview
Mining Project-Oriented BusinessProcessesSaimir Bala, Cristina Cabanillas, Jan Mendling,Andreas Rogge-Solti, Axel Polleres
Motivation
I Imagine a train crashes because of an engineering error andI a lot of people get injured
I You are a national railway system administrator, say ABCI You might be in trouble!
Mining Project-Oriented Business Processes Motivation 2 / 22
Agenda
I Problem
I Project-Oriented Business Processes
I Approach
I Conclusion
Mining Project-Oriented Business Processes Motivation 3 / 22
Agenda
I Problem
I Project-Oriented Business Processes
I Approach
I Conclusion
Mining Project-Oriented Business Processes Problem 4 / 22
Who is responsible?
I Are you as ABC responsible for the accident?!I Show that your work complies with safety regulationsI E.g. in the railway domain EN50128, EN50129, EN50126
Mining Project-Oriented Business Processes Problem 5 / 22
How to provide evidence of compliance?
I Analyze the work in retrospect
I The company does not use a BPM engine to execute their processes:I No process designed a prioriI Rather a project that is handled ad-hoc by engineers
I An expert (auditor) analyses the existing documentation andmanually checks if everything was done properly
I Spreadsheets, wordprocessor, diagrams, version control system (VCS)data
Mining Project-Oriented Business Processes Problem 6 / 22
Agenda
I Problem
I Project-Oriented Business Processes
I Approach
I Conclusion
Mining Project-Oriented Business Processes Project-Oriented Business Processes 7 / 22
Idea: mine project-oriented business pro-cesses
I Has the accident something to do with the software?
Mining Project-Oriented Business Processes Project-Oriented Business Processes 8 / 22
Idea: mine project-oriented business pro-cesses
I Has the accident something to do with the software?
Mining Project-Oriented Business Processes Project-Oriented Business Processes 8 / 22
Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engineRecursive, cyclic One time with fixed goals and resourcesMany instances One prototype/productProcess model (e.g. BPMN) Plan (e.g. GANTT chart)Activities WorkpackagesSubprocesses Subworkpackages
Process mining
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engineRecursive, cyclic One time with fixed goals and resourcesMany instances One prototype/productProcess model (e.g. BPMN) Plan (e.g. GANTT chart)Activities WorkpackagesSubprocesses Subworkpackages
Process mining
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engineRecursive, cyclic One time with fixed goals and resourcesMany instances One prototype/productProcess model (e.g. BPMN) Plan (e.g. GANTT chart)Activities WorkpackagesSubprocesses Subworkpackages
Process mining
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
State of the art: reduction to process min-ing
I Mining a process from software repositories (Kindler et al.,2006)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 10 / 22
State of the art: visualization I
I Dotted chart (Song & van der Aalst,2007)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 11 / 22
State of the art: visualization II
I Storylines (Ogawa & Ma, 2010)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 12 / 22
Agenda
I Problem
I Project-Oriented Business Processes
I Approach
I Conclusion
Mining Project-Oriented Business Processes Approach 13 / 22
Mining VCS logs
I Input: VCS logs (e.g. from Git, Subversion, etc)I Output: GANTT chart
Mining Project-Oriented Business Processes Approach 14 / 22
Challenges
I Timing (how big is the activity in reality wrt to what we see in thelog?)
I Aggregation (how can we aggregate events into activities? and howcan we see the project from a coarser grained point of view?)
I Coverage (how efficiently was the time used?)
Mining Project-Oriented Business Processes Approach 15 / 22
Assumptions
1. Meaningful tree structure2. Members perform local changes3. Systematic commits
Mining Project-Oriented Business Processes Approach 16 / 22
Assumptions
1. Meaningful tree structure
2. Members perform local changes3. Systematic commits
Mining Project-Oriented Business Processes Approach 16 / 22
Assumptions
1. Meaningful tree structure2. Members perform local changes
3. Systematic commits
Mining Project-Oriented Business Processes Approach 16 / 22
Assumptions
1. Meaningful tree structure2. Members perform local changes3. Systematic commits
Mining Project-Oriented Business Processes Approach 16 / 22
Visualization of a project
I Aggregation (data from the SHAPE-project)I Time span Jan 2014 – Jan 2015I 8 peopleI 156 objects (files and directories)I 226 commits, generating 453 events
Mining Project-Oriented Business Processes Approach 17 / 22
Correction of activity starting times
I Adjustment and coverage
Mining Project-Oriented Business Processes Approach 18 / 22
Evaluation on open source projects
Log Duration Idle periods Files Commits t̂c χ
File name Days Number Number Number Hours %
Our work 24 0 89 63 9 100
Whitehall 1279 6 6539 15566 2 95
Petitions 834 17 1562 914 13 59
Study 624 13 7501 736 11 58
The Guardian 1667 59 12889 621 30 44
Book 414 15 154 592 5 32
Papers 1859 55 1791 649 20 30
Requirements 771 22 505 231 17 21
Yelp 206 6 24 54 20 20
Adobe 1076 13 356 237 24 15
I More real world logs on https://github.com/showcasesMining Project-Oriented Business Processes Approach 19 / 22
Limitations and Future work
LimitationsI Strong assumptions on the structureI The approach doesn’t take into account amount of documents
changesI Checking rules
Future workI Use statistic methods to improve the quality of the discovered projectsI Discover the type of work/project by using comments written by usersI User assessment of the quality of the discovered GANTT charts
Mining Project-Oriented Business Processes Approach 20 / 22
Agenda
I Problem
I Project-Oriented Business Processes
I Approach
I Conclusion
Mining Project-Oriented Business Processes Conclusion 21 / 22
Conclusion
I We help the auditor to analyze the projectI Different levels of abstraction (aggregation)I Time and resource of eventsI Work effort measure (coverage)
I We used project VCS logsI Output as GANTT chart
Source code: https://github.com/s41m1r/MiningCVS
Email me: [email protected]
Mining Project-Oriented Business Processes Conclusion 22 / 22
References
I Kindler, E., Rubin, V. & Schäfer, W. (2006). Activity Mining forDiscovering Software Process Models. Software Engineering 79,175–180.
I Ogawa, M. & Ma, K.-L. (2010). Software evolution storylines. InProceedings of the 5th international symposium on Softwarevisualization (pp. 35–42).
I Song, M. & van der Aalst, W. M. (2007). Supporting process miningby showing events at a glance. In 7th Annual Workshop onInformation Technologies and Systems (pp. 139–145).
I Baier, T., Mendling, J., & Weske, M. (2014). Bridging abstractionlayers in process mining. Information Systems, 46, 123-139.
Part I: AppendixMining Project-Oriented Business Processes References 1 / 4
Expected active time between commits
Expected active time between commits t̂c is given as follows.
(1) t̂c =∑
a∈Af(ω(a)− α′(a))∑
a∈Af(c(a)− 1)
withI ω (a): End time of activity aI α’(a): Time of the first event of the activity aI c (a): Number of commits in activity a
Part I: AppendixMining Project-Oriented Business Processes Backup 2 / 4
Coverage factor
Definition (Coverage)The coverage χ of work packages by activities is a function χ : W → [0, 1]and is defined as follows.
(2) χ(w) =∑
a∈β−1(w) (ω(a)− α(a))τ(w)
where τ is the duration of work package w .
Part I: AppendixMining Project-Oriented Business Processes Backup 3 / 4
Average idle time
Let nc be the number of commits per work package. We compute theaverage idle time as follows.
(3) t Idle = τ − nc · t̂cn , n > 0
where n is the number of idle times in the work package, and τ is the timeduration of the work package.
Part I: AppendixMining Project-Oriented Business Processes Backup 4 / 4