45
A UNIFIED FRAMEWORK FOR THE COMPREHENSION OF SOFTWARE’S TIME DIMENSION By Omar Benomar Collaborators: Dr. Houari Sahraoui, Dr. Pierre Poulin, Dr. Hani Abdeen and Mohamed Aymen Saied Work to be presented at ICSE-NIER’15

A UNIFIED FRAMEWORK FOR THE COMPREHENSION OF SOFTWARE’S TIME DIMENSION By Omar Benomar Collaborators: Dr. Houari Sahraoui, Dr. Pierre Poulin, Dr. Hani

Embed Size (px)

Citation preview

A UNIFIED FRAMEWORK FOR THE COMPREHENSION OF SOFTWARE’S

TIME DIMENSION

By Omar Benomar

Collaborators: Dr. Houari Sahraoui, Dr. Pierre Poulin, Dr. Hani Abdeen

and Mohamed Aymen Saied

Work to be presented at ICSE-NIER’15

2

Overview• Introduction

• Time Representation in Software Comprehension

• Unification Approach

• Application: Entity Collaboration

• Application: Phase Identification

• Conclusion

Introduction• Software maintenance

• Modification after deployment• Longest in terms of time• Resource consuming

• 50-80% of total development cost (Coleman et al. 1994)

• Different types

• Comprehension task• 50% of maintenance effort (Corbi et al. 1989)• Several software dimensions• Mental model of software

3

www.mashable.com

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

4

Introduction• Comprehension even more central in software

development

• Software development• Teams in different locations • Long period of time: team restructuring• Rarely from scratch

• Software engineering tools and techniques• Two research communities

• Evolution comprehension • Execution comprehension

www.ciklum.com

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

5

Time Representation

• Software comprehension involving time• Study of evolution• Analysis of execution

• Time representation• Visualization

• Axis• Graphical attribute• Animation

• Automatic approaches • Sequence of events

• Comprehension models

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

Wu et al. 2004

Bohnet et al. 2009

Wettel and Lanza 2008

Isaacs et al. 2014

Langelier et al. 2008

Dugerdil and Alam 2008

Xing and Stroulia 2004

Pirzadeh et al. 2010

Girba and Ducasse 2005

Lienhard et al. 2007

6

Time Representation

• Considered from two different contexts• Studied in separate manners• Belong to different research communities

• Different in appearance

• Close examination reveal similarities• Unifying perspective not studied before

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

7

Research proposition

“Software comprehension problems involving time dimension should be analyzed using a unified framework, which enables

easier communication of solutions between evolution comprehension and execution comprehension research”

8

Approach OverviewComprehension of Software’s Time Dimension

Execution Comprehension Unification Evolution Comprehension

Common Comprehension Model

Entity Collaboration

Phase Identification

express

Software Visualization

Search-Based

Optimization

Heat Maps for Classes’ roles in

Use-Case Scenarios

Heat Maps for Developers

ContributionsAnalysis

Genetic Algorithm for Execution

Phases Detection

Genetic Algorithm for Evolution

Phases Identification

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

9

Unified Comprehension Model

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

10

Application: Entity CollaborationComprehension of Software’s Time Dimension

Execution Comprehension Unification Evolution Comprehension

Common Comprehension Framework

Entity Collaboration

express

Software Visualization

Heat Maps for Analysis of Classes’ roles in Use-Case

Scenarios

Heat Maps for Developers

Contributions

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

Entity Collaboration

Evolution Execution

• Software history• Developers contributions to classes

• Commits • Source code changes

• Visual analysis• Changes by each developer• Combination of multiple developers’ contributions

• Execution trace of use cases• Classes roles in use cases

• Method executions• Class activity

• Visual analysis• Degree of class activity in use case• Combination of classes’ activities

11

Problem representation • Sequence• Event• Entity (subject or object)• Property• Entity contribution• EC Aggregation

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

12

Heat Map in VERSO• VERSO

• 3D graphical representation: classes, packages

• Heat map• 2D graphical elements • Heat distribution metaphor• Mapping entity contribution to colors

• Heat map color ramp must not interfere with existing colors

t

hc

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

13

Heat Map Example

• 1 package

• 3 classes: A, B, C

• Class A not involved in any event (no heat map color)

• Class B contribution larger than that of class C

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

14

Heat Map Visualization• Element placement

• Concrete representations• Elements have fixed natural positions

• Software is intangible • Elements do not have predefined positions

• Element placement to ease the visual analysis

• Similar heat map colors closer

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

15

Element Placement

• Treemap Layout

• Provide 2 degrees of freedomSibling classes and sibling packages

• Layout optimization by Simulated Annealing• Solution = Element placement• Neighbour solution = Sibling packages & classes

swapping (exploit 2 degrees of freedom)• Fitness function = Interesting classes Manhattan

distances

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

16

Element Placement Optimization

Initial solution (random) Optimized solution

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

17

Visual Analysis• Heat map

• Entity contribution representation or an aggregation

• Interactive visualization features• 3D camera: zooming, scene navigation, synchronization• Visual cluttering options: 3D box properties, invisible boxes• Color scale manager: color filtering, color re-mapping, histogram

• Complex analysis tasks with heat maps comparison• Color weaving• Flipping • Multiple windows

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

18

Filtering Options

Filtering Re-mapping

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

19

Color Weaving

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

20

Flipping

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

21

Multiple Windows

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

22

Case Studies

• Case study 1: JHotDraw GUI framework• Evolution visualization• 4 versions: 5.2, 5.3, 5.4.1 et 6.0.1 • 171 to 498 classes, 14 to 36 packages • 8 developers

• Cast Study 2: Pooka email client• Execution visualization• 301 classes, 32 packages• 37 execution traces• 3 use cases: Read email, Inbox actions,

Search mail

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

23

Visual Analysis Example (Evolution)

• Developers questions (Fritz and Murphy 2010):

A. Who is working on what?

B. Who has made changes to my classes?

C. Which classes has been changed most?

• Generate developers’ contribution heat maps, one per developerQuestions: A, B

• Generate an aggregate heat map for all developers’ contributionsQuestions: C

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

24

• Bob contribution to JHotDraw 5.4.1

• Jane contribution to JHotDraw 5.4.1

• Heat map comparison to reveal collaborations

• Examples• Class ZoomDrawingView in package contrib.zoom

(Bob only)• Class JavaDrawApplet in package applet (Jane only)• Class JavaDrawApp in package samples.javadraw

(Bob more than Jane)

Visual Analysis Example (Evolution)

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

25

Visual Analysis Example (Execution)

• Identification of classes responsible for email attachment management

• Scenario 1: Open email + close email • Scenario 2: Open email + open attachment + close email

• 31 classes involved in Scenario 2 out of 301 in Pooka

• Narrowed down to 6 classes

• New classes: Attachment, AttachmentPane and AttachmentBundle

• More active: Pooka, PookaManager, MessageInfo

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

26

Application: Phase IdentificationComprehension of Software’s Time Dimension

Execution Comprehension Unification Evolution Comprehension

Common Comprehension Framework

Phase Identification

express

Search-Based

Optimization

Genetic Algorithm for Execution

Phases Detection

Genetic Algorithm for Evolution

Phases Identification

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

27

Phase Identification

Evolution Execution

• Software history• Development stages of software

• Commits to classes• Changes to source code over time

• Heuristic search• Similarity of changes within phase commits• Dissimilarity of changes between two successive

phases

• Execution trace• Execution parts relating to features

• Method executions• Object lifetimes

• Heuristic search• Similar objects activity within a phase• Different objects from one phase to

another

Problem representation • Sequence• Event• Entity (subject or object)• Property• Phase

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

28

Optimization Problem• Phases

• Subsets of consecutive sequence events

• Phase identification • Search optimal decomposition of sequence events

• Solution• partition of sequence events into phases

• Near-optimal solution • Solution parts relate to meaningful phases

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

29

Search Space

• All possible decompositions of sequence of events into phases

• Solutions can have any number of phases

• possible solutions:

• is the number of potential phase switching positions (in the order of the number of events)• is the actual number of phase transitions

• Genetic algorithm to find the best solution to phase identification problem

lkC

lk

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

30

Solution Encoding

• Solutions • Vector of integers• Event index (cut position)

• Cut position • Execution context

• Method return followed by method call

• Evolution context• Last commit of each development day

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

31

Genetic Algorithm• Initial population of solutions is created randomly

• Different number of phases• No maximum number of phases

• Each iteration• New population

• Elect fittest solution (Elitism)• Select good parents for reproduction (Selection)

• Roulette-wheel• Tournament selection

• Generate solutions• Existing genetic material (Crossover)• New genetic material (Mutation)

• Fitness based on heuristics

N

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

32

Heuristics for Phase Identification

Evolution Execution

• Based on structural changes

• Two successive phases have minimum common committed classes

• Two successive phases have minimum common types of changes

• Classes undergo similar changes within a phase

• Based on object lifetimes

• Two successive phases have minimum common active objects

• Few objects come from previous phases

• Objects are created at the begining of a phase and destroyed before its end

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

33

Fitness Function• Heuristics translated to measurable properties

Evolution Execution

• Entity Coupling: Ratio of common committed entities in two phases (ETCp)

• Change-Type Coupling: Ratio of common change types between two phases (CTCp)

• Change-Importance Cohesion: Variance of change importance in a phase (CICh)

• Development-Rate Cohesion: Variance of mean time between phase commits (DRCh)

• Object_metric: score for objects’ lifetimes within two phases.

• Phase_coupling: Ratio of common active objects in two phases

• Thin_cut: Ratio of object traversing cut position

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

34

Fitness Function (evolution)• Evaluation of a given solution’s quality

where is the solution (history partition) to be evaluated

• Metric values have different ranges and are not normalized• Use geometric mean to compare solutions’ qualities

sol

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

35

Fitness Function (execution)• Evaluation of a given solution’s quality

where is the solution (trace partition) to be evaluated and , and are weights affected to each component

• All the values are normalized between and the algorithm maximizes the solutions fitness

sol a b c

1,0

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

36

Crossover Operator

• Single point crossover

• Randomly pick a cut position in the two parent solutions

• New cut position produces four parts

• Perform crossover between the two parents

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

37

Mutation Operator

• Three different mutations with equal probability

• Add a new cut position (A1)

• Remove a cut position (A2)

• Perturb a cut position (A3)

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

38

Evaluation

Evolution Execution

Applications ArgoUML, JFreeChart, ICEFaces JHotDraw, Pooka

Data 2 to 11 years of development882 to 9150 commits 1088 to 2748 classes4 to 15 releases

7 execution scenarios63162 to 105069 execution eventsEx: Initialization, Open File, Draw Circle, Save File

Tool ChangeDistiller1 for structural diffSubversion

JVMTI2 API to record Methods entry/exit

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

1 http://www.ifi.uzh.ch/seal/research/tools/changeDistiller.html2 http://docs.oracle.com/javase/6/docs/technotes/guides/jvmti

39

Evaluation

Evolution Execution

Reference Official release dates Tags representing end of each feature in the trace

Measure Distance: days between computed and reference evolution events

Stability: similarity between computed solutions

Recall of discovered release events

Precision and recall of phases

Precision and recall of events

DE

AEDEAEDEprecisionevent

,

AE

AEDEAEDErecallevent

),(

Detected

ActualDetectedprecision phase

Actual

ActualDetectedrecall phase

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

40

Results (Evolution)• Different runs of the algorithm generate similar solutions• Most software releases discovered (45/49)• Software releases events identified with good accuracy (days)

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

41

Results (Execution)• Detected most phases with high precision• Initialization phase difficult to detect, better results when omitted

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

42

Conclusion• Propose a unified view of software comprehension problems involving time

dimension• Define a common comprehension framework• Express entity collaboration comprehension with our model in both contexts

• Evolution: Developers’ contributions to software evolution• Execution: Classes’ roles in executions of use-case scenarios

• Use software visualization with the same metaphor for analysis and comprehension

• Express phase identification problem with our model in both contexts• Evolution: Phases in terms of development activities• Execution: Phases relating to high-level software features

• Use genetic algorithm to detect both evolution and execution phases

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

43

Future Work

• Specific research directions• User study to evaluate heat map visualization• Object profiles in execution traces• Evolution phases patterns across different software releases

• Future perspectives• Use of unified framework to express and analyze

• Debugging and profiling in evolution• ‘Co-change’ in execution

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

44

Publications• O. Benomar, H. Sahraoui, and P. Poulin. A unified framework for the comprehension of

software’s time dimension. In International Conference on Software Engineering, (to appear) ICSE-NIER, 2015.

• O. Benomar, H. Sahraoui, and P. Poulin. Visualizing software dynamicities with heat maps. In Working Conference on Software Visualization, VISSOFT, 2013.

• O. Benomar, H. Sahraoui, and P. Poulin. Detecting program execution phases using heuristic search. In International Symposium on Search-Based Software Engineering, SSBSE, 2014.

• O. Benomar, H. Abdeen, H. Sahraoui, P. Poulin and M. A. Saied. Detection of software evolution phases based on development activities. In International Conference on Program Comprehension, (to appear) ICPC, 2015.

Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion

45

Thank You