28
The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm 1 DIRO, Université de Montréal, Canada 2 CS, Missouri University of Science and Technology, USA 3 IT Department, ABMMC, Qatar Ali Ouni 1,2 , Marouane Kessentini 2 , Houari Sahraoui 1 , and Mohamed Salah Hamdi 3 Amsterdam, The Netherlands July 06-10, 2013

The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

Embed Size (px)

Citation preview

Page 1: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

The Use of Development History in Software

Refactoring Using a Multi-Objective

Evolutionary Algorithm

1DIRO, Université de Montréal, Canada2CS, Missouri University of Science and Technology, USA

3IT Department, ABMMC, Qatar

Ali Ouni1,2, Marouane Kessentini2, Houari Sahraoui1,

and Mohamed Salah Hamdi3

Amsterdam, The Netherlands

July 06-10, 2013

Page 2: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

2

Motivating exampleEmployee

ID

Name

FamilyName

Natinality

DateOfBirth

Sex

. . .

. . .

getPhoneNumber()

calculateLocalTax()

getAge()

calculateSalary()

setMaritalStatus()

getCurrentNatinality()

. . .

. . .

Car

IdNumber

TowingCapacity

OwnerName

. . .

getHistoryReport()

getTowingCapacity()

setInsuranceNum()

. . .

defect: Blob Position

PositionId

Grade

CompanyName

. . .

getPosition()

setGrade()

. . .

Page 3: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

3

Outline

• Problem statement

• Approach : Multi-Objective Refactoring

• Evaluation

• Conclusion

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 4: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

4

• Design defect introduced during the initial

design or during evolution

– Anomalies, anti-patterns, bad smells…

– Design situations that adversely affect the development of a

software

– Examples: Blob, spagheti code, functional decomposition, ...

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Design defects

Page 5: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

5

• Refactoring to correct them and to improve code

quality– The process of improving a code after it has been written by

changing the internal structure of the code without changing the external behavior (Fowler et al., ‘99)

– Examples: Move method, extract class, move attribute, ...

• Refactoring implementation may produce

semantic errors

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Refactoring

Page 6: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

6

Problem statement

• Automate the refactoring task

• Existing approaches– See the refactoring as a single-objective problem

– Improve the internal structure

– The semantics is not a major concern

– Produce semantic errors / incoherencies

– The consistency with development history is not

considered when searching for new refactorings

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 7: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

7

Problem statement

• Hypothesis1. Code elements which undergo changes in the past, at

approximately the same time, bear a good probability for being

semantically related.

2. Code elements that experienced a huge number of refactoring

in the past have a good chance for refactoring in the future.

3. Recorded refactorings applied in the past can be used to

propose new ones in similar contexts.

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 8: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

8

Approach Overview

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Source code

with defects Search-based Refactoring

(NSGA-II)

Defects detection

rules

Refactoring

operations

Proposed

refactorings

Software version archive

(change log, recorded refactorings …)

Page 9: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

9

Multi-Objective Refactoring

• See the refactoring task as a multi-objective

optimization problem

– Improve software quality : defects correction

– Maximize the use of development/maintenance

history

– Preserve semantic coherence

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Meta Heuristic Search Using Multi-Objective

Optimization (NSGA-II)

Page 10: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

10

NSGA-II overview

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

• NSGA-II: Non-dominated Sorting Genetic Algorithm (K. Deb et al., ’02)

Parent

Population

Offspring

Population

Non-dominated

sorting

F1

F2

F3

F4Crowding distance

sorting

Population

in next

generation

Page 11: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

11

NSGA-II adaptation

• Representation of the individuals

• Creation of a population of individuals

• Creation of new individuals using genetic operators

(crossover and mutation)

• Definition of fitness functions

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 12: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

12

– Individual = Refactoring solution (sequence of refactoring

operations)

– Controlling parameters

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Representation of individuals

MM PUF EC MF IC PDM

Ref Refactorings Controlling parametersMM move method (sourceClass, targetClass, method)

MF move field (sourceClass, targetClass, field)

PUF pull up field (sourceClass, targetClass, field)

PUM pull up method (sourceClass, targetClass, method)

PDF push down field (sourceClass, targetClass, field)

PDM push down method (sourceClass, targetClass, method)

IC inline class (sourceClass, targetClass)

EC extract class (sourceClass, newClass)

Page 13: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

13

– Population: set of refactoring solutions

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Population creation

MM PUF EC MF IC PDM . . .

MM PUF EC MF IC PDM . . .

MM PUF EC MF IC PDM . . .

MM PUF EC MF IC PDM . . .

MM PUF EC MF IC PDM MF PUM MM

MM PUF EC MF IC PDM MM

MM PUF EC MF IC PDM . . .

MM PUF EC MF IC PDM . . .

Page 14: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

14

Genetic Operators

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Crossover

MM PUF EC MF IC PDM

MF IC PUM EC MM

K=3

Crossover

MM PUF EC EC MM

MF IC PUM MF IC PDM

Parent 1

Parent 2

Child 1

Child 2

MM PUF EC MF IC PDM

MutationParent Child

MM IC EC MM IC PDM

K=2, j=4

Mutation

Page 15: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

15

Objective functions

1. Quality- Maximize the number of corrected defects (ICPC’11)

2. Semantics- Minimize semantic errors (ICSM’12)

3. Development/maintenance history reuse– Maximize the use of past maintenance and development history to

find refactoring opportunities

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 16: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

16

Quality Objective Function

• For each candidate refactoring solution:

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

oringore_refactefects_befdetected_d#

defectscorrected_#Quality

Page 17: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

17

Semantics Objective Function

• Minimize semantic errors– Vocabulary-based similarity (cosine similarity)

– Dependency-based similarity (shared method call, shared

fields)

• Intuition :– The meaningfulness of proposed refectorings increase when

applied to semantically connected elements.

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Page 18: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

18

• Average of three measures1. Score that characterizes the co-change of elements that will be

refactored.

2. Number of changes applied in the past to the same code

elements to modify

3. Similarity with good refactorings applied in the past to similar

code fragments

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

History-based Objective Function

Page 19: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

19

Software version archive

1. Score that characterizes the co-change of elements that

will be refactored.

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

Commit 2

C

Commit 1

A

B

F

GCommit 3

D

B

A

Commit 4

B

C

A

Commit 5

F

D

C

E

Changed files

Co-change

matrix (CCM)

Changed files

A B C D E F G

A 3 3 1 1 0 1 1

B 3 3 1 1 0 1 1

C 1 1 3 1 1 1 0D 1 1 1 2 1 1 0

E 0 0 1 1 1 1 0

F 1 1 1 1 1 2 1

G 1 1 0 0 0 1 1

History-based Objective Function

Page 20: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

20

2. Number of changes applied in the past to the same

code elements to modify

where t(e) is the number of times that the code element(s) e was refactored in the past and

n is the size of the list of possible refactoring operations.

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

n

i

i etROmeasurehistory1

)()(2_

History-based Objective Function

Page 21: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

21

3. Similarity with previous refactorings applied to similar

code fragments

where n is the number of possible refactorings to use

m is the number of times that refactoring has been applied in the past

2 if the same refactoring has been applied in the past

w = 1 if a compatible refactoring has been applied in the past

0 otherwise

Context. Problem statement. Multi-Objective Refactoring. Validation. Conclusion.

n

i

swROmeasurehistory1

)(3_

History-based Objective Function

Page 22: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

22

• Two research questions

– RQ1. To what extent the reuse of software development history

can improve the results of refactoring suggestion?

– RQ2. How do the proposed multi-objective approach performs

compared to random search, mono-objective approach and

other existing work ?

Evaluation

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

Page 23: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

23

Evaluation

• Data: Two large open source Java projects

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

Systems # classes # defects KLOC# revision

commits

Xerces v2.7.0 991 66 240 7493

JFreeChart v1.0.9521 57 170 2009

Page 24: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

24

Evaluation

• Method: Two metrics

– defect correction ratio (DCR)

– refactoring precision (RP)

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

gsrefactorin applying before defects #

defects corrected #DCR

gsrefactorin proposed#

gsrefactorin meaningful #RP

Page 25: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

25

Results & Comparison

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

Systems Approach Algorithm Correction ratio

(DCR)

Meaningful

refactorings

(RP)

JFreeChart

v1.10.2

Our approach NSGA-II 86% (49|57) 94% (197|210)

Ouni et al. CSMR'13 NSGA-II 82% (47|57) 86% (202|234)

Harman et al. GECCO'07 Pareto opt. N.A 66% (192|289)

Kessentini et al. ICPC'11 GA 89% (51/57) 62% (147|236)

Ouni et al. ICSM'12 NSGA-II 84% (48/57) 77% (157|203)

Xerces-J

v2.7.0

Our approach NSGA-II 80% (53|66) 96% (282|294)

Ouni et al. CSMR'13 NSGA-II 79% (52|66) 93% (219|236)

Harman et al. GECCO'07 Pareto opt. N.A 63% (251|396)

Kessentini et al. ICPC'11 GA 88% (58/66) 69% (212|304)

Ouni et al. ICSM'12 NSGA-II 83% (55/66) 81% (186|228)

Page 26: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

26

Results and comparison

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

• Comparison– NSGA-II

– GA (mono-objective)

– Random Search

0

10

20

30

40

50

60

70

80

90

100

JFreeChart Xerces

NSGA-II Mono-objective Random search

DCR

0

10

20

30

40

50

60

70

80

90

100

JFreeChart Xerces

NSGA-II Mono-objective Random search

RP

Defect correction ratio (DCR) Refactoring precision (RP)

Page 27: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

27

Conclusion • A novel search-based approach for refactoring suggestion

– The use of maintenance/development history in software refactoring

• Three objectives to optimize– Quality

– Semantics preservation

– The use of development history

• Evaluation– Two large open-source systems

– Our approach succeeded in correcting the majority of defects while

preserving the semantic coherence of the original program

• Future Work– Use collected refactorings from different systems and calculates a similarity

with not only the refactoring type but also the context (code fragments)

– Test our approach with other other EA and other projects

Context. Problem statement. Multi-Objective. Refactoring. Validation. Conclusion.

Page 28: The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

28

Thanks for your attention