37
Fabio Palomba DETECTING BAD SMELLS IN SOURCE CODE USING CHANGE HISTORY INFORMATION University of salerno fisciano, 27/09/2013 Candidate: Fabio Palomba @fabiopalomba3 ADVISORS: Andrea De Lucia Gabriele Bavota sabato 28 settembre 13

Detecting Bad Smells in Source Code using Change History Information

Embed Size (px)

DESCRIPTION

Code smells represent symptoms of poor implementation choices. Previous studies found that these smells make source code more difficult to maintain, possibly also increasing its fault-proneness. There are several approaches that identify smells based on code analysis techniques. However, we observe that many code smells are intrinsically characterized by how code elements change over time. Thus, relying solely on structural information may not be sufficient to detect all the smells accurately. We propose an approach to detect five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy, by exploiting change history information mined from versioning systems. We applied approach, coined as HIST (Historical Information for Smell deTection), to eight software projects written in Java, and wherever possible compared with existing state-of-the-art smell detectors based on source code analysis. The results indicate that HIST’s precision ranges between 61% and 80%, and its recall ranges between 61% and 100%. More importantly, the results confirm that HIST is able to identify code smells that cannot be identified through approaches solely based on code analysis.

Citation preview

Page 1: Detecting Bad Smells in Source Code using Change History Information

Fabio Palomba

DETECTING BAD SMELLS IN SOURCE CODE USING CHANGE HISTORY INFORMATION

University of salernofisciano, 27/09/2013

Candidate:

Fabio Palomba@fabiopalomba3

ADVISORS:

Andrea De Lucia Gabriele Bavota

sabato 28 settembre 13

Page 2: Detecting Bad Smells in Source Code using Change History Information

TODAY, I’LL SPEAK ABOUT...

Bad Code: causes and effectsSoftware Evolution and Code Quality

HIST: Historical Information for Smell deTectionA Method to Detect Bad Smells using Historical Informations

sabato 28 settembre 13

Page 3: Detecting Bad Smells in Source Code using Change History Information

SOFTWARE EVOLUTION AND CODE QUALITY

PART I

sabato 28 settembre 13

Page 4: Detecting Bad Smells in Source Code using Change History Information

CohesionCoupling

W. Stevens, G. Myers, and L. Constantine. Structured design. IBM Systems Journal, 13(2):115 - 139, 1974.

sabato 28 settembre 13

Page 5: Detecting Bad Smells in Source Code using Change History Information

sabato 28 settembre 13

Page 6: Detecting Bad Smells in Source Code using Change History Information

Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software

Engineering, 22(10):751 - 761, 1996.

L. C. Briand, J. Wüst, J. W. Daly, and V. D. Porter. Exploring the relationship between

design measures and software quality in object- oriented systems. Journal of

Systems and Software (JSS), 51(3):245 - 273, 2000.

DEFECTS

sabato 28 settembre 13

Page 7: Detecting Bad Smells in Source Code using Change History Information

COMPREHENSIBILITY

M. Abbes, F. Khomh, Y.-G. Guéhéneuc, and G. Antoniol,“An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension,” in 15th European Conference on Software Maintenance and Reengineering, CSMR 2011.

sabato 28 settembre 13

Page 8: Detecting Bad Smells in Source Code using Change History Information

Tibor Gyimóthy, Rudolf Ferenc, and István Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions

on Software Engineering (TSE), 31(10):897 - 910, 2005.

PRODUCTIVITY

sabato 28 settembre 13

Page 9: Detecting Bad Smells in Source Code using Change History Information

2 - 100MAINTENANCE COSTS ARE

GREATER THAN DEVELOPMENT COSTSR.D. Banker, S. M. Datar, C. F. Kemerer, D. Zweig. Software complexity and

maintenance costs. Communications of the ACM, v.36 n.11, pages 81 - 94, 1993.sabato 28 settembre 13

Page 10: Detecting Bad Smells in Source Code using Change History Information

HIST: HISTORICAL

INFORMATION FOR SMELL DETECTION

PART II

sabato 28 settembre 13

Page 11: Detecting Bad Smells in Source Code using Change History Information

An AntiPattern is a literary form that describes a commonly occurring solution to a problem that

generates decidedly negative consequencesWilliam H. Brown, Raphael C. Malveau, Hays W. McCormick, Thomas J. Mowbray -

Antipatterns: Refactoring Software, Architectures and Project in Crisis

22 14

BAD SMELLS: DEFINITION

DEVELOPMENT ANTIPATTERNSsabato 28 settembre 13

Page 12: Detecting Bad Smells in Source Code using Change History Information

DETECTION VIA STRUCTURAL ANALYSIS“Code smell are structural characteristics of software

that may indicate a code or design problem.”F. Fontana et al. - “Automatic detection of bad smells in code: An experimental assessment”, Journal of Object Technology

sabato 28 settembre 13

Page 13: Detecting Bad Smells in Source Code using Change History Information

DETECTION VIA STRUCTURAL ANALYSIS“Code smell are structural characteristics of software

that may indicate a code or design problem.”

Many bad smells are intrinsically characterized by how code elements change over time,

rather than by structural properties!

F. Fontana et al. - “Automatic detection of bad smells in code: An experimental assessment”, Journal of Object Technology

sabato 28 settembre 13

Page 14: Detecting Bad Smells in Source Code using Change History Information

BLOB Feature envy

Divergent change

shotgun surgery

Parallel inheritance hierarchies

sabato 28 settembre 13

Page 15: Detecting Bad Smells in Source Code using Change History Information

HIST PROCESS

sabato 28 settembre 13

Page 16: Detecting Bad Smells in Source Code using Change History Information

HIST PROCESS

sabato 28 settembre 13

Page 17: Detecting Bad Smells in Source Code using Change History Information

EXTRACTING CHANGE HISTORYChanges at method-level are captured

using a code-analyzer developed in the Markos European Project*

* http://markosproject.berlios.desabato 28 settembre 13

Page 18: Detecting Bad Smells in Source Code using Change History Information

HIST PROCESS

sabato 28 settembre 13

Page 19: Detecting Bad Smells in Source Code using Change History Information

HIST PROCESS

sabato 28 settembre 13

Page 20: Detecting Bad Smells in Source Code using Change History Information

DETECTION ALGORITHMS

Divergent change occurs when one class is commonly changed in different ways for different reasons.

Classes having at least two sets of methods changing together but independently

from methods in the other sets

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 21: Detecting Bad Smells in Source Code using Change History Information

DETECTION ALGORITHMS

Divergent change occurs when one class is commonly changed in different ways for different reasons.

Classes having at least two sets of methods changing together but independently

from methods in the other sets

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 22: Detecting Bad Smells in Source Code using Change History Information

DETECTION ALGORITHMS

Divergent change occurs when one class is commonly changed in different ways for different reasons.

Classes having at least two sets of methods changing together but independently

from methods in the other sets

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 23: Detecting Bad Smells in Source Code using Change History Information

MINING VERSION HISTORIES THROUGH ASSOCIATION RULE DISCOVERY

C

E

D

B

A

Files

Changes occurring in snapshotsS1 S2 S3 S4 S5 S6 S7 S8

Thomas Zimmermann, Peter Weißgerber, Stephan Diehl, Andreas Zeller : Mining Version Histories to Guide Software Changes. ICSE 2004: 563-572

Annie T. T. Ying, Gail C. Murphy, Raymond T. Ng, Mark Chu-Carroll: Predicting Source Code Changes by Mining

Change History. IEEE Trans. Software Eng. 30(9): 574-586 (2004)

A

C

B

D

A

D

B

D

E

A

C

B

D

E

A

C

sabato 28 settembre 13

Page 24: Detecting Bad Smells in Source Code using Change History Information

MINING VERSION HISTORIES THROUGH ASSOCIATION RULE DISCOVERY

C

E

D

B

A

Files

Changes occurring in snapshotsS1 S2 S3 S4 S5 S6 S7 S8

Thomas Zimmermann, Peter Weißgerber, Stephan Diehl, Andreas Zeller : Mining Version Histories to Guide Software Changes. ICSE 2004: 563-572

Annie T. T. Ying, Gail C. Murphy, Raymond T. Ng, Mark Chu-Carroll: Predicting Source Code Changes by Mining

Change History. IEEE Trans. Software Eng. 30(9): 574-586 (2004)

A

C

B

D

A

D

B

D

E

A

C

B

D

E

A

C

sabato 28 settembre 13

Page 25: Detecting Bad Smells in Source Code using Change History Information

Classes containing at least one method changing together with methods

contained in more than δ (δ=3) different classes

DETECTION ALGORITHMS

You have a Shotgun Surgery when every time you make a kind of change, you have to make a lot of little changes to

a lot of different classes.

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 26: Detecting Bad Smells in Source Code using Change History Information

Pairs of classes for which the addition of a subclass implies the addition of a subclass

for the other class

You have a Parallel Inheritance Hierarchies when every time you make a subclass of one class, you also have to

make a subclass of another.

DETECTION ALGORITHMS

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 27: Detecting Bad Smells in Source Code using Change History Information

DETECTION ALGORITHMS

A Blob is a class implementing several responsibilities, having a large number of attributes, operations and dependencies

with data classes.

Classes modified (in any way) in more than α% (α =8) commits involving at least another class

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 28: Detecting Bad Smells in Source Code using Change History Information

DETECTION ALGORITHMS

A Feature Envy occurs when a method is more interested in a class other than the one it is actually in.

Methods involved in commits with methods ofanother class of the system β% (β =70) more

than commits with methods of their class

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 29: Detecting Bad Smells in Source Code using Change History Information

EMPIRICAL EVALUATIONsabato 28 settembre 13

Page 30: Detecting Bad Smells in Source Code using Change History Information

CASE STUDY DESIGN

RQ1: Which are the performance of HIST in detecting bad smells?

RQ2: How does HIST compares to the techniques based on structural analysis?

RQ System MetricsApache TomcatApache AntJEdit5 API Android

Apache TomcatApache AntJEdit5 API Android

PrecisionRecallF-Measure

PrecisionRecallF-MeasureCorrectti ∩ Correcttj

Correctti \ Correcttj

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013sabato 28 settembre 13

Page 31: Detecting Bad Smells in Source Code using Change History Information

10 % (20 - 7)

0 % (0 - 0)

7 % (4 - 45)

50 % (52 - 49)

63 % (68 - 60)

76 % (73 - 79)

89 % (80 - 100)

61 % (61 - 61)

68 % (76 - 61)

76 % (71 - 81)

RESULTS

Bad smell HIST Code analysis technique

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013

Divergent Change

Shotgun Surgery

Parallel Inheritance

Blob

Feature Envy

F-Measure (Precision - Recall) F-Measure (Precision - Recall)

sabato 28 settembre 13

Page 32: Detecting Bad Smells in Source Code using Change History Information

48 %

41 %

17 %

39 %

35 %

20 %

40 %43 % 17 %

RESULTS

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia and D. Poshyvanyk, “Detecting Bad Smells in Source Code Using Change History Information”, in the 28th IEEE/ACM International Conference

on Automated Software Engineering (ASE’13), 2013

0 % 93 %

100 % 0 %

7 %

0 %

Bad smell HIST ∩ CA HIST \ CA CA \ HISTDivergent Change

Shotgun Surgery

Parallel Inheritance

Blob

Feature Envy

sabato 28 settembre 13

Page 33: Detecting Bad Smells in Source Code using Change History Information

Conclusions & Future work

sabato 28 settembre 13

Page 34: Detecting Bad Smells in Source Code using Change History Information

CONCLUSION“This paper seems to bring some fresh air

into an area that has not seen breakthroughs for some time.”

[One of the ASE Reviewers]

+ PRECISION+ F - MEASURE

+ RECALL

+ Historical analysis for smell detection

- Historical information is needed

sabato 28 settembre 13

Page 35: Detecting Bad Smells in Source Code using Change History Information

CAN WE DEFINE AN HYBRID APPROACH TO DETECT BAD SMELLS?

sabato 28 settembre 13

Page 36: Detecting Bad Smells in Source Code using Change History Information

CAN WE USE HIST TO DETECT OTHER BAD SMELLS?

Spaghetti CodeLong Method

Complex Class

Refused Bequest

Duplicate Code

Primitive ObsessionMiddle Man

Data Clump

sabato 28 settembre 13

Page 37: Detecting Bad Smells in Source Code using Change History Information

Fabio Palomba fabiopalomba13

[email protected]

http://www.linkedin.com/pub/fabio-palomba/4a/542/60

University of Salerno

DETECTING BAD SMELLS IN SOURCE CODE USING CHANGE HISTORY INFORMATION

Thank you!Questions and/or comments

sabato 28 settembre 13