30
How does context affect the distribution of software maintainability metrics? Feng Zhang, Audris Mockus, Ying Zou, Foutse Khomh, and Ahmed E. Hassan

How does Context Affect the Distribution of Software Maintainability Metrics?

Embed Size (px)

DESCRIPTION

Software metrics have many uses, e.g., defect prediction, effort estimation, and benchmarking an organization against peers and industry standards. In all these cases, metrics may depend on the context, such as the programming language. Here we aim to investigate if the distributions of commonly used metrics do, in fact, vary with six context factors: application domain, programming language, age, lifespan, the number of changes, and the number of downloads. For this preliminary study we select 320 nontrivial software systems from SourceForge. These software systems are randomly sampled from nine popular application domains of SourceForge. We calculate 39 metrics commonly used to assess software maintainability for each software system and use Kruskal Wallis test and Mann-Whitney U test to determine if there are significant differences among the distributions with respect to each of the six context factors. We use Cliff’s delta to measure the magnitude of the differences and find that all six context factors affect the distribution of 20 metrics and the programming language factor affects 35 metrics. We also briefly discuss how each context factor may affect the distribution of metric values.We expect our results to help software benchmarking and other software engineering methods that rely on these commonly used metrics to be tailored to a particular context.

Citation preview

Page 1: How does Context Affect the Distribution of Software Maintainability Metrics?

How does context affect the

distribution of software

maintainability metrics?

Feng Zhang, Audris Mockus, Ying Zou, Foutse Khomh,

and Ahmed E. Hassan

Page 2: How does Context Affect the Distribution of Software Maintainability Metrics?

2

Software Metrics

Numerous Software

Page 3: How does Context Affect the Distribution of Software Maintainability Metrics?

3

Various Usage of Software Metrics

Page 4: How does Context Affect the Distribution of Software Maintainability Metrics?

4

Contexts !

Motivation

Page 5: How does Context Affect the Distribution of Software Maintainability Metrics?

5

In Software Engineering Area?

Page 6: How does Context Affect the Distribution of Software Maintainability Metrics?

6

What are the Contexts of Software?

Age (AG)

Number of Changes (NC)

Life Span (LS)

Number of Downloads

(ND)

Application Domain (AD)

Programming Language (PL)

Page 7: How does Context Affect the Distribution of Software Maintainability Metrics?

7

39 Software Maintainability Metrics

Complexity (14 metrics) Abstraction (5 metrics)

Coupling (8 metrics)

Cohesion (4 metrics)

Encapsulation (4 metrics)

Documentation (4 metrics)

Page 8: How does Context Affect the Distribution of Software Maintainability Metrics?

8

Data Collection

56,833

824

Page 9: How does Context Affect the Distribution of Software Maintainability Metrics?

9

Data Cleaning

618

506

478

390

320

824

Page 10: How does Context Affect the Distribution of Software Maintainability Metrics?

10

31

26

23

29

49

19

16

41

29

14

13

7

7

9

7

Build Tools

Code Generators

Communications

Framework

Games / Entertainmaint

Internet

Network

Software Development

System Administrator

Build & CodeGen

Comm & Internet

Comm & Network

Games & Internet

Internet & SW Dev

SW Dev & Sys Admin

57

85

18

146

14

C

C++

C#

Java

Pascal

Data Description

320

Software

Systems

Page 11: How does Context Affect the Distribution of Software Maintainability Metrics?

11

Research Questions

Page 12: How does Context Affect the Distribution of Software Maintainability Metrics?

12

Separately

RQ1. Analysis Methods

Page 13: How does Context Affect the Distribution of Software Maintainability Metrics?

13

RQ1. Analysis Methods (cont’)

For example

C Java Pascal C++ C#

Metric

1

Metric

1

Metric

1

Metric

1

Metric

1

Metric

n

Metric

n

Metric

n

Metric

n

Metric

n

Kruskal Wallis test

Kruskal Wallis test

Page 14: How does Context Affect the Distribution of Software Maintainability Metrics?

14

Complexity (8/14 metrics) Abstraction (1/5 metrics)

Coupling (5/8 metrics)

Cohesion (2/4 metrics)

Encapsulation (1/4 metrics)

Documentation (3/4 metrics)

YES!! the Contexts Matter!

51 % of metrics are

impacted by all Six

Contexts

Page 15: How does Context Affect the Distribution of Software Maintainability Metrics?

15

and Among the Six Contexts …

at least 72 % of

metrics are impacted

by a Single Context

Page 16: How does Context Affect the Distribution of Software Maintainability Metrics?

16

Does it mean ALL six contexts

should be considered all the time?

Page 17: How does Context Affect the Distribution of Software Maintainability Metrics?

17

Research Question 2

Page 18: How does Context Affect the Distribution of Software Maintainability Metrics?

18

RQ2. Analysis Methods

Page 19: How does Context Affect the Distribution of Software Maintainability Metrics?

19

RQ2. Analysis Methods (cont’)

C Java Pascal C++ C#

Metric

i

Metric

i Mann-Whitney U test

Metric

i

Metric

i Mann-Whitney U test

Metric

i

Metric

i Mann-Whitney U test

Metric

i

Metric

i Mann-Whitney U test

Metric

i Metric

i Mann-Whitney U test

Metric

i Mann-Whitney U test Metric

i

Page 20: How does Context Affect the Distribution of Software Maintainability Metrics?

20

RQ2. Analysis Methods (cont’)

0.147 0.330 0.474 Cliff’s delta

14.7% 33.0% 47.4% % of non-overlap

Small Medium Large Cohen’s standard

0.20 0.50 0.80 Cohen’s d

Page 21: How does Context Affect the Distribution of Software Maintainability Metrics?

21

RQ2. Findings for

each Category of Metrics

Page 22: How does Context Affect the Distribution of Software Maintainability Metrics?

22

Metric AD PL AG LS NC ND

TLOC - - - - -

TNF - - - - -

TNC - - -

TNM - - -

TNS - - - - - -

CLOC - - - - - -

NOM - - - - - -

NIM - - - - - -

NIV - - - - - -

WMC - - - - - -

NMP - - - - - -

CC - - - - - -

NPATH - - - - - -

MNL - - - - - -

Contexts Impacting ‘Complexity’

AD: Application Domain

PL : Programming Language

NC: Number of Changes

Page 23: How does Context Affect the Distribution of Software Maintainability Metrics?

23

Metric AD PL AG LS NC ND

CF - - - - -

CBO - - - -

ICP - - - - - -

MPC - - - - - -

RFC - - - -

NMI - - - - -

FANIN - - - - - -

FANOUT - - - - - -

Contexts Impacting ‘Coupling’

AD: Application Domain

PL : Programming Language

NC: Number of Changes

Page 24: How does Context Affect the Distribution of Software Maintainability Metrics?

24

Metric AD PL AG LS NC ND

LCOM - - - - -

TCC - - - - - -

LCC - - - - - -

ICH - - - - - -

Contexts Impacting ‘Cohesion’

AD: Application Domain

Page 25: How does Context Affect the Distribution of Software Maintainability Metrics?

25

Metric AD PL AG LS NC ND

NACI - - - - -

MIF - - - - -

IFANIN - - - -

NOC - - - - - -

DIT - - - - -

Contexts Impacting ‘Abstraction’

AD: Application Domain

PL : Programming Language

Page 26: How does Context Affect the Distribution of Software Maintainability Metrics?

26

Metric AD PL AG LS NC ND

RPA - - - - -

RPM - - - - - -

RSA - - - - - -

RSM - - - - -

Contexts Impacting ‘Encapsulation’

AD: Application Domain

Page 27: How does Context Affect the Distribution of Software Maintainability Metrics?

27

Metric AD PL AG LS NC ND

CLC - - - - - -

RCCC - - - -

CLM - - - - - -

RCCM - - - - - -

Contexts Impacting ‘Documentation’

AD: Application Domain

PL : Programming Language

Page 28: How does Context Affect the Distribution of Software Maintainability Metrics?

28

Summary of RQ2 Findings Metric Category AD PL AG LS NC ND

Complexity - - -

Coupling - - -

Cohesion - - - - -

Abstraction - - - -

Encapsulation - - - - -

Documentation - - - -

AD: Application Domain

PL : Programming Language

NC: Number of Changes

Page 29: How does Context Affect the Distribution of Software Maintainability Metrics?

29

Metric Category Context Groups

Complexity

AD (2) (Framework); and others

PL (3) (C); (Pascal); and others

NC (3) (Low NC;) (moderate NC); and (high NC)

Coupling

AD (3) (Communication, Network); (Build Tools, Code

Generators;) and others

PL (3) (Pascal;) (Java;) and others

NC (3) (Low NC); (moderate NC); and (high NC)

Cohesion AD (2) (Communication, Network); and others

Abstraction

AD (4) (Communication, Network); (Games); (Build Tools,

Code Generators); and others

PL (3) (Java;) (C++); and others

Encapsulation AD (3) (Build Tools); (Communication, Network); and others

Documentation AD (2) (Build Tools, Code Generators); and others

PL (2) (Java); and others

Guidelines for Benchmarking

Maintainability Metrics

Page 30: How does Context Affect the Distribution of Software Maintainability Metrics?

30