Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and...

Preview:

Citation preview

Cohesion and Coupling OptimisationBALANCING IMPROVEMENT AND DISRUPTION

MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU

What I have (not) done

The ultimate solution for software modularisation

Full understanding of developers behaviour

Identification of the best metrics for software modularisation

matheus.paixao.14@ucl.ac.uk 2

What I have done

Thorough empirical study of structural cohesion and coupling optimisation

Identification of disruption as an important overlooked issue

Multi-objective approach to balance structural improvement and disruption

matheus.paixao.14@ucl.ac.uk 3

Structural Dependencies

4matheus.paixao.14@ucl.ac.uk

p1

c1 c2 c3

c4 c5

p2

c6 c7

c8

p3

c9

Function Call

Data Access

Inheritance

Interface Implementation

Modularisation drivers

Semantic and Structural cohesion/coupling metrics better describe developers’ implementations [1][2]

matheus.paixao.14@ucl.ac.uk 5

[2] Candela, I., Bavota, G., Russo, B., & Oliveto, R. (2016). Using Cohesion and Coupling for Software Remodularization : Is It Enough ? ACM Transactions on Software Engineering and Methodology, 25(3), 1–28.

[1] Bavota, G., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., & De Lucia, A. (2013). An empirical study on the developers’ perception of software coupling. In 2013 35th International Conference on Software Engineering (ICSE) (pp. 692–701). San Francisco: IEEE.

Literature Review

6matheus.paixao.14@ucl.ac.uk

Metrics Validation

LongitudinalEvaluation

DisruptionAnalysis

largest empirical study of automated software

re-modularisation to date

Software Systems Under Study

matheus.paixao.14@ucl.ac.uk 7

Modularisation Quality - MQ

matheus.paixao.14@ucl.ac.uk 8

p1

c1 c2 c3

c4 c5

p2

c6 c7

c8

p3

c9

𝑀𝐹(𝑝1) = 0.72

𝑀𝐹(𝑝2) = 0.66

𝑀𝐹(𝑝3) = 0.00

𝑀𝑄 = 1.38

RQ1: Is there any evidence that open source software systems respect structural measurements of cohesion and coupling?

matheus.paixao.14@ucl.ac.uk 9

RQ1.1: Purely Random Distribution RQ1.2: k-Random Neighbourhood Search

RQ1.3: Systematic 1-Neighbourhood Search

Cohesion

99.99%

0.1001%

MQ

99.99%

0.0866%

RQ2: What is the relationship between raw cohesion and the MQ metric?

matheus.paixao.14@ucl.ac.uk 10

-100.00%

-50.00%

0.00%

50.00%

100.00%

150.00%

200.00%

250.00%

300.00%

350.00%

400.00%

Bunch Solutions

MQ Difference Cohesion Difference

Bunch Search Solutions

Pivot 2.0.2

13

Packages

58

All Systems

493.11%

RQ2: What is the relationship between raw cohesion and the MQ metric?

matheus.paixao.14@ucl.ac.uk 11

Package-constrained Search Solutions

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

Package-constrained Solutions

MQ Difference Cohesion Difference

matheus.paixao.14@ucl.ac.uk 12

DisMoJo

Disruption metric based on MoJoFM[2]

matheus.paixao.14@ucl.ac.uk 13

[3] Zhihua Wen, & Tzerpos, V. (2004). An effectiveness measure for software clustering algorithms. In Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. (pp. 194–203).

Disruption Analysis Workflow

matheus.paixao.14@ucl.ac.uk 14

Releases

Re-modularisationapproach

ImprovedModularisations

DisMoJo

DisruptionValues

BunchPackage-constrained

30 executions

RQ3: What is the disruption caused by search based approaches for optimising software modularisation?

matheus.paixao.14@ucl.ac.uk 15

Bunch Package-constrained

Mean

80.39% 57.82%

Release 1.2

p4

Longitudinal Disruption Analysis

matheus.paixao.14@ucl.ac.uk 16

Release 1.1

p1

c1

p2

p3

c2 c3

c4 c5

c9

c6 c7

c8

p1

c1

p2

p3

c2 c3

c4 c5

c9

c6 c7

c8

c10 c11 c12

Longitudinal Disruption Analysis

matheus.paixao.14@ucl.ac.uk 17

Lower Bound

Mean

4.32% 30.99%

Upper Bound

Balancing modularity improvement and disruption

matheus.paixao.14@ucl.ac.uk 18

RQ5: What is the modularity improvement provided by the multiobjective search for acceptable disruption levels?

matheus.paixao.14@ucl.ac.uk 19

Package-free

Lower Bound

1.66% in MQ

0.13% in Cohesion

Upper Bound

150.38% in MQ

2.38% in Cohesion

Package-constrained

Lower Bound

3.36% in MQ

0.72% in Cohesion

Upper Bound

59.25% in MQ

23.49% in Cohesion

Conclusion

Software systems respect modularity measurements

Search based approaches for re-modularisation cause large disruption

Multiobjective search can be used to find clear and constant trade-off between modularity improvement and disruption

Modularity can be improved within lower and upper bounds of acceptable disruption performed by developers

matheus.paixao.14@ucl.ac.uk 20

Backup – System’s Selection Criteria

At least 10 subsequent official releases

No general libraries and APIs

Java systems

matheus.paixao.14@ucl.ac.uk 21

Backup - Software Systems Under Study

matheus.paixao.14@ucl.ac.uk 22

it’s an ordinal metric[1]

value has no meaning

it’s not normalised

Backup - Modularisation Metrics

matheus.paixao.14@ucl.ac.uk 23

avoids god packages

MQ

[4] Stanley Stevens. (1946). On the Theory of Scales of Measurement. American Association for the Advancement of Science, 103(2684), 677–680.

it’s an interval metric[1]

easy to understand

it’s normalised

leads to god packages

Cohesion

inflation effect

Backup – RQ1 table of results

matheus.paixao.14@ucl.ac.uk 24

Backup – RQ2 table of results

matheus.paixao.14@ucl.ac.uk 25

Backup – Classes distribution

matheus.paixao.14@ucl.ac.uk 26

RQ2.3: Package-constrained Search Solutions

30.06% cohesion improvement

34.15%

Biggest package

Smallest package

Original Implementations

0.32%

Biggest package

Smallest package

Package-constrained solutions

1.33%

19.80%

Backup – RQ3 table of results

matheus.paixao.14@ucl.ac.uk 27

Backup - RQ3: Best MQ and best cohesion disruption

matheus.paixao.14@ucl.ac.uk 28

Bunch Package-constrained

Mean

80.39% 57.82%

Best MQ

79.67% 55.30%

Best Cohesion

78.77% 54.33%

Backup – RQ4 Two-Archive GA parameters

Population Size: number of classes (N)

Single point crossover (0.8 if N < 100; 1.0 otherwise)

Swap mutation (0.004 x log2N)

Tournament selection (size 2)

50N Generations

matheus.paixao.14@ucl.ac.uk 29

Backup – RQ5 natural disruption

matheus.paixao.14@ucl.ac.uk 30

Backup – RQ5 modularity improvement within acceptable disruption

matheus.paixao.14@ucl.ac.uk 31

Recommended