Cohesion and Coupling Optimisation BALANCING IMPROVEMENT AND DISRUPTION MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU
What I have (not) done The ultimate solution for software modularisation Full understanding of developers behaviour Identification of the best metrics for software modularisation matheus.paixao.14@ucl.ac.uk 2
What I have done Thorough empirical study of structural cohesion and coupling optimisation Identification of disruption as an important overlooked issue Multi-objective approach to balance structural improvement and disruption matheus.paixao.14@ucl.ac.uk 3
Structural Dependencies p1 p2 Function Call c1 c2 c3 c6 c7 Data Access c4 c5 c8 Inheritance Interface Implementation p3 c9 matheus.paixao.14@ucl.ac.uk 4
Modularisation drivers Semantic and Structural cohesion/coupling metrics better describe developers’ implementations [1][2] [1] Bavota, G., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., & De Lucia, A. (2013). An empirical study on the developers’ perception of software coupling. In 2013 35th International Conference on Software Engineering (ICSE) (pp. 692 – 701). San Francisco: IEEE. [2] Candela, I., Bavota, G., Russo, B., & Oliveto, R. (2016). Using Cohesion and Coupling for Software Remodularization : Is It Enough ? ACM Transactions on Software Engineering and Methodology , 25 (3), 1 – 28. matheus.paixao.14@ucl.ac.uk 5
Literature Review Metrics Validation Disruption Analysis largest empirical study of Longitudinal automated software Evaluation re-modularisation to date matheus.paixao.14@ucl.ac.uk 6
Software Systems Under Study matheus.paixao.14@ucl.ac.uk 7
Modularisation Quality - MQ p1 p2 c1 c2 c6 c7 c3 c4 c5 c8 𝑁𝐺(𝑞2) = 0.66 𝑁𝐺(𝑞1) = 0.72 p3 𝑁𝐺(𝑞3) = 0.00 c9 𝑁𝑅 = 1.38 matheus.paixao.14@ucl.ac.uk 8
RQ1: Is there any evidence that open source software systems respect structural measurements of cohesion and coupling? RQ1.1: Purely Random Distribution RQ1.2: k-Random Neighbourhood Search RQ1.3: Systematic 1-Neighbourhood Search 0.1001% 0.0866% 99.99% 99.99% Cohesion MQ matheus.paixao.14@ucl.ac.uk 9
RQ2: What is the relationship between raw cohesion and the MQ metric? Pivot 2.0.2 Bunch Search Solutions Bunch Solutions 400.00% 350.00% 58 13 300.00% 250.00% 200.00% 150.00% 100.00% 50.00% Packages 0.00% -50.00% All Systems -100.00% MQ Difference Cohesion Difference 493.11% matheus.paixao.14@ucl.ac.uk 10
RQ2: What is the relationship between raw cohesion and the MQ metric? Package-constrained Search Solutions Package-constrained Solutions 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% MQ Difference Cohesion Difference matheus.paixao.14@ucl.ac.uk 11
matheus.paixao.14@ucl.ac.uk 12
DisMoJo Disruption metric based on MoJoFM[2] [3] Zhihua Wen, & Tzerpos, V. (2004). An effectiveness measure for software clustering algorithms. In Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. (pp. 194 – 203). matheus.paixao.14@ucl.ac.uk 13
Disruption Analysis Workflow Bunch Improved Package-constrained Modularisations 30 executions Re-modularisation approach DisMoJo Releases Disruption Values matheus.paixao.14@ucl.ac.uk 14
RQ3: What is the disruption caused by search based approaches for optimising software modularisation? Package-constrained Bunch Mean 80.39% 57.82% matheus.paixao.14@ucl.ac.uk 15
Longitudinal Disruption Analysis Release 1.1 Release 1.2 p1 p2 p1 p2 c2 c6 c7 c2 c6 c7 c1 c3 c1 c3 p3 p4 c4 c5 c8 c8 c4 c5 p3 c9 c9 c10 c11 c12 matheus.paixao.14@ucl.ac.uk 16
Longitudinal Disruption Analysis Lower Bound Upper Bound Mean 4.32% 30.99% matheus.paixao.14@ucl.ac.uk 17
Balancing modularity improvement and disruption matheus.paixao.14@ucl.ac.uk 18
RQ5: What is the modularity improvement provided by the multiobjective search for acceptable disruption levels? Package-free Package-constrained Lower Bound Lower Bound 1.66% in MQ 3.36% in MQ 0.13% in Cohesion 0.72% in Cohesion Upper Bound Upper Bound 150.38% in MQ 59.25% in MQ 2.38% in Cohesion 23.49% in Cohesion matheus.paixao.14@ucl.ac.uk 19
Conclusion Software systems respect modularity measurements Search based approaches for re-modularisation cause large disruption Multiobjective search can be used to find clear and constant trade-off between modularity improvement and disruption Modularity can be improved within lower and upper bounds of acceptable disruption performed by developers matheus.paixao.14@ucl.ac.uk 20
Backup – System’s Selection Criteria At least 10 subsequent official releases No general libraries and APIs Java systems matheus.paixao.14@ucl.ac.uk 21
Backup - Software Systems Under Study matheus.paixao.14@ucl.ac.uk 22
Backup - Modularisation Metrics MQ Cohesion avoids god packages leads to god packages it’s not normalised it’s normalised value has no meaning easy to understand it’s an ordinal metric[1] it’s an interval metric[1] inflation effect [4] Stanley Stevens. (1946). On the Theory of Scales of Measurement. American Association for the Advancement of Science , 103 (2684), 677 – 680. matheus.paixao.14@ucl.ac.uk 23
Backup – RQ1 table of results matheus.paixao.14@ucl.ac.uk 24
Backup – RQ2 table of results matheus.paixao.14@ucl.ac.uk 25
Backup – Classes distribution RQ2.3: Package-constrained Search Solutions Original Implementations Package-constrained solutions 34.15% 19.80% 30.06% cohesion improvement Biggest package Biggest package 0.32% 1.33% Smallest Smallest package package matheus.paixao.14@ucl.ac.uk 26
Backup – RQ3 table of results matheus.paixao.14@ucl.ac.uk 27
Backup - RQ3: Best MQ and best cohesion disruption Package-constrained Bunch Mean 80.39% 57.82% Best MQ 79.67% 55.30% Best Cohesion 78.77% 54.33% matheus.paixao.14@ucl.ac.uk 28
Backup – RQ4 Two-Archive GA parameters Population Size: number of classes (N) Single point crossover (0.8 if N < 100; 1.0 otherwise) Swap mutation (0.004 x log2N) Tournament selection (size 2) 50N Generations matheus.paixao.14@ucl.ac.uk 29
Backup – RQ5 natural disruption matheus.paixao.14@ucl.ac.uk 30
Backup – RQ5 modularity improvement within acceptable disruption matheus.paixao.14@ucl.ac.uk 31
Recommend
More recommend