data driven analysis of
play

Data-Driven Analysis of Technical Debt Based on Open- Sources - PowerPoint PPT Presentation

8/15/2018 | 1 Data-Driven Analysis of Technical Debt Based on Open- Sources Software Projects Georgios Digkas 1,2 1 University of Groningen, Netherlands 2 University of Macedonia, Greece Self Introduction 8/15/2018 | 2 Technical Debt


  1. 8/15/2018 | 1 Data-Driven Analysis of Technical Debt Based on Open- Sources Software Projects Georgios Digkas 1,2 1 University of Groningen, Netherlands 2 University of Macedonia, Greece

  2. Self Introduction 8/15/2018 | 2

  3. Technical Debt 8/15/2018 | 5 source : https://twitter.com/carnage4life

  4. Technical Debt Types Symptoms of Technical Debt 8/15/2018 | 6 source : https://conference.eurostarsoftwaretesting.com source : https://conference.eurostarsoftwaretesting.com

  5. TD Tools 8/15/2018 | 7 Li, Zengyang et al.

  6. SonarQube TD Evaluation 8/15/2018 | 8 Rules … Technical Debt

  7. SonarQube Rules / Issues 8/15/2018 | 9 source : docs.sonarqube.org

  8. ECSA 2017 8/15/2018 | 12 The evolution of Technical Debt in the Apache Ecosystem

  9. 8/15/2018 | 13 › What are the most frequent types of TD? › What are the most costly to fix types of TD? › How does TD evolve over time?

  10. Focus on Apache Ecosystem 8/15/2018 | 14

  11. Evolution of TD 8/15/2018 | 15

  12. Evolution of Normalized TD 8/15/2018 | 16 Normalized TD = TD / NCLOC

  13. The most frequent types of TD 8/15/2018 | 17 # Issue % String literals should not be duplicated 1 7.0 2 The members of an interface declaration or class should appear in a pre-defined order 5.6 Exception handlers should preserve the original exceptions 3 4.8 4 The diamond operator ("<>") should be used 4.4 Generic exceptions should never be thrown 5 4.2 6 Statements should be on separate lines 3.7 7 Control flow statements "if", "for", "while", "switch" and "try" should not be nested too deeply 3.5 8 Sections of code should not be "commented out" 3.2 Source files should not have any duplicated blocks 9 2.4 10 "@Override" should be used on overriding and implementing methods 2.4

  14. The most costly to fix types of TD 8/15/2018 | 18 # Issue % Source files should not have any duplicated blocks  1 13.8 String literals should not be duplicated  2 9.2 Generic exceptions should never be thrown  3 8.4 4 Cognitive Complexity of methods should not be too high 5.0 Exception handlers should preserve the original exceptions  5 4.8 6 Methods should not be too complex 3.7 7 Control flow statements "if", "for", "while", "switch" and "try" should not be nested too deeply 3.5 8 The members of an interface declaration or class should appear in a pre-defined order 2.8 9 Dead stores should be removed 2.4 10 Standard outputs should not be used directly to log anything 2.2

  15. The most costly to fix types of TD 8/15/2018 | 19 # Issue Source files should not have any duplicated blocks  1 672 String literals should not be duplicated  2 446 Generic exceptions should never be thrown  3 408 4 Cognitive Complexity of methods should not be too high 246 Exception handlers should preserve the original exceptions  5 232 6 Methods should not be too complex 179 7 Control flow statements "if", "for", "while", "switch" and "try" should not be nested too deeply 170 8 The members of an interface declaration or class should appear in a pre-defined order 135 9 Dead stores should be removed 115 10 Standard outputs should not be used directly to log anything 107

  16. Takeaways 8/15/2018 | 20 › Technical Debt  › Normalized Technical Debt  › Most frequent: low-level coding problems › Most expensive types of TD are higher lever • duplicated code • ad-hoc exception handling › A minority of problem types is responsible for the majority of estimated TD

  17. SANER 2018 8/15/2018 | 22 How Do Developers Fix Issues and Pay Back Technical Debt in the Apache Ecosystem?

  18. 8/15/2018 | 23 › Is TD paid back? › Which TD types are paid back more often? › What is the survivability of those issues?

  19. Open and closed issues per project 8/15/2018 | 25

  20. Fixed issues per issue type 8/15/2018 | 26

  21. Issues with the highest fixing rate 8/15/2018 | 27 Issue F Conditionally executed blocks should be reachable 59 * Replace Map.get/test with single method call 58 * Deprecated elements should have both the annotation and the Javadoc tag 57 Unused "private" fields should be removed 56 Boolean expressions should not be gratuitous 55 * Synchronized classes … should not be used 53 52 * Constructors should not be used to instantiate "String" and primitive-wrapper classes Dead stores should be removed 50 * @Override should be used on overriding [...] 48 Unused "private" methods should be removed 47

  22. Issues whose resolution has yielded the higher benefit 8/15/2018 | 28 Issue CiR 1 Source files should not have any duplicated blocks  8 2 Cognitive Complexity of methods should not be too high  4 3 Generic exceptions should never be thrown  4 4 String literals should not be duplicated  1 5 Exception handlers should preserve the original exceptions  1 6 Control flow statements should not be nested too deeply  1 7 Synchronized classes … should not be used  3 8 Methods should not be too complex  3 9 Standard outputs should not be used directly to log anything  1 10 Sections of code should not be "commented out"  8

  23. Research Question 5 8/15/2018 | 29

  24. Takeaways 8/15/2018 | 30 › Variation in the fixing rate › Variation in the survivability • 10% fixed within the first month • 50% in the first year › Some of the issues can take up to 10 years › Issues related to duplication and exception handling are frequently encountered and rarely fixed by developers

  25. Limitations of my Studies 8/15/2018 | 31 › OSS projects by ASF › Java › SonarQube › Weekly analysis of the commits › Architectural decisions not known/accessible › Commit policy

  26. 8/15/2018 | 32 Work In Progress

  27. New Source Code TD (Observation) 8/15/2018 | 33 › The analysis of several open-source projects by ASF revealed that the quality of some projects degrades over time (conforming to the software ageing phenomenon). › However, for the majority of Apache projects their normalized Technical Debt (TD/NCLOC) tends to decrease over time.

  28. New Source Code TD (OQ) 8/15/2018 | 34 When normalized TD is decreasing in a project, is it because TD is repaid or because the new code is clean, or both? To what extent is each factor responsible?

  29. New Source Code TD (Claim) 8/15/2018 | 35 › Might be limited value in trying to get rid of existing TD. › Someone should aim at writing clean, TD-free code. › If (as Google reports) the existing code base is renewed annually at a rate of say, 20%, and if the new code is clean, then after 6, 7, 8 years all code will be essentially TD-free.

  30. New Source Code TD (Objective) 8/15/2018 | 36 Analyze newly added source code (per commit) for the purpose of evaluation with respect to the technical debt amount that is introduced from the point of view of software developers in the context of OSS (industrial) development.

  31. New Source Code TD (Potential RQs) 8/15/2018 | 37 › RQ1: For TD violations that are removed, in which exact ways has the removal occurred? • Sloppy code removed or refactored? • The removal happened intentionally or it was a side effect? › RQ2: For new code that might be 'clean', exactly how clean is it? › RQ3: If the new code is not totally clean, what types of TD are newly introduced? › RQ4: Is the normalized TD of NEW code higher/lower than the normalized TD of existing code? › RQ5: Is the normalized TD of NEW code lower in projects which improve along evolution, compared to those that deteriorate?

  32. 8/15/2018 | 38 SLR on Architectural Smells

  33. SLR (Objective) 8/15/2018 | 39 Analyze the research state of the art on software smells for the purpose of understanding with respect to : (a) their applicability on the architecture level, (b) the research intensity on them, (c) their detection from tools, and (d) their classification based on their elements, level of granularity, relevance to software evolution, from the point of view of researchers and practitioners in the context of software development.

  34. SLR (RQs) 8/15/2018 | 40 1. Which smells can be defined (identified) at the architecture level? 1. Which are the pure architecture smells? 2. Which code or design flaws (i.e. smells, violations or antipatterns) could also be applied at the architecture level? 3. Are there similarities among architectural smells that have been defined with different names in the literature?

  35. SLR (RQs) 8/15/2018 | 41 2. Which architectural smells have attracted the most research attention? 3. Which architectural smells are detectable by tools? 4. How can we classify architectural smells with respect to: (a) the affected architectural element (interface, component and connector), (b) the portion of the system is involved (or the whole system), and (c) the development history of the project (to indicate if it is considered to identify the architectural smell or not).

  36. 8/15/2018 | 44 Thank you for your attention

Recommend


More recommend