how many of all bugs do we find a study of static bug
play

How Many of All Bugs Do We Find? A Study of Static Bug Detectors - PowerPoint PPT Presentation

How Many of All Bugs Do We Find? A Study of Static Bug Detectors Andrew Habib, Michael Pradel TU Darmstadt, Germany software-lab.org 1 Static Bug Detection Error Prone 2 Static Bug Detection Error Prone General framework Scalable


  1. How Many of All Bugs Do We Find? A Study of Static Bug Detectors Andrew Habib, Michael Pradel TU Darmstadt, Germany software-lab.org 1

  2. Static Bug Detection Error Prone 2

  3. Static Bug Detection Error Prone � General framework � Scalable static analysis � Set of checkers for specific bug patterns 2

  4. How Many Bugs Do They Find? 3

  5. How Many Bugs Do They Find? Given a representative set of real-world bugs, how many of them do static bug detectors find? 3

  6. How Many Bugs Do They Find? Given a representative set of real-world bugs, how many of them do static bug detectors find? This talk: Empirical study with 594 real-world Java bugs and 3 popular static checkers 3

  7. Real-World Bugs � 594 bugs from 15 popular Java projects � Extended version of Defects4J data set � Why this set? � Gathered independently � Used in other bug-related studies * � Contains real fixes by developers * Just et al., 2014 (mutation testing); Shamshiri et al., 2015 (test generation); Pearson et al., 2017 (fault localization); Martinez et al., 2017 (program repair) 4

  8. Defects4J: Files Involved in Bug 550 501 500 450 Number of bugs 400 350 300 250 200 150 100 64 50 12 10 4 1 1 1 0 1 2 3 4 5 6 7 11 Number of buggy files 5

  9. Defects4J: Size of Bug Fix 550 500 450 Number of bugs 400 350 296 300 250 200 150 128 100 54 44 50 29 29 6 6 1 1 0 1-4 5-9 10-14 15-19 20-24 25-49 50-74 75-99 100-199 200-1.999 Diff size between buggy and fixed versions (LoC) 6

  10. Previous Approach How to determine which bugs are found? [Thung et al., 2012] � Get diff between buggy and fixed code � Run tool on code with buggy lines � If warning on buggy line: Bug found � Result: 50% – 95% of all bugs found � Limitation: � No check that warning points to bug � One tool flags up to 57% of all lines 7

  11. Previous Approach How to determine which bugs are found? [Thung et al., 2012] � Get diff between buggy and fixed code � Run tool on code with buggy lines � If warning on buggy line: Bug found � Result: 50% – 95% of all bugs found � Limitation: � No check that warning points to bug � One tool flags up to 57% of all lines 7

  12. Methodology: Overview Bugs + fixes Bug detectors Automated filtering of warnings Fixed Diff-based warnings- Combined based 8

  13. Methodology: Overview Bugs + fixes Bug detectors Automated filtering of warnings Fixed Diff-based warnings- Combined based 8

  14. Methodology: Overview Bugs + fixes Bug detectors Automated filtering of warnings Fixed Diff-based warnings- Combined based 8

  15. Methodology: Overview Bugs + fixes Bug detectors Automated filtering of warnings Fixed Diff-based warnings- Combined based Candidates for detected bugs Manual inspection of candidates Detected bugs 8

  16. Methodology: Diff-based Bugs + fixes Bug detectors Automated filtering of warnings Diff-based Candidates for detected bugs Manual inspection of candidates Detected bugs 9

  17. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning 9

  18. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: 9

  19. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: Modified line 9

  20. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: Modified line Removed line 9

  21. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: Modified line Removed line Newly inserted line 9

  22. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: Warnings by bug detector 9

  23. Methodology: Diff-based 1) Identify lines changed to fix bug 2) Intersect with lines with warning Buggy file: Fixed file: Warnings by bug detector Candidate for detected bug 9

  24. Example: public Dfp multiply(final int x) { return multiplyFast(x); } Bug fix public Dfp multiply(final int x) { if (x >= 0 && x < RADIX) { return multiplyFast(x); } else { return multiply(newInstance(x)); } } 10

  25. Example: public Dfp multiply(final int x) { return multiplyFast(x); Warning: } Missing Bug fix @Override public Dfp multiply(final int x) { if (x >= 0 && x < RADIX) { return multiplyFast(x); } else { return multiply(newInstance(x)); } } 10

  26. Example: public Dfp multiply(final int x) { return multiplyFast(x); Warning: } Missing Bug fix @Override public Dfp multiply(final int x) { -1 if (x >= 0 && x < RADIX) { +1 return multiplyFast(x); } else { return multiply(newInstance(x)); } } Candidate for detected bug 10

  27. Method.: Fixed Warnings-based Bugs + fixes Bug detectors Automated filtering of warnings Fixed warnings- based Candidates for detected bugs Manual inspection of candidates Detected bugs 11

  28. Method.: Fixed Warnings-based 1) Compare warnings before and after fix 2) Warning that disappears was for bug 11

  29. Method.: Fixed Warnings-based 1) Compare warnings before and after fix 2) Warning that disappears was for bug Buggy file: Fixed file: 11

  30. Method.: Fixed Warnings-based 1) Compare warnings before and after fix 2) Warning that disappears was for bug Buggy file: Fixed file: Warnings by bug detector 11

  31. Method.: Fixed Warnings-based 1) Compare warnings before and after fix 2) Warning that disappears was for bug Buggy file: Fixed file: Warnings by bug detector Candidate for detected bug 11

  32. Example public Week(Date time , TimeZone zone) { this(time , RegularTimePeriod.DEFAULT_TIME_ZONE , Locale.getDefault ()); } Bug fix public Week(Date time , TimeZone zone) { this(time , zone , Locale.getDefault ()); } 12

  33. Example public Week(Date time , TimeZone zone) { this(time , RegularTimePeriod.DEFAULT_TIME_ZONE , Locale.getDefault ()); } Warning: Bug fix Chaining public Week(Date time , TimeZone zone) { constructor this(time , ignores zone , argument Locale.getDefault ()); } Candidate for detected bug 12

  34. Methodology: Combined Bugs + fixes Bug detectors Automated filtering of warnings Fixed = Diff-based warnings- Combined + based Candidates for detected bugs Manual inspection of candidates Detected bugs 13

  35. Results 14

  36. Warnings to Inspect All warnings Per bug Candidates Tool Min Max Avg Total only Error Prone 0 148 7.58 4,402 53 Infer 0 36 0.33 198 32 SpotBugs 0 47 1.1 647 68 Total 5,247 153 15

  37. Warnings to Inspect All warnings Per bug Candidates Tool Min Max Avg Total only Error Prone 0 148 7.58 4,402 53 Infer 0 36 0.33 198 32 SpotBugs 0 47 1.1 647 68 Total 5,247 153 15

  38. Warnings to Inspect All warnings Per bug Candidates Tool Min Max Avg Total only Error Prone 0 148 7.58 4,402 53 Infer 0 36 0.33 198 32 SpotBugs 0 47 1.1 647 68 Total 5,247 153 97% of all warnings are removed by the automated filtering step 15

  39. Manual Inspection Distinguish coincidental matches from actually detected bugs Candidate = (bug, warning) Full match Partial match Mismatch 16 Created by Freepik

  40. Manual Inspection: Example public Dfp multiply(final int x) { return multiplyFast(x); Warning: } Bug fix Missing @Override public Dfp multiply(final int x) { if (x >= 0 && x < RADIX) { return multiplyFast(x); } else { return multiply(newInstance(x)); } } Candidate for detected bug 17

  41. Manual Inspection: Example public Dfp multiply(final int x) { return multiplyFast(x); Warning: } Bug fix Missing @Override public Dfp multiply(final int x) { if (x >= 0 && x < RADIX) { return multiplyFast(x); } else { return multiply(newInstance(x)); } } Mismatch 17

  42. Manual Inspection: Example (2) public Week(Date time , TimeZone zone) { this(time , RegularTimePeriod.DEFAULT_TIME_ZONE , Locale.getDefault ()); } Warning: Bug fix Chaining public Week(Date time , TimeZone zone) { constructor this(time , ignores zone , argument Locale.getDefault ()); } Candidate for detected bug 18

  43. Manual Inspection: Example (2) public Week(Date time , TimeZone zone) { this(time , RegularTimePeriod.DEFAULT_TIME_ZONE , Locale.getDefault ()); } Warning: Bug fix Chaining public Week(Date time , TimeZone zone) { constructor this(time , ignores zone , argument Locale.getDefault ()); } Full match 18

  44. Most Bugs are Missed Three tools together: Detect 27 of 594 bugs (less than 5%) SpotBugs ErrorProne 6 14 2 0 0 2 3 Infer 19

  45. Why are Most Bugs Missed? Manual inspection of random sample of 20 missed bugs: 14 are domain-specific � Unrelated to any of the supported bug patterns � Application-specific algorithms � Forgot to handle special case � Difficult to decide whether behavior is intended 20

Recommend


More recommend