software debugging past present and future
play

Software Debugging: Past, Present, and Future Alessandro (Alex) - PowerPoint PPT Presentation

Software Debugging: Past, Present, and Future Alessandro (Alex) Orso School of Computer Science College of Computing Georgia Institute of Technology http://www.cc.gatech.edu/~orso/ Partially supported by : NSF, Google, IBM, and MSR


  1. Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Many others! 2006 2017

  2. Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • Many others! 2010 2017

  3. Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • IR-based techniques 2008 2017

  4. IR-Based Techniques Bug ID: 90018 Summary: Native tooltips left around on CTabFolder. Description: Hover over the PartStack CTabFolder inside eclipse until some native tooltip is displayed. For example, the maximize button. When the tooltip appears, change perspectives using the keybinding. the CTabFolder gets hidden, but its tooltip is permanently displayed and never goes away. Even if that CTabFolder is disposed (I'm assuming) when the perspective is closed. --------------------------------------------------------------------------

  5. IR-Based Techniques Source code file: CTabFolder .java public class CTabFolder extends Composite { // tooltip int [] toolTip Events = new int[] {SWT.MouseExit, SWT.Mouse Hover , SWT.MouseMove, SWT.MouseDown, SWT.DragDetect}; Bug ID: 90018 Listener toolTip Listener; Summary: Native tooltips left around on … CTabFolder. / * Returns <code>true</code> if the CTabFolder Description: Hover over the PartStack only displys the selected tab CTabFolder inside eclipse until some native * and <code>false</code> if the CTabFolder tooltip is displayed. For example, the displays multiple tabs. maximize button. When the tooltip appears, */ change perspectives using the keybinding. …void onMouse Hover (Event event) { the CTabFolder gets hidden, but its tooltip is show ToolTip (event.x, event.y); permanently displayed and never goes away. } Even if that CTabFolder is disposed (I'm void on Dispose( ) { assuming) when the perspective is closed. in Dispose = true; -------------------------------------------------------------------------- hideToolTip (); … } }

  6. Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • IR-based techniques ... • Many others! 2017

  7. Additional Techniques 1960 • Contracts (e.g., Meyer et al.) • Counterexample-based (e.g., Groce et al., Ball et al.) • Tainting-based (e.g., Leek et al.) • Debugging of field failures (e.g., Jin et al.) • Predicate switching (e.g., Zhang et al.) • Fault localization for multiple faults (e.g., Steimann et al.) • Debugging of concurrency failures (e.g., Park et al.) • Automated data structure repair (e.g., Rinard et al.) • Finding patches with genetic programming • Domain specific fixes 
 (tests, web pages, comments, concurrency) • Identifying workarounds/recovery strategies (e.g., Gorla et al.) • Formula based debugging (e.g., Jose et al., Ermis et al.) • ... 2017

  8. Additional Techniques 1960 • Contracts (e.g., Meyer et al.) • Counterexample-based (e.g., Groce et al., Ball et al.) • Tainting-based (e.g., Leek et al.) • Debugging of field failures (e.g., Jin et al.) Not meant to be comprehensive! • Predicate switching (e.g., Zhang et al.) • Fault localization for multiple faults (e.g., Steimann et al.) • Debugging of concurrency failures (e.g., Park et al.) • Automated data structure repair (e.g., Rinard et al.) • Finding patches with genetic programming • Domain specific fixes 
 (tests, web pages, comments, concurrency) • Identifying workarounds/recovery strategies (e.g., Gorla et al.) • Formula based debugging (e.g., Jose et al., Ermis et al.) • ... 2017

  9. Present Can We Debug at the Push of a Button?

  10. Automated Debugging (rank based) 1)" 2)" 3)" 4)" …"

  11. Automated Debugging (rank based) 1)" 2)" 3)" 4)" …" Here$is$a$list$of$ places$to$check$out$

  12. Automated Debugging Conceptual Model 1)" 2)" 3)" 4)" …" Ok,$I$will$check$out$ your$sugges3ons$ one$by$one. $

  13. Automated Debugging Conceptual Model ✔ 1)" ✔ 2)" ✔ 3)" 4)" …" Found&the&bug!&

  14. Performance of Automated Debugging Techniques 100 Spectra-Based 
 Techniques 80 % of faulty versions 60 Space 40 Siemens 20 0 20 40 60 80 100 % of program to be examined to find fault

  15. Performance of Automated Debugging Techniques 100 Spectra-Based 
 Techniques n 80 o % of faulty versions i s s i ? M d e 60 h s i l p m Space o 40 c Siemens c A 20 0 20 40 60 80 100 % of program to be examined to find fault

  16. Assumption #1: Locating a bug in 10% of the code is a great result 100 LOC ➡ 10 LOC 10,000 LOC ➡ 1,000 LOC 100,000 LOC ➡ 10,000 LOC

  17. Assumption #2: Programmers exhibit perfect bug understanding Do you see a bug?

  18. Assumption #3: Programmers inspect a list linearly and exhaustively Good for comparison, but is it realistic?

  19. Assumption #3: Programmers inspect a list linearly and exhaustively Does the conceptual model make sense? Good for comparison, Have we really evaluated it? but is it realistic?

  20. Assumption #3: Programmers inspect a list linearly and exhaustively Does the conceptual model make sense? Are we headed in the right direction? Good for comparison, Have we really evaluated it? but is it realistic?

  21. Are we headed in the right direction? “ Are Automated Debugging Techniques Actually Helping Programmers? ” ISSTA 2011 
 C. Parnin and A. Orso “ Evaluating the Usefulness of IR-Based Fault Localization Techniques ” ISSTA 2015 
 Q. Wang, C. Parnin, and A. Orso

  22. What do we know about automated debugging? Studies on tools Human studies

  23. What do we know about Let’s&see…& automated debugging? Over&50&years&of&research& on&automated&debugging.& 2001.%Sta)s)cal%Debugging% 1999.$Delta$Debugging$ Studies on tools Human studies 1981.%Weiser.%Program%Slicing% 1962.&Symbolic&Debugging&(UNIVAC&FLIT)&

  24. What do we know about automated debugging? s e i d u t s n a m u H s l o o t n o s e i d u t S Weiser Kusumoto Sherwood Ko DeLine

  25. Are these Techniques and Tools Actually Helping Programmers? RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank? RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? RQ4: Does perfect bug understanding exist?

  26. Are these Techniques and Tools Actually Helping Programmers? User studies: 
 User studies: 
 Spectra based fault localization 
 Spectra based fault localization 
 IR-based fault localization IR-based fault localization RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank? RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? RQ4: Does perfect bug understanding exist?

  27. Experimental Protocol: Setup Participants: 34 developers MS’s Students Different levels of expertise 
 (low, medium, high) Tools ✔ 1)" • Rank-based tool 
 ✔ 2)" ✔ (Eclipse plug-in, logging) 3)" 4)" • Eclipse debugger …"

  28. Experimental Protocol: Setup Software subjects: • Tetris (~2.5KLOC) • NanoXML (~4.5KLOC) ✔ 1)" ✔ 2)" ✔ 3)" 4)" …"

  29. Tetris Bug (Easier)

  30. NanoXML Bug (Harder)

  31. Experimental Protocol: Setup Tasks: • Fault in Tetris • Fault in NanoXML • 30 minutes per task • Questionnaire at the end ✔ 1)" ✔ 2)" ✔ 3)" 4)" …"

  32. Experimental Protocol: Studies and Groups

  33. Experimental Protocol: Studies and Groups A ! B ! Part 1

  34. Experimental Protocol: Studies and Groups C ! D ! Rank ! 7 ➡ 35 Part 2 Rank ! 83 ➡ 16

  35. Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML A B C D

  36. Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Not A significantly B different C D

  37. Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Not Not A significantly significantly B different different Not Not C significantly significantly D different different

  38. Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Significantly Not A Stratifying different for high significantly participants B performers different Not Not C significantly significantly D different different

  39. Study Results A ! B ! C ! D ! Rank ! Rank ! Analysis of results and questionnaires... Tetris NanoXML Significantly Not A Stratifying different for high significantly participants B performers different Not Not C significantly significantly D different different

  40. Findings RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? Experts are faster when using the tool ➡ Yes (with caveats) RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank Changes in rank have no significant effects ➡ No RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? Programmers do not visit each statement in the list, they search RQ4: Does perfect bug understanding exist? Perfect bug understanding is generally not a realistic assumption

  41. Future Where Shall We Go Next?

  42. Feedback-based Debugging (Humans in the Loop) Intuition : we should amplify, rather than replace human skills

  43. 1)" 2)" Pass 3)" SFL 4)" …" Fail

  44. 1)" 2)" Pass 3)" SFL 4)" …" Fail Assumption: Programmers exhibit Assumption: Programmers inspect a perfect bug understanding list linearly and exhaustively

  45. 1)" 2)" Pass 3)" SFL 4)" …" ✗ ✗ Fail Assumption: Programmers exhibit Assumption: Programmers inspect a perfect bug understanding list linearly and exhaustively

  46. Partial 
 Execution Pass Conjecture of Fault Cause Fail 1)" Pass 2)" 3)" 4)" SFL …" Fail

  47. Conjecture of Fault Cause Partial 
 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail

  48. Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail

  49. Swift Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail High-level Query state, params method return, state’ ✓ | ✗ | ?

  50. Example

  51. Iteration 1 ✓ ✗ ?

  52. Iteration 2 ✓ ✗ ?

  53. Iteration 3 ✓ ? ✗

  54. Swift Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Virtual Test Cases Fail High-level Query state, params method return, state’ ✓ | ✗ | ?

  55. Swift Conjecture of Fault Cause Method-level 1)" Preliminary empirical results: Execution Pass 2)" 3)" 4)" SFL …" •20 faults from 3 open-source projects Virtual Test Cases Fail •Average ranking: 66.3 High-level Query •Average # of queries: 4.3 state, params method return, state’ ✓ | ✗ | ?

  56. Formula-based Debugging (AKA Failure Explanation) • Intuition : executions can be expressed as formulas that we can reason about • Cause Clue Clauses • Error invariants • Error invariants

  57. Input Assertion

  58. Input Formula Semantics of the program ✘ Input A B C Assertion ⋀ ⋀ ⋀ ⋀ Unsatisfiable Formula Assertion

  59. Input Formula Semantics of the program ✘ Input A B C Assertion ⋀ ⋀ ⋀ ⋀ Unsatisfiable Formula MAX-SAT solver ✘ ✔ Input Assertion A B C ⋀ ⋀ ⋀ CoMSS MAX-SAT set Assertion

Recommend


More recommend