Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Many others! 2006 2017
Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • Many others! 2010 2017
Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • IR-based techniques 2008 2017
IR-Based Techniques Bug ID: 90018 Summary: Native tooltips left around on CTabFolder. Description: Hover over the PartStack CTabFolder inside eclipse until some native tooltip is displayed. For example, the maximize button. When the tooltip appears, change perspectives using the keybinding. the CTabFolder gets hidden, but its tooltip is permanently displayed and never goes away. Even if that CTabFolder is disposed (I'm assuming) when the perspective is closed. --------------------------------------------------------------------------
IR-Based Techniques Source code file: CTabFolder .java public class CTabFolder extends Composite { // tooltip int [] toolTip Events = new int[] {SWT.MouseExit, SWT.Mouse Hover , SWT.MouseMove, SWT.MouseDown, SWT.DragDetect}; Bug ID: 90018 Listener toolTip Listener; Summary: Native tooltips left around on … CTabFolder. / * Returns <code>true</code> if the CTabFolder Description: Hover over the PartStack only displys the selected tab CTabFolder inside eclipse until some native * and <code>false</code> if the CTabFolder tooltip is displayed. For example, the displays multiple tabs. maximize button. When the tooltip appears, */ change perspectives using the keybinding. …void onMouse Hover (Event event) { the CTabFolder gets hidden, but its tooltip is show ToolTip (event.x, event.y); permanently displayed and never goes away. } Even if that CTabFolder is disposed (I'm void on Dispose( ) { assuming) when the perspective is closed. in Dispose = true; -------------------------------------------------------------------------- hideToolTip (); … } }
Statistical Debugging 1960 • Intuition : debugging techniques can leverage multiple executions • Tarantula • CBI • Ochiai • Causal inference based • IR-based techniques ... • Many others! 2017
Additional Techniques 1960 • Contracts (e.g., Meyer et al.) • Counterexample-based (e.g., Groce et al., Ball et al.) • Tainting-based (e.g., Leek et al.) • Debugging of field failures (e.g., Jin et al.) • Predicate switching (e.g., Zhang et al.) • Fault localization for multiple faults (e.g., Steimann et al.) • Debugging of concurrency failures (e.g., Park et al.) • Automated data structure repair (e.g., Rinard et al.) • Finding patches with genetic programming • Domain specific fixes (tests, web pages, comments, concurrency) • Identifying workarounds/recovery strategies (e.g., Gorla et al.) • Formula based debugging (e.g., Jose et al., Ermis et al.) • ... 2017
Additional Techniques 1960 • Contracts (e.g., Meyer et al.) • Counterexample-based (e.g., Groce et al., Ball et al.) • Tainting-based (e.g., Leek et al.) • Debugging of field failures (e.g., Jin et al.) Not meant to be comprehensive! • Predicate switching (e.g., Zhang et al.) • Fault localization for multiple faults (e.g., Steimann et al.) • Debugging of concurrency failures (e.g., Park et al.) • Automated data structure repair (e.g., Rinard et al.) • Finding patches with genetic programming • Domain specific fixes (tests, web pages, comments, concurrency) • Identifying workarounds/recovery strategies (e.g., Gorla et al.) • Formula based debugging (e.g., Jose et al., Ermis et al.) • ... 2017
Present Can We Debug at the Push of a Button?
Automated Debugging (rank based) 1)" 2)" 3)" 4)" …"
Automated Debugging (rank based) 1)" 2)" 3)" 4)" …" Here$is$a$list$of$ places$to$check$out$
Automated Debugging Conceptual Model 1)" 2)" 3)" 4)" …" Ok,$I$will$check$out$ your$sugges3ons$ one$by$one. $
Automated Debugging Conceptual Model ✔ 1)" ✔ 2)" ✔ 3)" 4)" …" Found&the&bug!&
Performance of Automated Debugging Techniques 100 Spectra-Based Techniques 80 % of faulty versions 60 Space 40 Siemens 20 0 20 40 60 80 100 % of program to be examined to find fault
Performance of Automated Debugging Techniques 100 Spectra-Based Techniques n 80 o % of faulty versions i s s i ? M d e 60 h s i l p m Space o 40 c Siemens c A 20 0 20 40 60 80 100 % of program to be examined to find fault
Assumption #1: Locating a bug in 10% of the code is a great result 100 LOC ➡ 10 LOC 10,000 LOC ➡ 1,000 LOC 100,000 LOC ➡ 10,000 LOC
Assumption #2: Programmers exhibit perfect bug understanding Do you see a bug?
Assumption #3: Programmers inspect a list linearly and exhaustively Good for comparison, but is it realistic?
Assumption #3: Programmers inspect a list linearly and exhaustively Does the conceptual model make sense? Good for comparison, Have we really evaluated it? but is it realistic?
Assumption #3: Programmers inspect a list linearly and exhaustively Does the conceptual model make sense? Are we headed in the right direction? Good for comparison, Have we really evaluated it? but is it realistic?
Are we headed in the right direction? “ Are Automated Debugging Techniques Actually Helping Programmers? ” ISSTA 2011 C. Parnin and A. Orso “ Evaluating the Usefulness of IR-Based Fault Localization Techniques ” ISSTA 2015 Q. Wang, C. Parnin, and A. Orso
What do we know about automated debugging? Studies on tools Human studies
What do we know about Let’s&see…& automated debugging? Over&50&years&of&research& on&automated&debugging.& 2001.%Sta)s)cal%Debugging% 1999.$Delta$Debugging$ Studies on tools Human studies 1981.%Weiser.%Program%Slicing% 1962.&Symbolic&Debugging&(UNIVAC&FLIT)&
What do we know about automated debugging? s e i d u t s n a m u H s l o o t n o s e i d u t S Weiser Kusumoto Sherwood Ko DeLine
Are these Techniques and Tools Actually Helping Programmers? RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank? RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? RQ4: Does perfect bug understanding exist?
Are these Techniques and Tools Actually Helping Programmers? User studies: User studies: Spectra based fault localization Spectra based fault localization IR-based fault localization IR-based fault localization RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank? RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? RQ4: Does perfect bug understanding exist?
Experimental Protocol: Setup Participants: 34 developers MS’s Students Different levels of expertise (low, medium, high) Tools ✔ 1)" • Rank-based tool ✔ 2)" ✔ (Eclipse plug-in, logging) 3)" 4)" • Eclipse debugger …"
Experimental Protocol: Setup Software subjects: • Tetris (~2.5KLOC) • NanoXML (~4.5KLOC) ✔ 1)" ✔ 2)" ✔ 3)" 4)" …"
Tetris Bug (Easier)
NanoXML Bug (Harder)
Experimental Protocol: Setup Tasks: • Fault in Tetris • Fault in NanoXML • 30 minutes per task • Questionnaire at the end ✔ 1)" ✔ 2)" ✔ 3)" 4)" …"
Experimental Protocol: Studies and Groups
Experimental Protocol: Studies and Groups A ! B ! Part 1
Experimental Protocol: Studies and Groups C ! D ! Rank ! 7 ➡ 35 Part 2 Rank ! 83 ➡ 16
Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML A B C D
Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Not A significantly B different C D
Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Not Not A significantly significantly B different different Not Not C significantly significantly D different different
Study Results A ! B ! C ! D ! Rank ! Rank ! Tetris NanoXML Significantly Not A Stratifying different for high significantly participants B performers different Not Not C significantly significantly D different different
Study Results A ! B ! C ! D ! Rank ! Rank ! Analysis of results and questionnaires... Tetris NanoXML Significantly Not A Stratifying different for high significantly participants B performers different Not Not C significantly significantly D different different
Findings RQ1: Do programmers who use automated debugging tools locate bugs faster than programmers who do not use such tools? Experts are faster when using the tool ➡ Yes (with caveats) RQ2: Is the effectiveness of debugging with automated tools affected by the faulty statement’s rank Changes in rank have no significant effects ➡ No RQ3: Do developers navigate a list of statements ranked by suspiciousness in the order provided? Programmers do not visit each statement in the list, they search RQ4: Does perfect bug understanding exist? Perfect bug understanding is generally not a realistic assumption
Future Where Shall We Go Next?
Feedback-based Debugging (Humans in the Loop) Intuition : we should amplify, rather than replace human skills
1)" 2)" Pass 3)" SFL 4)" …" Fail
1)" 2)" Pass 3)" SFL 4)" …" Fail Assumption: Programmers exhibit Assumption: Programmers inspect a perfect bug understanding list linearly and exhaustively
1)" 2)" Pass 3)" SFL 4)" …" ✗ ✗ Fail Assumption: Programmers exhibit Assumption: Programmers inspect a perfect bug understanding list linearly and exhaustively
Partial Execution Pass Conjecture of Fault Cause Fail 1)" Pass 2)" 3)" 4)" SFL …" Fail
Conjecture of Fault Cause Partial 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail
Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail
Swift Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Fail High-level Query state, params method return, state’ ✓ | ✗ | ?
Example
Iteration 1 ✓ ✗ ?
Iteration 2 ✓ ✗ ?
Iteration 3 ✓ ? ✗
Swift Conjecture of Fault Cause Method-level 1)" Execution Pass 2)" 3)" 4)" SFL …" Virtual Test Cases Fail High-level Query state, params method return, state’ ✓ | ✗ | ?
Swift Conjecture of Fault Cause Method-level 1)" Preliminary empirical results: Execution Pass 2)" 3)" 4)" SFL …" •20 faults from 3 open-source projects Virtual Test Cases Fail •Average ranking: 66.3 High-level Query •Average # of queries: 4.3 state, params method return, state’ ✓ | ✗ | ?
Formula-based Debugging (AKA Failure Explanation) • Intuition : executions can be expressed as formulas that we can reason about • Cause Clue Clauses • Error invariants • Error invariants
Input Assertion
Input Formula Semantics of the program ✘ Input A B C Assertion ⋀ ⋀ ⋀ ⋀ Unsatisfiable Formula Assertion
Input Formula Semantics of the program ✘ Input A B C Assertion ⋀ ⋀ ⋀ ⋀ Unsatisfiable Formula MAX-SAT solver ✘ ✔ Input Assertion A B C ⋀ ⋀ ⋀ CoMSS MAX-SAT set Assertion
Recommend
More recommend