Class 20 • Fault localization (cont’d) • Test-data generation • Exam review: Nov 3, after class to 7:30 • Responsible for all material up through Nov 3 (through test-data generation) • Send questions beforehand so all can prepare • Exam: Nov 10 • Final project presentations: Dec 1, 3; 4:35-6:45 • Assign (see Schedule for links) • Problem Set 9 discuss • Readings 1 Fault Localization Using Tarantula • What information does Tarantula use to compute suspicious (and ranking) of statements in the program? • How is this information used? • Are there other ways to compute the suspiciousness using this information? • What information other than statement coverage could be used for fault localization? • Do you think statement coverage would have worked for tritype ? • How could we use fault localization to identify which changes are most suspicious after a build?
Improving Fault-localization Efficiency all tests failed failed Execute … P P’ P’’ pass tests tests Debug Debug Execute Execute • Are all failing tests caused by the same fault? • Are all failing tests caused by the same fault? • Are all failing tests caused by the same fault? • Are all failing tests caused by the same fault? • Can we associate groups of tests with different • Can we associate groups of tests with different • Can we associate groups of tests with different faults? faults? faults? • Can we reduce debugging effort by considering • Can we reduce debugging effort by considering these groups individually? these groups individually? • Can we reduce debugging effort by considering these groups simultaneously? P i is failure-free Improving Fault-localization Efficiency t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 3,2,1 2,1,3 5,4,2 5,2,6 mid() { int x,y,z,m; h h h h h h h h h h 1:read(“Enter 3 integers:”,x,y,z); h h h h h h h h h h 2:m = z; h h h h h h h h h h 3:if (y<z) h h h h h h 4: if (x<y) h 5: m = y; h h h h h 6: else if (x<z) h h h h 7: m = y; h h h h 8:else h h h h 9: if (x>y) h h h 10: m = z; h 11: else if (x>z) 12: m = x; h h h h h h h h h h 13:print(“Middle number is:”, m); } P P P P P P F F F F Pass/fail Status
Improving Fault-localization Efficiency t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 3,2,1 2,1,3 5,4,2 5,2,6 mid() { int x,y,z,m; h h h h h h h h h h 1:read(“Enter 3 integers:”,x,y,z); h h h h h h h h h h 2:m = z; h h h h h h h h h h 3:if (y<z) h h h h h h 4: if (x<y) h 5: m = y; h h h h h 6: else if (x<z) h h h h 7: m = y; h h h h 8:else h h h h 9: if (x>y) h h h 10: m = z; h 11: else if (x>z) 12: m = x; h h h h h h h h h h 13:print(“Middle number is:”, m); } P P P P P P F F F F Pass/fail Status Debugging Process all tests failed failed Execute … P P’ P’’ pass tests tests Debug Debug Execute Execute all some some tests tests … P Execute P’ failed failed pass Debug Execute tests P j is P i is failure-free failure-free
Debugging Process all tests failed failed Execute … P P’ P’’ pass tests tests Debug Debug Execute Execute all some some tests tests … P Execute P’ failed failed pass Debug Execute tests some failed all tests Debug tests P … P’ pass Execute some failed Debug tests P k is P j is P i is failure-free failure-free failure-free Hierarchy of Bugs � Faults often dominate each other Fault 4 � Failing test cases are first Fault 1 caused by a set of initial Fault 5 faults Fault 2 Fault 8 � Once initial faults are Fault 6 fixed, other faults manifest themselves Fault 3 Fault 7 Time
Debugging Process all tests failed failed Execute … P P’ P’’ pass tests tests Debug Debug Execute Execute Potential benefits: all Potential costs some some • Reduced time to failure-free tests tests … P Execute P’ failed failed • Overhead to partition test program pass Debug Execute tests cases • Less “noise” in locating each • Multiple debuggers fault some failed (developers) • Better utilization of developer all tests Debug effort tests P … P’ pass Execute some failed Debug tests P k is P j is P i is failure-free failure-free failure-free Debugging Process all tests failed failed Crucial problem: Execute … P P’ P’’ Crucial problem: pass tests tests Debug Debug Execute Execute • Partitioning failed tests into • Partitioning failed tests into groups of similar behavior— all groups of similar behavior— some some tests focus on different faults tests … focus on different faults P Execute P’ failed failed pass Debug Execute • fault-focusing clusters tests • fault-focusing clusters of failed some test cases failed all tests Debug tests P … P’ pass Execute some failed Debug tests P k is P j is P i is failure-free failure-free failure-free
Fault-focusing Clusters—Overview Test Cases t07 t01 t02 t03 t04 t09 t05 t06 t07 t08 t09 t10 t08 t10 Fault-focusing clusters : • Clusters of failing test cases • Clusters failing in similar way • Each cluster targeting a different fault Fault-focusing Clusters Specialized Test Suites Test Cases Specialized test suites : t07 t01 t01 t02 t02 t03 t03 t04 t04 Fault-focusing clusters t09 combined with passing t05 t05 t06 t06 test cases t08 t10
Fault-focusing Clusters Specialized Test Suites Specialized test suites : Specialized test suites : t07 t01 t02 t03 t04 Fault-focusing clusters Fault-focusing clusters t09 t05 t06 combined with passing combined with passing Developer test cases test cases • Find faults one at a using specialized test t08 t01 t02 t03 t04 suites t10 t05 t06 Developer Fault-focusing Clusters Specialized Test Suites t01 t01 Test Cases t02 t02 t03 t03 t04 t04 Specialized test suites : t07 t01 t02 t03 t04 Fault-focusing clusters t05 t05 t06 t06 t09 t05 t06 combined with passing Developer 1 test cases • Find faults one at a using specialized test t08 t01 t02 t03 t04 suites t10 t05 t06 Developer 2 • Find faults at the same time (in parallel) using specialized test suites
Fault-focusing Clusters failed specialized suspiciousness Fault Execution test cases test suites and ranks execution Localization Clustering information Fault-focusing Clusters failed specialized suspiciousness Fault Execution test cases test suites and ranks execution Localization Clustering information Clustering by behavior models Dynamic information • profiles (branch, method-method, …) • only failed tests Statistical analysis, machine learning • generate models for each execution • cluster models Fault-localization for stopping point
Clustering Behavior Models • Models: discrete-time Markov chains (DTMCs) from profiles Most difficult problem of clustering is determining Most difficult problem of clustering is determining (branch, method,…) a good stopping criterion? a good stopping criterion? • Clustering: iterative with two most similar according to Sim1 Sim1 : sum of absolute difference between matching What is a good stopping point for the clustering transitions in DTMCs being compared for fault-focused clusters? t07 t07-09 t08 t07-09-08-10 t09 t08-010 t10 Fault Localization for Stopping Point t07 t07-09 t08 t07-09-08-10 t09 t08-10 t10
Fault Localization for Stopping Point t07 t07-09 I A B t08 = Sim 2 t07-09-08-10 U A B t09 t08-10 t10 Fault Localization for Stopping Point t07 t07-09 I A B t08 = 2 Sim t07-09-08-10 U A B t09 t08-10 t10 rank t07-09-08-10 t01 t02 t03 t04 t05 t06 t07 t08 t09 t10
Tarantula: Fault Localization suspiciousness t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 3,3,5 1,2,3 3,2,2 5,5,5 1,1,4 5,3,4 3,2,1 2,1,3 5,4,2 5,2,6 mid() { int x,y,z,m; • • • • • • • • • • 0.50 1:read(“Enter 3 integers:”,x,y,z); • • • • • • • • • • 2:m = z; 0.50 • • • • • • • • • • 3:if (y<z) 0.50 • • • • • • 0.43 4: if (x<y) • 0.00 5: m = y; • • • • • 0.50 6: else if (x<z) //bug • • • • 0.60 7: m = y; • • • • 8:else 0.60 • • • • 0.60 9: if (x>y) //bug • • • 0.75 10: m = z; • 0.00 11: else if (x>z) 0.00 12: m = x; • • • • • • • • • • 13:print(“Middle number is:”, m); 0.50 } P P P P P P F F F F Pass/fail Status Fault Localization for Stopping Point t07 t07-09 I A B t08 = 2 Sim t07-09-08-10 U A B t09 t08-10 t10 rank rank t07-09-08-10 t07-09 10 t01 t02 t03 t04 t01 t02 t03 t04 9 t05 t06 t07 t08 t05 t06 t07 8 t09 t10 t09 7
Recommend
More recommend