comparative causality explaining the differences between
play

Comparative Causality: Explaining the Differences Between - PowerPoint PPT Presentation

Comparative Causality: Explaining the Differences Between Executions William N. Sumner Xiangyu Zhang {wsumner,xyzhang} @ cs.purdue.edu ICSE 2013 22 May 2013 Background Debugging requires understanding how a program behaves. Background


  1. Example – Altered Meaning Buggy Trial Correct 1)x ← input() x ← 0 x ← 1 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2)

  2. Example – Altered Meaning Buggy Trial Correct 1)x ← input() x ← 0 x ← 1 2)y ← input() y ← 2 y ← 2 y ← 2 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2)

  3. Example – Altered Meaning Buggy Trial Correct 1)x ← input() x ← 0 x ← 1 x ← 1 2)y ← input() y ← 2 y ← 2 y ← 2 3)z ← input() z ← 6 z ← 3 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2)

  4. Example – Altered Meaning Buggy Trial Correct 1)x ← input() x ← 0 x ← 1 x ← 1 2)y ← input() y ← 2 y ← 2 y ← 2 3)z ← input() z ← 6 z ← 3 z ← 3 4)if y>1 & z<6: if False: if True: if False: 5) y ← 5 y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(5) print(2)

  5. Example – Altered Meaning Buggy Trial Correct 1)x ← input() x ← 0 x ← 1 x ← 1 2)y ← input() y ← 2 y ← 2 y ← 2 3)z ← input() z ← 6 z ← 3 z ← 3 4)if y>1 & z<6: if False: if True: if False: 5) y ← 5 y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(5) print(2) ● New control flow unlike original runs ● Occurs in large portion of real bugs

  6. Confounding of Explanations Behavior not found in original executions: ● includes irrelevant information ● excludes necessary information

  7. Confounding of Explanations Behavior not found in original executions: ● includes irrelevant information ● excludes necessary information Solution: Dual Slicing ● Identify & extract execution differences relevant to the failure – Those that differ across executions

  8. Confounding of Explanations Behavior not found in original executions: ● includes irrelevant information ● excludes necessary information Solution: Dual Slicing ● Identify & extract execution differences relevant to the failure – Those that differ across executions ● Run trials on the extracted program

  9. Dual Slicing ● A slice of two executions at once – Includes dependences that differ across executions – Skips dependences that are the same 1)x ← 0 1)x ← 1 2)y ← 1 2)y ← 1 3)print(x+y) 3)print(x+y)

  10. Dual Slicing ● A slice of two executions at once – Includes dependences that differ across executions – Skips dependences that are the same 1 1)x ← 0 1)x ← 1 0 2 2)y ← 1 2)y ← 1 1 3)print(x+y) 3)print(x+y) 3

  11. Dual Slicing ● A slice of two executions at once – Includes dependences that differ across executions – Skips dependences that are the same 1 1 1)x ← 0 1)x ← 1 0 2 2)y ← 1 2)y ← 1 2 1 1 1 3)print(x+y) 3)print(x+y) 3 3

  12. Dual Slicing ● A slice of two executions at once – Includes dependences that differ across executions – Skips dependences that are the same 1 1 1 1)x ← 0 1)x ← 1 0 2 2)y ← 1 2 2)y ← 1 2 1 1 1 3)print(x+y) 3)print(x+y) 3 3 3

  13. Dual Slicing ● A slice of two executions at once – Includes dependences that differ across executions – Skips dependences that are the same 1 1 1 1)x ← 0 1)x ← 1 0 2 2)y ← 1 0 2 1 2)y ← 1 2 1 1 1 3)print(x+y) 3)print(x+y) 3 3 3

  14. Dual Slicing Buggy Correct 1)x ← input() x ← 0 x ← 1 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2) 7 ● Identify differences affecting the failure

  15. Dual Slicing Buggy Correct 1)x ← input() x ← 0 x ← 1 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2) 7 ● Identify differences affecting the failure

  16. Dual Slicing Buggy Correct 1)x ← input() x ← 0 x ← 1 2 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2) 7 ● Identify differences affecting the failure

  17. Dual Slicing Buggy Correct 1)x ← input() x ← 0 x ← 1 2 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 6 z ← 3 4)if y>1 & z<6: if False: if False: 5) y ← 5 6 6)else: y ← y+1 else: y ← 3 else: y ← 2 7)print(y) print(3) print(2) 7 Extract 2)y ← input() 6)y ← y+1 7)print(y)

  18. Example – Extracted Meaning Buggy Trial Correct y ← 2 y ← 1 2)y ← input() y ← 3 y ← 2 6)y ← y+1 print(3) print(2) 7)print(y)

  19. Example – Extracted Meaning Buggy Trial Correct y ← 2 y ← 2 y ← 2 2)y ← input() y ← 3 y ← 2 6)y ← y+1 print(3) print(2) 7)print(y)

  20. Example – Extracted Meaning Buggy Trial Correct y ← 2 y ← 2 y ← 2 2)y ← input() y ← 3 y ← 3 y ← 2 6)y ← y+1 print(3) print(3) print(2) 7)print(y) Trial can now correctly blame y

  21. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[1]) 2 1 ● Control flow is not the only source of confounding

  22. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[1]) 2 1 What should we blame here?

  23. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[1]) 2 1

  24. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[1]) 2 1

  25. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[2]) print(x[1]) 2 5 1

  26. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[2]) print(x[1]) 2 5 1

  27. Data Confounding Buggy Trial Correct 1)x ← [0,1,2,3] x ← … x ← … 2)y ← input() y ← 2 y ← 2 y ← 1 3)z ← input() z ← 3 z ← 2 z ← 2 4)x[z] ← 5 x[3] ← 5 x[2] ← 5 x[2] ← 5 5)print(x[y]) print(x[2]) print(x[2]) print(x[1]) 2 5 1 ● Either new control flow or new data flow can cause confounding. ● Removing them is crucial.

  28. Execution Omission ● A failure is not just incorrect behavior, it is missing correct behavior.

  29. Execution Omission ● A failure is not just incorrect behavior, it is missing correct behavior. – Also known as execution omission – Cannot be explained by reproducing faulty behavior

  30. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  31. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*') What should we blame here?

  32. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  33. Execution Omission Buggy Correct x ← 0 x ← 0 x ← 0 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  34. Execution Omission Buggy Correct x ← 0 x ← 0 x ← 0 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') print('*') Print('*') 5) print('*') ● x alone reproduces the failure! ● Does x alone explain the bug?

  35. Execution Omission Buggy Correct x ← 0 x ← 0 x ← 0 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') print('*') Print('*') 5) print('*') ● x alone reproduces the failure! ● Does x alone explain the bug? – Can you fix the bug by only fixing x?

  36. Execution Omission Buggy Correct x ← 0 x ← 0 x ← 0 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') print('*') Print('*') 5) print('*') ● x alone reproduces the failure! ● Does x alone explain the bug? – Can you fix the bug by only fixing x? We can run a symmetric trial to find out!

  37. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  38. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  39. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 2 y ← 5 2) y ← input() if False: if True: if True: 3) if x > 3: print(2) print(5) 4) print(y) print('*') print('*') print('*') 5) print('*') ● Fixing x does not fix the missing behavior!

  40. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 2 y ← 5 2) y ← input() if False: if True: if True: 3) if x > 3: print(2) print(5) 4) print(y) print('*') print('*') print('*') 5) print('*') ● Fixing x does not fix the missing behavior! ● x alone does not explain the bug.

  41. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*') What if we try both x and y?

  42. Execution Omission Buggy Correct x ← 0 x ← 4 1) x ← input() y ← 2 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*') What if we try both x and y?

  43. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if True: 3) if x > 3: print(5) 4) print(y) print('*') Print('*') 5) print('*')

  44. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if True: if True: 3) if x > 3: print(5) print(5) 4) print(y) print('*') print('*') print('*') 5) print('*') ● Fixing x and y together can fix the bug

  45. Execution Omission Buggy Correct x ← 0 x ← 4 x ← 4 1) x ← input() y ← 2 y ← 5 y ← 5 2) y ← input() if False: if True: if True: 3) if x > 3: print(5) print(5) 4) print(y) print('*') print('*') print('*') 5) print('*') ● Fixing x and y together can fix the bug Symmetric trials at each step 1) 2) explain the missed behavior, too.

  46. Comparative Causality Explaining a failure: ● Reproducing the failure is not enough. ● Requires explaining why both executions differ from each other

  47. Comparative Causality Explaining a failure: ● Reproducing the failure is not enough. ● Requires explaining why both executions differ from each other ● Dual slicing ensures – That we compare behaviors from the two executions

  48. Comparative Causality Explaining a failure: ● Reproducing the failure is not enough. ● Requires explaining why both executions differ from each other ● Dual slicing ensures – That we compare behaviors from the two executions ● Symmetric comparison explains – Why the buggy execution did something wrong – Why the buggy execution didn't do something right

  49. Real Results ● Implemented with LLVM ● 20 KLOC ● Automatically explains bugs in C programs. – 20kloc - 400kloc – 400kinst – 2.24minst

  50. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

  51. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

  52. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

  53. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 Precise reasoning grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 is more efficient! make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 3 min 14 min tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

  54. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 125 trials 1100 trials tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

  55. Real Results CC Old Program Size Trials Time Stmts Root Trials Time Stmts Root Precision Recall find 73k 15 12 6 X 1260 253 0 - 0 0 gnuplot 144k 33 44 10 X 469 141 10 X 1 1 gnuplot 139k 323 200 48 X 208 51 0 - 0 0 gnuplot 134k 337 961 129 - 1888 950 121 - 0.97 0.91 gnuplot 134k 130 140 33 - 3012 931 38 - 0.87 1 grep 12k 186 114 62 - 1012 8263 23 - 0.96 0.35 grep 12k 327 156 69 - 1734 183 32 - 1 0.46 grep 12k 78 49 27 X 1546 168 23 - 0.96 0.81 make 30k 62 342 27 X 543 416 17 - 1 0.63 tar 20k 8 22 3 X 221 50 3 X 1 1 tar 24k 125 124 48 X 332 110 0 - 0 0 tar 20k 121 53 20 X 296 66 0 - 0 0 35 stmts 20 stmts tar 20k 28 43 10 X 2117 439 5 - 1 0.5 tar 21k 87 80 23 X 709 165 15 X 0.73 0.48 tar 21k 15 22 4 X 1283 228 4 X 1 1 Avg. 125 157 34 1109 828 20 0.22 0.26

Recommend


More recommend