mutation testing
play

Mutation Testing Reid Holmes Key questions Is a test suite: Su - PowerPoint PPT Presentation

Mutation Testing Reid Holmes Key questions Is a test suite: Su ffi ciently broad ? Su ffi ciently deep ? 2 Test suite depth Mutation testing 3 Program Generate Mutants 4 Program Generate Mutants 5 Program Generate Mutants Mutant


  1. Mutation Testing Reid Holmes

  2. Key questions Is a test suite: Su ffi ciently broad ? Su ffi ciently deep ? 2

  3. Test suite depth Mutation testing 3

  4. Program Generate Mutants 4

  5. Program Generate Mutants 5

  6. Program Generate Mutants Mutant 6 6

  7. Program Generate Mutants Mutant 7 7

  8. Program Generate Mutants Mutant 8 8

  9. Program Test Suite Generate Mutants Execute Kill Mutant Suites Score 9

  10. what mutations? flip boolean boundaries (<, >=, etc) remove conditional increment to decrement 10

  11. mutation operators Conditional Boundary < —> <= <= —> < > —> >= >= —> > if (a<b) {..} —> if (a<=b) {..} 11

  12. mutation operators Negate Conditionals == —> != != —> == … if (a==b) {..} —> if (a!=b) {..} 12

  13. mutation operators Remove Conditionals if (a==b) {..} —> if (true) {..} 13

  14. mutation operators Math + —> - * —> / | —> & int a = b + c; … —> int a = b - c; 14

  15. mutation operators Increments/Decrements ++ —> - - - - —> ++ i++ —> i— 15

  16. mutation operators Inline Constant int i = 0; —> int i = 3; 16

  17. mutation operators Return mutator return o; —> return null; 17

  18. mutation operators Skip void calls void somethingImportant(){..} int foo() { int i = 5; somethingImportant(); return i; } —> int foo() { int i = 5; // somethingImportant(); return i; 18 }

  19. public float avg(float[] data){ float sum = 0; 
 for (float num : data){ sum += num; } return sum * data.length; }

  20. Test suite: ✔ assertEq(avg([1]), 1); public float avg(float[] data){ float sum = 0; 
 for (float num : data){ sum += num; } return sum * data.length; }

  21. Test suite: ✖ assertEq(avg([1]), 1); public float avg(float[] data){ float sum = 1; 
 for (float num : data){ sum += num; } return sum * data.length; }

  22. Test suite: ✖ assertEq(avg([1]), 1); public float avg(float[] data){ float sum = 0; 
 for (float num : data){ sum -= num; } return sum * data.length; }

  23. Test suite: ✔ assertEq(avg([1]), 1); public float avg(float[] data){ float sum = 0; 
 for (float num : data){ sum += num; } return sum / data.length; }

  24. Test suite: Kill Score: ✔ assertEq(avg([1]), 1); 66% ✖ sum = 0 —> sum = 1 ✖ sum += num —> sum += num ✔ sum * length —> sum / length

  25. Test suite: New test: ✔ ✔ assertEq(avg([1]), 1); assertEq(avg([1,1]), 1); ✖ sum = 0 —> sum = 1 ✖ sum += num —> sum += num ✔ sum * length —> sum / length should have been / not * all along

  26. Test suite: New test: ✔ ✔ assertEq(avg([1]), 1); assertEq(avg([1,1]), 1); public float avg(float[] data){ float sum = 0; 
 for (float num : data){ sum += num; } return sum / data.length; }

  27. Test suite: New test: ✔ ✔ assertEq(avg([1]), 1); assertEq(avg([1,1]), 1); From the expected return of this function, this test should pass in the program; instead it reveals a fault in the program itself.

  28. mutation assumptions 1) Competent Programmer Hypothesis: —>Most programs are nearly correct. 2) Coupling Hypothesis: —> Big bugs are composed of a series of small errors. 28

  29. Assessing quality of the test suites 29

  30. “If the program works … on specified data , then it will always work on any data . — Hoare Mutation testing

  31. 31 + Programmatic oracle Synthetic Past studies: Correctness Small programs focus Few faults Few mutants

  32. ISSTA ICSE FSE 1996 2005 2014 321 KLOC 1 6 Faults 12 38 357 230,000 Mutants 24 1,100 developer-written & Tests generated generated generated Coverage ✖ ✖ ✔ controlled Examine ✖ ✖ ✔ shortcomings

  33. ISSTA ICSE FSE 1996 2005 2014 Do stronger tests detect more mutants? 321 KLOC 1 6 Faults 12 38 357 Is mutant detection correlated with 230,000 Mutants 24 1,100 fault detection ? developer-written & Tests generated generated generated Coverage Can mutants describe ✖ ✖ ✔ controlled all real faults? Examines ✖ ✖ ✔ shortcomings

  34. Experimental method Define Compilable Triggering Analyze Generate Candidates Faults Tests Misses Suites 34

  35. Experimental Do stronger tests detect results more mutants? Statement Mutant coverage detection 27% 40% Increased 60% 73% Unchanged 35

  36. Experimental results What kinds of faults are not represented by mutants? if (x) { … if (cK.length != return; sD[0].length) } 10% 17% if (x) { … if (cK.length != 73% // del getCatCount()) } Increased Weak/missing No operator 36

  37. Mutation takeaway A correlation exists between mutant detection and real fault detection . 37

  38. Impact Adding tests can on be more testing impactful than increasing coverage Mutants can serve as effective Mutants can describe proxies for real faults many real faults Kill score is a better predictor of test quality than coverage 60% of real faults are Stronger coverage already criteria o ff er little 38 covered additional insight

Recommend


More recommend