automated unit test generation during software development
play

AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A - PowerPoint PPT Presentation

AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A Controlled Experiment and Think-aloud Observations ISSTA 2015 Jos Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri Testing is a widespread


  1. AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A Controlled Experiment and Think-aloud Observations ISSTA 2015 José Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri

  2. “Testing is a widespread validation approach in industry, but it is still largely ad hoc , expensive , and unpredictably effective .” “Software Testing Research: Achievements, Challenges, Dreams,” A. Bertolino. Future of Software Engineering . IEEE . 2007.

  3. “Testing is a widespread validation approach in industry, but it is still largely ad hoc , expensive , and unpredictably effective .” “Software Testing Research: Achievements, Challenges, Dreams,” A. Bertolino. Future of Software Engineering . IEEE . 2007. “Test case generation has a strong impact on the effectiveness and efficiency of testing.” “…one of the most active research topics in software testing for several decades, resulting in many different approaches and tools .” ”An orchestrated survey of methodologies for automated software test case generation,” S. Anand, E. K. Burke, T. Y. Chen, J. Clark, M.B. Cohen, W. Grieskamp, M. Harman, M.J. Harrold, P . McMinn. J. Systems and Software . Elsevier. 2013.

  4. BACK IN ISSTA 2013… “Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg

  5. BACK IN ISSTA 2013… ARE UNIT TEST GENERATION TOOLS HELPFUL TO DEVELOPERS WHILE THEY ARE CODING ? “Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg

  6. CODE COVERAGE

  7. CODE COVERAGE TIME SPENT ONTESTING

  8. CODE COVERAGE TIME SPENT ONTESTING IMPLEMENTATION QUALITY

  9. CONTROLLED EXPERIMENT

  10. CONTROLLED EXPERIMENT Golden Implementation 
 and Test Suite

  11. CONTROLLED EXPERIMENT Golden Implementation 
 Class Template and Test Suite

  12. CONTROLLED EXPERIMENT Golden Implementation 
 Implementation 
 Class Template and Test Suite and Test Suite

  13. CONTROLLED EXPERIMENT Golden Implementation 
 Implementation 
 Class Template and Test Suite and Test Suite

  14. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template and Test Suite and Test Suite

  15. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template 1 hour and Test Suite and Test Suite

  16. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template 1 hour and Test Suite and Test Suite 41

  17. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template 1 hour and Test Suite and Test Suite 41

  18. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template 1 hour and Test Suite and Test Suite 41 2

  19. CONTROLLED EXPERIMENT Manual Golden Implementation 
 Implementation 
 Class Template 1 hour and Test Suite and Test Suite 41 2 4

  20. DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO TEST SUITES WITH HIGHER CODE COVERAGE? RQ 1

  21. CODE COVERAGE participants’ test suites run on their own implementations Assisted Manual 100% 80% 83% Branch Coverage 60% 63% 57% 50% 40% 41% 39% 38% 26% 20% 0% FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  22. CODE COVERAGE Times coverage was checked Assisted Manual 10 Times Coverage was checked 9.6 9 8 6.4 6 6 5.9 5.3 4 4 2 1.9 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap Category Axis

  23. CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)

  24. CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)

  25. CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)

  26. CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)

  27. CODE COVERAGE participants’ test suites run on golden implementations Assisted Manual 100% 80% Branch Coverage 60% 50% 40% 42% 41% 37% 35% 30% 28% 20% 21% 0% FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  28. CODE COVERAGE participant’s test suites run on golden implementations, over time Assisted Manual EvoSuite-generated FilterIterator FixedOrderComparator 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) ListPopulation PredicatedMap 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min)

  29. CODE COVERAGE participant’s test suites run on golden implementations, over time Assisted Manual EvoSuite-generated FilterIterator FixedOrderComparator 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% Coverage can be higher when using EvoSuite , 25% 25% depending on how the generated tests are used. 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) ListPopulation PredicatedMap 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min)

  30. DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO DEVELOPERS SPENDING MORE OR LESS TIME ON TESTING? RQ 2

  31. TESTING EFFORT Number of test runs Assisted Manual 14 13.7 12.9 11 11 8 8.2 7.8 7.2 6 5.6 4.4 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  32. TESTING EFFORT Minutes spent on testing Assisted Manual 26 25 21 20 18.5 16 15.8 14.3 12.6 10 9.3 7.7 5 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  33. TESTING EFFORT Minutes spent on testing Assisted Manual 26 25 21 20 18.5 Using EvoSuite reduces the time spent on testing. 16 15.8 14.3 12.6 10 9.3 7.7 5 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  34. DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO SOFTWARE WITH FEWER BUGS? RQ 3

  35. IMPLEMENTATION QUALITY Golden test suites run on participants’ implementations Assisted Manual 16 Number of Failures+Errors 15.6 14.3 13 10 6 6.4 6.3 6.1 5.3 4.3 4.2 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  36. IMPLEMENTATION QUALITY Golden test suites run on participants’ implementations Assisted Manual 16 Number of Failures+Errors 15.6 14.3 13 Using EvoSuite during development did not 
 10 lead to to better implementations. 6 6.4 6.3 6.1 5.3 4.3 4.2 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  37. DOES SPENDING MORE TIME WITH EVOSUITE AND ITS TESTS LEAD TO BETTER IMPLEMENTATIONS? RQ 4

  38. PRODUCTIVITY Time spent with EvoSuite Number of runs Time spent on tests 0.40 number of failures plus errors 0.35 0.35 0.30 0.32 0.20 Correlation with 
 0.10 0.02 0.00 -0.29 -0.03 -0.22 -0.49 -0.10 -0.20 -0.30 -0.40 -0.50 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  39. PRODUCTIVITY Time spent with EvoSuite Number of runs Time spent on tests 0.40 number of failures plus errors 0.35 0.35 0.30 0.32 0.20 Implementation quality improves the more time 
 Correlation with 
 0.10 developers spend with EvoSuite -generated tests. 0.02 0.00 -0.29 -0.03 -0.22 -0.49 -0.10 -0.20 -0.30 -0.40 -0.50 FilterIterator FixedOrderComparator ListPopulation PredicatedMap

  40. Using automated unit test generation does impact developers’ productivity,

  41. Using automated unit test generation does impact developers’ productivity, but…

  42. Using automated unit test generation does impact developers’ productivity, but… …how to make the most out of unit test generation tools?

  43. THINK ALOUD OBSERVATIONS K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.

  44. THINK ALOUD OBSERVATIONS Subject K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.

  45. THINK ALOUD OBSERVATIONS Observer Subject K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.

Recommend


More recommend