AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A Controlled Experiment and Think-aloud Observations ISSTA 2015 José Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri
“Testing is a widespread validation approach in industry, but it is still largely ad hoc , expensive , and unpredictably effective .” “Software Testing Research: Achievements, Challenges, Dreams,” A. Bertolino. Future of Software Engineering . IEEE . 2007.
“Testing is a widespread validation approach in industry, but it is still largely ad hoc , expensive , and unpredictably effective .” “Software Testing Research: Achievements, Challenges, Dreams,” A. Bertolino. Future of Software Engineering . IEEE . 2007. “Test case generation has a strong impact on the effectiveness and efficiency of testing.” “…one of the most active research topics in software testing for several decades, resulting in many different approaches and tools .” ”An orchestrated survey of methodologies for automated software test case generation,” S. Anand, E. K. Burke, T. Y. Chen, J. Clark, M.B. Cohen, W. Grieskamp, M. Harman, M.J. Harrold, P . McMinn. J. Systems and Software . Elsevier. 2013.
BACK IN ISSTA 2013… “Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg
BACK IN ISSTA 2013… ARE UNIT TEST GENERATION TOOLS HELPFUL TO DEVELOPERS WHILE THEY ARE CODING ? “Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg
CODE COVERAGE
CODE COVERAGE TIME SPENT ONTESTING
CODE COVERAGE TIME SPENT ONTESTING IMPLEMENTATION QUALITY
CONTROLLED EXPERIMENT
CONTROLLED EXPERIMENT Golden Implementation and Test Suite
CONTROLLED EXPERIMENT Golden Implementation Class Template and Test Suite
CONTROLLED EXPERIMENT Golden Implementation Implementation Class Template and Test Suite and Test Suite
CONTROLLED EXPERIMENT Golden Implementation Implementation Class Template and Test Suite and Test Suite
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template and Test Suite and Test Suite
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template 1 hour and Test Suite and Test Suite
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template 1 hour and Test Suite and Test Suite 41
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template 1 hour and Test Suite and Test Suite 41
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template 1 hour and Test Suite and Test Suite 41 2
CONTROLLED EXPERIMENT Manual Golden Implementation Implementation Class Template 1 hour and Test Suite and Test Suite 41 2 4
DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO TEST SUITES WITH HIGHER CODE COVERAGE? RQ 1
CODE COVERAGE participants’ test suites run on their own implementations Assisted Manual 100% 80% 83% Branch Coverage 60% 63% 57% 50% 40% 41% 39% 38% 26% 20% 0% FilterIterator FixedOrderComparator ListPopulation PredicatedMap
CODE COVERAGE Times coverage was checked Assisted Manual 10 Times Coverage was checked 9.6 9 8 6.4 6 6 5.9 5.3 4 4 2 1.9 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap Category Axis
CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)
CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)
CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)
CODE COVERAGE participant’s test suites run on their own implementations ListPopulation Manual 100% EvoSuite Branch Coverage (%) Assisted 75% 50% 25% 0% 0 10 20 30 40 50 60 Time (min)
CODE COVERAGE participants’ test suites run on golden implementations Assisted Manual 100% 80% Branch Coverage 60% 50% 40% 42% 41% 37% 35% 30% 28% 20% 21% 0% FilterIterator FixedOrderComparator ListPopulation PredicatedMap
CODE COVERAGE participant’s test suites run on golden implementations, over time Assisted Manual EvoSuite-generated FilterIterator FixedOrderComparator 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) ListPopulation PredicatedMap 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min)
CODE COVERAGE participant’s test suites run on golden implementations, over time Assisted Manual EvoSuite-generated FilterIterator FixedOrderComparator 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% Coverage can be higher when using EvoSuite , 25% 25% depending on how the generated tests are used. 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) ListPopulation PredicatedMap 100% 100% Branch Coverage Branch Coverage 75% 75% 50% 50% 25% 25% 0% 0% 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min)
DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO DEVELOPERS SPENDING MORE OR LESS TIME ON TESTING? RQ 2
TESTING EFFORT Number of test runs Assisted Manual 14 13.7 12.9 11 11 8 8.2 7.8 7.2 6 5.6 4.4 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
TESTING EFFORT Minutes spent on testing Assisted Manual 26 25 21 20 18.5 16 15.8 14.3 12.6 10 9.3 7.7 5 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
TESTING EFFORT Minutes spent on testing Assisted Manual 26 25 21 20 18.5 Using EvoSuite reduces the time spent on testing. 16 15.8 14.3 12.6 10 9.3 7.7 5 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
DOES USING EVOSUITE DURING SOFTWARE DEVELOPMENT LEAD TO SOFTWARE WITH FEWER BUGS? RQ 3
IMPLEMENTATION QUALITY Golden test suites run on participants’ implementations Assisted Manual 16 Number of Failures+Errors 15.6 14.3 13 10 6 6.4 6.3 6.1 5.3 4.3 4.2 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
IMPLEMENTATION QUALITY Golden test suites run on participants’ implementations Assisted Manual 16 Number of Failures+Errors 15.6 14.3 13 Using EvoSuite during development did not 10 lead to to better implementations. 6 6.4 6.3 6.1 5.3 4.3 4.2 3 0 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
DOES SPENDING MORE TIME WITH EVOSUITE AND ITS TESTS LEAD TO BETTER IMPLEMENTATIONS? RQ 4
PRODUCTIVITY Time spent with EvoSuite Number of runs Time spent on tests 0.40 number of failures plus errors 0.35 0.35 0.30 0.32 0.20 Correlation with 0.10 0.02 0.00 -0.29 -0.03 -0.22 -0.49 -0.10 -0.20 -0.30 -0.40 -0.50 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
PRODUCTIVITY Time spent with EvoSuite Number of runs Time spent on tests 0.40 number of failures plus errors 0.35 0.35 0.30 0.32 0.20 Implementation quality improves the more time Correlation with 0.10 developers spend with EvoSuite -generated tests. 0.02 0.00 -0.29 -0.03 -0.22 -0.49 -0.10 -0.20 -0.30 -0.40 -0.50 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
Using automated unit test generation does impact developers’ productivity,
Using automated unit test generation does impact developers’ productivity, but…
Using automated unit test generation does impact developers’ productivity, but… …how to make the most out of unit test generation tools?
THINK ALOUD OBSERVATIONS K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.
THINK ALOUD OBSERVATIONS Subject K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.
THINK ALOUD OBSERVATIONS Observer Subject K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as Data (revised edition) . MIT Press, 1993. J. Hughes and S. Parkes, “Trends in the use of verbal protocol analysis in software engineering research,” Behaviour and Information Technology , vol. 22, no. 2, pp. 127–140, 2003.
Recommend
More recommend