regression testing vs regression testing development
play

Regression Testing vs. Regression Testing Development Testing - PowerPoint PPT Presentation

Regression Testing vs. Regression Testing Development Testing Developed first version of software During regression testing, an Adequately tested the first version established test set may be Modified the software; version 2 now


  1. Regression Testing vs. Regression Testing Development Testing • Developed first version of software • During regression testing, an • Adequately tested the first version established test set may be • Modified the software; version 2 now available for reuse needs to be tested • How to test version 2? • Approaches • Approaches – Retest entire software from scratch – Retest all – Only test the changed parts, ignoring – Selective retest (selective regression unchanged parts since they have already testing) ← Main focus of research been tested – Could modifications have adversely affected unchanged parts of the software? Selective Retesting T Tests to rerun Tests not to rerun • Tests to rerun – Select those tests that will produce different output when run on P’ • Modification-revealing test cases • It is impossible to always find the set of modification-revealing test cases – (we cannot predict when P’ will halt for a test) – Select modification-traversing test cases • If it executes a new or modified statement in P’ or misses a statement in P’ that it executed in P 1

  2. 2

  3. Cost of Regression Testing 2 1 3 1 3 2 1 3 2 3 1 2 Analysis 2 3 + Cost = C x 3 Cost = C y Retest All 2 3 3 1 Selective Retest 3 3 2 We want C x < C y 3 1 Key is the test selection algorithm/technique 1 3 We want to maintain the same “quality of testing” T’ = {t2, t3} Selective-retest Approaches Selective-retest Approaches • Safe approaches • Data-flow coverage-based approaches – Select every test that may cause the – Select tests that exercise data interactions modified program to produce different output that have been affected by modifications than the original program • E.g., select every test in T, that when executed on P, executed at least one def-use pair that has been • E.g., every test that when executed on P, executed at deleted from P’, or at least one def-use pair that has least one statement that has been deleted from P, at been modified for P’ least one statement that is new in or modified for P’ • Coverage-based approaches • Minimization approaches – Rerun tests that could produce different – Minimal set of tests that must be run to output than the original program. Use some meet some structural coverage criterion coverage criterion as a guide • E.g., every program statement added to or modified for P’ be executed (if possible) by at least one test in T 3

  4. Selective-retest Approaches Factors to consider • Ad-hoc/random approaches • Testing costs – Time constraints • Fault-detection ability – No test selection tool available • Test suite size vs. fault-detection • E.g., randomly select n test cases from T ability • Specific situations where one technique is superior to another Open Questions Experiment • How do techniques differ in terms of • Hypothesis their ability to – Non-random techniques are more effective than random techniques but are much more expensive – reduce regression testing costs? – The composition of the original test suite – detect faults? greatly affects the cost and benefits of test • What tradeoffs exist b/w testsuite size selection techniques reduction and fault detection ability? – Safe techniques are more effective and more • When is one technique more cost- expensive than minimization techniques effective than another? – Data-flow coverage based techniques are as effective as safe techniques, but can be more • How do factors such as program design, expensive location, and type of modifications, and – Data-flow coverage based techniques are more test suite design affect the efficiency effective than minimization techniques but are and effectiveness of test selection more expensive techniques? 4

  5. Measure Modeling Cost • Did not have implementations of all • Costs and benefit of several test techniques selection algorithms – Had to simulate them • Developed two models • Experiment was run on several – Calculating the cost of using the machines (185,000 test cases) – technique w.r.t. the retest-all results not comparable technique • Simplifying assumptions – Calculate the fault detection – All test cases have uniform costs effectiveness of the resulting test – All sub-costs can be expressed in case equivalent units • Human effort, equipment cost Modeling Cost Modeling Fault-detection • Cost of regression test selection • Per-test basis – Given a program P and – Cost = A + E(T’) – Its modified version P’ – Where A is the cost of analysis – Identify those tests that are in T and reveal – And E(T’) is the cost of executing and a fault in P’, but that are not in T’ validating tests in T’ – Normalize above quantity by the number of – Note that E(T) is the cost of fault-revealing tests in T executing and validating all tests, i.e., • Problem the retest-all approach – Multiple tests may reveal a given fault – Relative cost of executing and – Penalizes selection techniques that discard these test cases (i.e., those that do not validating = |T’|/|T| reduce fault-detection effectiveness) 5

  6. Modeling Fault-detection Experimental Design • Per-test-suite basis • 6 C programs – Three options • Test suites for the programs • The test suite is inadequate • Several modified versions – No test in T is fault revealing, and thus, no test in T’ is fault revealing • Same fault detection ability – Some test in both T and T’ is fault revealing • Test selection compromises fault-detection – Some test in T is fault revealing, but no test in T’ is fault revealing • 100 - (Percentage of cases in which T’ contains no fault-revealing tests) Test Suites and Versions Versions and Test Suites • Given a test pool for each program • Two sets of test suites for each program – Edge-coverage based – Black-box test cases • 1000 edge-coverage adequate test suites • Category-partition method • To obtain test suite T, for program P (from its test – Additional white-box test cases pool): for each edge in P’s CFG, choose (randomly) from those tests of pool that exercise the edge (no • Created by hand repeats) • Each (executable) statement, edge, and def- – Non-coverage based use pair in the base program was exercised • 1000 non-coverage-based test suites by at least 30 test cases • To obtain the k th non-coverage based test suite, for • Nature of modifications program P: determine n, the size of the k th coverage- based test suite, and then choose tests randomly – Most cases single modification from the test pool for P and add them to T, until T contains n test cases – Some cases, 2-5 modifications 6

  7. Another look at the subjects Test Selection Tools • Minimization technique 1000 • For each program – Select a minimal set of tests that cover • 1000 edge-coverage based test suites: modified edges • 1000 non-coverage based test suites: • Safe technique – DejaVu • we discussed the details earlier in this lecture • Data-flow coverage based technique – Select tests that cover modified def-use pairs • Random technique – Random(n) randomly selects n% of the test cases • Retest-all Variables Measured Quantities • Each run • The subject program – Program P – 6 programs, each with a variety of – Version P’ modifications – Selection technique M • The test selection technique – Test suite T – Safe, data-flow, minimization, • Measured random(25), random(50), random(75), – The ratio of tests in the selected test retest-all suite T’ to the tests in the original • Test suite composition test suite – Edge-coverage adequate – Whether one or more tests in T’ reveals the fault in P’ – random 7

  8. Dependent variables Number of runs • Average reduction in test suite size • For each subject program, from the test suite universe • Fault detection effectiveness – Selected 100 edge-coverage adequate • 100-Percentage of test suites in which T’ does not reveal a fault in P’ – And 100 random test suites • For each test suite – Applied each test selection method – Evaluated the fault detection capability of the resulting test suite Fault-detection Effectiveness How to read the graphs Entire structure represents a data distribution Upper quartile Box’s height spans Median the central 50% of the data Lower quartile 100-Percentage of test suites in which T’ does not reveal a fault in P’ 8

  9. How to read the graphs Fault-detection Effectiveness 9

  10. Conclusions • Minimization produces the smallest and the least effective test suites • Random selection of slightly larger test suites yielded equally good test suites as far as fault-detection is concerned • Safe and data-flow nearly equivalent average behavior and analysis costs – Data-flow may be useful for other aspects of regression testing • Safe methods found all faults (for which they has fault-revealing tests) while selecting (average) 74% of the test cases Conclusions • In certain cases, safe method could not reduce test suite size at all – On the average, slightly larger random test suites could be nearly as effective • Results were sensitive to – Selection methods used – Programs – Characteristics of the changes – Composition of the test suites 10

Recommend


More recommend