comparing the effectiveness of testing techniques
play

Comparing the Effectiveness of Testing Techniques a paper by Elaine - PowerPoint PPT Presentation

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparing the Effectiveness of Testing Techniques a paper by Elaine J. Weyuker presented by Matthias Kegele June 8, 2011 Comparing the Effectiveness


  1. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparing the Effectiveness of Testing Techniques a paper by Elaine J. Weyuker presented by Matthias Kegele June 8, 2011 Comparing the Effectiveness of Testing Techniques

  2. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Agenda ◮ Comparison relations ◮ Probabilistic relations ◮ Limitations of formal analysis ◮ Empirical comparison of criteria ◮ Conclusions Comparing the Effectiveness of Testing Techniques

  3. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions On comparing effectiveness of testing techniques ◮ ultimate goal: higher dependability of SUT ◮ definition of effective ◮ finding faults: quality over quantity? ◮ real life versus in vitro Comparing the Effectiveness of Testing Techniques

  4. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparison relations ◮ subsumption relation ◮ power relation ◮ BETTER relation Comparing the Effectiveness of Testing Techniques

  5. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Subsumption relation (1) ◮ natural way ◮ program P, test set TS, test criteria C 1 and C 2 ◮ C 1 subsumes C 2 ◮ ∀ TS P satisfies ( TS , C 1 ) ⇒ satisfies ( TS , C 2 ) Comparing the Effectiveness of Testing Techniques

  6. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Subsumption relation (2) ◮ incomparability of testing criteria ◮ can be misleading ◮ wide spectrum of test suites that satisfy a given criterium ◮ little or no guidance how to choose test suite Comparing the Effectiveness of Testing Techniques

  7. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Power relation (1) ◮ detection of failure ◮ C 1 is at least as powerful as C 2 ◮ detects ( C 2 , failure ) ⇒ detects ( C 1 , failure ) Comparing the Effectiveness of Testing Techniques

  8. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Power relation (2) ◮ still problem with incomparability ◮ C 1 is at least as powerful as C 2 ◮ ∃ C 2 : exposes some failures more often than C 1 ◮ ∃ C 1 , C 2 : none of both find certain failures Comparing the Effectiveness of Testing Techniques

  9. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions BETTER relation (1) ◮ test case required by criterion C to test P: tc ∈ ts ∧ ∀ ts . satisfies ( ts , C ) ⇒ requires ( C , ts ) ◮ C 1 is BETTER than C 2 ◮ ∀ tc . requires ( C 2 , tc ) ⇒ requires ( C 1 , tc ) ◮ relevant test sets (monotonic) Comparing the Effectiveness of Testing Techniques

  10. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions BETTER relation (2) ◮ still problem with incomparability ◮ very few criteria require specific test case Comparing the Effectiveness of Testing Techniques

  11. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Probabilistic measures ◮ covers relation ◮ properly covers relation ◮ expected number of failures detected Comparing the Effectiveness of Testing Techniques

  12. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Input domains (1) ◮ program P with domain D (possible inputs) ◮ partition into subsets: subdomains D i ◮ test set: for each subdomain one test case Comparing the Effectiveness of Testing Techniques

  13. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Input domains (2) ◮ program P has domain D = { 0 , 1 , 2 , 3 , 4 } ◮ failure causing input: 0 ◮ C 1 requires test case from subdomain { 0 , 1 , 2 } and one from { 3 , 4 } ◮ C 2 requires test case from the subdomain { 0 , 1 , 2 } and one from { 0 , 3 , 4 } ◮ test sets that satisfy C 1 : (0,3) , (0,4) , (1 , 3) , (1 , 4) , (2 , 3) , (2 , 4) ◮ test sets that satisfy C 2 : (0,0) , (0,3) , (0,4) , (1,0) , (1 , 3) , (1 , 4) , (2,0) , (2 , 3) , (2 , 4) ◮ P(find failure with C 1 ) = 1 3 ◮ P(find failure with C 2 ) = 5 9 Comparing the Effectiveness of Testing Techniques

  14. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Covers relation (1) ◮ C 1 covers C 2 ◮ SD C ( P , S ) subdomains used for generating test sets satisfying criterion C ◮ ∀ D ∈ SD C 2 ( P , S ) exists { D 1 , ..., D n } belonging to SD C 1 ( P , S ) such that D 1 ∪ ... ∪ D n = D ◮ universally covers : ∀ ( P , S ) : C 1 covers C 2 Comparing the Effectiveness of Testing Techniques

  15. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Covers relation (2) ◮ M = P (”test set exposes at least one fault”) ◮ specification S, subdomains size d i of D i , failure causing inputs m i (tc) ◮ M ( C , P , S ) = 1 − � n i =1 (1 − m i d i ) Comparing the Effectiveness of Testing Techniques

  16. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Covers relation (3) ◮ deficit: ◮ ∃ C 1 , C 2 , P , S : C 1 covers C 2 ⇒ M ( C 1 , P , S ) ≥ M ( C 2 , P , S ) Comparing the Effectiveness of Testing Techniques

  17. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Properly covers relation ◮ C 1 properly covers C 2 ⇒ M ( C 1 , P , S ) ≥ M ( C 2 , P , S ) ◮ each subdomain of C 2 is covered by a union of C 1 subdomains ◮ none of C 2 subdomains occur more often in the covering than it does in SD C 1 (cmp. example) ◮ properly universally covers : ∀ ( P , S ) : C 1 properly covers C 2 Comparing the Effectiveness of Testing Techniques

  18. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Expected number of failures detected ◮ E ( C , P , S ) = � n m i i =1 d i ◮ C 1 properly covers C 2 ⇒ E ( C 1 , P , S ) ≥ E ( C 2 , P , S ) ◮ ability to rank criteria (next slide) ◮ properly covers � = will find more faults ⇒ just more likely Comparing the Effectiveness of Testing Techniques

  19. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparing the Effectiveness of Testing Techniques

  20. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Limitations of formal analysis ◮ relations compare idealized versions of testing strategies, not practicle ones ◮ missing risk analysis: high consequence faults, trivial faults ◮ no provision of human variability: experiences, expertise, acquired intuition ◮ cost of testing: is it beneficial to use certain criteria? Comparing the Effectiveness of Testing Techniques

  21. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Empirical comparison of criteria (1) ◮ formal scientific experiment: ◮ synthetic faults seeded into a program ◮ small programs ◮ not representative ◮ case study: ◮ large industrial software system ◮ containing real faults ◮ need for modelling the system (expensive) ◮ may not be representative of a wider class of programs Comparing the Effectiveness of Testing Techniques

  22. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Empirical comparison of criteria (2) ◮ repeating similar case studies ⇒ body of knowledge ◮ different sorts of systems, development environment, test personnel, languages ◮ severity of faults in case studies? Comparing the Effectiveness of Testing Techniques

  23. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Conclusions (1) ◮ formal comparison relations ◮ intuitive but profound problems ◮ increase of dependability? ◮ C 1 is better than C 2 but is C 1 good? Comparing the Effectiveness of Testing Techniques

  24. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Conclusions (2) ◮ probabilistic measures ◮ more appropriate, still serious flaws ◮ absolute instead of relative comparison possible? ◮ empirical studies ◮ how it works in practice ◮ test ultimate goal of achiving higher dependability Comparing the Effectiveness of Testing Techniques

  25. Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Thank you for your attention! Questions? Comparing the Effectiveness of Testing Techniques

Recommend


More recommend