Regression Testing S oftware is constantly modified Bug fixes - PowerPoint PPT Presentation

Time - Aware Test Suite Prioritization Kristen R. Walcott, Gregory M. Kapfhammer, Mary Lou S offa Robert S . Roos University of Virginia Allegheny College International S ymposium on S oftware Testing and Analysis Portland, Maine July 17 - 20, 2006

Regression Testing � S oftware is constantly modified � Bug fixes � Addition of functionality � After making changes, test using regression test suite � Provides confidence in correct modifications � Detects new faults � High cost of regression testing � More modifications › larger test suite � May execute for days, weeks, or months � Testing costs are very high

Reducing the Cost � Cost - saving techniques � S election: Use a subset of the test cases � Prioritization: Reorder the test cases � Prioritization methods � Initial ordering � Reverse ordering � Random ordering � Based on fault detection ability

Ordering Tests with Fault Detection � Idea: First run the test cases that will find faults first � Complications: � Different tests may find the same fault � Do not know which tests will find faults � Use coverage to estimate fault finding ability

Prioritization Example Prioritized Test S uite (with some fault information) T2 T1 T4 T5 T6 T3 1 fault 3 faults 3 faults 3 faults 7 faults 2 faults 1 min. 9 min. 4 min. 4 min. 4 min. 3 min. Faults found / minute 1.0 0.778 0.75 0.75 0.75 0.667 • Retesting generally has a time budget • Is this prioritization best when the time budget is considered? Contribution: A test prioritization technique that intelligently incorporates a time budget

Fault Aware Prioritization FAULTS/ f 1 f 2 f 3 f 4 f 5 f 6 f 7 f 8 TEST CASE X X X X X X X T1 X T2 X X T3 X X X T4 X X X T5 X X X T6 TESTING GOAL: Find as many faults as soon as possible

Time Budget: 12 minutes T1 f 1 f 2 f 4 f 5 f 6 f 7 f 8 T2 f 1 T3 f 1 f 5 T4 f 2 f 3 f 7 T5 f 4 f 6 f 8 T6 f 2 f 4 f 6 Fault - based Prioritization T4 T5 T6 T1 T3 T2 3 faults 3 faults 3 faults 7 faults 2 faults 1 fault 9 min. 4 min. 4 min. 4 min. 3 min. 1 min. Finds 7 unique faults in 9 minutes

Time Budget: 12 minutes T1 f 1 f 2 f 4 f 5 f 6 f 7 f 8 T2 f 1 T3 f 1 f 5 T4 f 2 f 3 f 7 T5 f 4 f 6 f 8 T6 f 2 f 4 f 6 Naïve Time - based Prioritization T2 T3 T5 T1 T4 T6 1 fault 2 faults 3 faults 3 faults 3 faults 7 faults 1 min. 3 min. 4 min. 4 min. 4 min. 9 min. Finds 8 unique faults in 12 minutes

Time Budget: 12 minutes T1 f 1 f 2 f 4 f 5 f 6 f 7 f 8 T2 f 1 T3 f 1 f 5 T4 f 2 f 3 f 7 T5 f 4 f 6 f 8 T6 f 2 f 4 f 6 Average - based Prioritization T2 T1 T5 T3 T4 T6 1 fault 7 faults 3 faults 3 faults 3 faults 2 faults 1 min. 9 min. 4 min. 4 min. 4 min. 3 min. Finds 7 unique faults in 10 minutes

Time Budget: 12 minutes T1 f 1 f 2 f 4 f 5 f 6 f 7 f 8 T2 f 1 T3 f 1 f 5 T4 f 2 f 3 f 7 T5 f 4 f 6 f 8 T6 f 2 f 4 f 6 Intelligent Time - Aware Prioritization T5 T4 T1 T6 T3 T2 3 faults 3 faults 2 faults 7 faults 1 fault 3 faults 4 min. 4 min. 3 min. 9 min. 1 min. 4 min. Finds 8 unique faults in 11 minutes

Time - Aware Prioritization � Time - aware prioritization (TAP) combines: � Fault finding ability (overlapping coverage) � Test execution time � Time constrained test suite prioritization problem 0/ 1 knapsack problem � Use genetic algorithm heuristic search technique � Genetic algorithm � Fitness ideally calculated based on faults � A fault cannot be found if code is not covered � Fitness function based on test suite and test case code coverage and execution time

Prioritization Infrastructure Genetic algorithm Program Selection Create initial Number tuples/iteration Tuple 1 Tuple 2 population Maximum # of iterations Percent of test suite Crossover execution time Calculate Crossover probability fitnesses Mutation probability Addition Deletion Mutation Addition probability Deletion probability Select Add new Test adequacy criteria Best tuples Program coverage weight Next Final test Test suite generation tuple

Fitness Function Secondary Fitness Primary Fitness Use coverage information to estimate � “ goodness” of test case Test Suite 1: 70% coverage Preferred! T2: 80% T1: 40% Block coverage � Test Suite 2: 40% coverage T1: 40% T2: 80% Method coverage � Fitness function components � Overall coverage 1. Cumulative coverage of test tuple 2. Time required by test tuple 3. If over time budget, receives very low fitness �

Creation of New Test Tuples Crossover • Vary test tuples using recombination • If recombination causes duplicate test case execution, replace duplicate test case with one that is unused

Creation of New Test Tuples � Mutation � For each test case in tuple � S elect random number, R � If R < mutation probability, replace test case � Addition - Append random unused test case � Deletion - Remove random test case

Experimentation Goals � Analyze trends in average percent of faults detected (APFD) � Determine if time - aware prioritizations outperform selected set of other prioritizations � Identify time and space overheads

Experiment Design � GNU/ Linux workstations � 1.8 GHz Intel Pentium 4 � 1 GB main memory � JUnit test cases used for prioritization � Case study applications � Gradebook � JDepend � Faults seeded into applications � 25, 50, and 75 percent of 40 errors

Evaluation Metrics � Average percent of faults detected (APFD) T = test tuple g = number of faults in program under test n = number of test cases reveal(i, T) = position of the first test in T that exposes fault i ∑ g reveal i T ( , ) 1 = = − + i 1 APFD T P ( , ) 1 ng 2 n � Peak memory usage � User and system time

TAP APFD Values Block coverage preferred: 11% better in Gradebook 13% better in JDepend

TAP Time Overheads More generations with smaller populations: • Took less time • Same quality results

Gradebook: Intelligent vs Random

JDepend: Intelligent vs. Random

Other Prioritizations � Random prioritizations redistribute fault - revealing test cases � Other prioritizations � Initial ordering � Reverse ordering � Fault - aware � Impossible to implement � Good watermark for comparison

Gradebook: Alternative Prioritizations % total Fault # Faults Initial Reverse TAP time aware 0.25 10 - 0.6 - 0.2 0.43 0.7 0.25 20 - 0.9 - 0.2 0.41 0.7 0.25 30 - 0.9 - 0.0 0.46 0.5 0.50 10 - 0.04 0.1 0.74 0.9 0.50 20 - 0.2 0.2 0.74 0.9 0.50 30 - 0.3 0.3 0.72 0.8 0.75 10 0.3 0.5 0.73 0.9 0.75 20 0.1 0.4 0.71 0.9 0.75 30 0.04 0.5 0.70 0.9 • Time - aware prioritization up to 120% better than other prioritizations

Conclusions and Future Work � Analyzes a test prioritization technique that accounts for a testing time budget � Time intelligent prioritization had up to 120% APFD improvement over other techniques � Future Work � Make fitness calculation faster � Distribute fitness function calculation � Exploit test execution histories � Create termination condition based on prior prioritizations � Analyze other search heuristics

Thank you! Time - Aware Prioritization (TAP) Research: � http:/ / www.cs.virginia.edu/ ~krw7c/ TimeAwarePrioritization.htm

Regression Testing S oftware is constantly modified Bug fixes - PowerPoint PPT Presentation

Time - Aware Test Suite Prioritization Kristen R. Walcott, Gregory M. Kapfhammer, Mary Lou S offa Robert S . Roos University of Virginia Allegheny College International S ymposium on S oftware Testing and Analysis Portland, Maine July

Regression Testing vs. Regression Testing Development Testing Developed first version of

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Building a Sustainable Future Q3 2016 REVIEW November 2, 2016 Michael McCain, President and

LEAD-it-Yourself! Train-the-Trainer Workshop October 26, 2015 Seattle, WA LEAD-it-Yourself!

Minnesota Guidestar Connected Automated Vehicles (CAV) MnDOT CAV Update JAY HIETPAS CAV

Addressing the Pressing Need for Modernizing and Expanding Training and Education in Health

Strategic Planning Update Overview 2014-2017 Strategic Plan Close Out Strategic Planning

Harmonisation and interoperability Stakeholder forum 13 November 2019 This project has

ACCELERATOR WORKSHOP SESSION: LEADERSHIP & VISION Theme 3 EDUCATION ICT FRAMEWORKS

Nathlia N. Arajo, Maritza R. Gual & Marcos C. Maturana Brazil Sum umma mary ry

Regression Testing S oftware is constantly modified Bug fixes - PowerPoint PPT Presentation

Time - Aware Test Suite Prioritization Kristen R. Walcott, Gregory M. Kapfhammer, Mary Lou S offa Robert S . Roos University of Virginia Allegheny College International S ymposium on S oftware Testing and Analysis Portland, Maine July

Regression Testing vs. Regression Testing Development Testing Developed first version of

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction &amp; STRIPS Case Malte Helmert and

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Building a Sustainable Future Q3 2016 REVIEW November 2, 2016 Michael McCain, President and

LEAD-it-Yourself! Train-the-Trainer Workshop October 26, 2015 Seattle, WA LEAD-it-Yourself!

Minnesota Guidestar Connected Automated Vehicles (CAV) MnDOT CAV Update JAY HIETPAS CAV

Addressing the Pressing Need for Modernizing and Expanding Training and Education in Health

Strategic Planning Update Overview 2014-2017 Strategic Plan Close Out Strategic Planning

Harmonisation and interoperability Stakeholder forum 13 November 2019 This project has

ACCELERATOR WORKSHOP SESSION: LEADERSHIP &amp; VISION Theme 3 EDUCATION ICT FRAMEWORKS

Nathlia N. Arajo, Maritza R. Gual &amp; Marcos C. Maturana Brazil Sum umma mary ry

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

ACCELERATOR WORKSHOP SESSION: LEADERSHIP & VISION Theme 3 EDUCATION ICT FRAMEWORKS

Nathlia N. Arajo, Maritza R. Gual & Marcos C. Maturana Brazil Sum umma mary ry