T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY - PDF document

BIO PRESENTATION SUPPLEMENTAL T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY G ROWTH M ODELING Michael Allegra GXS BETTER SOFTWARE CONFERENCE & EXPO 2005 September 19-22, 2005 Hyatt Regency San Francisco Airport San Francisco, California, USA

Michael Allegra Michael Allegra has devoted most of his 14 year career in software testing to leading and managing teams at IBM’s Global Services division. He was responsible for delivering high quality software for Fortune 500 customers and managing a billing test group responsible for $250 million in annual revenue. Michael has led process improvement activities for his organization involving CMM and industry best practices. He recently left IBM and joined GXS where he manages a test group for the leading provider of messaging applications and services.

Defect Prediction Defect Prediction with Reliability with Reliability Growth Modeling Growth Modeling Michael Allegra Michael Allegra IBM/GXS IBM/GXS

Agenda Agenda • How will this technology help me ? • Overview of Reliability Growth Models • Describe a model applicable to most software development • CASRE Tool setup (its free!) • Entering the data and format with provided templates • Interpreting the results • Summary

Why should I care? Why should I care? • Software reliability models can tell us; – Number of latent defects – ‘Good Enough’ testing – Effectiveness of current test cycle – Service level/Availability target validation – Help Desk staffing needs

Reliability Growth Modeling Reliability Growth Modeling Typical formulas Typical formulas

Reliability Growth Modeling Reliability Growth Modeling Typical formulas Typical formulas Fahgetaboutit! We aren’t covering that!

Basic Concepts Basic Concepts • Skip the Rocket Science stuff! • Reliability Growth Modeling is a subset of the Software Reliability Engineering methodology. • SRE covers a wide range of technical topics that cannot be covered in this brief tutorial. • RGM can be useful without all the SRE steps by following some basic principles

Model Types Model Types • Defect prediction models Static or Dynamic – Static • Defects per KLOC, Function Point, Class, etc • Based on historical projects • Example, test team historically finds 5 defects/FP and 1 defect escapes per every 7 found in test – 21FP in release = 105 defects found in test phase and 15 escapes or latent defects. • Useful for ballpark resource planning. • Not always accurate due to assumptions • Proven best for model level predictions

Dynamic models Dynamic models • Uses defect data from actual project as it moves through development lifecycle • Best suited for product or service level reliability, not modules • Two basic models – Full-life cycle: Rayleigh model – Qualify phase – reliability growth models

Dynamic Models;Rayleigh Dynamic Models;Rayleigh • A type of Weibull distribution • Models entire development life cycle

Dynamic Models; Reliability Growth Dynamic Models; Reliability Growth • Reliability growth is present when failure intensity decreases as test time increases • Used during Independent Test phase – Function Test, System Test, etc – When testing most closely resembles end- user activity • Two types of Reliability Growth Models; – Time Between Failures (when is next failure) – Fault Count (how many failures exist)

Recommended Model Recommended Model • Yamada S-shaped model • Fault detection rate starts flat and builds reflecting tester learning curve and test environment stability Yamada S-shaped model 80 70 60 50 defects 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 test interval

Model assumptions Model assumptions • Software is tested in a similar manner as customers • Unknown number of defects at start of formal testing • Fixes are clean • Discovered defects are quickly fixed and not present in future test intervals • Finite number of faults

Recommended Project Types Recommended Project Types • At least 10 test intervals • Expecting at least 50 defects • One month or greater test time • For iterative development: – Must allocate new functions delivered with new defects – See John Musa’s book for details

Team must agree on defect definition! Team must agree on defect definition! • The IBM definition of a defect is ‘any non-conformance to the agreed requirements, acceptance criteria, specifications, plans, standards or other input requirements.’ • Defects that are ‘opened’ and subsequently ‘cancelled’ by a tester should not be included. • Any behavior of the product that would result in a customer reporting a defect after the product is released should be considered a defect. • Consideration should also be given to the following types of defects: – Documentation – Build – Packaging – Installation – Performance – Usability

CASRE CASRE • Computer Aided Software Reliability Estimation • Developed by Jet Propulsion Laboratory • Yes, its Free! • Primarily developed for scholars, researchers, etc. • Demo will cover basic scenario step by step

CASRE CASRE • Requires MS Windows • Download CASRE 3.0 from: – www.openchannelfoundation.org/projects/CASRE_3.0 – You will need to register and fax back a license agreement. • Installation instructions are included on web site. • Verify the software starts without error. – Note: make sure the ‘casrev3.ini’ file is in your windows directory (c:\winnt)

Process flow, high level Process flow, high level 2) Import to 1) Update data file CASRE Prediction per test interval 5) Interpret 3) Select/Run Results Model EXCEL 4) Import results then Format CASRE

Create the data file Create the data file • Start this after you have about 5 test intervals. • Before you begin, gather your defect list that contains the following information – Date defect opened (sort list by date) – Number of defects per severity per date – Interval size per increment (hours/day, days/week, etc) • Format data file as shown in template • Save as type ‘internal#.dat’ (Oct18.dat) • Isolate yourself from interruption while creating!

Create defect data file Create defect data file • Test interval time • Test interval # of Defects Test Time Severity

CASRE CASRE • Start CASRE, open .dat file • Verify data imported correctly • Setup and Execute Model – Select Data Range – Select future Prediction Intervals – Choose model, ‘Yamada s-Shaped’ • Select model results • Check ‘Goodness to Fit’

CASRE CASRE DEMO IN CASRE

Import and Format Results Import and Format Results

Model Results; Predicted vs Actual Defects Model Results; Predicted vs Actual Defects Total Defects Predicted vs. Actual Predicted defects uncovered w ith 5 additional test days 60 56 50 53 40 Defects 30 20 10 0 9/26 10/3 10/10 10/17 10/24 10/31 11/7 11/14 11/21 11/28 12/5 Test Day Predicted Cumulative failures Actual Cumulative Failures If there are significant differences past the first 1/3 of the test cycle then the testing effectiveness must be reevaluated for the following: •If the actuals continue to be well below the prediction, the test coverage is probably insufficient or there are not enough people testing given the current rate of execution. •If the actuals are above the prediction then you may need to re-check your parameters in your data file and in CASRE.

Model Results; Predicted failure rate Model Results; Predicted failure rate Predicted Failure Rate by Test lnterval Failures per test hour 0.50 0.40 0.30 0.20 0.10 0.00 9/26 10/3 10/10 10/17 10/24 10/31 11/7 11/14 11/21 11/28 12/5 Estimation at each Test Iteration Push for additiona l test time if Predicted Failure rate: •still increasing •has not started to decline significantly •remains at a level that is unacceptable to the team and business organization Consider reducing test time if Predicted Failure rate: •Is at a level that is acceptable to the business, i.e. ‘we will not find enough defects to justify the cost of the remaining test intervals’ •has leveled-off at a low rate and the test coverage has hit all major functions, therefore rate is not likely to increase.

Model Results; Total Defects Predicted Model Results; Total Defects Predicted Estimated Total Defects in Product as testing progressed (95% Confidence Interval & Most Likely Estimate) 120 100 High 79 80 ects Defect Most Likely 61 60 40 Low 53 20 0 3 0 7 0 7 4 1 / 1 1 2 2 2 / 2 1 / / / / / 1 1 1 1 0 0 1 1 1 1 1 1 Estimation at each Test Iteration After a flattening of the estimated total defects, if the rate begins to rise again it can mean the scope of the testing is not sufficient and test coverage should be increased.

Model Results; Defects Predicted vs. Actual Model Results; Defects Predicted vs. Actual Actual Latent Defects 80 HIGH 75 Defects 70 65 Most Likely 60 55 50 LOW 45 12/16/03 12/23/03 12/30/03 1/6/04 1/13/04 1/20/04 1/27/04 Post GA defects •Exited test with 53 uncovered defects •Model predicted 61 defects most likely •62 defects total after 2 months release. •No further defects reported to date. Model proved to be very accurate!

T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY - PDF document

BIO PRESENTATION SUPPLEMENTAL T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY G ROWTH M ODELING Michael Allegra GXS BETTER SOFTWARE CONFERENCE & EXPO 2005 September 19-22, 2005 Hyatt Regency San Francisco Airport San

T15 5/17/2007 3:00:00 PM "F ROM S TART U P TO W ORLD C LASS T ESTING " Iris Trout

T15 Thursday, October 30, 2003 3:00 PM I NCREASE THE V ALUE OF Y OUR T ESTING WITH B USINESS -

T15 November 18, 2004 3 :00 PM T EST H ARNESSES FOR API T ESTING Michael Sonshine Intuit Inc

T15: 402.8.3 Off Detector Electronics Colin Jessop, University of Notre Dame US-MTD Technical

"Test Process Improvement on a Shoestring" Presented by: Martin Pol Polteq Test

Medley Management Inc. (NYSE: MDLY) Investor Presentation Quarter ended June 30, 2016 Important

Descriptive Statistics Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1

BC Hydros Dam Safety Program and Risk Management Processes Stephen Rigbey Director, Dam Safety,

Aer Lingus Group plc 2014 Preliminary Results 24 February 2015 1 Disclaimer | Forward looking

The graph is always greener on the other side graphing and visuals tips, and what to avoid

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

GMP Training Course 20-21 October 2009 EU GMP Requirements Sterile medicinal product Dr. Martin

Fagron H1-2018 Results Rafael Padilla, CEO Karin de Jong, CFO 3 August 2018 Everyone is Unique

CRITERIA, POST THAW ASSESSMENT AND STABILITY TESTING Elina Linetsky, Ph.D. cGMP Facility, Cell

Great Streets: Getting the Design of Buildings Right at Ground Level Mark Emery Weston

Technical Summary and Preliminary Cost Analysis for the Direct Production of 99m Tc NNSA Mo-99

The University of North Carolina at Greensboro Division of Business Affairs Vice Chancellor

Client Alert House Holds Hearing on FDAs Authority over Compounding Pharmacies Contact

Egypt for Medical Clothes - Medic Med Medic ic Facts cts Egypt for Medical clothes (Medic) is

Improving the Design Of Early Phase Stem Cell Clinical

Jobe Ofetotse URP (UB), MCRP (UCT), SP (USB) oftjob@gmail.com 28 OCTOBER 2017 STRUCTURE UCTURE

COSTIS Installation COmpact Solid Target Irradiation System (COSTIS) delivered with flange

July 23, 2020 | 8:00-9:00 am Teleconference: (647) 951-8467 or Long Distance: 1 (844) 304 -7743

Workshop S Into the Storm Ohio Storm Water Compliance in Light of the 2020 Renewal of the

T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY - PDF document

BIO PRESENTATION SUPPLEMENTAL T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY G ROWTH M ODELING Michael Allegra GXS BETTER SOFTWARE CONFERENCE & EXPO 2005 September 19-22, 2005 Hyatt Regency San Francisco Airport San

T15 5/17/2007 3:00:00 PM &quot;F ROM S TART U P TO W ORLD C LASS T ESTING &quot; Iris Trout

T15 Thursday, October 30, 2003 3:00 PM I NCREASE THE V ALUE OF Y OUR T ESTING WITH B USINESS -

T15 November 18, 2004 3 :00 PM T EST H ARNESSES FOR API T ESTING Michael Sonshine Intuit Inc

T15: 402.8.3 Off Detector Electronics Colin Jessop, University of Notre Dame US-MTD Technical

&quot;Test Process Improvement on a Shoestring&quot; Presented by: Martin Pol Polteq Test

Medley Management Inc. (NYSE: MDLY) Investor Presentation Quarter ended June 30, 2016 Important

Descriptive Statistics Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1

BC Hydros Dam Safety Program and Risk Management Processes Stephen Rigbey Director, Dam Safety,

Aer Lingus Group plc 2014 Preliminary Results 24 February 2015 1 Disclaimer | Forward looking

The graph is always greener on the other side graphing and visuals tips, and what to avoid

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

GMP Training Course 20-21 October 2009 EU GMP Requirements Sterile medicinal product Dr. Martin

Fagron H1-2018 Results Rafael Padilla, CEO Karin de Jong, CFO 3 August 2018 Everyone is Unique

CRITERIA, POST THAW ASSESSMENT AND STABILITY TESTING Elina Linetsky, Ph.D. cGMP Facility, Cell

Great Streets: Getting the Design of Buildings Right at Ground Level Mark Emery Weston

Technical Summary and Preliminary Cost Analysis for the Direct Production of 99m Tc NNSA Mo-99

The University of North Carolina at Greensboro Division of Business Affairs Vice Chancellor

Client Alert House Holds Hearing on FDAs Authority over Compounding Pharmacies Contact

Egypt for Medical Clothes - Medic Med Medic ic Facts cts Egypt for Medical clothes (Medic) is

Improving the Design Of Early Phase Stem Cell Clinical

Jobe Ofetotse URP (UB), MCRP (UCT), SP (USB) oftjob@gmail.com 28 OCTOBER 2017 STRUCTURE UCTURE

COSTIS Installation COmpact Solid Target Irradiation System (COSTIS) delivered with flange

July 23, 2020 | 8:00-9:00 am Teleconference: (647) 951-8467 or Long Distance: 1 (844) 304 -7743

Workshop S Into the Storm Ohio Storm Water Compliance in Light of the 2020 Renewal of the

T15 5/17/2007 3:00:00 PM "F ROM S TART U P TO W ORLD C LASS T ESTING " Iris Trout

"Test Process Improvement on a Shoestring" Presented by: Martin Pol Polteq Test