quality assurance test development execution
play

Quality Assurance: Test Development & Execution Implementing - PDF document

Quality Assurance: Test Development & Execution Implementing Testing Ian S. King Test Development Lead Windows CE Base OS Team Microsoft Corporation Test Schedule What makes a good tester? Analytical Phases of testing Ask


  1. Quality Assurance: Test Development & Execution Implementing Testing Ian S. King Test Development Lead Windows CE Base OS Team Microsoft Corporation Test Schedule What makes a good tester? � Analytical � Phases of testing � Ask the right questions � Unit testing (may be done by developers) � Develop experiments to get answers � Component testing � Methodical � Integration testing � Follow experimental procedures precisely � System testing � Document observed behaviors, their precursors � Usability testing and environment � Brutally honest � You can’t argue with the data How do test engineers fail? Testability � Desire to “make it work” � Can all of the feature’s code paths be exercised through APIs, events/messages, etc.? � Impartial judge, not “handyman” � Unreachable internal states � Trust in opinion or expertise � Can the feature’s behavior be programmatically � Trust no one – the truth (data) is in there verified? � Failure to follow defined test procedure � Is the feature too complex to test? � Consider configurations, locales, etc. � How did we get here? � Can the feature be tested timely with available � Failure to document the data resources? � Failure to believe the data � Long test latency = late discovery of faults 1

  2. What color is your box? Designing Good Tests � Black box testing � Well-defined inputs and outputs � Treats the SUT as atomic � Consider environment as inputs � Study the gazinta’s and gozouta’s � Consider ‘side effects’ as outputs � Best simulates the customer experience � Clearly defined initial conditions � White box testing � Clearly described expected behavior � Examine the SUT internals � Trace data flow directly (in the debugger) � Specific – small granularity provides greater � Bug report contains more detail on source of defect precision in analysis � May obscure timing problems (race conditions) � Test must be at least as verifiable as SUT Types of Test Cases Manual Testing � Valid cases � Definition: test that requires direct human � What should work? intervention with SUT � Invalid cases � Necessary when: � Ariane V – data conversion error (http://www.cs.york.ac.uk/hise/safety-critical- � GUI is present archive/1996/0055.html) � Behavior is premised on physical activity (e.g. � Boundary conditions card insertion) � Fails in September? � Advisable when: � Null input � Error conditions � Automation is more complex than SUT � Distinct from invalid input � SUT is changing rapidly (early development) Types of Automation Tools: Automated Testing Record/Playback � Record “proper” run through test procedure � Good: replaces manual testing (inputs and outputs) � Better: performs tests difficult for manual � Play back inputs, compare outputs with testing (e.g. timing related issues) recorded values � Best: enables other types of testing � Advantage: requires little expertise (regression, perf, stress, lifetime) � Disadvantage: little flexibility - easily � Risks: invalidated by product change � Time investment to write automated tests � Disadvantage: update requires manual � Tests may need to change when features change involvement 2

  3. Types of Automation Tools: Types of Automation Tools: Scripted Record/Playback Script Harness � Fundamentally same as simple � Tests are programmed as modules, then run record/playback by harness � Record of inputs/outputs during manual test � Harness provides control and reporting input is converted to script � Advantage: tests can be very flexible � Advantage: existing tests can be maintained � Disadvantage: requires considerable as programs expertise and abstract process � Disadvantage: requires more expertise � Disadvantage: fundamental changes can ripple through MANY scripts Types of Automation Tools: Verb-Based Scripting Test Corpus � Module is programmed to invoke product � Body of data that generates known results behavior at low level – associated with ‘verb’ � Can be obtained from � Tests are designed using defined set of verbs � Real world – demonstrates customer experience � Advantage: great flexibility � Test generator – more deterministic � Caveats � Advantage: changes are usually localized to a given verb � Bias in data generation � Don’t share test corpus with developers! � Disadvantage: requires considerable expertise and abstract process Instrumented Code: Instrumented Code: Test Hooks Diagnostic Compilers � Code that enables non-invasive testing � Creates ‘instrumented’ SUT for testing � Profiling – where does the time go? � Code remains in shipping product � Code coverage – what code was touched? � May be enabled through � Really evaluates testing, NOT code quality � Special API � Syntax/coding style – discover bad coding � Special argument or argument value � lint, the original syntax checker � Registry value or environment variable � Complexity � Example: Windows CE IOCTLs � Very esoteric, often disputed (religiously) � Risk: silly customers…. � Example: function point counting 3

  4. Environment Management Instrumented platforms Tools � Example: App Verifier � Predictably simulate real-world situations � Supports ‘shims’ to instrument standard system � MemHog calls such as memory allocation � DiskHog � Tracks all activity, reports errors such as � Data Channel Simulator unreclaimed allocations, multiple frees, use of freed memory, etc. � Win32 includes ‘hooks’ for platform instrumentation Test Monkeys � Generate random input, watch for crash or hang � Typically, ‘hooks’ UI through message queue Finding and Managing Bugs � Primarily to catch “local minima” in state space (logic “dead ends”) � Useless unless state at time of failure is well preserved! What are the contents of a bug What is a bug? report? � Formally, a “software defect” � Repro steps – how did you cause the failure? � SUT fails to perform to spec � Observed result – what did it do? � SUT causes something else to fail � Expected result – what should it have done? � SUT functions, but does not satisfy usability � Any collateral information: return criteria values/output, debugger, etc. � If the SUT works to spec and someone wants � Environment it changed, that’s a feature request � Test platforms must be reproducible � “It doesn’t do it on my machine” 4

  5. Ranking bugs A Bug’s Life � Severity � Priority � Sev 1: crash, hang, data � Pri 1: Fix immediately loss � Pri 2: Fix before next � Sev 2: blocks feature, no release outside team workaround � Pri 3: Fix before ship � Sev 3: blocks feature, � Pri 4: Fix if nothing better workaround available to do ☺ � Sev 4: trivial (e.g. cosmetic) Regression Testing Tracking Bugs � Raw bug count � Good: rerun the test that failed � Slope is useful predictor � Or write a test for what you missed � Ratio by ranking � Better: rerun related tests (e.g. component � How bad are the bugs we’re finding? level) � Find rate vs. fix rate � Best: rerun all product tests � One step forward, two back? � Automation can make this feasible! � Management choices � Load balancing � Review of development quality When can I ship? To beta, or not to beta � Test coverage sufficient � Quality bar for beta release: features mostly work if you use them right � Bug slope, find vs. fix lead to convergence � Pro: � Severity mix is primarily low-sev � Get early customer feedback on design � Priority mix is primarily low-pri � Real-world workflows find many important bugs � Con: � Do you have time to incorporate beta feedback? � A beta release takes time and resources 5

  6. Developer Preview � Different quality bar than beta � Goals � Review of feature set � Review of API set by technical consumers � Customer experience � Known conflicts with previous version � Known defects, even crashing bugs � Setup/uninstall not completed 6

Recommend


More recommend