managing defects in hpc software development
play

Managing Defects in HPC Software Development Presented to OLCF - PowerPoint PPT Presentation

Managing Defects in HPC Software Development Presented to OLCF Webinar Series Tom Evans ORNL, PI ExaSMR ECP Applications Project November 1, 2017 Before we start Since I cannot see anyone in this presentation format, feel free to totally


  1. Managing Defects in HPC Software Development Presented to OLCF Webinar Series Tom Evans ORNL, PI ExaSMR ECP Applications Project November 1, 2017

  2. Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc 2 Defects. HPC Software

  3. Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) 2 Defects. HPC Software

  4. Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply 2 Defects. HPC Software

  5. Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply • I promise that there will be no distracting manager clip-art, sliding images, dissolution, etc. 2 Defects. HPC Software

  6. Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply • I promise that there will be no distracting manager clip-art, sliding images, dissolution, etc. • If you require sparkly things in the presentation to keep you awake, please refer back to item (1). 2 Defects. HPC Software

  7. Outline 1 Research and Software Development 2 The Complete Development Lifecycle 3 Unit Testing 4 Design-by-Contract TM 5 Summary 3 Defects. HPC Software

  8. Research and HPC Code Challenge Manage SQE with discovery Posit Consider a new algorithm implemented in a multidimensional, parallel code. • Theory predicts second-order convergence. • Computational results are first-order instead of second-order. • Is this a code bug or an error in analysis? 4 Defects. HPC Software

  9. Research and HPC Code Challenge Manage SQE with discovery Posit Consider a new algorithm implemented in a multidimensional, parallel code. • Theory predicts second-order convergence. • Computational results are first-order instead of second-order. • Is this a code bug or an error in analysis? 4 Defects. HPC Software

  10. Research and HPC Code Challenge Manage SQE with discovery Posit Consider a new algorithm implemented in a multidimensional, parallel code. • Theory predicts second-order convergence. • Computational results are first-order instead of second-order. • Is this a code bug or an error in analysis? 4 Defects. HPC Software

  11. Research and HPC Code Challenge Manage SQE with discovery Posit Consider a new algorithm implemented in a multidimensional, parallel code. • Theory predicts second-order convergence. • Computational results are first-order instead of second-order. • Is this a code bug or an error in analysis? 4 Defects. HPC Software

  12. Research and HPC Code • In other words, SQE and methods research are not only compatible, they are essential • This is especially true for parallel scientific software, which is much more difficult to design, test, and analyze than serial software. • We are interested in this case in performing software verification • Software verification is a method for removing defects at code construction time 5 Defects. HPC Software

  13. What is SQE • SQE is the practice of managing the cost and quality of a software product • Guiding Principle The cost of defect resolution increases with time from defect introduction ⋆ • Things fall apart ◮ Defects in model development ◮ Defects in algorithmic selection ◮ Defects in requirements ◮ Defects in implementation 6 Defects. HPC Software

  14. How to mitigate defects • There are many methods for defect management • Three techniques we use for software verification in an HPC environment ◮ The complete development lifecycle ◮ Unit-testing ◮ Design-by-Contract TM • This list is by no means exhaustive (or a complete SQE process) ◮ Notably missing, reviews ◮ We do them, they work, but I’m not here to talk about them • However, taken together these can help catch defects before they become an unbearable expense 7 Defects. HPC Software

  15. Requirements Management in Scientific Software • Requirements can be very difficult to pin down in scientific software development: ◮ the vector keeps changing as new things are learned ◮ as a community we often know what we want, but aren’t necessarily good at saying it • Software verification helps disambiguate language-based requirements into functional specifications • As requirements change, software verification helps ensure that the software is keeping pace. • Agility is key in scientific software development: ◮ rapid prototyping ◮ testing new methods, algorithms, and features 8 Defects. HPC Software

  16. Complete Development Lifecycle • The developer is responsible for the complete implementation of a feature including: ◮ Requirements ◮ Derivation ◮ Construction ◮ Deployment • Documentation and verification is implicit in each phase • Reviews and team collaboration are essential Developers are responsible for all phases of code development 9 Defects. HPC Software

  17. Unit Testing Unit testing is a form of software verification • It ensures that each part of the software performs its contracted task • The effectiveness of unit-testing is greatly enhanced by the following two code design practices: ◮ Acyclic code design ◮ Design-by-Contract TM (see later) We practice a method of unit testing in which the unit test is written either before, or concurrently with, the executable code. 10 Defects. HPC Software

  18. Acyclic Code Design Geometry Geometry, Physics Physics Tallier There are no physical or logical cyclic dependencies RTK_Core_Geometry <T:RTK_Array<RTK_Array<RTK_Cell>>> Geometry, Physics Domain_Transporter Boundary_Mesh Geometry, Physics Eigenvalue_Solver <<bind>> T RTK_Geometry Geometry, Physics Geometry, Physics Source_Transporter Solver T RTK_Array Geometry, Physics Fixed_Source_Solver Geometry, Physics Geometry, Physics DR_Source_Transporter DD_Source_Transporter RTK_Cell Allows hierarchical testing 11 Defects. HPC Software

  19. An Example—Reactor Geometry Figure: Small modular reactor core model. 12 Defects. HPC Software

  20. An Example—Reactor Geometry Sample starting neutron 1 13 Defects. HPC Software

  21. An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) 13 Defects. HPC Software

  22. An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) Calculate distance to boundary 3 13 Defects. HPC Software

  23. An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) Calculate distance to boundary Process collision 3 lk Move particle 4 Tally state data 5 φ = 1 V ∑ l k k 13 Defects. HPC Software

  24. An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) x Calculate distance to boundary 3 Move particle 4 Tally state data 5 φ = 1 V ∑ l k k Repeat 2–5 6 13 Defects. HPC Software

  25. First Level— RTK_Cell T RTK_Geometry + initalize() + distance_to_boundary() • Here is the class diagram for the + move_across_surface() + move_within_cell() <<bind>> RTK_Core_Geometry + position() RTK_Geometry part of the code <T:Core> + direction() + change_direction() + reflect() + boundary_state() T Lattice RTK_Array <T:RTK_Cell> <<bind>> + initialize() + distance_to_boundary() + update_state() + cross_surface() + find_object() <<bind>> +matid() Core <T:Lattice> RTK_Cell + initialize() + distance_to_boundary() + update_state() + cross_surface() + matid() 14 Defects. HPC Software

Recommend


More recommend