recovering numerical reproducibility in hydrodynamics
play

Recovering Numerical Reproducibility in Hydrodynamics Simulations - PowerPoint PPT Presentation

23rd IEEE Symposium on Computer Arithmetic July 10-13 2016, Santa-Clara, USA Recovering Numerical Reproducibility in Hydrodynamics Simulations Philippe Langlois, Rafife Nheili, Christophe Denis DALI, Universit de Perpignan Via Domitia LIRMM,


  1. 23rd IEEE Symposium on Computer Arithmetic July 10-13 2016, Santa-Clara, USA Recovering Numerical Reproducibility in Hydrodynamics Simulations Philippe Langlois, Rafife Nheili, Christophe Denis DALI, Université de Perpignan Via Domitia LIRMM, UMR 5506 CNRS, Université de Montpellier CMLA, ENS Cachan, France 1 / 31

  2. Recovering numerical reproducibility p procs 2 procs 4 procs Sequential Execution: Original code 2 / 31

  3. Recovering numerical reproducibility p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code 2 / 31

  4. Recovering numerical reproducibility Reproducibility: bitwise identical results for every p -parallel run, p ≥ 1 p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code Reproducible code 2 / 31

  5. Recovering numerical reproducibility Reproducibility: bitwise identical results for every p -parallel run, p ≥ 1 p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code reproducibility Reproducible code 2 / 31

  6. Recovering numerical reproducibility Reproducibility: bitwise identical results for every p -parallel run, p ≥ 1 Reproducibility � = Accuracy p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code accuracy reproducibility Reproducible code 2 / 31

  7. Recovering numerical reproducibility Reproducibility: bitwise identical results for every p -parallel run, p ≥ 1 Reproducibility � = Accuracy Failures reported in numerical simulation for energy [10], dynamic weather science [2], dynamic molecular [9], dynamic fluid [8] p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code accuracy reproducibility Reproducible code 2 / 31

  8. Recovering numerical reproducibility Reproducibility: bitwise identical results for every p -parallel run, p ≥ 1 Reproducibility � = Accuracy Failures reported in numerical simulation for energy [10], dynamic weather science [2], dynamic molecular [9], dynamic fluid [8] How to debug? How to test? How to validate? How to receive legal agreements? p procs 2 procs 4 procs Sequential Execution: Non-reproducible original code accuracy reproducibility Reproducible code 2 / 31

  9. Hydrodynamics simulation One industrial scale simulation code Simulation of free-surface flows in 1D-2D-3D hydrodynamics 300 000 loc. of open source Fortran 90 20 years, 4000 registered users, EDF R&D + international consortium Telemac 2D [3] 2D hydrodynamic: Saint Venant equations Finite element method, triangular element mesh, sub-domain decomposition for parallel resolution Mesh node unknowns: water depth (H) and velocity (U,V) 3 / 31

  10. Telemac2D: the simplest gouttedo simulation The gouttedo simulation test case 2D-simulation of a water drop fall in a square basin Unknown: water depth for a 0.2 sec time step Triangular mesh: 8978 elements and 4624 nodes Expected numerical reproducibility (time step = 1, 2, . . . ) Sequential Parallel p = 2 4 / 31

  11. A white plot displays a non-reproducible value Numerical reproducibility? time step = 1 Sequential Parallel p = 2 5 / 31

  12. A white plot displays a non-reproducible value Numerical reproducibility? time step = 2 Sequential Parallel p = 2 5 / 31

  13. A white plot displays a non-reproducible value Numerical reproducibility? time step = 3 Sequential Parallel p = 2 5 / 31

  14. A white plot displays a non-reproducible value Numerical reproducibility? time step = 4 Sequential Parallel p = 2 5 / 31

  15. A white plot displays a non-reproducible value Numerical reproducibility? time step = 5 Sequential Parallel p = 2 5 / 31

  16. A white plot displays a non-reproducible value Numerical reproducibility? time step = 6 Sequential Parallel p = 2 5 / 31

  17. A white plot displays a non-reproducible value Numerical reproducibility? time step = 7 Sequential Parallel p = 2 5 / 31

  18. A white plot displays a non-reproducible value Numerical reproducibility? time step = 8 Sequential Parallel p = 2 5 / 31

  19. A white plot displays a non-reproducible value Numerical reproducibility? time step = 9 Sequential Parallel p = 2 5 / 31

  20. A white plot displays a non-reproducible value Numerical reproducibility? time step = 10 Sequential Parallel p = 2 5 / 31

  21. A white plot displays a non-reproducible value Numerical reproducibility? time step = 11 Sequential Parallel p = 2 5 / 31

  22. A white plot displays a non-reproducible value Numerical reproducibility? time step = 12 Sequential Parallel p = 2 5 / 31

  23. A white plot displays a non-reproducible value Numerical reproducibility? time step = 13 Sequential Parallel p = 2 5 / 31

  24. A white plot displays a non-reproducible value Numerical reproducibility? time step = 14 Sequential Parallel p = 2 5 / 31

  25. A white plot displays a non-reproducible value NO numerical reproducibility! time step = 15 Sequential Parallel p = 2 5 / 31

  26. Telemac2D: gouttedo NO numerical reproducibility! Sequential Parallel p = 2 6 / 31

  27. Today’s issues Case study Industrial scale software: openTelemac-Mascaret Finite element simulation, domain decomposition, linear system solving • 2 modules: Tomawac, Telemac2D 7 / 31

  28. Today’s issues Case study Industrial scale software: openTelemac-Mascaret Finite element simulation, domain decomposition, linear system solving • 2 modules: Tomawac, Telemac2D Feasibility How to recover reproducibility? Sources of non-reproducibility? Do existing techniques apply? how easily? • Compensation yields reproducibility here! 7 / 31

  29. Today’s issues Case study Industrial scale software: openTelemac-Mascaret Finite element simulation, domain decomposition, linear system solving • 2 modules: Tomawac, Telemac2D Feasibility How to recover reproducibility? Sources of non-reproducibility? Do existing techniques apply? how easily? • Compensation yields reproducibility here! Efficiency How much to pay for reproducibility? • × 1 . 2 ↔ × 2 . 3 extra-cost which decreases as the problem size increases • OK to debug, to validate and even to simulate! 7 / 31

  30. Outline Motivation 1 Reproducibility failure in a finite element simulation 2 Sequential and parallel FE assembly Sources of non reproducibility in Telemac2D Recovering reproducibility 3 Reproducible parallel FE assembly Reproducible algebraic operations Reproducible conjugate gradient Reproducible Telemac2D Efficiency 4 Conclusion 5 8 / 31

  31. Parallel reduction and compensation techniques Non associative floating point addition The computed value depends on the operation order 9 / 31

  32. Parallel reduction and compensation techniques Non associative floating point addition The computed value depends on the operation order Parallel reduction of undefined order generates reproducibility failure a c a c b d b d a ⊕ b c ⊕ d a ⊕ b (a ⊕ b) ⊕ c (a ⊕ b) ⊕ (c ⊕ d) � = ((a ⊕ b) ⊕ c) ⊕ d 9 / 31

  33. Parallel reduction and compensation techniques Non associative floating point addition The computed value depends on the operation order Parallel reduction of undefined order generates reproducibility failure Compensate rounding errors with error free transformations a c a c b d b d e 1 e 2 a ⊕ b c ⊕ d a ⊕ b f 1 (a ⊕ b) ⊕ c f 2 e 3 (a ⊕ b) ⊕ (c ⊕ d) ((a ⊕ b) ⊕ c) ⊕ d f 3 ((a ⊕ b) ⊕ (c ⊕ d)) ⊕ (( e 1 ⊕ e 2 ) ⊕ e 3 ) = (((a ⊕ b) ⊕ c ) ⊕ d) ⊕ (( f 1 ⊕ f 2 ) ⊕ f 3 ) Should be repeted for too ill-conditionned sums 9 / 31

  34. Finite element assembly: the sequential case The assembly step: V ( i ) = � elements W el ( i ) compute the inner node values V ( i ) accumulating local W el for every el that contains i "!# *+# +# ()*+,'-# !" '# "# &# ' %&# !"# $$# The assembly loop for p = 1,np //p: triangular local number (np=3) for el = 1,nel i = IKLE(el,p) % V(i) = V(i) + W(el,p) //i: domain global number 10 / 31

  35. Finite element assembly: the sequential case The assembly step: V ( i ) = � elements W el ( i ) compute the inner node values V ( i ) accumulating local W el for every el that contains i "!# *+# +# ()*+,'-# !" '# "# &# ' %&# !"# $$# The assembly loop for p = 1,np //p: triangular local number (np=3) for el = 1,nel i = IKLE(el,p) % <–- LOOP INDEX INDIRECTION V(i) = V(i) + W(el,p) //i: domain global number 10 / 31

  36. Finite element assembly: the parallel case Interface point assembly: communications and reductions V ( i ) = � D k V ( i ) for sub-domains D k , k = 1 ... p sequential parallel sub-domains inner nodes − → interface points V ( i ) = a V D 1 ( i ) = b V D 2 ( i ) = c Interface point assembly V ( i ) = b + c = a Exact arithmetic 11 / 31

  37. Finite element assembly: the parallel case Interface point assembly: communications and reductions V ( i ) = � D k V ( i ) for sub-domains D k , k = 1 ... p sequential parallel sub-domains inner nodes − → interface points V D 1 ( i ) = � V ( i ) = � a V D 2 ( i ) = � c b Interface point assembly V ( i ) = � b ⊕ � c � = � a Floating point arithmetic 11 / 31

Recommend


More recommend