petascale computational fluid dynamics with python on gpus
play

Petascale Computational Fluid Dynamics with Python on GPUs F.D. - PowerPoint PPT Presentation

Petascale Computational Fluid Dynamics with Python on GPUs F.D. Witherden , P.E. Vincent Department of Aeronautics Imperial College London Introduction Computational fluid dynamics (CFD) is the bedrock of several high-tech industries.


  1. Petascale Computational Fluid Dynamics with Python on GPUs F.D. Witherden , P.E. Vincent Department of Aeronautics Imperial College London

  2. Introduction • Computational fluid dynamics (CFD) is the bedrock of several high-tech industries. • Desire amongst practitioners to perform unsteady , scale resolving simulations, within the vicinity of complex geometries .

  3. Image courtesy of A.S. Ayer

  4. The Need for FLOP/s • From The Opportunities and Challenges of Exascale Computing , US DOE, fall 2010.

  5. R MAX != R PEAK • FLOP/s are great… • if you can get them. • Most commercial codes struggle to get ~10% of peak on CPUs.

  6. PyFR • A high-order compressible Navier- Stokes solver for unstructured grids. • Designed from the ground up to run on NVIDIA GPUs. • Written entirely in Python !

  7. The Py in PyFR • Leverages PyCUDA and mpi4py . • Makes extensive use of run-time code generation . • All compute performed on device . • Overhead from the Python interpreter < 1% .

  8. The Py in PyFR • Leverages PyCUDA and mpi4py . • Makes extensive use of run-time code generation . • All compute performed on device . • Overhead from the Python interpreter < 1% .

  9. The FR in PyFR • Uses flux reconstruction (FR) approach; • can recover well-know schemes including nodal Discontinuous Galerkin (DG) methods. • Lots of element-local structured compute.

  10. The FR in PyFR • Majority of operations are block-by-panel type matrix multiplications: C A B M N K • where N ~ 105 and N ≫ (M, K).

  11. The FR in PyFR • In parallel only simple halo exchanges are required between MPI ranks.

  12. The FR in PyFR • FR is a great fit for modern hardware. • Previous GTC talks have outlined the key tenants of an efficient multi-GPU capable implementation: • GTC 2014 — PyFR: Technical Challenges of Bringing Next Generation Fluid Dynamics to GPUs • GTC 2015 — GiMMiK: Generating Bespoke Matrix Multiplication Kernels

  13. PyFR Scaling • Evaluated on the Piz Daint cluster at CSCS . • Test case is a NACA 0021 aerofoil at a high angle of attack. Animation courtesy of J.S. Park

  14. PyFR Strong Scaling 100 80 % of Peak FLOP/s 60 40 20 0 50 100 200 400 K20X GPUs

  15. PyFR Weak Scaling 100 1.31 PFLOP/s 80 % of Peak FLOP/s 60 40 20 0 2 4 8 40 80 160 2000 K20X GPUs

  16. So The Solver Scales • There’s a lot more to a code than just the solver… • and it all needs to scale .

  17. Traditional Visualisation • Traditional visualisation pipeline with PyFR:

  18. Traditional Visualisation • Traditional visualisation pipeline with PyFR:

  19. Traditional Visualisation • Disk I/O… 7000 5600 Bandwidth MiB/s • like device ↔ host transfers only 4200 2800 slower 1400 • …much slower ! 0 Device ↔ host Disk

  20. In-situ Visualisation • Cut out the middle men…

  21. In-situ Visualisation • Cut out the middle men… • Using ParaView Catalyst it is possible to avoid disk I/O …

  22. In-situ Visualisation • Pipeline with Catalyst… Solution Triangle list • majority of processing performed on the host with VTK.

  23. In-situ Visualisation • Can we do better? • Yes !

  24. In-situ Visualisation • Interface with PyFR using the plugin infrastructure. PyFR plugin CUDA pointer C++ shared library

  25. In-situ Visualisation • Pipeline with Catalyst and VTK-m… Solution Triangle list • all compute performed on the device .

  26. In-situ Visualisation • Pipeline with Catalyst and VTK-m… Solution Triangle list • all compute performed on the device .

  27. In-situ Visualisation • Kitware • NVIDIA • ORNL • Utkarsh Ayachit • Bhushan Desam • Jack Wells • T.J. Corona • Tom Fogal • Zenotech • David DeMarle • Peter Messmer • Berk Geveci • Jeremy Purches • Mark Allan • Robert Maynard • Jamil Appa • Imperial College • Robert O’Bara • Andrei Cimpoeru • Patrick O’Leary • Arvind Iyer • David Standingford • Jin Seok Park • Brian Vermeire

  28. In-situ Visualisation Animation courtesy of A.S. Ayer

  29. In-situ Visualisation Animation courtesy of A.S. Ayer

  30. Summary • Funded and supported by • Any questions? • E-mail: freddie.witherden08@imperial.ac.uk • Website: http://pyfr.org

Recommend


More recommend