multi scale application software development ecosystem on
play

Multi-scale Application Software Development Ecosystem on ARM Dr. - PowerPoint PPT Presentation

Multi-scale Application Software Development Ecosystem on ARM Dr. Xiaohu Guo STFC Hartree Centre, UK Daresbury Laboratory UK Astronomy Technology Daresbury Science and Innovation Campus Centre, Edinburgh, Scotland Warrington, Cheshire Polaris


  1. Multi-scale Application Software Development Ecosystem on ARM Dr. Xiaohu Guo STFC Hartree Centre, UK

  2. Daresbury Laboratory UK Astronomy Technology Daresbury Science and Innovation Campus Centre, Edinburgh, Scotland Warrington, Cheshire Polaris House Swindon, Rutherford Appleton Laboratory Wiltshire Harwell Science and Innovation Campus Didcot, Oxfordshire Chilbolton Observatory Stockbridge, Hampshire STFC’s Sites Isaac Newton Group of Telescopes Joint La Palma Astronomy Centre Hawaii

  3. Overview • Multiscale simulation framework • Our early porting experience on Isambard ARM thunderX2 system • Discussion and the future work

  4. Multiple Scales of Materials Modelling FF mapping Coarse graining via DL_FIELD via DL_CGMAP DL_MONTE MC via MS&MD via DL_POLY DPD & LB via DL_MESO KMC via DL_AKMC QM/MM bridging via # ChemShell

  5. Multi-scale Simulation Software Eco-system

  6. User Community Annual Downloads & Valid e Mail List Size 2010 :: DL_POLY (2+3+MULTI) - 1,000 (list end) 2017 :: DL_POLY_4 - 4,200 (list start 2011) 2016 Downloads • UK – 19.2% • EU-UK – 18.7% • USA – 11.4% • India – 10.3% DL_POLY_ • China – 9.4% web-registration • France – 5.9% • London- 5.5% • Sofia - 2.0% 4 • Beijing - 1.8% DL_POLY_ DL_POLY_ DL_POLY_ 3 C web-registratio 2 n

  7. DL_POLY: MD code Thanks to Dr. Ilian Todorov Drug polymorphs & discovery Membranes’ DNA Proteins processes strands solvation & dynamics binding Dynamics at Interfaces Dynamic processes in Crystalline & Amorphous & of Phase Metal-Organic & Organic Solids – damage and Transformations recovery Frameworks

  8. DL_MESO: Meso scale simulation Toolkit • General-purpose, highly-scalable mesoscopic simulation software (developed for CCP5/UKCOMES) – Lattice Boltzmann Equation (LBE) – Dissipative Particle Dynamics (DPD) • >800 academic registrations (science and engineering) • Extensively used for Computer Aided Formulation (CAF) project with TSB-funded industrial consortium Thanks to Dr. Michael Seaton

  9. CFD software in macro-scale region IMPORTANCE: Hartree Centre key technologies, align with SCD missions and STFC global challenge schemes. FEM SPH/ISPH Nuclear Schlumberger oil reservoir Manchester Bob Wave impact on BP oil rig NERC ocean roadmap EPSRC MAGIC CCP-WSI Tsunami

  10. Concurrent Coupling Toolkit : MUI Data Exchange Interface Data Points Yu-Hang Tang, etc. Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers, Journal of DPD and SPH Coupling Computational Physics, Volume 297, 2015, Pages 13-31

  11. Algorithms Abstraction and Programming Implementation FEM, FDM, FVM MD, DPD, SPH/ISPH Unstructured Mesh Mesh topology Particle Pre/Post Nearest Neighbour Pre/Post Processing Management Processing List Search Basic Math FEM Matrix Mesh Basic particle Particle operators Assembly Adaptivity Math operators Refinement Sparse/Dense Linear Solver DDM/DLB Mesh/Particles Reordering MPI OpenMP CUDA OpenCL OpenACC C/C++ Fortran Python

  12. Porting the software framework On ARM Platform

  13. Isambard system specification • Isambard PI: 10,752 Armv8 cores (168 x 2 x 32) • Cavium ThunderX2 32 core 2.1GHz • Prof Simon McIntosh-Smith Cray XC50 ‘Scout’ form factor • High-speed Aries interconnect • Cray HPC optimised software stack University of Bristol / • CCE, Cray MPI, math libraries, CrayPAT, GW4 Alliance … • Phase 2 (the Arm part): • Delivered Oct 22nd • Handed over Oct 29th • Accepted Nov 9th!

  14. Performance on mini-apps (node level comparisons) Thanks to Prof. Simon McIntosh-Smith

  15. Single node performance results https://github.com/UoB-HPC/benchmarks Thanks to Prof. Simon McIntosh-Smith

  16. Earlier DLPOLY Performance Results

  17. Earlier DLMESO Performance Results

  18. Earlier ISPH Performance Results

  19. Performance comparing with our Scafellpike

  20. Current Arm software ecosystem Three mature compiler suites: GNU (gcc, g++, gfortran) Arm HPC Compilers based on LLVM (armclang, armclang++, armflang) Cray Compiling Environment (CCE) Three mature sets of math libraries: OpenBLAS + FFTW Arm Performance Libraries (BLAS, LAPACK, FFT) Cray LibSci + Cray FFTW Multiple performance analysis and debugging tools: Arm Forge (MAP + DDT, formerly Allinea) CrayPAT / perftools, CCDB, gdb4hpc, etc TAU, Scalasca, Score-P, PAPI, MPE

  21. More ARM productivity features needed ! • ARM processor does not trap integer divide by Zero • Architectural decision – no signal thrown • Will return zero (1/0 == 0) • Do trap float divide by zero SIG-FPE • Need latest autoconf and automake, update your config.guess and config.sub • Weak memory model: • you threading lock-free implementation may not work here ! • How can we use Nvidia GPUs ? • More math libraries ? • DD/DLB libraries ? • Sparse linear solvers ? Particular theaded libraries ?

  22. Software Ecosystem on Isambard .

  23. Motivation: Performance Optimization Space

  24. Summary and conclusion These are early results , generated quickly in the first few days with no time to tune scaling etc. We expect the results to improve even further as we continue to work on them The software stack has been robust, reliable and high-quality (both the commercial and open source parts)

  25. Thanks, Questions ?

  26. http://gw4.ac.uk/is ambard/ GROMACS scalability, up to 8,192 cores Thanks to Prof. Simon McIntosh-Smith

Recommend


More recommend