Meshes/no meshes, radiation and tasks Numerical algorithms for the future of astrophysical simulations Bert Vandenbroucke bv7@st-andrews.ac.uk Pedro Gonnet (Google), Matthieu Schaller (Leiden), Josh Borrow (Durham)
Slide 2 of 51 Why do we use simulations? Sedov, L. I., Similarity and dimensional methods in mechanics, 10 th edition (CRC Press, 1993), pp. 261-290
Slide 3 of 51 Why do we use simulations? X-ray: NASA/CXC/Rutgers/J.Hughes; Optical: NASA/STScI
Slide 4 of 51 Why do we use simulations? image: EAGLE Project gravity + hydrodynamics + gas physics (cooling/heating) + star formation + stellar feedback + AGN physics
Slide 5 of 51 SOME HISTORY...
Slide 6 of 51 Holmberg experiment Holmberg, 1941
Slide 7 of 51 Hydrodynamics Grid based 1D methods However (Gingold & Monaghan, 1977): 20 cells in 1D 400 cells in 2D 8000 (!) cells in 3D 1000 cells in 1D 1,000,000 cells in 2D 1,000,000,000 cells in 3D
Slide 8 of 51 Smoothed Particle Hydrodynamics Gingold & Monaghan, 1977
Slide 9 of 51 SPH approximately 13,000 (!) particles Pongracic, 1987
Slide 10 of 51 Adaptive Meshes Berger & Colella, 1989 3 × 64 3 = 786,432 cells Saftly et al., 2013 Ruffert, 1992
Slide 11 of 51 Santa Barbara cluster comparison Dark matter Gas Frenk et al., 1999
Slide 12 of 51 Blob test Agertz et al., 2007
Slide 13 of 51 Shearing layers test SPH AMR G ADGET 2 RAMSES
Slide 14 of 51 Bulk movements BV, public PhD presentation
Slide 15 of 51 Galilean invariance Lagrangian Eulerian EXTRA BOOST VELOCITY BV & De Rijcke, 2016
Slide 16 of 51 Angular momentum Hopkins, 2015 Springel, 2010
Slide 17 of 51 Moving mesh Springel, 2010; BV & De Rijcke, 2016
Slide 18 of 51 Moving mesh BV, public PhD presentation
Slide 19 of 51 Moving mesh
Slide 20 of 51 A NOTE ABOUT COMPUTERS
Slide 21 of 51 Do computers get faster?
Slide 22 of 51 Do computers get faster?
Slide 23 of 51 Do computers get faster? Verstocken et al., 2017
Slide 24 of 51 Do computers get faster? YES, but only if your software • is incredibly parallel: – a huge amount of independent computations – the same operation for many different values • uses very little memory at a time • generates very little output to hard disk
Slide 25 of 51 Do simulations get faster? 51,000,000 SPH particles with G ADGET 2 yes no Gonnet, 2013 Short answer: NO
Slide 26 of 51 Algorithmic problems Neighbour finding with an octree Gonnet, 2013
Slide 27 of 51 Algorithmic changes Replace tree with regular grid to • exploit symmetries • split the data into small chunks • explicitly avoid conflicts NON-TRIVIAL! Gonnet, 2013
Slide 28 of 51 Task based parallelism Gonnet, 2013
Slide 29 of 51 Task based parallelism SPH mode (equal to G ADGET 2)
Slide 30 of 51 SWIFT vs G ADGET 2 Schaller, 2017 Gonnet, 2013
Slide 31 of 51 MESH-FREE HYDRODYNAMICS
Slide 32 of 51 Moving messes BV & De Rijcke, 2016
Slide 33 of 51 Mesh-free volumes Mesh-free Moving-mesh SPH Hopkins, 2015
Slide 34 of 51 Mesh-free hydrodynamics Mesh-free (SPH) Finite volume Volumes Hydro: Second order reconstruction Neighbours Riemann problem “Faces” Predictable: efficient Accurate
Slide 35 of 51 SWIZMO G ADGET 2 SPH GIZMO mesh-free
Slide 36 of 51 SWIZMO
Slide 37 of 51 Is mesh-free worth it? The SWIFT hydro scheme comparison project
Slide 38 of 51 RADIATION If out of time, click here …
Slide 39 of 51 Radiation NGC 604 HST, 1995
Slide 40 of 51 Post-processing BV et al., submitted to MNRAS https://github.com/bwvdnbro/CMacIonize N. Sartorio BV & Wood, submitted to A&C
Slide 41 of 51 Radiation hydrodynamics (RHD) BV & Wood, submitted to A&C
Slide 42 of 51 Algorithmic problems
Slide 43 of 51 A task-based alternative PACKET GENERATION PACKET PROPAGATION PACKET REEMISSION IN nothing grid (part), OUT grid (part),
Slide 44 of 51 A task-based alternative Each grid (part) has 27 buffers: • 1 internal buffer • 6 direct (face) neighbours • 12 indirect (edge) neighbours • 8 indirect (corner) neighbours The packet traversal task takes an INPUT buffer and deposits photon packets in the 27 OUTPUT buffers, according to the outgoing direction (absorbed photons are put in the internal OUTPUT buffer)
Slide 45 of 51 Example: serial run 60 × 60 × 60 cell grid, 60 subgrids 10 7 photon packets, no reemission 1 core, total run time: 151.757 s
Slide 46 of 51 Example: parallel run 60 × 60 × 60 cell grid, 60 subgrids 10 7 photon packets, no reemission 32 cores, total run time: 10.1375 s
Slide 47 of 51 Strong scaling Naive parallelization Better parallelization
Slide 48 of 51 A few lessons I learned so far • One grid vs many subgrids not an issue: equally efficient • Measuring performance is cheap! • Measuring load-balance is better than guessing it • Lots of extra tricks (subgrid copies, premature launching...), BUT mostly computer science
Slide 49 of 51 CONCLUSIONS
Slide 50 of 51 Which method should I use?
Slide 51 of 51 Which code should I use?
Recommend
More recommend