he ur
play

he ur Operated by Triad National Security, LLC for the U.S. - PowerPoint PPT Presentation

he ur Operated by Triad National Security, LLC for the U.S. Department of Energy's NNSA Los Alamos National Laboratory LA-UR-19-24811 Understanding Storage System Challenges you for Parallel Scientific Simulations nt wo Brad Settlemyer


  1. he ur Operated by Triad National Security, LLC for the U.S. Department of Energy's NNSA

  2. Los Alamos National Laboratory LA-UR-19-24811 Understanding Storage System Challenges you for Parallel Scientific Simulations nt wo Brad Settlemyer Los Alamos National Laboratory

  3. Los Alamos National Laboratory Outline • Intro to Computational Science • VPIC Overview • PIC Introduction • VPIC Scientific Workflow • VPIC I/O Workloads • Real VPIC I/O Challenges 4/23/19 | 3

  4. Los Alamos National Laboratory A Brief Introduction to Computational Science 4/23/19 | 4

  5. Los Alamos National Laboratory The Traditional Scientific Method • A method for understanding the physical world • Begins with observation • Some parts of the physical world are not well suited to observation • Galaxy formations/collisions • Climate models • Asteroid collisions • Fluid dynamics 4/23/19 | 5

  6. Los Alamos National Laboratory Incorporating Simulation into The Scientific Method • Computer-based simulation enables new scientific inquiry • Long time-scales • Complex interactions • Dangerous interactions • Computational Challenges 4/23/19 | 6

  7. Los Alamos National Laboratory Incorporating Simulation into The Scientific Method • Computer-based simulation enables new scientific inquiry • Long time-scales • Complex interactions • Dangerous interactions • Computational Challenges • Tightly-coupled simulations imply bulk-synchronous I/O • A single job may require months of compute time 4/23/19 | 7

  8. Los Alamos National Laboratory 1. Create Mesh (Computational Science Workflow) Fixed Mesh Adaptive Mesh Mesh deformation (Valves, cylinders) (Turbulent combustion) (Shock propagating in fluid) 4/23/19 | 8

  9. Los Alamos National Laboratory 2. Calculate Physics (Computational Science Workflow) • Often takes weeks or months • Figure shows particle-in-cell (PIC) method • Many other methods • Finite Element Methods • Finite Difference Methods • Monte Carlo Methods • The actual scientific question being answered typically favors one method or another 4/23/19 | 9

  10. Los Alamos National Laboratory 3. Generate Data (Computational Science Workflow) • Simulation pauses when all processes reach some Compute/ PFS IOBB Lustre Lustre Clients Routers (Infiniband) OST interesting point in the OSS simulation • Save state to protect against a failure (checkpoint/restart) • Save state for later analysis Lustre • Machine failures and scientific MDS insight occur at different frequencies L • Once I/O is complete, simulation resumes 4/23/19 | 10

  11. Los Alamos National Laboratory 3. Generate Data (Computational Science Workflow) • Simulation pauses when all processes reach some Compute/ PFS IOBB Lustre Lustre Clients Routers (Infiniband) OST interesting point in the OSS simulation • Save state to protect against a failure (checkpoint/restart) • Save state for later analysis Lustre • Machine failures and scientific MDS insight occur at different frequencies L • Once I/O is complete, simulation resumes 4/23/19 | 11

  12. Los Alamos National Laboratory 4. Analyze Data (Computational Science Workflow) • Scientists analyze/visualize simulation output • Test and validate hypotheses • Source of new phenomena observations! • Automatic and in-situ analysis emerging as relevant to some scientific fields 4/23/19 | 12

  13. Los Alamos National Laboratory What makes HPC computing unique and difficult? • Simulation Scale • Frequently billions or trillions of mesh cells (1.5PB simulations on Trinity) • Simulations run for weeks or months • Longest simulation on Trinity: 7 months • Longest I’ve heard of: 18 months • Universe tends toward disorder (entropy increases) • As simulation progresses, high % of memory is frequently modified • Tight-coupling, frequent communication due to boundary condition exchanges and load balancing over time • Large storage system requirements • Checkpoint/restart bursts to support long running jobs • Capacity to store large quantities of restart dumps and analysis data 4/23/19 | 13

  14. Los Alamos National Laboratory An Overview of VPIC 4/23/19 | 14

  15. Los Alamos National Laboratory Quick Particle-In-Cell (PIC) Overview Particles model material • Millions of particles per process Node 0 Node2 • Trillions of particles per simulation Fixed Mesh • Method extends to 3D well • Each process maintains a contiguous chunk of the mesh • Updates fields and materials Node 1 Node3 t 0 t 1 • Solves the Maxwell-Boltzmann kinetic equations • Applications in astrophysics, fusion, plasma interactions PIC Introduction: https://www.youtube.com/watch?v=CmhSWPpa_6w 4/23/19 | 15

  16. Los Alamos National Laboratory Why do I/O researchers use VPIC? • Excellent scaling • Demonstrated across 4096 Trinity nodes (32k processes) • Flexible code • Popular CS languages (engine is 16k sloc C/C++) • Supports MPI, OpenMP, and Pthreads • Can be field dominant or particle dominant • Can be compute/comm/memory intensive 4/23/19 | 16

  17. Los Alamos National Laboratory VPIC’s Simulation Science Workflow Data Retention Time Sim Initial Checkpoint Analysis Mesh Input Dump Data Set Setup Deck Forever 5 - 15x per pipeline Sampled Time-step Checkpoint Data Set Data Set Dump Campaign 4 – 8x per week 5 – 10x per week Checkpoint Time-step Dump Data Set Temporary Setup/Parameterize/ Job Simulate Job Down- Post- Viz Create Mesh Begin Physics End Sample Process Phase S3 Phase S4 Phase S5 Phase S1 Phase S2 Simulation Science Pipeline 4/23/19 | 17

  18. Los Alamos National Laboratory VPIC Checkpoint/Restart • Essential for simulations running for long duration over thousands of nodes • Basic paradigms: N-N, N-M, N-1 • Typically the largest consumer of bandwidth/capacity • In general must store both the particles and the fields • Why?! Performance! • Approximately 80% of system memory • VPIC uses N-N file organization for checkpoint/restart 4/23/19 | 18

  19. Los Alamos National Laboratory HPC Checkpoint Workload LANL’s Trinity Platform Mem ory ( 2 PiB, PiB/ s) Computer Burst Buffer ( 3 .5 PB, 2 .5 TB/ s) Lustre PFS ( 7 8 PB, 1 .1 TB/ s) Cam paign/ Archive 4/23/19 | 19

  20. Los Alamos National Laboratory VPIC Time Step Data Sets • Types of data • Particles (32 – 48 bytes each) • Fields (typically <1k, but could be much more) • Cell Materials (often 0 bytes) • 2 primary methods for data reduction • Sampling (mean, spatial average, etc.) • Decimation • Scientist typically determines the processing methods needed • Frequently not well optimized • Bound on bandwidth and performed on front ends 4/23/19 | 20

  21. Los Alamos National Laboratory VPIC Visualization • Format the data into a parallel visualization format • Paraview, Ensight, VisIt, etc • Visualization workflows are typically bound on read performance • Interactivity defeats pre-fetching algorithms • Viewing doesn’t always occur along the contiguous dimension 4/23/19 | 21

  22. Los Alamos National Laboratory Real VPIC I/O Challenges 4/23/19 | 22

  23. Los Alamos National Laboratory Tracking the Trajectory of High Energy particles Assumptions: • Simulation has trillions of particles • Highest energy particles only known at simulation end • Insufficient memory to track the history of each particle Goal: • Determine if the trajectory of the high-energy particles follows Fermi acceleration between magnetic islands • Highly selective queries 4/23/19 | 23

  24. Los Alamos National Laboratory Spatial distribution of particles within energy band Assumptions • Simulation has trillions of particles • Energy distribution changing over time Goals • Filter particles by energy band to examine the spatial location of energy bands • Scan intensive workload Image and problem from “ Parallel I/O, Analysis, and Visualization of a Trillion Particle Simulation ,” Byna, et al. 4/23/19 | 24

  25. Los Alamos National Laboratory The tip of the iceberg … • Where are the largest clusters of similarly charged particles (i.e. magnetic islands)? • Which particles have most recently moved between magnetic islands? • Which particles are moving as groups and how are they moving? • Is it possible to develop a taxonomy of formations that occur during a magnetic reconnection? • And more … 4/23/19 | 25

  26. Los Alamos National Laboratory Conclusions • VPIC is an excellent resource for I/O researchers • Open source • Popular programming languages (subsets) • Doesn’t require exotic compilers • Highly scalable • Important scientific problems • VPIC scientists have real I/O problems • A VPIC researcher has consumed all of the Trinity storage systems inodes • Extremely small writes are an unsolved problem • Data analysis performance severely limits current insight 4/23/19 | 26

  27. Los Alamos National Laboratory Thanks! 4/23/19 | 27

Recommend


More recommend