he ur Operated by Triad National Security, LLC for the U.S. Department of Energy's NNSA
Los Alamos National Laboratory LA-UR-19-24811 Understanding Storage System Challenges you for Parallel Scientific Simulations nt wo Brad Settlemyer Los Alamos National Laboratory
Los Alamos National Laboratory Outline • Intro to Computational Science • VPIC Overview • PIC Introduction • VPIC Scientific Workflow • VPIC I/O Workloads • Real VPIC I/O Challenges 4/23/19 | 3
Los Alamos National Laboratory A Brief Introduction to Computational Science 4/23/19 | 4
Los Alamos National Laboratory The Traditional Scientific Method • A method for understanding the physical world • Begins with observation • Some parts of the physical world are not well suited to observation • Galaxy formations/collisions • Climate models • Asteroid collisions • Fluid dynamics 4/23/19 | 5
Los Alamos National Laboratory Incorporating Simulation into The Scientific Method • Computer-based simulation enables new scientific inquiry • Long time-scales • Complex interactions • Dangerous interactions • Computational Challenges 4/23/19 | 6
Los Alamos National Laboratory Incorporating Simulation into The Scientific Method • Computer-based simulation enables new scientific inquiry • Long time-scales • Complex interactions • Dangerous interactions • Computational Challenges • Tightly-coupled simulations imply bulk-synchronous I/O • A single job may require months of compute time 4/23/19 | 7
Los Alamos National Laboratory 1. Create Mesh (Computational Science Workflow) Fixed Mesh Adaptive Mesh Mesh deformation (Valves, cylinders) (Turbulent combustion) (Shock propagating in fluid) 4/23/19 | 8
Los Alamos National Laboratory 2. Calculate Physics (Computational Science Workflow) • Often takes weeks or months • Figure shows particle-in-cell (PIC) method • Many other methods • Finite Element Methods • Finite Difference Methods • Monte Carlo Methods • The actual scientific question being answered typically favors one method or another 4/23/19 | 9
Los Alamos National Laboratory 3. Generate Data (Computational Science Workflow) • Simulation pauses when all processes reach some Compute/ PFS IOBB Lustre Lustre Clients Routers (Infiniband) OST interesting point in the OSS simulation • Save state to protect against a failure (checkpoint/restart) • Save state for later analysis Lustre • Machine failures and scientific MDS insight occur at different frequencies L • Once I/O is complete, simulation resumes 4/23/19 | 10
Los Alamos National Laboratory 3. Generate Data (Computational Science Workflow) • Simulation pauses when all processes reach some Compute/ PFS IOBB Lustre Lustre Clients Routers (Infiniband) OST interesting point in the OSS simulation • Save state to protect against a failure (checkpoint/restart) • Save state for later analysis Lustre • Machine failures and scientific MDS insight occur at different frequencies L • Once I/O is complete, simulation resumes 4/23/19 | 11
Los Alamos National Laboratory 4. Analyze Data (Computational Science Workflow) • Scientists analyze/visualize simulation output • Test and validate hypotheses • Source of new phenomena observations! • Automatic and in-situ analysis emerging as relevant to some scientific fields 4/23/19 | 12
Los Alamos National Laboratory What makes HPC computing unique and difficult? • Simulation Scale • Frequently billions or trillions of mesh cells (1.5PB simulations on Trinity) • Simulations run for weeks or months • Longest simulation on Trinity: 7 months • Longest I’ve heard of: 18 months • Universe tends toward disorder (entropy increases) • As simulation progresses, high % of memory is frequently modified • Tight-coupling, frequent communication due to boundary condition exchanges and load balancing over time • Large storage system requirements • Checkpoint/restart bursts to support long running jobs • Capacity to store large quantities of restart dumps and analysis data 4/23/19 | 13
Los Alamos National Laboratory An Overview of VPIC 4/23/19 | 14
Los Alamos National Laboratory Quick Particle-In-Cell (PIC) Overview Particles model material • Millions of particles per process Node 0 Node2 • Trillions of particles per simulation Fixed Mesh • Method extends to 3D well • Each process maintains a contiguous chunk of the mesh • Updates fields and materials Node 1 Node3 t 0 t 1 • Solves the Maxwell-Boltzmann kinetic equations • Applications in astrophysics, fusion, plasma interactions PIC Introduction: https://www.youtube.com/watch?v=CmhSWPpa_6w 4/23/19 | 15
Los Alamos National Laboratory Why do I/O researchers use VPIC? • Excellent scaling • Demonstrated across 4096 Trinity nodes (32k processes) • Flexible code • Popular CS languages (engine is 16k sloc C/C++) • Supports MPI, OpenMP, and Pthreads • Can be field dominant or particle dominant • Can be compute/comm/memory intensive 4/23/19 | 16
Los Alamos National Laboratory VPIC’s Simulation Science Workflow Data Retention Time Sim Initial Checkpoint Analysis Mesh Input Dump Data Set Setup Deck Forever 5 - 15x per pipeline Sampled Time-step Checkpoint Data Set Data Set Dump Campaign 4 – 8x per week 5 – 10x per week Checkpoint Time-step Dump Data Set Temporary Setup/Parameterize/ Job Simulate Job Down- Post- Viz Create Mesh Begin Physics End Sample Process Phase S3 Phase S4 Phase S5 Phase S1 Phase S2 Simulation Science Pipeline 4/23/19 | 17
Los Alamos National Laboratory VPIC Checkpoint/Restart • Essential for simulations running for long duration over thousands of nodes • Basic paradigms: N-N, N-M, N-1 • Typically the largest consumer of bandwidth/capacity • In general must store both the particles and the fields • Why?! Performance! • Approximately 80% of system memory • VPIC uses N-N file organization for checkpoint/restart 4/23/19 | 18
Los Alamos National Laboratory HPC Checkpoint Workload LANL’s Trinity Platform Mem ory ( 2 PiB, PiB/ s) Computer Burst Buffer ( 3 .5 PB, 2 .5 TB/ s) Lustre PFS ( 7 8 PB, 1 .1 TB/ s) Cam paign/ Archive 4/23/19 | 19
Los Alamos National Laboratory VPIC Time Step Data Sets • Types of data • Particles (32 – 48 bytes each) • Fields (typically <1k, but could be much more) • Cell Materials (often 0 bytes) • 2 primary methods for data reduction • Sampling (mean, spatial average, etc.) • Decimation • Scientist typically determines the processing methods needed • Frequently not well optimized • Bound on bandwidth and performed on front ends 4/23/19 | 20
Los Alamos National Laboratory VPIC Visualization • Format the data into a parallel visualization format • Paraview, Ensight, VisIt, etc • Visualization workflows are typically bound on read performance • Interactivity defeats pre-fetching algorithms • Viewing doesn’t always occur along the contiguous dimension 4/23/19 | 21
Los Alamos National Laboratory Real VPIC I/O Challenges 4/23/19 | 22
Los Alamos National Laboratory Tracking the Trajectory of High Energy particles Assumptions: • Simulation has trillions of particles • Highest energy particles only known at simulation end • Insufficient memory to track the history of each particle Goal: • Determine if the trajectory of the high-energy particles follows Fermi acceleration between magnetic islands • Highly selective queries 4/23/19 | 23
Los Alamos National Laboratory Spatial distribution of particles within energy band Assumptions • Simulation has trillions of particles • Energy distribution changing over time Goals • Filter particles by energy band to examine the spatial location of energy bands • Scan intensive workload Image and problem from “ Parallel I/O, Analysis, and Visualization of a Trillion Particle Simulation ,” Byna, et al. 4/23/19 | 24
Los Alamos National Laboratory The tip of the iceberg … • Where are the largest clusters of similarly charged particles (i.e. magnetic islands)? • Which particles have most recently moved between magnetic islands? • Which particles are moving as groups and how are they moving? • Is it possible to develop a taxonomy of formations that occur during a magnetic reconnection? • And more … 4/23/19 | 25
Los Alamos National Laboratory Conclusions • VPIC is an excellent resource for I/O researchers • Open source • Popular programming languages (subsets) • Doesn’t require exotic compilers • Highly scalable • Important scientific problems • VPIC scientists have real I/O problems • A VPIC researcher has consumed all of the Trinity storage systems inodes • Extremely small writes are an unsolved problem • Data analysis performance severely limits current insight 4/23/19 | 26
Los Alamos National Laboratory Thanks! 4/23/19 | 27
Recommend
More recommend