Data Management, In-Situ Workflows and Extreme Scales Manish Parashar, Ph.D . Director, Rutgers Discovery Informatics Institute RDI 2 Distinguished Professor, Department of Computer Science Philip Davis, Shaohua Duan, Yubo Qin, Melissa Romanus, Pradeep Subedi, Zhe Wang ROSS 2018 @ HPDC’18, Tempe, AZ, USA June 12, 2018
Outline • Extreme scale simulation-based science – opportunities and challenges • Rethinking the simulations-to-insights pipeline: Data staging and in-Situ workflows • Runtime management for data staging and in-situ workflows – Data placement – Resilience • Conclusion
Science / Society Transformed by Compute & Data • computation & da The sc scientific process ss has evolved to include co data • Nearly every field of discovery is transitioning from “data poor” to “ da data rich ” Oceanography Neuroscience: EEG, fMRI OOI Crisis Management POS Terminals Economics: KSTAR Fusion: Sociology: The Web Physics: LHC Internet of Things Biology: Sequencing Astronomy: LSST Personalized Medicine
Moving Aggressively Towards Exascale • Create systems that can apply exaflops of computing power to exabytes of data • Improve HPC application developer productivity • Establish hardware technology for future HPC systems • ….
Moving Aggressively Towards Exascale
Moving Aggressively Towards Exascale Source: Hyperion (IDC) Paints a Bullish Picture of HPC Future By John Russell
Extreme Scales => Extreme Challenges • Exponential increase in parallelism • Extreme core counts, concurrency • Diversity in emerging memory and storage technologies • New memory technologies • Increasing performance gap between memory and disks • Growing data volumes, increasing in data costs • Data access costs vary widely with the location • Variability and heterogeneity in data movement cost (performance, energy) • Increasingly heterogeneous machine architecture • Complex CPU + accelerator architectures • Proliferation of accelerators • Diverse and complex application/user requirements • Complex application workflows; complex mapping onto heterogeneous systems • Large numbers of domain scientists and non-experts • Reliability, energy efficiency, correctness, ….
RDI 2 Scientific Discovery through Simulations: A BigData Problem • Scientific simulations running on current high-end computing systems generate huge amounts of data! – If a single core produces 2MB/minute on average, one of these machines could generate simulation data between ~ 170TB per hour -> ~ 700PB per day -> ~ 1.4EB per year • Successful scientific discovery depends on a comprehensive understanding of this enormous simulation data How we enable the computation scientists to efficiently manage and explore extreme scale data: “find the needles in haystack” ??
RDI 2 Traditional Simulation -> Insight Workflows Break Down Storage Servers n o i t a l u m i S a t a D w a R Analysis/Visualization Simulation Cluster Machines … Figure. Traditional data analysis pipeline – Perform data manipulations and • Traditional simulation -> insight analysis on mid-size clusters pipeline – Collect experimental / observational – Run large-scale simulation data workflows on large supercomputers – Move to analysis sites – Dump data to parallel disk systems – Perform comparison of – Export data to archives experimental/observational to – Move data to users’ sites – usually validate simulation data selected subsets
The Cost of Data Movement • The energy cost of moving data is a • Moving data between node significant concern memory and persistent storage is slow! performance gap bitrate*length 2 Energy_move_data = cross_section_area_of_wire K. Yelick, “Software and Algorithms for Exascale: Ten Ways to Waste an Exascale Computer”
We need to Rethink extreme scale simulation workflows! Traditional data analysis pipeline The costs of data movement (power and performance) are increasing and dominating! We need to Rethink extreme scale simulation workflows! – Reduce data movement – Move computation/analytics closer to the data – Add value to simulation data along the IO path In-situ workflows, In-transit processing
Some Recent Research Addressing In-Situ • Swift/T • SuperGlue – Workflow coordination, all – Standardizing glue components for applications share an MPI context, HPC workflows which is split by an execution • Landrush wrapper – Leverage heterogeneous compute • Catalyst and Libsim node resources like GPUs to run – Embed analysis/viz in simulation in-situ workflow processes using time division. • Damaris • ADIOS – Leverages dedicated cores in – Flexible I/O abstractions for end to multicore nodes to offload data end data pipelines management tasks • FlowVR • Mercury – Independent task coordination – RPC and bulk message passing across processes. across applications. • Decaf • FlexPath – Decoupled dataflow middleware – Communication between MPI for in-situ workflows applications using a reliable transport. • Bredala – Semantic data redistribution of • DataSpaces complex data structures for in-situ applications
RDI 2 Rethinking the Data Management Pipeline – Hybrid Staging + In-Situ & In-Transit Execution • Reduce data movement • Move computation/analytics closer to the data source • Process, transform data along the data path
DataSpaces: Extreme Scale Data Staging Service The DataSpaces Abstraction Virtual shared-space programming abstraction l Simple API for coordination, interaction and messaging l Distributed, associative, in-memory object store l Online data indexing, flexible querying l Adaptive cross-layer runtime management l Hybrid in-situ/in-transit execution l Efficient, high-throughput/low-latency asynchronous data transport l
The DataSpaces Staging Abstraction • In-memory storage distributed across set of cores/node • In-staging data processing, querying sharing and exchange Runtime data coupling Data staging Online data analysis and processing 15
Design Space for Staging • Location of the compute resources Analysis Tasks DRAM DRAM Simulation – Same cores as the simulation (in situ) Simulation Node Visualization – Some (dedicated) cores on the same nodes Network Communication Sharing cores with the simulation – Some dedicated nodes on the same machine DRAM DRAM – Dedicated nodes on an external resource Staging Node Processing data on remote nodes Using distinct cores on same node • Data access, placement, and persistence – Direct access to simulation data structures Node N ... – Shared memory access via hand-off / copy CPUs – Shared memory access via non-volatile near Node 2 Node 1 Staging option 1 node storage (NVRAM) DRAM Staging option 2 – Data transfer to dedicated nodes or external CPUs NVRAM resources Staging option 3 SSD DRAM • Synchronization and scheduling Hard Disk NVRAM – Execute synchronously with simulation SSD every n th simulation time step Hard Disk – Execute asynchronously Network 16
Extreme Scale Storage Hierarchies: Devices SRAM: Latency ~1X DRAM: Latency ~10X 3D-RAM: Latency ~100X NAND-SSDs: Latency ~100,000X Disks: Latency ~10 MillionX 17
Extreme Scale Storage Hierarchies: Architectures • Non-volatile memory attached to nodes or to burst-buffer nodes • Storage nodes accessed via PFS (Lustre) or object stores (DAOS) 18
Time-Sensitivity of Data Storage in Scientific Workflows Credit: Gary Grider LANL 19
Outline • Extreme scale simulation-based science – opportunities and challenges • Rethinking the simulations-to-insights pipeline: Data staging and in-Situ workflows • Runtime management for data staging and in-situ workflows – Data placement – Resilience • Conclusion 20
In-Staging Data Management • Limited DRAM capacity, and decreasing bandwidth vs. increasing data size -- need to use multiple memory levels for staging • Effectiveness of staging is sensitive to the data placement across the staging cores/nodes and the levels of the memory hierarchy – Data access latency can significant impact the overall performance of the workflows • Efficient data placement can be challenging because of the complex and dynamic data exchange/access patterns exhibited by the different components of the workflow, and by different workflows
Example: Managing Multi-tiered Data Staging in DataSpaces A multi-tiered data staging approach that leverages both DRAM and SSD to • support code coupling and data management in data-intensive simulation workflows. Efficient utility-based application-aware data placement mechanism • -- Application-aware: utilizing temporal and spatial data access attributes -- Adaptive: placing data objects dynamically based on data read patterns
Recommend
More recommend