Many-Task Applications in the Integrated Plasma Simulator Samantha S. Foley, Wael R. Elwasif, David E. Bernholdt, Aniruddha G. Shet Oak Ridge National Laboratory Randall Bramley Indiana University
Motivation ! Computational science is moving from single SPMD codes to loosely coupled MPMD applications ! MPMD viewed through a many-task computing (MTC) paradigm: ! Some degree of data and task coupling ! Varying parallelism and runtime between tasks ! Modest number of tasks, executed in a time stepped style ! Mismatch in runtime and parallelism, and the presence of dependencies lead to poor load balancing situations MTAGS - SC10 Nov. 15, 2010 2
The Integrated Plasma Simulator (IPS) ! The Integrated Plasma Simulator (IPS) is a component framework for fusion energy simulation for the Center for Simulation of RF Wave Interactions with Magnetohydrodynamics (SWIM) ! One of three US DOE SciDAC 2 projects to explore integrated fusion simulation ! Primary directive: “Explore the targeted coupled physics interactions while constituent codes evolve independently, minimizing impact on long lived codes and other research/production use”. ! Code re-factoring and/or rewriting ruled out. MTAGS - SC10 Nov. 15, 2010 3
IPS Landscape ! Existing physics codes Framework Framework Services ! Little prior experience with coupling in the fusion community Component Adapter Component Adapter Physics App . Physics App. ! Loose coupling and State Adapter State Adapter modest data communication Plasma State ! Target platforms are leadership class facilities (Cray) Solution: evolutionary development of a light-weight Python framework that allows underlying codes to remain unchanged , provides a flexible execution environment , and loosely coupled simulation composition strategies with file-based data coupling MTAGS - SC10 Nov. 15, 2010 4
IPS: Architecture Tasks (Parallel Physics Codes) Resource Manager Task Manager 5
IPS: Levels of Parallelism 1. Tasks are parallel codes 2. Tasks of a single component can run concurrently 3. Tasks of multiple components can run concurrently 4. Multiple simulations can run concurrently within the same batch allocation and framework instance These levels of parallelism can be used to improve the resource utilization efficiency Head Node IPS Framework Simulation A Simulation B Comp 1 Comp 2 Driver Comp 1 Comp 2 Driver Comp 3 Comp 3 Batch Allocation MTAGS - SC10 Nov. 15, 2010 6
RM & TM in the IPS Batch Allocation Framework Simulation A Comp 1 Comp 2 Driver Comp 3 RM TM Queue of Tasks MTAGS - SC10 Nov. 15, 2010 7
Resource Usage Simulator (RUS) ! We created RUS to examine resource utilization and efficiency of IPS simulations ! Accurately simulates task and resource management in the IPS ! Random variation of task execution times ! RUS provides the ability to examine how the multiple levels of parallelism and characteristics of the tasks interact ! Focus on multiple simulations capability ! Ultimately, this tool will be used to inform how IPS simulations can be configured with respect to resource efficiency MTAGS - SC10 Nov. 15, 2010 8
SWIM Scenarios ! TNT Scenario ! ANT Scenario ! TORIC: ! AORSA: 4 processes, 97 ± 2 seconds 1024 processes, 1020 ± 5 seconds ! NUBEAM: ! NUBEAM: 16 processes, 115 ± 15 seconds 512 processes, 1020 ± 300 seconds ! TSC: ! TSC: 1 process, 130 ± 40 seconds 1 process, 130 ± 40 seconds Cores N Cores T A N T T Time Time MTAGS - SC10 Nov. 15, 2010 9
Multiple Simulation Task Interleaving ! Single simulation ! 43% resource efficiency ! 8 steps completed ! Two simulations ! 64% resource efficiency ! 12 total steps ! Four simulations ! 86% resource efficiency ! 16 total steps ! More physics can be done in the same time and same resources using MTC capability MTAGS - SC10 Nov. 15, 2010 10
Resource Utilization - TNT Resource efficiency Avg. time/simulation 16 cores, 4 sims, 86% effcy Cores N 16p T T 1p 4p Time MTAGS - SC10 Nov. 15, 2010 11
Resource Utilization - ANT ! >90% efficiency Avg. time/simulation Resource efficiency achievable for all multi-simulation cases ! Peak efficiencies occur at multiples of the cores needed to run each task ! E.g., 1540 cores allows 1 instance of each task to run concurrently Cores T A 1p 1024p N 512p Time MTAGS - SC10 Nov. 15, 2010 12
Study of Resource Utilization Trends ! Using RUS we examine the resource utilization efficiency of variations in SWIM workloads ! What happens to the resource utilization when multiple instances of the same simulation execute concurrently? ! What happens to the resource utilization when the time or parallelism of the tasks are varied? ! We performed four studies on the two scenarios: 1. Time scaling of TSC 2. Time scaling of NUBEAM 3. Weak parallel scaling of NUBEAM 4. Strong parallel scaling of NUBEAM ! The following graphs show the highest peak for a given number of simulations versus experiment variation (time or parallelism) MTAGS - SC10 Nov. 15, 2010 13
Scaling Trends Cores N 16p T T 1p 4p Time MTAGS - SC10 Nov. 15, 2010 14
Time Scaling of TSC TNT ANT Cores Cores N T A 16p 1p 1024p T N T 1p 512p 4p Time Time MTAGS - SC10 Nov. 15, 2010 15
Time Scaling of NUBEAM TNT ANT Cores Cores N T A 16p 1p 1024p N T 512p T 1p Time 4p Time MTAGS - SC10 Nov. 15, 2010 16
Weak Scaling of NUBEAM TNT ANT Weak scaling = Cores Cores increase work, N T A 16p increase parallelism, 1p 1024p N T same runtime T 512p 1p 4p Time Time MTAGS - SC10 MTAGS - SC10 Nov. 15, 2010 17
Strong Scaling of NUBEAM TNT ANT Strong scaling = Cores Cores same work, increase N A T 16p parallelism, decrease 1024p N 1p T 512p runtime T 1p 4p Time Time MTAGS - SC10 Nov. 15, 2010 18 18
General Observations for Many Task Execution ! Interleaving multiple simulations is an effective way to increase resource utilization efficiency ! Even small numbers of interleaved simulations (3 or 4) are sufficient for significant resource efficiency improvements ! Modest increases in allocation size produce high efficiencies ! Local maxima at larger allocation sizes tend to be lower than the first or second peak ! Great differences in parallelism of tasks provide more opportunities for effective resource utilization ! However, it is more important for the tasks to match in parallelism than in time to improve resource efficiency MTAGS - SC10 Nov. 15, 2010 19
Future Work ! Examine different SWIM simulation scenarios ! Validate and improve model using data from IPS runs ! Study impact of concurrent task execution in a single simulation ! Study, develop and include models for overheads such as task launch time, I/O, component and framework activities in RUS ! Develop the capability to use RUS as a recommendation system for IPS simulation configuration to maximize resource utilization ! Explore the impact of different scheduling algorithms and policies MTAGS - SC10 Nov. 15, 2010 20
Summary ! The IPS provides a flexible and lightweight execution environment and coupling framework for MPMD fusion energy applications ! Characteristics of fusion tasks lead to poor resource utilization ! Using RUS, we showed how the execution of small numbers of simultaneous simulations can dramatically improve resource utilization ! Through simulation of resource utilization of real and synthetic workloads, we are able to extract some preliminary guidelines for constructing more efficient coupled simulations using a many task approach MTAGS - SC10 Nov. 15, 2010 21
Questions? MTAGS - SC10 Nov. 15, 2010 22
Recommend
More recommend