A Study of Network Quality of Service in Many-Core MPI Applications - PowerPoint PPT Presentation

A Study of Network Quality of Service in Many-Core MPI Applications Lee Savoie 1 , David Lowenthal 1 , Bronis de Supinski 2 , Kathryn Mohror 2 1 The University of Arizona, 2 Lawrence Livermore National Laboratory

Introduction Core counts increasing in high performance computing • (HPC) Many machines already include many-core accelerators • Many-core nodes process more data • The network must work harder to transfer data between • nodes 2

Network Contention “There goes the neighborhood: performance degradation due to nearby jobs” (Bhatele et al., SC 13) 3

Fat-tree Contention HPC systems with many-core nodes need better network • management 4

Quality of Service (QoS) Most networks provide QoS mechanisms for network • management In Infiniband: • Packets are marked with a service level (SL) • Each SL has a priority • SL 1, priority 1 Network SL 2, priority 3 5

Research Question Can we improve the performance of contending jobs on • HPC systems using QoS? This will enable HPC systems to handle the increased data demands of • many-core nodes. This work focuses on per-job QoS • Each job runs in a separate service level • Each job is guaranteed a minimum amount of bandwidth • 6

Experimental Set Up 300 node machine • Left 20 nodes free in case of failures • No other jobs running • Service levels with priorities 2286:254:9:1 • Applications • QBox • Crystal Router • MILC • pF3D • Micro-benchmarks • 7

Micro-Benchmarks Flood-Pairs Nearest-Neighbor All-to-all Random-Pairs 8

Methodology Ran 4 jobs at a time • 70 nodes each • 22 ranks per node • Assigned nodes to jobs randomly • Repeated tests several times with different node assignments • Restarted each job when it completed to maintain • contention profile until all jobs completed at least once Ran the following tests • Ideal – each job running in isolation • Default – all jobs in the same service level • All assignments of jobs to 4 service levels • 9

Results: Micro-Benchmarks Per-job QoS is insufficient to improve performance. • 10

Flood-pairs Rank Timing Only a few ranks need to be prioritized. • 11

Nearest-neighbor Rank Timing High Priority Contended 12

Per-Rank QoS Prioritizing an entire job gives high priority to some ranks • that are already fast. This slows down other jobs, erasing any throughput • improvement. What if we prioritize only the slowest ranks? • Requires prioritizing only ~10% of ranks • Same performance as prioritizing the entire job • Expect significant reduction in impact on other jobs • This is the subject of ongoing research • 18

Related Work QoS has been studied for a long time • Jokanovic et al. (2012) came to opposite conclusions • Segregate jobs into SLs with different priorities • 59% contention reduction • Possible reasons for the difference: • Simulation vs hardware • Future vs current hardware • Different service levels • 19

Different Service Levels QoS in HPC deserves more research • 20

Conclusion Many-core nodes will require efficient networks to move • data around Simple, per-job QoS is unlikely to improve performance • Differs from previous work • Per-rank QoS is more promising • Further research is needed to understand QoS in HPC • lsavoie@cs.arizona.edu http://www.cs.arizona.edu/people/lsavoie/ 21

Backup 22

Per-Job QoS No QoS: Job 1 Job 2 Network Job 3 QoS: Job 1, priority 1 Network Job 2, priority 3 Job 3, priority 2 23

Related Work QoS has been applied to: • The internet [Blake 1998] • Video streaming [Ke 2005, Kumwilaisak 2003] • Clouds and data centers [Voith 2012] • Wireless networks [Andrews 2001] • Divide traffic across SLs with the same priority to avoid • head of line blocking [Subramoni 2010, Guay 2011] We use service levels with different priorities • Other methods of dealing with contention • Adaptive routing [Jain 2014] • Job placement [Yang 2016, Jokanovic 2015] • These methods are complimentary to ours and insufficient on their • own 24

Results: Applications Per-job QoS is insufficient to improve performance. • 25

A Study of Network Quality of Service in Many-Core MPI Applications - PowerPoint PPT Presentation

A Study of Network Quality of Service in Many-Core MPI Applications Lee Savoie 1 , David Lowenthal 1 , Bronis de Supinski 2 , Kathryn Mohror 2 1 The University of Arizona, 2 Lawrence Livermore National Laboratory Introduction Core counts

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Casey Rosenthal @caseyrosenthal Part One. SERVICE A SERVICE B SERVICE C SERVICE D SERVICE E

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Mail Service Quality Support: Mail Service Quality Support: Mail Service Quality Support: Mail

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Parallelization strategies in PWSCF (and other QE codes) MPI vs Open MP MPI Message

RSVP FOR QOS: What role for the IETF? Terminology RSVP has two major historical uses: making

Brief Introduc.on to Python and Network Programming Phani Vadrevu pvadrevu@uga.edu

A Crash Course in Python Based on Learning Python By Mark Lutz & David Ascher, O'Reilly

Procedure Procedure: a description of a computation that, given an input, produces an output.

Telematics 2 & Performance Evaluation Chapter 2 Quality of Service in the Internet

QoS Negotiation in Real-Time No critique for Mondays class Systems By Stephanie McCarthy

A survey of QoS architectures Cristina Aurrecoechea, Andrew T. Campbell, Linda Hauw Center for

A Constraint-Based Approach to Quality Assurance in Service Choreographies c, 1 Manuel Carro, 1 ,

A Study of Network Quality of Service in Many-Core MPI Applications - PowerPoint PPT Presentation

A Study of Network Quality of Service in Many-Core MPI Applications Lee Savoie 1 , David Lowenthal 1 , Bronis de Supinski 2 , Kathryn Mohror 2 1 The University of Arizona, 2 Lawrence Livermore National Laboratory Introduction Core counts

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Casey Rosenthal @caseyrosenthal Part One. SERVICE A SERVICE B SERVICE C SERVICE D SERVICE E

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Mail Service Quality Support: Mail Service Quality Support: Mail Service Quality Support: Mail

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

MPI &amp; MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Parallelization strategies in PWSCF (and other QE codes) MPI vs Open MP MPI Message

RSVP FOR QOS: What role for the IETF? Terminology RSVP has two major historical uses: making

Brief Introduc.on to Python and Network Programming Phani Vadrevu pvadrevu@uga.edu

A Crash Course in Python Based on Learning Python By Mark Lutz &amp; David Ascher, O'Reilly

Procedure Procedure: a description of a computation that, given an input, produces an output.

Telematics 2 &amp; Performance Evaluation Chapter 2 Quality of Service in the Internet

QoS Negotiation in Real-Time No critique for Mondays class Systems By Stephanie McCarthy

A survey of QoS architectures Cristina Aurrecoechea, Andrew T. Campbell, Linda Hauw Center for

A Constraint-Based Approach to Quality Assurance in Service Choreographies c, 1 Manuel Carro, 1 ,

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

A Crash Course in Python Based on Learning Python By Mark Lutz & David Ascher, O'Reilly

Telematics 2 & Performance Evaluation Chapter 2 Quality of Service in the Internet