MPI: 25 Years of Progress Anthony Skjellum University of Tennessee - PowerPoint PPT Presentation

MPI: 25 Years of Progress Anthony Skjellum University of Tennessee at Chattanooga Tony-skjellum@utc.edu   Formerly: LLNL, MSU, MPI Software Technology,   Verari/Verarisoft, UAB, and Auburn University   Co-authors: Ron Brightwell, Sandia   Rossen Dimitrov, Intralinks

Outline l Background l Legacy l About Progress l MPI Taxonomy l A glimpse at the past l A look toward the future

Progress l 25 years we as a community set out to standardize parallel programming l It worked J l Amazing “collective operation” (hmm.. still not complete) l Some things about the other progress too, moving data independently of user calls to MPI …

Community l This was close to the beginning …

As we all know (agree?) l MPI defined progress as a “weak” requirement l MPI implementations don’t have to move the data independently of when MPI is called l Implementations can do so l There is no need for an internally concurrent schedule to comply l For instance: do all the data movement at “Waitall” … predictable if required only to be here!

How programs/programmers achieve progress l The MPI library calls the progress engine when you call any of most MPI calls l The MPI library does it for you ▼ In the transport, MPI just shepherds lightly ▼ In an internal thread or threads periodically scheduled l You kick the progress engine (Self help) ▼ You call MPI_Test() sporadically in your user thread ▼ You schedule and call MPI_Test() in a helper thread

            Desirements l Overlap communication and Computation l Predictability / low jitter   l Later: overlap of communication, computation, and I/O   l Proviso: LJ à Must have the memory bandwidth  

MPI Implementation Taxonomy (Dimitrov) l Message completion notification blocking blocking ▼ Asynchronous (blocking) independent polling ▼ Synchronous (polling) l Message progress polling all-polling ▼ Asynchronous (independent) independent ▼ Synchronous (polling)  

Segmentation l Common technique for implementing overlapping through pipelining Segments Message m m/s m/s m/s Compute m/s Compute m/s Compute m/s Compute m Entire message Segmented message

Optimal Segmentation T ( s ) T no overlap T best s 1 s b s m

Performance Gain from Overlapping l Effect of overlapping on FFT global phase in seconds, p = 2 1.000 0.900 0.800 size Max Execution time [sec] 0.700 speedup 0.600 1M p=2 1M 1.41 0.500 2M p=2 0.400 4M p=2 2M 1.43 0.300 0.200 0.100 4M 1.43 0.000 1 2 4 8 16 32 64 Number of segments

Performance Gain from Overlapping (cont.) l Effect of overlapping on FFT global phase in seconds, p = 4 1.000 0.900 0.800 size Max Execution time [sec] 0.700 speedup 0.600 1M p=4 1M 1.31 0.500 2M p=4 0.400 4M p=4 0.300 2M 1.32 0.200 0.100 4M 1.33 0.000 1 2 4 8 16 32 64 Number of segments

Performance Gain from Overlapping (cont.) l Effect of overlapping on FFT global phase in seconds, p = 8 1.000 0.900 0.800 size Max Execution time [sec] 0.700 speedup 0.600 1M p=8 1M 1.32 0.500 2M p=8 4M p=8 0.400 0.300 2M 1.32 0.200 0.100 4M 1.33 0.000 1 2 4 8 16 32 64 Number of segments

Effect of Message-Passing Library on Overlapping l Comparison between blocking and polling modes of MPI, n = 2M, p = 2 0.500 0.450 0.400 0.350 Execution time [sec] 0.300 blocking 0.250 polling 0.200 0.150 0.100 0.050 0.000 1 2 4 8 16 32 64 Number of segments

Effect of Message-Passing Library on Overlapping l Comparison between blocking and polling modes of MPI, n = 2M, p = 8 0.500 0.450 0.400 0.350 Execution time [sec] 0.300 blocking 0.250 polling 0.200 0.150 0.100 0.050 0.000 1 2 4 8 16 32 64 Number of segments

Observations/Upshots l Completion notification method affects latency of short messages (i.e., < 4k on legacy system) l Notification method did not affect bandwidth of long messages l Short message programs ▼ Strong progress, polling notification l Long message programs ▼ Strong progress, blocking notification

Future (soon?) l MPI’s support overlap and notification mode well l Overlap is worth at most a factor of 2 (3 if you include I/O) l It is valuable in real algorithmic situations l Arguably growing in value at exascale l We need to reveal this capability broadly without the “Self help” model

Thank you l 25 years of   progress l And still going   strong … l Collective! l Nonblocking? l Persistent! l Fault Tolerant?

MPI: 25 Years of Progress Anthony Skjellum University of Tennessee - PowerPoint PPT Presentation

MPI: 25 Years of Progress Anthony Skjellum University of Tennessee at Chattanooga Tony-skjellum@utc.edu Formerly: LLNL, MSU, MPI Software Technology, Verari/Verarisoft, UAB, and Auburn University Co-authors: Ron Brightwell, Sandia

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Parallelization strategies in PWSCF (and other QE codes) MPI vs Open MP MPI Message

MPI - Message Passing Interface MPI is the mostly used message passing-standard By

The Evolution of MPI William Gropp Computer Science www.cs.uiuc.edu/ homes/ wgropp Outline 1.

Message Passing Programming Designing MPI Applications Overview Lecture will cover MPI

In Introduction to MPI Shaohao Chen Research Computing Services Information Services and

We will use the NMHC OPTECH app for polling in this session, the results will help guide our

Univision News September Poll: Poll Overview & Key Findings @SergioGarciaRs Sergio I.

POLL RESULTS: Congressional Bipartisanship Nationwide and in Battleground States 1 Voters think

Elections and Political Parties G. Elliott Morris Data Journalist DataCamp Analyzing Election

V0D 14 Nov 2018 Why 25% of Voter Polls Are Wrong V0D 2018 CTC 1 V0D 2018 CTC 2 Why 25% of

FHA PFE Learning Collaborative Quantifying the Value of Patient & Family Advisory Councils

Bayesian Post-Election Audits Ronald L. Rivest and Emily Shen Viterbi Professor of EECS MIT,

Non-Representative Polls Jennifer Kanjana @jennkanjana Sources Salganik, M. J. (2017). Bit by

MPI: 25 Years of Progress Anthony Skjellum University of Tennessee - PowerPoint PPT Presentation

MPI: 25 Years of Progress Anthony Skjellum University of Tennessee at Chattanooga Tony-skjellum@utc.edu Formerly: LLNL, MSU, MPI Software Technology, Verari/Verarisoft, UAB, and Auburn University Co-authors: Ron Brightwell, Sandia

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

MPI &amp; MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Parallelization strategies in PWSCF (and other QE codes) MPI vs Open MP MPI Message

MPI - Message Passing Interface MPI is the mostly used message passing-standard By

The Evolution of MPI William Gropp Computer Science www.cs.uiuc.edu/ homes/ wgropp Outline 1.

Message Passing Programming Designing MPI Applications Overview Lecture will cover MPI

In Introduction to MPI Shaohao Chen Research Computing Services Information Services and

We will use the NMHC OPTECH app for polling in this session, the results will help guide our

Univision News September Poll: Poll Overview &amp; Key Findings @SergioGarciaRs Sergio I.

POLL RESULTS: Congressional Bipartisanship Nationwide and in Battleground States 1 Voters think

Elections and Political Parties G. Elliott Morris Data Journalist DataCamp Analyzing Election

V0D 14 Nov 2018 Why 25% of Voter Polls Are Wrong V0D 2018 CTC 1 V0D 2018 CTC 2 Why 25% of

FHA PFE Learning Collaborative Quantifying the Value of Patient &amp; Family Advisory Councils

Bayesian Post-Election Audits Ronald L. Rivest and Emily Shen Viterbi Professor of EECS MIT,

Non-Representative Polls Jennifer Kanjana @jennkanjana Sources Salganik, M. J. (2017). Bit by

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Univision News September Poll: Poll Overview & Key Findings @SergioGarciaRs Sergio I.

FHA PFE Learning Collaborative Quantifying the Value of Patient & Family Advisory Councils