MPIBlib: Benchmarking MPI Communications for Parallel Computing on - PowerPoint PPT Presentation

Introduction MPIBlib benchmarking suite Conclusion MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters Alexey Lastovetsky Vladimir Rychkov Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen.OFlynn } @ucd.ie Heterogeneous Computing Laboratory School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland http://hcl.ucd.ie The 15th European PVM/MPI Users Group conference September 9, 2008, Dublin, Ireland Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Accurate estimation of the execution time of MPI communication operations plays an important role in optimization of parallel applications: ◮ Design of parallel applications ◮ Tuning collective communication operations ◮ Heterogeneous platforms Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Accurate estimation of the execution time of MPI communication operations plays an important role in optimization of parallel applications: ◮ Design of parallel applications ◮ Tuning collective communication operations ◮ Heterogeneous platforms ◮ MPI benchmarking suites mpptest, NetPIPE, IMB(PMB), SKaMPI, MPIBench ◮ Measurement of the execution time of MPI functions - fixed set of communication operations to be measured (except SKaMPI) ◮ A benchmark methodology - a single timing method ◮ Not much interpretation of results - executables and plotting Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Communication performance modeling - interpretation of results The procedure of the estimation of parameters determines what amount of experimental results and what communication experiments are required Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Communication performance modeling - interpretation of results The procedure of the estimation of parameters determines what amount of experimental results and what communication experiments are required ◮ Results of experiments should be available dynamically - MPI benchmarking library ◮ The communication operations measured by benchmarking suite should be customized - user-defined communication experiments ◮ The efficiency of measurements is crucial for the modeling at runtime (less accurate can be acceptable) - selection of timing methods Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Benchmark methodology Gropp, W., Lusk E.: Reproducible Measurements of MPI Performance Characteristics . In: Dongarra, J., Luque, E., Margalef, T. (eds.) EuroPVM/MPI 1999. LNCS, vol. 1697, pp. 1118, Springer (1999) ◮ Repeating the communication operation multiple times to obtain the reliable estimation of its execution time ◮ Selecting message sizes adaptively to eliminate artifacts in a graph of the output ◮ Testing the communication operation in different conditions: cache effects, communication and computation overlap, communication patterns, non-blocking communication etc. Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion ◮ Benchmark methodology Gropp, W., Lusk E.: Reproducible Measurements of MPI Performance Characteristics . In: Dongarra, J., Luque, E., Margalef, T. (eds.) EuroPVM/MPI 1999. LNCS, vol. 1697, pp. 1118, Springer (1999) ◮ Repeating the communication operation multiple times to obtain the reliable estimation of its execution time ◮ Selecting message sizes adaptively to eliminate artifacts in a graph of the output ◮ Testing the communication operation in different conditions: cache effects, communication and computation overlap, communication patterns, non-blocking communication etc. ◮ Common features on MPI benchmarking suites ◮ computing an average, minimum, maximum execution time of a series of the same communication experiments to get accurate results; ◮ measuring the communication time for different message sizes - the number of measurements can be fixed or adaptively increased for messages when time is fluctuating rapidly; ◮ performing simple statistical analysis by finding averages, variations, and errors. Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion Scheduling the communication experiment ◮ Series of communications - overlapping Intel MPI Benchmarks Scatter Gather 0.016 0.3 Execution time (sec) Execution time (sec) 0.012 0.225 0.008 0.15 0.004 0.075 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Message size (KB) Message size (KB) single (min) multi (avg) single (min) multi (avg) single (max) single (max) ◮ Isolation of communication operations from each other - barrier, reduce, short acknowledgments overlapping with these communications Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion Timing methods - based on MPI Wtime ◮ General - the time between two events: ◮ on a single designated processor ( root ) ◮ on all participating processors ( max ) ◮ on different processors ( global ) Global timing is the most accurate but the costliest if MPI global timer is not supported by a platform (regular clock synchronization required) Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Motivation MPIBlib benchmarking suite Related work Conclusion Timing methods - based on MPI Wtime ◮ General - the time between two events: ◮ on a single designated processor ( root ) ◮ on all participating processors ( max ) ◮ on different processors ( global ) Global timing is the most accurate but the costliest if MPI global timer is not supported by a platform (regular clock synchronization required) ◮ Operation-specific Supinski, B. de, Karonis, N.: Accurately measuring MPI broadcasts in a computational grid . In: The 8th International Symposium on High Performance Distributed Computing, pp. 2937 (1999) 0 1 2 3 Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Features MPIBlib benchmarking suite Customization of communication operations Conclusion MPIBlib benchmarking suite ◮ Implemented as a library - can be integrated into applications ◮ Provides general and operation-specific timing methods ◮ Supports extension of the communication operations to be measured Input accuracy parameters ◮ minimum/maximum numbers of repetitions if min reps == max reps, the fixed number of measurement ◮ confidence level and error of estimation if min reps < max reps, the number of measurement depends on statistics Output accuracy parameters ◮ number of repetitions ◮ confidence interval Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

Introduction Features MPIBlib benchmarking suite Customization of communication operations Conclusion Different timing methods on 16 node heterogeneous cluster Scatter Gather 0.016 0.3 Execution time (sec) Execution time (sec) 0.012 0.225 0.008 0.15 0.004 0.075 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Message size (KB) Message size (KB) root max global root max global Timing method Scatter Gather 0..100KB, 1KB stride, 1 rep (sec) 0..100KB, 1KB stride, 1 rep (sec) Global 28.7 44.7 Maximum 0.8 15.6 Root 0.8 15.7 Alexey Lastovetsky, Vladimir Rychkov, Maureen O’Flynn { Alexey.Lastovetsky, Vladimir.Rychkov, Maureen. MPIBlib: Benchmarking MPI Communications for Parallel Com

MPIBlib: Benchmarking MPI Communications for Parallel Computing on - PowerPoint PPT Presentation

Introduction MPIBlib benchmarking suite Conclusion MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters Alexey Lastovetsky Vladimir Rychkov Maureen OFlynn { Alexey.Lastovetsky,

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

PMPA/MPI Statistics and PMPA/MPI Statistics and Benchmarking Project Benchmarking Project Magda

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

c p e c Writing Message-Passing Parallel Programs with MPI Edinburgh Parallel Computing Centre

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

MPI Internals Advanced Parallel Programming Overview MPI Library Structure Point-to-point

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

In Introduction to MPI Shaohao Chen Research Computing Services Information Services and

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &

BEST PRACTICES WHEN BENCHMARKING CUDA APPLICATIONS Bill Fiser Senior System Software Engineer

Modelica/Scicos Modelica : language for modeling physical systems. Originally continuous-time

Dynamo: Amazons Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan

Guaranteed phase synchronization of hybrid oscillators using symbolic Eulers method Jawher

AEROBIC AND ANAEROBIC BIODEGRADABILITY OF ACCUMULATED SOLIDS IN HORIZONTAL SUBSURFACE FLOW

Lee Tunnel Main Pump Rag Test Lessons Learned from an Extraordinary Pump Test Maik Ulmschneider,

PRESENTATION ON MAR TYPE STP PRESENTATION ON MAR TYPE STP WHAT IS MAR WHAT IS MAR

Sambuz

Useful Links

Newsletter

Mail Us

MPIBlib: Benchmarking MPI Communications for Parallel Computing on - PowerPoint PPT Presentation

Introduction MPIBlib benchmarking suite Conclusion MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters Alexey Lastovetsky Vladimir Rychkov Maureen OFlynn { Alexey.Lastovetsky,

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

PMPA/MPI Statistics and PMPA/MPI Statistics and Benchmarking Project Benchmarking Project Magda

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

c p e c Writing Message-Passing Parallel Programs with MPI Edinburgh Parallel Computing Centre

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

MPI Internals Advanced Parallel Programming Overview MPI Library Structure Point-to-point

MPI &amp; MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

In Introduction to MPI Shaohao Chen Research Computing Services Information Services and

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &amp;

BEST PRACTICES WHEN BENCHMARKING CUDA APPLICATIONS Bill Fiser Senior System Software Engineer

Modelica/Scicos Modelica : language for modeling physical systems. Originally continuous-time

Dynamo: Amazons Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan

Guaranteed phase synchronization of hybrid oscillators using symbolic Eulers method Jawher

AEROBIC AND ANAEROBIC BIODEGRADABILITY OF ACCUMULATED SOLIDS IN HORIZONTAL SUBSURFACE FLOW

Lee Tunnel Main Pump Rag Test Lessons Learned from an Extraordinary Pump Test Maik Ulmschneider,

PRESENTATION ON MAR TYPE STP PRESENTATION ON MAR TYPE STP WHAT IS MAR WHAT IS MAR

Sambuz

Useful Links

Newsletter

Mail Us

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &