MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012

Outline • MPI-1 standards • MPICH-1 • MPI-2 • MPICH-2 • MPI-3

Overview • MPI (Message Passing Interface) • Specification for a standard library for message passing • Defined by MPI forum • Designed for high performance • on both massively parallel machines and on workstation clusters. • Widely available • both free available and vendor-supplied implementations

Goals • To develop a widely used standard for writing message-passing programs. • Establish a practical, portable, efficient, and flexible standard for message passing. • Design an application programming interface (not necessarily for compilers or a system implementation library). • Allow efficient communication: Avoid memory-to-memory copying and allow overlap of computation and communication and offload to communication co-processor, where available. • Allow for implementations that can be used in a heterogeneous environment. • Allow convenient C and Fortran 77 bindings for the interface. • Assume a reliable communication interface: the user need not cope with communication failures. Such failures are dealt with by the underlying communication subsystem.

Example #include <mpi.h> int main(int argc, char **argv){ /* Initialize MPI */ MPI_Init(&argc, &argv); /* Find out my identity in the default communicator */ int my_rank; MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); int number ; if (my_rank == 0) { number = -1; MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD); } else if (my_rank == 1) { MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); printf("Process 1 received number %d from process 0\n", number); } /* Shut down MPI */ MPI_Finalize(); return 0; }

MPI-1 • Point-to-point communication • basic, pairwise communication (i.e., send and receive) • Collective operations • process-group collective communication operations (i.e., barrier, broadcast, scatter, gather, reduce ) • Process groups & communication contexts • how groups of processes are formed and manipulated, how unique communication contexts are obtained, and how the two are bound together into a communicator (i.e., MPI_COMM_WORLD) • Process topologies • explains a set of utility functions meant to assist in the mapping of process groups (a linearly ordered set) to richer topological structures such as multi- dimensional grids.

MPI-1 contd. • Bindings for Fortran 77 and C • gives specific syntax in Fortran 77 and C, for all MPI functions, constants, and types. • Environmental Management and inquiry • explains how the programmer can manage and make inquiries of the current MPI environment • Profiling interface • ability to put performance profiling calls into MPI without the need for access to the MPI source code

MPICH • Freely available implementation of MPI specification • Argonne National Laboratory, Mississippi State University • Portability and High-performance • “CH” => “Chameleon” • Symbol of adaptability • Other – LAM, CHIMP-MPI, Unify etc. • Focus on the work station environment

Portability of MPICH • Distributed-memory Parallel Supercomputer • Intel Paragon, IBM SP2, Meiko CS-2, Thinking Machines CM-5, Ncube-2, Cray T3D • Shared-memory architectures • SGI Onyx, Challenge, Power Challenge, IBM SMP's the Convex Exempler, the Sequent Symmetry • Networks of Workstations • Ethernet-connected Unix workstations (may be of multiple vendors) • Sun, DEC, HP, SGI, IBM, Intel

MPICH Architecture • ADI ( A bstract D evice I nterface ) • Central mechanism for portability • Many implementations of ADI • MPI functions are implemented in terms of ADI macros and function • Not MPI library specific – can be used for any high- level message passing library

ADI • A set of function definitions • Four set of functions • Specifying a message to be sent or received • Moving data between the API and the message passing h/w • Managing list of pending messages (both sent or received) • Providing basic information about the execution environment (i.e., how many tasks are there)

Upper Layer

Lower Layer

Features of MPICH • Groups • An ordered list of process identifiers • Stored as an integer array • Process's rank in a group is its index in the list • Communicators • MPICH intracommunicators and intercommunicators uses same structure • Both has a local group and a remote group – identical (intra) or disjoint (inter) • Send and receive context – equal (intra) or different (inter) • Contexts are integers

Features of MPICH • Collective operations • Implemented on top of point-to-point operations • Some vendor-specific collective operations (Meiko, Intel and Convex) • Job Startup • MPI forum did not standardize the mechanism for starting jobs • mpirun mpirun -np 12 myprog

Features of MPICH • Command-Line Arguments and Standard I/O mpirun –np 64 myprog –myarg 13 < data.in > results.out mpirun –np 64 –stdin data.in myprog –myarg 13 > results.out • Useful commands mpicc –c myprog.c

MPE (Multi-Processing Environment) Extension Library • Parallel X graphics – routines to provide all processes with access to a shared X display • Logging – time stamped event trace file • Sequential sections – one process at a time, in rank order • Error handling – MPI_Errhandler_set

Contributions of MPICH • MPICH has succeeded in popularizing the MPI standard • Encouraging vendors to provide MPI to their customers • By helping to create demand • By offering them a convenient starting point

MPI-2 • Parallel I/O • Dynamic process management • One-sided communication • New language bindings – C++ & F90

Sequential I/O 0 1 2 3 • Good for small process numbers (~100) and small datasets (~MB) • Not good for big process numbers (~ 100K) and big datasets (~TB)

Parallel I/O FILE P(n-1) P0 P1 P2 • Multiple processes of a parallel program accessing data from a common file • Each process access a chunk of data using individual file pointers • MPI_File_open, MPI_File_seek, MPI_File_read, MPI_File_close

One-Sided Communication • Remote Memory Access (RMA) • Window – specific region of process memory made available for RMA by other processes • MPI_Win_create – called by all processes within a communicator • Origin: the process that performs the call • Target: the process in which memory is accessed • Communication calls • MPI_Get: Remote read • MPI_Put: Remote write • MPI_accumulate

One-sided communication MPI_Send MPI_Recv MPI_Get MPI_Put

Dynamic process mgt. • MPI-1 • Does not specify how processes will be created • Does not allow processes to enter or leave a running parallel application • MPI-2 • Start new process, send them signals, find out when they die, establish communication between two processes

MPICH-2 • ADI 3 – provides routines to support MPI-1 & 2 • Two types of RMA operations • Active target – target process must call an Mpi routine • Origin calls MPI_Win_start/MPI_Win_complete • Target calls MPI_Win_post/MPI_Win_wait • Passive target - target process not required to call any MPI routine • Origin calls MPI_Win_lock/MPI_Win_unlock

MPICH-2 • Dynamic process • There are no absolute and global process ids • No data structure that map a process rank into a “global rank” (i.e., rank in MPI_COMM_WORLD) • All communications are considered locally in terms of possible virtual connections to processes • Arrays of virtual connections indexed by rank

MPI-3 • Improved scalability • Better support for multi-core, cluster & application • Proposed => MPI_Count (larger than integer) • Extension of collective operations • Include non-blocking • Sparse collective operations • MPI_Sparse_gather

MPI-3 • Extension of one-sided communication • To support RMA to arbitrary locations, no constraints (symmetric allocation or collective window creation) on memory • RMA operations that are imprecise (such access to overlapping storage) must be permitted, even if the behavior is undefined • The required level of consistency, atomicity, and completeness should be flexible • Read-modify-write operations and compare and swap are needed for efficient algorithms • MPI_Get_accumulate, MPI_Compare_and_swap • Backward compatibility

References • http://www.mcs.anl.gov/research/projects/mpi/ • http://www.mpi-forum.org • A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard - W. Gropp et al • MPI-2: Extending the Message Passing Interface - Al Geist et al • MPICH Abstract Device Interface, version 3.3 Reference Manual • http://meetings.mpi-forum.org/presentations/MPI_Forum_SC10.ppt.pdf • http://wissrech.ins.uni- bonn.de/teaching/seminare/technum/pdfs/iseringhausen_mpi2.pdf • www.sdsc.edu/us/training/workshops/docs

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 - PowerPoint PPT Presentation

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards MPICH-1 MPI-2 MPICH-2 MPI-3 Overview MPI (Message Passing Interface) Specification for a standard library for message passing

MPICH: A High Performance Open-Source MPI Implementation SC12 Birds of a

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Parallelization strategies in PWSCF (and other QE codes) MPI vs Open MP MPI Message

MPI - Message Passing Interface MPI is the mostly used message passing-standard By

The Evolution of MPI William Gropp Computer Science www.cs.uiuc.edu/ homes/ wgropp Outline 1.

Message Passing Programming Designing MPI Applications Overview Lecture will cover MPI

In Introduction to MPI Shaohao Chen Research Computing Services Information Services and

Building Efficient ML Pipelines and Responsible AI Solutions Adi Polak Microsoft @adipolak

Fast and Memory-Efficient Multi-Agent Pathfinding Ko-Hsin Cindy Wang & Adi Botea NICTA &

Adi and iso according to CMB and LSS Vesa Muhonen Helsinki Institute of Physics In collaboration

ADI, LLC, USA A Presentation to the FCSM Wednesday, 11 Jan 2012 1 The Production Data Quality

Cryptanalysi Ben Nassi Raz Ben-Netanel s Prof. Adi Shamir Prof. Yuval Elovici Agenda 1)

Electrons for neutrinos Lawrence Weinstein Old Dominion University Neutrino Cross Section

Linguistic Knowledge and Transferability of Contextual Representations Nelson F. Matt Yonatan

De Development of lopment of a a Risk-Adjusted Risk-Adjusted Readmission Ra eadmission Rate:

Sambuz

Useful Links

Newsletter

Mail Us