An Input/Output LIbrary for cluster of SMP Adrien Lebre , Yves Denneulin { Adrien.Lebre,Yves.Denneulin } @imag.fr ID-IMAG (UMR 5132) Laboratory, Grenoble, France BULL - HPC, ´ Echirolles, France. Slide 1/17 6th May 2005 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Plan 1 Introduction Context Parallel Input/Output 2 aIOLi system Preamble Principles Technical aspects 3 Results POSIX vs aIOLi MPI I/O vs aIOLi 4 Conclusion Slide 2/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context Environment Cluster of SMPs Linux High Performance Computing Intensive I/O applications CPU bounded application ⇒ I/O bounded application Remote hard drive I/O Parallel I/O Handling concurrent accesses to a same resource (a file) Accesses : different in size, in offset Example : matrix product Slide 3/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context Environment Cluster of SMPs Linux High Performance Computing Intensive I/O applications CPU bounded application ⇒ I/O bounded application Remote hard drive I/O Parallel I/O Handling concurrent accesses to a same resource (a file) Accesses : different in size, in offset Example : matrix product Slide 3/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Parallel I/O Example P0 read(fd,buf,1024); //file position=0 Matrix product Specific parts to fetch (according read(fd,buf,1024); //file position=1024 P1 to data distribution: columns, rows, BLOCK/BLOCK, read(fd,buf,1024); //file position=2048 P2 BLOCK/CYCLIC ...) Several requests at the same time : read(fd,buf,1024); //file position=3072 P3 disjoint/contiguous SMP Client “lethal” behavior for I/O subsystem 4 independent requests but contiguous from a global point of view Slide 4/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Parallel I/O Example Matrix product P0 read(fd,buf,1024); //file position=0 read(fd,buf,1024); //file position=4096 ... Specific parts to fetch (according read(fd,buf,1024); //file position=1024 P1 to data distribution: columns, read(fd,buf,1024); //file position=5120 rows, BLOCK/BLOCK, BLOCK/CYCLIC ...) read(fd,buf,1024); //file position=2048 P2 read(fd,buf,1024); //file position=6144 Several requests at the same time : read(fd,buf,1024); //file position=3072 P3 disjoint/contiguous read(fd,buf,1024); //file position=7168 “lethal” behavior for I/O subsystem SMP Client 4 requests have been processed ? What about the new requests ? Contiguous / Disjoint ? Slide 4/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Parallel I/O Example Matrix product P0 read(fd,buf,1024); //file position=0 read(fd,buf,1024); //file position=4096 ... Specific parts to fetch (according read(fd,buf,1024); //file position=1024 P1 to data distribution: columns, read(fd,buf,1024); //file position=5120 ... rows, BLOCK/BLOCK, BLOCK/CYCLIC ...) read(fd,buf,1024); //file position=2048 P2 read(fd,buf,1024); //file position=6144 ... Several requests at the same time : read(fd,buf,1024); //file position=3072 P3 disjoint/contiguous read(fd,buf,1024); //file position=7168 ... “lethal” behavior for I/O subsystem SMP Client 4 requests have been processed ? What about the new requests ? Contiguous / Disjoint ? Slide 4/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Parallel I/O Requirements / constraints Methods for disjoint data (readv) ⇒ complexity of API Collective operations ⇒ Synchronization mechanisms logical view (the files) ⇒ physical placements (block devices) Available solutions - related works Many Parallel File Systems : +/- efficient but hardware dependent “cluster compliant” : PVFS, NFSparallel, GPFS, Lustre Designed for “ Parallel I/O” : PIOUS, VESTA ... Libraries : Focus on portability aspects A lot ! : MPI I/O is the reference. Sophisticated API ⇒ Development overhead / Language bindings Slide 5/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Parallel I/O Requirements / constraints Methods for disjoint data (readv) ⇒ complexity of API Collective operations ⇒ Synchronization mechanisms logical view (the files) ⇒ physical placements (block devices) Available solutions - related works Many Parallel File Systems : +/- efficient but hardware dependent “cluster compliant” : PVFS, NFSparallel, GPFS, Lustre Designed for “ Parallel I/O” : PIOUS, VESTA ... Libraries : Focus on portability aspects A lot ! : MPI I/O is the reference. Sophisticated API ⇒ Development overhead / Language bindings Slide 5/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context summary ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn IO server 1 IO server 2 IO server n Slide 6/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context summary ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn IO server 1 IO server 2 IO server n Slide 6/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context summary ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn IO server 1 IO server 2 IO server n Slide 6/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context summary ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn IO server 1 IO server 2 IO server n Slide 6/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction aIOLi system Context Results Parallel Input/Output Conclusion Context summary ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn SMP Client SMP Client SMP Client ... ... ... ... ... ... P1 Pn P1 Pn P1 Pn IO server 1 IO server 2 IO server n Slide 6/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction Preamble aIOLi system Principles Results Technical aspects Conclusion aIOLi system Objectives Supply Parallel I/O algorithms scheduling policies aggregating access ⇒ efficiency overlapping access Only through the use of the ubiquitous POSIX calls open/creat/lseek/read/write/close ⇒ Simplicity Minimal overhead avoid expensive synchronisation mechanisms (barrier, . . . ) Slide 7/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Introduction Preamble aIOLi system Principles Results Technical aspects Conclusion Evaluation of “the Linux” I/O stack 1 GB File decomposition on a SMP (kernel 2.4.27, IDPOT cluster, NFS version 3, mpich 1.2.5) 300 1 Observations 2 4 250 8 1 process ⇒ Sequential 1 randomize read (optimal) 200 Completion time + processes ⇒ - performance (sec) 150 1 process in random 100 access ⇒ more performance for large 50 access than parallel approach 0 1 4 8 16 32 64 128 512 1024 4096 Access granularity (KBytes) Slide 8/17 aIOLi - CCGRID05 - May 2005 Adrien Lebre � Bull-ID LIPS 2004 c
Recommend
More recommend