Awareness of MPI Virtual Process Topologies on the Single-Chip Cloud Computer Steffen Christgau, Bettina Schnor Potsdam University Institute of Computer Science Operating Systems and Distributed Systems HIPS@IPDPS, 21. May 2012
Outline The Single-Chip Cloud Computer (SCC) RCKMPI – MPI on the SCC Topology-Awareness for RCKMPI – The concept Topology-Awareness for RCKMPI – Evaluation Summary and Future Work Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 2 of 19
Outline The Single-Chip Cloud Computer (SCC) RCKMPI – MPI on the SCC Topology-Awareness for RCKMPI – The concept Topology-Awareness for RCKMPI – Evaluation Summary and Future Work Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 2 of 19
Outline The Single-Chip Cloud Computer (SCC) RCKMPI – MPI on the SCC Topology-Awareness for RCKMPI – The concept Topology-Awareness for RCKMPI – Evaluation Summary and Future Work Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 2 of 19
Outline The Single-Chip Cloud Computer (SCC) RCKMPI – MPI on the SCC Topology-Awareness for RCKMPI – The concept Topology-Awareness for RCKMPI – Evaluation Summary and Future Work Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 2 of 19
Outline The Single-Chip Cloud Computer (SCC) RCKMPI – MPI on the SCC Topology-Awareness for RCKMPI – The concept Topology-Awareness for RCKMPI – Evaluation Summary and Future Work Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 2 of 19
1. The Single-Chip Cloud Computer (SCC) Many-Core Architecture Research Community (MARC) established by Intel research of universities together with Intel world-wide community (dominated by European institutions) Website, regular symposia (every 6 months, 5 up to now) Our group is MARC member Focus: application scalability Experiences with parallel ASP , climate simulation Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 3 of 19
Single-Chip Cloud Computer Core1 L2 Cache 256 KB MC 1 MC 3 MPB Router 16 KB Core0 MC 0 MC 2 L2 Cache 256 KB VRC SIF 24 tiles, 48 P54C cores, connected via Network-on-Chip, no Cache-Coherence fast 16 KB tile SRAM on each tile Message Passing Buffer (MPB) Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 4 of 19
2. RCKMPI: MPI on the SCC fork of MPICH2 Application MPI Message Passing Interface MPICH2 ROMIO MPI Implementation (MPI IO Implementation) ADIO Abstract Device Interface (ADI3) Process Management Channel-3 Device BG Cray Interface SCCMPB SCCSHM SCCMulti Nemesis Sock Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 5 of 19
2. RCKMPI: MPI on the SCC fork of MPICH2 Application MPI Message Passing Interface MPICH2 ROMIO MPI Implementation (MPI IO Implementation) ADIO Abstract Device Interface (ADI3) Process Management Channel-3 Device BG Cray Interface SCCMPB SCCSHM SCCMulti Nemesis Sock Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 5 of 19
SCCMPB uses the fast Message Passing Buffer of each tile as shared memory and divides it into n equal-size Exclusive Write Sections (EWS) ⇒ remote write, local read = MPI_COMM_WORLD MPI Rank 0 MPI Rank n Core 0 Core n rank 1 rank rank n-1 rank n MPB of core/rank 0 Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 6 of 19
SCCMPB uses the fast Message Passing Buffer of each tile as shared memory and divides it into n equal-size Exclusive Write Sections (EWS) ⇒ remote write, local read = MPI_COMM_WORLD MPI Rank 0 MPI Rank n Core 0 Core n rank 1 rank rank n-1 rank n MPB of core/rank 0 Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 6 of 19
Comparison of different CH3-devices at maximum Manhattan distance 8 100 10 bandwidth / MByte/s 1 0.1 0.01 RCKMPI sccmulti CH device RCKMPI sccmpb CH device RCKMPI sccshm CH device 0.001 4 16 64 256 1 Ki 4 Ki 16 Ki 64 Ki 256 Ki 1 Mi 4 Mi message size / Byte Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 7 of 19
Bandwidths for Manhattan distance 0, 5 and 8 (two processes started). 90 Core 00 and 01 Core 00 and 10 Core 00 and 47 80 70 60 bandwidth / MByte/s 50 40 30 20 10 0 4 16 64 256 1 Ki 4 Ki 16 Ki 64 Ki 256 Ki 1 Mi 4 Mi message size / Byte Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 8 of 19
Bandwidths for maximum Manhattan distance 8, and varied number of MPI processes 70 2 MPI processes 12 MPI processes 24 MPI processes 48 MPI processes 60 50 bandwidth / MByte/s 40 30 20 10 0 4 16 64 256 1 Ki 4 Ki 16 Ki 64 Ki 256 Ki 1 Mi 4 Mi message size / Byte Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 9 of 19
Remember: SCCMPB uses the fast Message Passing Buffer of each tile as shared memory (total 384 KB) MPI_COMM_WORLD MPI Rank 0 MPI Rank n Core 0 Core n rank 1 rank rank n-1 rank n MPB of core/rank 0 The MPB is equally devided in n sections = ⇒ depending on the number of started MPI processes. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 10 of 19
3. Topology Awareness for RCKMPI: The Concept The bandwidth between 2 RCKMPI processes depends on: the number of started MPI processes since a fully-connected network between all MPI processes is managed. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 11 of 19
Application Behaviour Tv qv Tu process 1 qu uu process 2 process n vv uv Goal: The bandwidth between communicating processes, so-called neighbors in the Task Interaction Graph should be increased. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 12 of 19
Application Behaviour Tv qv Tu process 1 qu uu process 2 process n vv uv Goal: The bandwidth between communicating processes, so-called neighbors in the Task Interaction Graph should be increased. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 12 of 19
Requirements: 1 An improved MPB layout must consider both: communication neighbors and group communication (barriers, broadcasts, gather/scatter, ...) 2 Each MPI process has to know its new offset within all remote MPBs. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 13 of 19
Putting things together ... message payload channel header original layout proc. 1 proc. 2 proc. n-1 proc. n write section new layout without topology information p 1 p 2 p n proc. 1 proc. 2 proc. n-1 proc. n channel headers payload area with topology information p 1 p 2 p n p1 proc. 2 pn Internal barrier for recalculation phase of new MPB addresses. Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 14 of 19
MPI offers API to specify virtual process topology 1 ★❞❡❢✐♥❡ ◆❯▼❴❉■▼❙ ✷ 2 ✐♥t ❣r✐❞❴❞✐♠s ❬◆❯▼❴❉■▼❙❪ ❀ 3 ✐♥t ❣r✐❞❴♣❡r✐♦❞s ❬◆❯▼❴❉■▼❙❪ ❀ 4 ▼P■❴❈♦♠♠ ❝♦♠♠❴t♦♣♦ ❀ 5 6 7 ✴✯ ❢♦ r ❛ ❣r✐❞ ✱ s❡t ❛ ❧ ❧ ✐t❡♠s ♦❢ ❣r✐❞❴♣❡r✐♦❞s t♦ ✵ ✯✴ ❢ ♦r ✭ ✐♥t ✐ ❂ ✵❀ ✐ ❁ ◆❯▼❴❉■▼❙❀ ✐✰✰✮ 8 ❣r✐❞❴♣❡r✐♦❞s ❬ ✐ ❪ ❂ ✵❀ 9 10 ▼P■❴❉✐♠s❴❝r❡❛t❡✭♥✉♠Pr♦❝s ✱ ◆❯▼❴❉■▼❙✱ ❣r✐❞❴❞✐♠s ✮ ❀ 11 ▼P■❴❈❛rt❴❝r❡❛t❡ ✭▼P■❴❈❖▼▼❴❲❖❘▲❉✱ ◆❯▼❴❉■▼❙✱ ❣r✐❞❴❞✐♠s ✱ 12 ❣r✐❞❴♣❡r✐♦❞s ✱ tr✉❡ ✱ ✫❝♦♠♠❴t♦♣♦ ✮ ❀ 13 Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 15 of 19
MPI offers API to specify virtual process topology 1 ★❞❡❢✐♥❡ ◆❯▼❴❉■▼❙ ✷ 2 ✐♥t ❣r✐❞❴❞✐♠s ❬◆❯▼❴❉■▼❙❪ ❀ 3 ✐♥t ❣r✐❞❴♣❡r✐♦❞s ❬◆❯▼❴❉■▼❙❪ ❀ 4 ▼P■❴❈♦♠♠ ❝♦♠♠❴t♦♣♦ ❀ 5 6 7 ✴✯ ❢♦ r ❛ ❣r✐❞ ✱ s❡t ❛ ❧ ❧ ✐t❡♠s ♦❢ ❣r✐❞❴♣❡r✐♦❞s t♦ ✵ ✯✴ ❢ ♦r ✭ ✐♥t ✐ ❂ ✵❀ ✐ ❁ ◆❯▼❴❉■▼❙❀ ✐✰✰✮ 8 ❣r✐❞❴♣❡r✐♦❞s ❬ ✐ ❪ ❂ ✵❀ 9 10 ▼P■❴❉✐♠s❴❝r❡❛t❡✭♥✉♠Pr♦❝s ✱ ◆❯▼❴❉■▼❙✱ ❣r✐❞❴❞✐♠s ✮ ❀ 11 ▼P■❴❈❛rt❴❝r❡❛t❡ ✭▼P■❴❈❖▼▼❴❲❖❘▲❉✱ ◆❯▼❴❉■▼❙✱ ❣r✐❞❴❞✐♠s ✱ 12 ❣r✐❞❴♣❡r✐♦❞s ✱ tr✉❡ ✱ ✫❝♦♠♠❴t♦♣♦ ✮ ❀ 13 Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 15 of 19
4. Topology Awareness for RCKMPI: Evaluation 30 25 20 bandwidth / MByte/s 15 10 5 0 4 16 64 256 1 Ki 4 Ki 16 Ki 64 Ki 256 Ki 1 Mi 4 Mi message size / Byte enhanced RCKMPI with 1D topology (48 procs, 2 Cache lines) enhanced RCKMPI with 1D topology (48 procs, 3 Cache lines) enhanced RCKMPI without topology (48 procs) Bettina Schnor (Potsdam University) MPI Topology Awareness on the SCC Frame 16 of 19
Recommend
More recommend