overview of common strategies for paralleliza5on
play

Overview of Common Strategies for Paralleliza5on Ivan Giro8o - PowerPoint PPT Presentation

Overview of Common Strategies for Paralleliza5on Ivan Giro8o igiro8o@ictp.it Interna?onal Centre for Theore?cal Physics (ICTP) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 1 Cinvestav Abacus, 16 Feb 2018


  1. Overview of Common Strategies for Paralleliza5on Ivan Giro8o – igiro8o@ictp.it Interna?onal Centre for Theore?cal Physics (ICTP) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 1 Cinvestav Abacus, 16 Feb 2018

  2. Serial Programming A problem is broken into a discrete series of instructions. Instructions are executed one after another. Only one instruction may execute at any Memory moment in time. Program Data Load/Store CPU Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 2 Cinvestav Abacus, 16 Feb 2018

  3. Parallel Programming Memory Memory CPU CPU communica?on Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 3 Cinvestav Abacus, 16 Feb 2018

  4. Concurrency The first step in developing a parallel algorithm is to decompose the problem into tasks that can be executed concurrently • A problem is broken into discrete parts that can be solved concurrently • Each part is further broken down to a series of instruc?ons • Instruc?ons from each part execute simultaneously on different processors • An overall control / coordina?on mechanism is employed Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 4 Cinvestav Abacus, 16 Feb 2018

  5. What is a Parallel Program 1 0 init init Read and Distribute Data Read and Distribute Data comm. Compute on Sub Compute on Sub Domain A Domain B Reduce data Reduce data update Sub Domain update Sub Domain comm. terminate terminate Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 5 Cinvestav Abacus, 16 Feb 2018

  6. Fundamental Steps of Parallel Design • Iden?fy por?ons of the work that can be performed concurrently • Mapping the concurrent pieces of work onto mul?ple processes running in parallel • Distribu?ng the input, output and intermediate data associated within the program • Managing accesses to data shared by mul?ple processors • Synchronizing the processors at various stages of the parallel program execu?on Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 6 Cinvestav Abacus, 16 Feb 2018

  7. Type of Parallelism • Func5onal (or task) parallelism : different people are performing different task at the same ?me • Data Parallelism : different people are performing the same task, but on different equivalent and independent objects Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 7 Cinvestav Abacus, 16 Feb 2018

  8. Process Interac?ons • The effec?ve speed-up obtained by the paralleliza?on depend by the amount of overhead we introduce making the algorithm parallel • There are mainly two key sources of overhead: 1. Time spent in inter-process interac?ons ( communica5on ) 2. Time some process may spent being idle ( synchroniza5on ) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 8 Cinvestav Abacus, 16 Feb 2018

  9. Barrier and Synchroniza?on all here? Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 9 Cinvestav Abacus, 16 Feb 2018

  10. Limita?ons of Parallel Compu?ng • Frac?on of serial code limits parallel speedup • Degree to which tasks/data can be subdivided is limit to concurrency and parallel execu?on • Load imbalance: • parallel tasks have a different amount of work • CPUs are par?ally idle • redistribu?ng work helps but has limita?ons • communica?on and synchroniza?on overhead Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 10 Cinvestav Abacus, 16 Feb 2018

  11. Shared Resources • In parallel programming, developers must manage exclusive access to shared resources • Resources are in different forms: – concurrent read/write (including parallel write) to shared memory loca?ons – concurrent read/write (including parallel write) to shared devices – a message that must be send and received Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 11 Cinvestav Abacus, 16 Feb 2018

  12. Thread 1 Thread 2 load a load a Program add a 1 add a 1 store a store a Private 11 10 10 11 data 10 11 11 Shared data Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 12 Cinvestav Abacus, 16 Feb 2018

  13. Parallelism - 101 • there are two main reasons to write a parallel program: • access to larger amount of memory (aggregated, going bigger) • reduce ?me to solu?on (going faster) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 13 Cinvestav Abacus, 16 Feb 2018

  14. Scalable Programming Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 14 Cinvestav Abacus, 16 Feb 2018

  15. NETWORK Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 15 Cinvestav Abacus, 16 Feb 2018

  16. Granularity • Granularity is determined by the decomposi?on level (number of task) on which we want divide the problem • The degree to which task/data can be subdivided is limit to concurrency and parallel execu?on • Paralleliza?on has to become “topology aware” § coarse grain and fine grained paralleliza?on has to be mapped to the topology to reduce memory and I/O conten?on § make your code modularized to enhance different levels of granularity and consequently to become more “plaeorm adaptable” Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 16 Cinvestav Abacus, 16 Feb 2018

  17. Sta?c Data Par??oning The simplest data decomposi5on schemes for dense matrices are 1-D block distribu5on schemes. Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 17 Cinvestav Abacus, 16 Feb 2018

  18. Block Array Distribu?on Schemes Block distribu5on schemes can be generalized to higher dimensions as well. Degree to which tasks/data can be subdivided is limit to concurrency and parallel execu5on!! Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 18 Cinvestav Abacus, 16 Feb 2018

  19. 1D Distribu?on of a 3D domain Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 19 Cinvestav Abacus, 16 Feb 2018

  20. Distributed Data Vs Replicated Data • Replicated data distribu?on is useful if it helps to reduce the communica?on among process at the cost of bounding scalability • Distributed data is the ideal data distribu?on but not always applicable for all data-sets • Usually complex applica?on are a mix of those techniques => distribute large data sets; replicate small data Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 20 Cinvestav Abacus, 16 Feb 2018

  21. Global Vs Local Indexes • In sequen?al code you always refer to global indexes • With distributed data you must handle the dis?nc?on between global and local indexes (and possibly implemen?ng u?li?es for transparent conversion) 1 2 3 1 2 3 1 2 3 Local Idx 1 2 3 4 5 6 7 8 9 Global Idx Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 21 Cinvestav Abacus, 16 Feb 2018

  22. Collaterals to Domain Decomposi?on /1 Are all the domain’s dimensions always mul5ple of the number of tasks/processes we are willing to use? Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 22 Cinvestav Abacus, 16 Feb 2018

  23. Again on Domain Decomposi?on Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 23 Cinvestav Abacus, 16 Feb 2018

  24. P 0 Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 24 Cinvestav Abacus, 16 Feb 2018

  25. call MPI_BCAST( ... ) P 0 (root) P 1 P 2 P 3 Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 25 Cinvestav Abacus, 16 Feb 2018

  26. P 0 P 1 P 3 P 2 call evolve( d]act ) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 26 Cinvestav Abacus, 16 Feb 2018

  27. call MPI_Gather( ..., ..., ... ) P 0 (root) P 1 P 3 P 2 Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 27 Cinvestav Abacus, 16 Feb 2018

  28. Replicated data • Compute domain (and workload) distribu?on among processes • Master-slaves: P 0 drives all processes • Large amount of data communica?on – at each step P 0 distribute data to all processes and collect the contribu?on of each process • Problem size scaling limited in memory capacity Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 28 Cinvestav Abacus, 16 Feb 2018

  29. Collaterals to Domain Decomposi?on /2 Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 29 Cinvestav Abacus, 16 Feb 2018

  30. The Transport Code - Parallel Version P 0 P 1 P 2 P 3 call evolve( d]act ) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 30 Cinvestav Abacus, 16 Feb 2018

  31. Data exchange among processes P 0 P 1 P 2 P 3 Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 31 Cinvestav Abacus, 16 Feb 2018

  32. proc_down = mod(proc_me - 1 + nprocs , nprocs) P 0 P 1 P 2 P 3 proc_up = mod(proc_me + 1 , nprocs) Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 32 Cinvestav Abacus, 16 Feb 2018

  33. Sendrecv Ivan Giro*o - igiro*o@ictp.it Overview of Common Strategies for Paralleliza5on 33 Cinvestav Abacus, 16 Feb 2018

Recommend


More recommend