WestGrid – Compute Canada - Online Workshop 2017 Introduction to Parallel Programming using OpenMP Shared Memory Parallel Programming Part – I Dr. Ali Kerrache WestGrid, Univ. of Manitoba, Winnipeg E -mail: ali.kerrache@umanitoba.ca
WestGrid – Compute Canada - Online Workshop 2017 Part - I Fundamental Basics of Parallel Programming using OpenMP Tuesday, January 31, 2017 - 12:00 to 14:00 CST Part - II Intermediate and Some Advanced Parallel Programming using OpenMP Tuesday, February 21, 2017 - 12:00 to 14:00 CST Part - III Introduction to Molecular Dynamics Simulations Tuesday, March 14, 2017 - 12:00 to 14:00 CST
What do you need? Basic Knowledge of: Access to Grex: C / C++ and/or Fortran Compute Canada account. Compilers: GNU , Intel , … WestGrid account. Compile, Debug & Run a program. Utilities: Text editor: vim , nano , … ssh client: PuTTy, Mobaxterm … Grex Slides & Examples (available): https://www.westgrid.ca/events/intro_openmp_part_1_0
How to participate in this workshop? Login to Grex: $ ssh your_user_name@grex.westgrid.ca [ your_user_name@tatanka ~] $ [ your_user_name@bison ~] $ Copy the examples to your current working directory: $ cp –r /global/scratch/workshop/openmp-wg-2017 . $ cd openmp-wg-2017 && ls Current directory Reserve a compute node and export number of threads: $ sh reserve_omp_node.sh $ export OMP_NUM_THREADS=4 [bash]
Introduction to Parallel Computing Using OpenMP Outline: Introduction Parallelism and Concurrency. Types of Parallel Machines. Models of Parallel Programming. Definition and construction of OpenMP. OpenMP syntax and directives. Simple OpenMP program (Hello World). Loops in OpenMP: work sharing. False sharing and race condition. critical and atomic constructs. reduction construct. Conclusions.
Introduction to Parallel Computing Using OpenMP Objectives: Introduce simple ways to parallelize programs. From a serial to a parallel program using OpenMP. OpenMP directives (C/C++ and Fortran): Compiler directives. Runtime library. Environment variables. OpenMP by examples: Compile & run an OpenMP program. Create threads & split the work over the available threads. Work sharing: loops and sections in OpenMP. Some of OpenMP constructs. Write and Optimize an OpenMP program.
Introduction to Parallel Computing Using OpenMP Serial Programming: Example: Develop a program. Time Performance & Optimization? 4 Cores 1 Core But in real world: Run multiple programs. Parallelization Execution in parallel Large & complex problems. Time consuming. With 4 cores: Execution time reduced by a factor of 4 Solution: Use Parallel Machines. Use Multi-Core Machines. What is Parallel Programming? Ability to obtain the same amount of Why Parallel? computation with multiple Reduce the execution time. cores at low frequency (fast). Run multiple programs.
Parallelism & Concurrency Concurrency: Condition of a system in which multiple tasks are logically active at the same time … but they may not necessarily run in parallel. Parallelism: subset of concurrency Condition of a system in which multiple tasks are active at the same time and run in parallel. What do we mean by parallel machines?
Types of Parallel Machines Distributed Memory Machines Shared Memory Machines CPU 0 CPU 1 CPU 2 CPU 3 CPU 0 CPU 1 CPU 2 CPU 3 MEM 0 MEM 1 MEM 2 MEM 3 SHARED MEMORY Each processor has its own memory . All processors share the same memory . The variables can be shared or The variables are independent . Communication by passing messages private . Communication via shared memory . (network). What are the different types of shared memory machines?
Shared Memory Machines SMP: NUMA: Non Uniform Address Symmetric Multi-Processor Space Multi-Processor Shared address space with equal Different regions have different time access for each processor. access cost. private Thread 2 Shared private Thread 3 Thread 1 private Variables Thread 0 private What kind of parallel programming?
Parallel Programming Models Shared Memory Machines Distributed Memory Machines CPU 0 CPU 1 CPU 2 CPU 3 CPU 0 CPU 1 CPU 2 CPU 3 MEM 0 MEM 1 MEM 2 MEM 3 SHARED MEMORY Multi-Processing Multi-Threading Multi-Core Computers Parallel Computers Used in Distributed Memory Machines. Used in shared Memory Machines. Communication via shared memory. Communication by message passing. Portable, easy to program and use. Difficult to program. Not very scalable. Scalable. MPI Based + OpenMP Based Hybrid: MPI – OpenMP
Definition of OpenMP: API Library used to divide computational work in a program and add parallelism to a serial program ( create threads ) to speed up the execution. Supported by many compilers: Intel (ifort, icc), GNU (gcc, gfortran, …), … C/C++, Fortran. Compilers: http://www.openmp.org/resources/openmp-compilers/ OpenMP Compiler Directives Runtime Library Environment Variables Directives to add to a Directives introduced after Directives executed serial program. compile time to control & at run time. Interpreted at compile execute OpenMP program. time.
Construction of OpenMP program Application / Serial program / End user OpenMP Compiler Directives Runtime Library Environment Variables Compilation / Runtime Library / Operating System Thread creation & Parallel Execution Thread 0 Thread 1 Thread 2 Thread 3 Thread 4 … What is the OpenMP programming model?
OpenMP: Fork – Join parallelism model Serial Program Define the regions to parallelize , then add OpenMP directives Serial Serial Serial Serial FORK FORK FORK JOIN JOIN JOIN Region Region Region Region Parallel Nested Serial region: master thread Region Parallel region: all threads Region FORK JOIN Master thread spawns a team of threads as needed. Parallelism is added incrementally: that is, the sequential program evolves into a parallel program.
OpenMP has simple syntax Most of the constructs in OpenMP are compiler directives or pragma : #include <omp.h> For C and C++, the pragma take the form: #pragma omp parallel { #pragma omp construct [clause [clause]…] Block of a C/C++ code } For Fortran, the directives take one of the forms: use omp_lib !$OMP construct [clause [clause]…] !$omp parallel C$OMP construct [clause [clause]…] Block of Fortran code *$OMP construct [clause [clause]…] !$omp end parallel For C/C++ include the H eader file: #include <omp.h> For Fortran 90 use the module : use omp_lib For F77 include the Header file: include ‘omp_lib.h’
Parallel regions & Structured blocks Most of OpenMP constructs apply to structured blocks Structured block: a block with one point of entry at the top and one point of exit at the bottom. The only “ branches ” allowed are STOP statements in Fortran and exit() in C/C++ Structured block Non structured block #pragma omp parallel if (go_now()) goto more; { #pragma omp parallel int id = omp_get_thread_num(); { int id = omp_get_thread_num(); more: res[id] = do_big_job (id); more: res[id] = do_big_job(id); if (conv (res[id]) goto done; if (conv (res[id]) goto more; goto more; } } printf (“All done\n”); done: if (!really_done()) goto more;
Compile & Run an OpenMP Program To compile and enable OpenMP: GNU: add –fopenmp to C/C++ & Fortran compilers. Intel compilers: add –openmp (accept also –fopenmp) PGI Linux compilers: add –mp Windows: add /Qopenmp Environment variables: OMP_NUM_THREADS OpenMP will spawns one thread per hardware thread. $ export OMP_NUM_THREADS=value ( bash shell ) $ setenv OMP_NUM_THREADS value (tcsh shell) value: number of threads [ For example 4 ] To run: $ ./name_your_exec_program or ./a.out
From serial to an OpenMP program Simple serial program in C/C++ and Fortran File: Example_00/ File: Example_00/ hello_f90_seq.f90 hello_c_seq.c Fortran program C/C++ program program Hello #include <stdio.h> implicit none int main() { printf("Hello World\n"); write(*,*) "Hello World" end program Hello } Compile the code Compile the code $ ifort hello_f90_seq.f90 $ icc hello_c_seq.c $ gcc hello_c_seq.c $ gfortran hello_f90_seq.f90 Run the code Run the code $ ./a.out $ ./a.out
Simple OpenMP Program C/C++ Fortran program Hello #include <omp.h> Header module #include <stdio.h> use omp_lib implicit none int main() { #pragma omp parallel !$omp parallel Compiler write(*,*) "Hello World" { Compiler directives !$omp end parallel printf("Hello World\n"); directives } end program Hello } C and C++ use exactly the same constructs . Slight differences between C/C++ and Fortran. Runtime Library Thread rank: omp_get_thread_num() Files: Example_00/ Number of threads: omp_get_num_threads() helloworld_c_omp.c Set number of threads: omp_set_num_threads() helloworld_f90_omp.f90 Compute time: omp_get_wtime()
Recommend
More recommend