Parallel programming 03 Walter Boscheri walter.boscheri@unife.it University of Ferrara - Department of Mathematics and Computer Science A.Y. 2018/2019 - Semester I
Outline Introduction to OpenMP 1 OpenMP directives 2 OpenMP synchronization 3 OpenMP syntax and main commands 4 OpenMP optimization 5 OpenMP SIMD 6 Exercise 7
1. Introduction to OpenMP OpenMP overview OpenMPI is a standard programming model for shared memory parallel programming portable across shared memory architectures FORTRAN binding OpenMP is a standard OpenMP is the easiest approach to multi-threaded programming Walter Boscheri Parallel programming 03 2 / 27
1. Introduction to OpenMP Where to use OpenMP # CPUs Problem dominated by Overhead MPI OpenMP Scalar Problem size Walter Boscheri Parallel programming 03 3 / 27
1. Introduction to OpenMP OpenMP programming model OpenMP is a shared memory model Workload is distributed among threads. Variables can be shared among all threads duplicated for each thread Threads communicate by sharing variables Unintended sharing of data can lead to race conditions : this is when the program’s outcome changes as the threads are scheduled differently Use synchronization to protect data conflicts and control race conditions Walter Boscheri Parallel programming 03 4 / 27
1. Introduction to OpenMP OpenMP execution model begin execution as a single process (master thread) start of a parallel construct: master thread creates team of threads completion of a parallel construct: threads in the team synchronize (implicit barrier) only master thread continues execution Block of code to be executed by multiple threads in parallel: !$OMP PARALLEL !$OMP END PARALLEL Walter Boscheri Parallel programming 03 5 / 27
2. OpenMP directives OpenMP directive format: FORTRAN include file for library routines: or USE omp_lib INCLUDE ’omp_lib.h’ OpenMP sentinel: !$OMP conditional compilation: !$ integration in Visual Studio 2019: Walter Boscheri Parallel programming 03 6 / 27
2. OpenMP directives OpenMP environment variables get the maximum number of threads NCPU which are available: NPRCS = OMP_GET_NUM_PROCS () set the number of threads NCPU that has to be used: CALL OMP_SET_NUM_THREADS (NCPU) wall clock timers which provide elapsed time: t_start = OMP_GET_WTIME () ! work to be measured t_end = OMP_GET_WTIME () PRINT *, ’ Work took ’, t_end - t_start , ’ seconds ’ function CALL OMP_GET_WTICK () returns the number of seconds between two successive clock ticks Walter Boscheri Parallel programming 03 7 / 27
2. OpenMP directives OpenMP variables Private and shared variables: private ( list ): declares the variables in list to be private to each thread in a team shared ( list ): makes variables that appear in list shared among all the threads in a team N.B. - if not specified, the default is shared. Exceptions: local variables in called sub-programs are private loop control variables of parallel OMP ( DO loop) is private Walter Boscheri Parallel programming 03 8 / 27
2. OpenMP directives Worksharing and Synchronization Which thread executes which statement or operation? and when? ⇓ Worksharing constructs master and synchronization constructs This is how to organize the parallel work! Walter Boscheri Parallel programming 03 9 / 27
2. OpenMP directives OpenMP worksharing constructs divide the execution of the enclosed code region among the members of the team must be enclosed dynamically within a parallel region they do not launch new threads no implied barrier on entry Walter Boscheri Parallel programming 03 10 / 27
2. OpenMP directives OpenMP SECTIONS directive !$OMP PARALLEL !$OMP SECTIONS a = b = !$OMP SECTION c = a = ... a = ... a = ... c = ... e = ... g = ... d = !$OMP SECTION e = b = ... b = ... b = ... d = ... f = ... h = ... f = !$OMP SECTION g = h = !$OMP END SECTIONS !$OMP END PARALLEL Walter Boscheri Parallel programming 03 11 / 27
2. OpenMP directives OpenMP DO directive !$OMP PARALLEL a = 5 !$OMP DO DO i = 1, 20 c(i) = b(i) + a*i a = 5 a = 5 a = 5 a = 5 ENDDO !$OMP END DO i=1,5 i=6,10 i=11,15 i=16,20 !$OMP END PARALLEL Walter Boscheri Parallel programming 03 12 / 27
2. OpenMP directives OpenMP DO directive Clauses for !$OMP DO private (list) declares the variables in list to be private to each thread in a team shared (list) makes variables that appear in list shared among all the threads in a team collapse (n) with constant integer n . The iterations of the following n nested loops are collapsed into one larger iteration space !$OMP PARALLEL DO PRIVATE(j) COLLAPSE (2) DO i = 1, 4 DO j = 1, 100 a(i) = b(j) + 4 ENDDO ENDDO !$OMP END PARALLEL DO Walter Boscheri Parallel programming 03 12 / 27
2. OpenMP directives OpenMP DO directive Clauses for !$OMP DO reduction (operator: list) performs a reduction on the variables that appear in list with the operator operator . It can be one of the following: +, -, *, .AND., .OR., MAX, MIN Variables must be shared. At the end of the reduction , the shared variable is updated to reflect the result of combining the original value of the shared reduction variable with the final value of each of the private copies using the operator specified. sm = 0 !$OMP PARALLEL DO PRIVATE(r) REDUCTION (+:sm) DO i = 1, 20 r = work(i) sm = sm + r ENDDO !$OMP END PARALLEL DO Walter Boscheri Parallel programming 03 12 / 27
2. OpenMP directives OpenMP DO directive Clauses for !$OMP DO nowait implicit barrier at the end of !$OMP DO unless nowait is specified. If nowait is specified, threads do not synchronize at the end of the parallel loop. schedule (type [chunk] ) . type can be: static : iterations are divided into pieces of a size specified by chunk dynamic : iterations are broken into pieces of a size specified by chunk . As each thread finishes a piece of the iteration space, it dynamically obtains the next set of iterations. guided : the chink size is reduced in an exponentially decreasing manner with each dispatched piece of the iteration space. chunk specifies the smallest piece. auto : scheduling is delegated to the compiler and/or runtime system. Walter Boscheri Parallel programming 03 12 / 27
2. OpenMP directives OpenMP WORKSHARE directive WORKSHARE directive allows parallelization of array expressions and FORALL statements. !$OMP WORKSHARE A = B FORALL(i=1:N,j=1:N, A(i,j).NE .0.0) B(i,j) = 1.0/A(i,j) !$OMP END WORKSHARE work inside block is divided into separate units of work each unit of work is executed only once the units of work are assigned to threads in any manner similar to PARALLEL DO without explicit loops Walter Boscheri Parallel programming 03 13 / 27
2. OpenMP directives OpenMP TASK directive When a thread encounters a TASK construct, a task is generated from the code for the associated structured block. TASK defines an explicit task. !$OMP TASK [clause] !block of work !$OMP END TASK Clauses: untied default( shared — non — private ) private (list) shared (list) Walter Boscheri Parallel programming 03 14 / 27
2. OpenMP directives OpenMP TASK directive the encountering thread may immediately execute the task may defer its execution The number of task is limited to the number of threads! Completion of a task can be guaranteed using task synchronization con- structs ⇓ !$OMP TASKWAIT Walter Boscheri Parallel programming 03 14 / 27
2. OpenMP directives OpenMP SINGLE directive The block is executed by only one thread in the team, which should not necessary be the master thread. !$OMP SINGLE !block of work !$OMP END SINGLE implicit barrier at the end of SINGLE construct (unless nowait is specified) To reduce the overhead, one can combine several parallel parts ( DO, WORKSHARE, SECTIONS ) and sequential parts ( SINGLE ) in one parallel region ( !$OMP PARALLEL --- !$OMP END PARALLEL ) Walter Boscheri Parallel programming 03 15 / 27
3. OpenMP synchronization OpenMP synchronization implicit barrier beginning and end of parallel constructs end of all other constructs implicit synchronization can be removed by means of nowait clause explicit: CRITICAL directive !$OMP CRITICAL !block of work !$OMP END CRITICAL A thread waits at the beginning of a critical region until no other thread in the team is executing a critical region with the same name. All unnamed critical directives map to the same unspecified name. Walter Boscheri Parallel programming 03 16 / 27
3. OpenMP synchronization OpenMP CRITICAL directive cnt = 0 cnt = 0 a = 5 a = 5 !$OMP PARALLEL !$OMP DO i=6,10 i=1,5 i=11,15 i=16,20 DO i = 1, 20 IF(b(i).EQ.0) THEN !$OMP CRITICAL IF ... IF ... IF ... IF ... cnt = cnt + 1 cnt = cnt + 1 !$OMP END CRITICAL ENDIF cnt = cnt + 1 a(i)=... a(i)=... a(i)=... a(i) = b(i) + a*i ENDDO a(i)=... !$OMP END DO !$OMP END PARALLEL Walter Boscheri Parallel programming 03 17 / 27
3. OpenMP synchronization Race conditions !$OMP PARALLEL SECTIONS A = A + B !$OMP SECTION B = A + C !$OMP SECTION C = B + A !$OMP END PARALLEL SECTIONS the result varies unpredictably based on the specific order of the execution for each section wrong answers produced without warning Walter Boscheri Parallel programming 03 18 / 27
Recommend
More recommend