static analysis of openmp data mapping for target
play

Static Analysis of OpenMP data mapping for target offmoading - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . 1/44 Introduction Our Solution Evaluation Conclusion Static Analysis of OpenMP data mapping for target offmoading Prithayan Barua, Vivek Sarkar . . . . . . . . . . . . . . . . . .


  1. . . . . . . . . . . . . . . . 1/44 Introduction Our Solution Evaluation Conclusion Static Analysis of OpenMP data mapping for target offmoading Prithayan Barua, Vivek Sarkar . . . . . . . . . . . . . . . . . . . . . . . . . Georgia Institute of Technology

  2. . . . . . . . . . . . . . . . 2/44 Introduction Our Solution Evaluation Conclusion Acknowledgements Shirako Jun, Tsang Whitney, Paudel Jeeva, Chen Wang OMPSan: Static Verifjcation of OpenMP’s Data Mapping Constructs. . . . . . . . . . . . . . . . . . . . . . . . . . IWOMP 2019

  3. . Outline . . . . . . . 3/44 Introduction Our Solution Evaluation Conclusion 1 . Introduction OpenMP Target Offmoading 2 Our Solution Basic Idea Analysis Interpret OpenMP Clauses 3 Evaluation Example Analysis Conclusion Experiment Results 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion

  4. . Outline . . . . . . . 4/44 Introduction Our Solution Evaluation Conclusion 1 . Introduction OpenMP Target Offmoading 2 Our Solution Basic Idea Analysis Interpret OpenMP Clauses 3 Evaluation Example Analysis Conclusion Experiment Results 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion

  5. . . . . . . . . . . . . 5/44 . Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Programming Heterogeneous Systems using OpenMP Programming Model Host can offmoad computations to target devices Each target device has a corresponding data environment Host can update the data between host and . . . . . . . . . . . . . . . . . . . . . . . . . . . devices using data mapping clauses

  6. . L4: . . 6/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Using OpenMP for Target offmoading Example 1, How to ofmoad computations #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . }

  7. . L4: . . 7/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target data map Example 1, L2 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  8. . L4: . . 8/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target 2 Example 1, L4 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. . . . . . . . . . . . . . . . . . ▷

  9. . L4: . . 9/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target Example 1, L8 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  10. . L4: . . 10/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target Example 1, L12 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  11. . L4: . . 11/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target data map Example 1, L14 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ▷ } //Copy 'A[0:N]' from device to host.

  12. . OpenMP Target Offmoading #pragma omp target data map(tofrom:A[0:N]) int A[N], sum=0; L2: #define N 10 Example 2 Execute L11: loop on host Conclusion { Evaluation Our Solution Introduction 12/44 . . L4: #pragma omp target . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { reduction(+:sum) L7: // #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  13. . #pragma omp target data map(tofrom:A[0:N]) . . . 13/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Disaster !! Wrong Output Example 2, L12 #define N 10 L2: int A[N], sum=0; L4: . L11: L14: } L13: sum += A[i]; // Access host copy of stale 'A'! L12: for ( int i=0; i<N; i++) { // Execute on host } // Leave 'A[0:N]' on device. { // Allocate 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: #pragma omp target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  14. . . . . . . . . . . . . . . . 14/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading But Why ? . . . . . . . . . . . . . . . . . . . . . . . . . Default Solution: OpenMP Specifjcations

  15. . . . . . . . . . . . . . . . 15/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Understanding the Data Map Usage Data Map Specifjcation . . . . . . . . . . . . . . . . . . . . . . . . . Our Flowchart to explain the Specifjcation

  16. . L4: . . 16/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading One possible fjx Example 3 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Force Copy 'A[0:N]' to host. #pragma omp target update from(A[0:N]) #pragma omp target map(from:A[0:N]) } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  17. . Conclusion . . . . . . . . 17/44 Introduction Our Solution Evaluation OpenMP Target Offmoading . Memory Optimization Naive Jacobian while ( error > tol && iter < iter_max ) { #pragma omp target map(tofrom:Anew) map(tofrom:A) map(tofrom:error) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) { + A[j-1][i] + A[j+1][i]); #pragma omp target map(tofrom:Anew) map(tofrom:A) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) A[j][i] = Anew[j][i]; iter++; . . . . . . . . . . . . . . . . . . . . . . . . } . . . . . . error = 0.0; Anew[j][i] = 0.25 * ( A[j][i+1] + A[j][i-1] error = fmax( error, fabs(Anew[j][i] - A[j][i])); }

  18. . OpenMP Target Offmoading . . . . . . . 18/44 Introduction Our Solution Evaluation Conclusion Memory Optimization . Remove Redundant Memory Copies #pragma omp target data map(to:Anew) map(tofrom:A) while ( error > tol && iter < iter_max ) { #pragma omp target map(tofrom:error) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) { + A[j-1][i] + A[j+1][i]); error = fmax( error, fabs(Anew[j][i] - A[j][i])); } #pragma omp target for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) A[j][i] = Anew[j][i]; iter++; . . . . . . . . . . . . . . . . . . . . . . . . . } . . . . . . error = 0.0; Anew[j][i] = 0.25 * ( A[j][i+1] + A[j][i-1]

Recommend


More recommend