input space splitting for opencl
play

Input Space Splitting for OpenCL Simon Moll, Johannes Doerfert, - PowerPoint PPT Presentation

Input Space Splitting for OpenCL Simon Moll, Johannes Doerfert, Sebastian Hack Saarbrcken Graduate School of Computer Science Saarland University Saarbrcken, Germany October 29, 2015 saarland university OpenCL: Execution Model computer


  1. Input Space Splitting for OpenCL Simon Moll, Johannes Doerfert, Sebastian Hack Saarbrücken Graduate School of Computer Science Saarland University Saarbrücken, Germany October 29, 2015

  2. saarland university OpenCL: Execution Model computer science Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 2 / 25

  3. saarland university OpenCL: Parallelized & Vectorized computer science Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 3 / 25

  4. saarland university Vectorization (SIMD) computer science Perform the same operation for multiple vector lanes simultaneously. Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 4 / 25

  5. saarland university Vectorization (SIMD) computer science Perform the same operation for multiple vector lanes simultaneously. Vector Patterns < i , i + 1 , i + 2 , i + 3 > Consecutive: contiguous entries <i,i,i,i> → i Uniform: single entry < i , j , 7 , − > Divergent: unrelated entries for (i = 0; i < 16; i++) for (i = 0; i < 16; i += 2) O[i] = I[i] + 2; O [ i ] = I [ i ] + 1; Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 4 / 25

  6. saarland university Diverging Control Flow computer science a b Thread Trace c d 1 a b c e f 2 a b d e f e 3 a b c e b c e f f 4 a b c e b d e f Different threads execute different code paths Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 5 / 25

  7. saarland university Diverging Control Flow computer science a a b b Thread Trace c c d 1 a b c d e b c d e f d 2 a b c d e b c d e f e e 3 a b c d e b c d e f f f 4 a b c d e b c d e f Different threads execute different code paths Execute everything, mask out results of inactive threads (using predication, blending) Control flow to data flow conversion on ASTs [Allen & Kennedy ’83] Whole-Function Vectorization on SSA CFGs [K & H ’11] Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 5 / 25

  8. saarland university Non-Divergent Control Flow computer science Idea: optimize cases where threads do not diverge a a b b Thread Trace c c d 1 a b c e b d e f d 2 a b c e b d e f e e 3 a b c e b d e f f f 4 a b c e b d e f Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 6 / 25

  9. saarland university Non-Divergent Control Flow computer science Idea: optimize cases where threads do not diverge a a b b Thread Trace c c d 1 a b c e b d e f d 2 a b c e b d e f e e 3 a b c e b d e f f f 4 a b c e b d e f Option 1: Insert dynamic predicate-tests & branches to skip paths ◮ “Branch on superword condition code” (BOSCC) [Shin et al. PACT’07] ◮ Additional overhead for dynamic test ◮ Does not help against increased register pressure Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 6 / 25

  10. saarland university Non-Divergent Control Flow computer science Idea: optimize cases where threads do not diverge a a b b Thread Trace u c c d d 1 a b c e b d e f 2 a b c e b d e f e e v 3 a b c e b d e f f f 4 a b c e b d e f Option 2: Statically prove non-divergence of certain blocks ◮ Non-divergent blocks can be excluded from linearization ◮ Less executed code, less register pressure ◮ More conservative than dynamic test Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 6 / 25

  11. saarland university Non-Divergent Control Flow computer science Idea: optimize cases where threads do not diverge a a b b Thread Trace u c c d d 1 a b c e f 2 a b c e f e e u 3 a b c e b d e f f f 4 a b c e b d e f 5 a b c e b d e f 6 a b c e b d e f Option 3: Statically split non-divergence inputs ◮ Code versions with improved divergence properties ◮ Orthogonal to both other options = ⇒ combination possible Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 6 / 25

  12. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  13. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  14. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  15. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  16. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  17. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  18. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 7 / 25

  19. saarland university 2D Convolution computer science int left = x - 2; int right = x + 2; int top = y - 2; int bottom = y + 2; int sum = 0; for (int i = left; i <= right; ++i) for (int j = top; j <= bottom; ++j) sum += input [ j ][ i ] * mask[j - top][i - left]; output [ y ][ x ] = sum; Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 8 / 25

  20. saarland university 2D Convolution computer science auto left = x - 2; auto right = x + 2; int top = y - 2; int bottom = y + 2; int sum = 0; for (auto i = left; i <= right; ++i) for (int j = top; j <= bottom; ++j) sum += input [ j ][ i ] * mask[j - top][i - left]; output [ y ][ x ] = sum; Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 9 / 25

  21. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 10 / 25

  22. saarland university 2D Convolution computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 10 / 25

  23. saarland university 2D Convolution computer science int left = MAX(0, x - 2); int right = MIN(width - 1, x + 2); int top = MAX(0, y - 2); int bottom = MIN(height - 1, y + 2); int sum = 0; for (int i = left; i <= right; ++i) for (int j = top; j <= bottom; ++j) sum += input [ j ][ i ] * mask [ j − ( y − 2 )][ i − ( x − 2 )] ; output [ y ][ x ] = sum; Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 11 / 25

  24. saarland university 2D Convolution computer science auto left = MAX(0, x - 2); auto right = MIN(width - 1, x + 2); int top = MAX(0, y - 2); int bottom = MIN(height - 1, y + 2); int sum = 0; for (auto i = left; i <= right; ++i) for (int j = top; j <= bottom; ++j) sum += input [ j ][ i ] * mask [ j − ( y − 2 )][ i − ( x − 2 )] ; output [ y ][ x ] = sum; Simon Moll, Johannes Doerfert, Sebastian Hack Motivation October 29, 2015 12 / 25

  25. saarland university Input Space Splitting computer science x y Simon Moll, Johannes Doerfert, Sebastian Hack Input Space Splitting October 29, 2015 13 / 25

  26. saarland university Input Space Splitting computer science x x y y Simon Moll, Johannes Doerfert, Sebastian Hack Input Space Splitting October 29, 2015 13 / 25

  27. saarland university Input Space Splitting computer science x x y y vector scalar Simon Moll, Johannes Doerfert, Sebastian Hack Input Space Splitting October 29, 2015 13 / 25

  28. saarland university The Polyhedral Model computer science S: A[i][j] = /* ... */ ; if (j <= i) P: A[i][j]+= A[j][i]; Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 14 / 25

  29. saarland university The Polyhedral Model computer science for (int i = 0; i <= N; i++) for (int j = 0; j <= N; j++) { S: A[i][j] = /* ... */ ; if (j <= i) P: A[i][j]+= A[j][i]; } Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 15 / 25

  30. saarland university The Polyhedral Model computer science i N for (int i = 0; i <= N; i++) for (int j = 0; j <= N; j++) { S: A[i][j] = /* ... */ ; if (j <= i) P: A[i][j]+= A[j][i]; } j 0 0 N I S = { ( S , ( i , j )) | 0 ≤ i ≤ N ∧ 0 ≤ j ≤ N } Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 15 / 25

  31. saarland university The Polyhedral Model computer science i N for (int i = 0; i <= N; i++) for (int j = 0; j <= N; j++) { S: A[i][j] = /* ... */ ; if (j <= i) P: A[i][j]+= A[j][i]; } j 0 0 N I S = { ( S , ( i , j )) | 0 ≤ i ≤ N ∧ 0 ≤ j ≤ N } I P = { ( P , ( i , j )) | 0 ≤ i ≤ N ∧ 0 ≤ j ≤ i } Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 15 / 25

  32. saarland university The Polyhedral Model computer science i N for (int i = 0; i <= N; i++) for (int j = 0; j <= N; j++) { S: A[i][j] = /* ... */ ; if (j <= i) P: A[i][j]+= A[j][i]; } j 0 0 N F S = { ( S , ( i , j )) → ( i , j ) } Simon Moll, Johannes Doerfert, Sebastian Hack Background October 29, 2015 15 / 25

Recommend


More recommend