rfc a new divergence analysis for llvm
play

RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner - PowerPoint PPT Presentation

RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner and Sebastian Hack http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1 Today: Divergence Analysis Recap: VPlan+RV


  1. RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klößner and Sebastian Hack http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1

  2. Today: Divergence Analysis Recap: VPlan+RV • VPlan: new vectorization infrastructure for LLVM. • RV: The Region Vectorizer github.com/uni-saarland/rv Vectorizer for outer loops and whole functions. available today! • VPlan+RV: Bring RV’s analyses and transformations to VPlan. Coming up: Partial Control-Flow Linearization (PLDI ’18). 2 → under development.

  3. Today: Divergence Analysis Recap: VPlan+RV • VPlan: new vectorization infrastructure for LLVM. • RV: The Region Vectorizer github.com/uni-saarland/rv • VPlan+RV: Bring RV’s analyses and transformations to VPlan. Coming up: Partial Control-Flow Linearization (PLDI ’18). 2 → under development. → Vectorizer for outer loops and whole functions. → available today!

  4. Recap: VPlan+RV • VPlan: new vectorization infrastructure for LLVM. • RV: The Region Vectorizer github.com/uni-saarland/rv • VPlan+RV: Bring RV’s analyses and transformations to VPlan. 2 → under development. → Vectorizer for outer loops and whole functions. → available today! → Today: Divergence Analysis → Coming up: Partial Control-Flow Linearization (PLDI ’18).

  5. DivergenceAnalysis 7 until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. unit tests show what’s possible. Not much to do: only single block loops with LLVM’s LV • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 for ( int i = 0; i < n; ++i) { 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3

  6. DivergenceAnalysis 7 until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. unit tests show what’s possible. Not much to do: only single block loops with LLVM’s LV • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 for ( int i = 0; i < n; ++i) { 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3

  7. DivergenceAnalysis 7 until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. unit tests show what’s possible. Not much to do: only single block loops with LLVM’s LV • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 for ( int i = 0; i < n; ++i) { 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3

  8. DivergenceAnalysis 7 until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. unit tests show what’s possible. Not much to do: only single block loops with LLVM’s LV • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 for ( int i = 0; i < n; ++i) { 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3

  9. DivergenceAnalysis for ( int i = 0; i < n; ++i) { until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 7 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3 → Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

  10. DivergenceAnalysis for ( int i = 0; i < n; ++i) { until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs. • Won’t be required by VPlan before patch series #3. • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 7 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3 → Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

  11. DivergenceAnalysis for ( int i = 0; i < n; ++i) { • Won’t be required by VPlan before patch series #3. • Integrated with LoopVectorizer (vplan-rv fork). 2 6 -1 1 7 7 7 7 vectorized } } varying_var = foo(i) + bar(j); uni_var = f(i); for ( int j = 0; j < m; ++j) { 3 → Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible. → until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

  12. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch uniform • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 φ φ

  13. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch uniform • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 φ φ

  14. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch uniform • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 φ varying φ φ

  15. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 uniform φ φ varying φ φ

  16. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch ? • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 uniform φ φ varying φ φ

  17. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch ? • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 uniform φ φ varying φ φ

  18. LLVM’s DivergenceAnalysis (NVPTX/AMDGPU) A B divergent branch ? • LLVM’s DivergenceAnalysis invalid for unstructured CFGs. • Our analysis supports unstructured control. 4 uniform φ φ varying φ φ

  19. DivergenceAnalysis GPUDivergenceAnalysis NVPTX/AMDGPU StructurizeCFG -use-rv-da LoopDivergenceAnalysis LoopVectorizer -vectorizer-use-da Available at github.com/cdl-saarland/vplan-rv 5

Recommend


More recommend