u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Faculty of Science Automatic Code Generation for Library Method Inclusion in Domain Specific Languages Communicating Process Architectures 2017 – University of Malta Mads Ohm Larsen Niels Bohr Institute, University of Copenhagen, Denmark 21 August 2017 Slide 1/21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. An example of such a method is a fast way of multiplying two matrices that comes with the blas library call *gemm . Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. An example of such a method is a fast way of multiplying two matrices that comes with the blas library call *gemm . If possible, we always want to use this faster method. Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Why can’t we? So why not just use one of these specialized libraries all the time? Slide 3/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Why can’t we? So why not just use one of these specialized libraries all the time? There exist many different libraries, for many different purposes/architectures/OSes. Slide 3/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Use the best? We have cBLAS , Accelerate, clBLAS , lapack and many many more. Slide 4/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Use the best? We have cBLAS , Accelerate, clBLAS , lapack and many many more. “Best” is hard to define. No one of the above is “best”. They all are “best” in their own way. Slide 4/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Coding blas Code cBLAS code: #include <cblas.h> 1 2 ... // Set up m, n, k, A_data, B_data, and C_data 3 4 // Calculates 5 // C := alpha * op(A) * op(B) + beta * C 6 // where op(X) is either X or X^T 7 cblas_sgemm( 8 CblasRowMajor, // Memory management 9 CblasNoTrans, // Transpose A? 10 CblasNoTrans, // Transpose B? 11 m, // Number of rows of op(A) 12 n, // Number of columns of op(B) 13 k, // Number of columns/rows of op(A) and op(B) 14 1.0, // Alpha argument 15 A_data, // Array of size m*k 16 k, // First dimension of A / Stride of A 17 B_data, // Array of size k*n 18 n, // Stride of B 19 0.0, // Beta argument 20 C_data, // Array of size m*n 21 n // Stride of C 22 ); 23 Slide 5/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Coding blas Code Python code: import numpy as np 1 ... # Set up a and b 2 c = np.matmul(a, b) 3 Slide 6/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy NumPy NumPy already uses blas for calls like matmul . Problem solved? Slide 7/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy NumPy NumPy already uses blas for calls like matmul . Problem solved? No. Python/NumPy is “slow” (single threaded) and cannot utilize GPGPUs or other accelerators out-of-the-box. Slide 7/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy vs. Bohrium Bohrium Bohrium can use GPGPUs, but does not support blas . Slide 8/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy vs. Bohrium Bohrium Bohrium can use GPGPUs, but does not support blas . Let us make it support these library methods, such as blas . Slide 8/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. If we find blas we want to link with it. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. If we find blas we want to link with it. However, if we find clBLAS we would also link with that. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Choose With automatic code inclusion, we can choose which library we want to use on compile- and run-time! Slide 10/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Choose With automatic code inclusion, we can choose which library we want to use on compile- and run-time! We want Bohrium to link to both blas and choose the correct one later. Slide 10/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Implementing We can implement all the blas calls ourselves. Slide 11/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Implementing We can implement all the blas calls ourselves. Tedious. Let’s generate it instead! Slide 11/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON All of the blas methods follow a similar pattern. Slide 12/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON All of the blas methods follow a similar pattern. Let’s use that to our advantage. Slide 12/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON { 1 "methods": [ 2 { 3 "name": "gemm", 4 "types": [ "s", "d", "c", "z" ], 5 "options": [ 6 "layout", "notransA", "notransB", 7 "m", "n", "k", 8 "A", "B", "C" 9 ] 10 }, 11 ... 12 ] 13 } 14 Slide 13/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21
Recommend
More recommend