p arallel p rograms u sing a d omain s pecific l anguage
play

P ARALLEL P ROGRAMS U SING A D OMAIN -S PECIFIC L ANGUAGE T OBIAS K - PowerPoint PPT Presentation

T OWARDS I NTERACTIVE V ISUAL E XPLORATION OF M ASSIVELY P ARALLEL P ROGRAMS U SING A D OMAIN -S PECIFIC L ANGUAGE T OBIAS K LEIN T OOL F OR D EVELOPMENT AND A NALYSIS OF O PEN CL K ERNELS W HY V ISUALIZE P ARALLEL P ROGRAMS ? I NSPIRED BY A


  1. T OWARDS I NTERACTIVE V ISUAL E XPLORATION OF M ASSIVELY P ARALLEL P ROGRAMS U SING A D OMAIN -S PECIFIC L ANGUAGE T OBIAS K LEIN

  2. T OOL F OR D EVELOPMENT AND A NALYSIS OF O PEN CL K ERNELS

  3. W HY V ISUALIZE P ARALLEL P ROGRAMS ? • I NSPIRED BY A LGORITHM V ISUALIZATION [M IKE B OSTOCK ]

  4. W HY V ISUALIZE P ARALLEL P ROGRAMS ? • G ENERAL U NDERSTANDING • D EBUGGING • P ERFORMANCE A NALYSIS • R APID P ROTOTYPING

  5. V ISUAL E NCODING OF P ROGRAM B EHAVIOR

  6. V ISUAL E NCODING – L OCAL M EMORY A CCESSES var[id] = 1; var[id]; barrier(CLK_LOCAL_MEM_FENCE); ...

  7. V ISUAL E NCODING - W ARP D IVERGENCE warp if (condition){ if (condition){ if (condition){ if (condition){ 1 (true) 1 (true) 0 (false) nr of threads instruction; instruction; instruction; instruction; } else { } else { } else { } else { instruction; instruction; instruction; instruction; 11 } } } } 00 8 0 (false) 0 (false) 1 (true) if (condition){ if (condition){ if (condition){ if (condition){ instruction; instruction; instruction; instruction; 16 11 10 } else { } else { } else { } else { instruction; instruction; instruction; instruction; 8 } } } } 11 7

  8. V ISUAL E NCODING - D EMO

  9. H OW DO WE CREATE P ROGRAM V ISUALIZATIONS

  10. D OMAIN -S PECIFIC L ANGUAGE • Device DSL • Executed in parallel • OpenCL C + Annotations • Just-in-time compiled • Host DSL • Device configuration • Domain Objects (Data, Images, Visualizer) • No compilation

  11. D EVICE DSL - A NNOTATIONS kernel reduction( global float *g_idata, global float *g_odata, int n) { . . . if (i < n) { sdata[tid] = g_idata[i];} else { sdata[tid] = 0; } barrier(CLK_LOCAL_MEM_FENCE); @watch[start] sdata; for (int s = 1; s < get_local_size(0); s *= 2) { int index = 2 * s * tid; if (index < get_local_size(0)) { sdata[index] += sdata[index + s]; } @watch[end] sdata; . . . }

  12. H OST DSL int local_size = 64; Common language concepts int global_size = 4096; float[512] in; GPU Data Structures float[512] out; Domain Objects Arrays.fillRandom(in); Device.setLocalWorkSize(local_size, 1); Device.setGlobalWorkSize(global_size, 1); Device Configurations Device.measureExecutionTime(true); Device.setInstrumentation(true); Device Kernel Call Device.reduction(in, out, 512); Visualize data Visualizer.plot(out);

  13. U SECASE – I MAGE P ROCESSING B ILATERAL F ILTER • E DGE P RESERVING AND S MOOTHING • G AUSSIAN W EIGHT • R ANGE W EIGHT (C OLOR ) A NALYSIS OF I NTERMEDIATE V ALUES • R EPRESENTED AS I MAGES • R EPRESENTED IN A P LOT

  14. U SECASE – I MAGE P ROCESSING

  15. S UMMARY • Visualizations can help you to understand the execution behavior of your GPU Program • They provide a simple way to reveal see possible issues in your implementation • DSLs are able to capture common concepts of a domain without reducing flexibility

  16. T HANK Y OU Contact tobias.da.klein@gmail.com peter.rautek@kaust.edu.sa

Recommend


More recommend