i ntroduction to
play

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P - PowerPoint PPT Presentation

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P ART I T EAPOT S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. S IMPLE O PEN GL P ROGRAM Idea of computing


  1. I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu

  2. P ART I “T EAPOT ”

  3. S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations.

  4. S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. � Let us start from observing an example of parallelism in a simple OpenGL application.

  5. S IMPLE O PEN GL P ROGRAM You will need CodeBlocks Windows, Linux or XCode Mac to run this example. • Install CodeBlocks bundled with MinGW compiler from http://www.codeblocks.org/downloads/26 � • Download codebase from https://github.com/kuz/ Introduction-to-GPU-Computing � • Open the project from the code/Cube � � • Compile & run it

  6. S HADER P ROGRAM Program which is executed on GPU . Has to be written using shading language . In OpenGL this language is GLSL , which is based on C. http://www.opengl.org/wiki/Shader

  7. S HADER P ROGRAM Program which is executed on GPU . Has to be written using shading language . In OpenGL this language is GLSL , which is based on C. OpenGL has 5 main shader stages: • Vertex Shader • Tessellation Control • Geometry Shader • Fragment Shader • Compute Shader (since 4.3) http://www.opengl.org/wiki/Shader

  8. S HADER P ROGRAM Program which is executed on GPU . Has to be written using shading language . In OpenGL this language is GLSL , which is based on C. OpenGL has 5 main shader stages: • Vertex Shader • Tessellation Control • Geometry Shader • Fragment Shader • Compute Shader (since 4.3) http://www.opengl.org/wiki/Shader

  9. L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene.

  10. L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf

  11. L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. Exercise: code that equation into fragment shader of the Cube program https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf

  12. L IGHTING

  13. C OMPARE FPS • Run the program with lighting enabled and look at FPS values

  14. C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires.

  15. C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. � • Note that these computations are performed on CPU

  16. C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. � • Note that these computations are performed on CPU � • Observe how FPS has changed

  17. C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy Parallel computations are fast on GPU. code which simulates approximately same amount of computations as Phong lighting model requires. Lets use it to compute something useful. � • Note that these computations are performed on CPU � • Observe how FPS has changed

  18. P ART II “O LD S CHOOL ”

  19. O PEN GL PIPELINE + GLSL Take the input data from the CPU memory and put it as an image into the GPU memory http://www.opengl.org/wiki/Framebuffer

  20. O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory http://www.opengl.org/wiki/Framebuffer

  21. O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory Store the resulting image to the Render Buffer inside the GPU memory http://www.opengl.org/wiki/Framebuffer

  22. O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory Read output from the GPU Store the resulting image memory back to the CPU to the Render Buffer inside memory the GPU memory http://www.opengl.org/wiki/Framebuffer

  23. O PEN GL PIPELINE + GLSL • Create texture where will store the input data http://www.opengl.org/wiki/Framebuffer

  24. O PEN GL PIPELINE + GLSL • Create texture where will store the input data � � � � � • Create FrameBuffer Object (FBO) to “render” to http://www.opengl.org/wiki/Framebuffer

  25. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline http://www.opengl.org/wiki/Framebuffer

  26. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix http://www.opengl.org/wiki/Framebuffer

  27. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture http://www.opengl.org/wiki/Framebuffer

  28. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) http://www.opengl.org/wiki/Framebuffer

  29. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) � • Read the data from the Render Buffer http://www.opengl.org/wiki/Framebuffer

  30. O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) � • Read the data from the Render Buffer � � � � • Can we use that to properly debug GLSL? http://www.opengl.org/wiki/Framebuffer

  31. D EMO Run the project from the code/FBO

  32. P ART III “M ODERN T IMES ”

  33. C OMPUTE S HADER • Since OpenGL 4.3 • Used to compute things not related to rendering directly

  34. C OMPUTE S HADER • Since OpenGL 4.3 • Used to compute things not related to rendering directly

  35. k l a t t o n t l l i i W C OMPUTE S HADER t u o b a • Since OpenGL 4.3 • Used to compute things not related to rendering directly http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf

  36. http://wiki.tiker.net/CudaVsOpenCL

  37. Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl http://wiki.tiker.net/CudaVsOpenCL

  38. Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia http://wiki.tiker.net/CudaVsOpenCL

  39. Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia ~same performance levels http://wiki.tiker.net/CudaVsOpenCL

  40. Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia ~same performance levels Open CL Developer-friendly http://wiki.tiker.net/CudaVsOpenCL

  41. P ART III C HAPTER 1

  42. K ERNEL

  43. K ERNEL

  44. K ERNEL

  45. W RITE AND R EAD D ATA ON GPU

  46. W RITE AND R EAD D ATA ON GPU … run computations here …

  47. W RITE AND R EAD D ATA ON GPU … run computations here …

  48. T HE C OMPUTATION

  49. T HE C OMPUTATION

  50. T HE C OMPUTATION

  51. T HE C OMPUTATION

  52. T HE C OMPUTATION

  53. D EMO Open, study and run the project from the code/OpenCL

  54. P ART III C HAPTER 2

  55. CUDA P ROGRAMMING MODEL • CPU is called “ host ” • Move data CPU <-> GPU memory cudaMemcopy • Allocate memory cudaMalloc ¡ • Launch kernels on GPU • GPU is called “ device ”

Recommend


More recommend