massive acceleration through the many core processor that
play

MASSIVE ACCELERATION THROUGH THE MANY-CORE PROCESSOR THAT YOU CALL - PowerPoint PPT Presentation

Click to edit Master title style MASSIVE ACCELERATION THROUGH THE MANY-CORE PROCESSOR THAT YOU CALL A GRAPHICS CARD Jesper Mosegaard Head of Computer Graphics Lab Alexandra Institute Plan Click to edit Master title style


  1. Click to edit Master title style MASSIVE ACCELERATION THROUGH THE � MANY-CORE PROCESSOR � THAT YOU CALL A � GRAPHICS CARD Jesper Mosegaard Head of Computer Graphics Lab Alexandra Institute �

  2. Plan Click to edit Master title style • Historical review • Cases • Future - and when is it for you ?

  3. GTS - Advanced Technology Group Click to edit Master title style • The Alexandra Institute is one of Denmark’s nine GTS Institutes – Approved by the Danish Ministry of Science, Technology and Innovation – Independent and not-for-profit companies – The core of technological infrastructure in Denmark – Develop technological services based on latest research – Sell state-of-the-art technological services to private enterprises and public authorities

  4. Research based user driven innovation Click to edit Master title style Research Consult

  5. What do we do ? Click to edit Master title style • Cutting-edge knowledge and competencies • Research strategy, and active in research • Software development • Teaching and training • Partner in research projects • Independent partner in choice of technology, method etc. • Idea-generating

  6. Computer Graphics Lab Click to edit Master title style Jesper Mosegaard, head of research Thomas Kim Kjeldsen Ph.d. Computer Science Ph.d. In Physics Peter Trier Mikkelsen Lee Lassen Masters Computer Science Masters in Computer Science Karsten Noe Nikolaj Andersen Ph.d. Computer Science 3D graphics Artist Jens Rimestad Masters Computer Science Brian Christensen Ph.d. Computer Science Jesper Børlum Masters in Civil Engineering

  7. An overview Click to edit Master title style Fast 3D Photorealistic Visualization Materials GPGPU Big Data Medical calculation

  8. Computer Graphics in many areas Click to edit Master title style

  9. CG cooperation Click to edit Master title style CAVI 10/2/12 Page 9

  10. Historical Review Click to edit Master title style

  11. Software rasterization Click to edit Master title style • Creative freedom Comanche, 1992 Outcast, 1999

  12. Hardware accelerated graphics Click to edit Master title style • S3 Virge (1995)

  13. Fixed Function pipeline Click to edit Master title style Ridge Racer Battlefield 1942 Quake 2

  14. The GPU Click to edit Master title style • GeForce 256 ”The worlds first GPU” (1999) – Integrated T&L – Texture/Environment Mapping

  15. Click to edit Master title style

  16. First programmable cards Click to edit Master title style • NV_Vertex_program (Geforce3) - 2000 • NV_Fragment_program (GeForce FX) - 2001 • In 2002 – ARB_Fragment_program – ARB_Vertex_program

  17. Programmable vertices and fragments Click to edit Master title style Vertices Rasterization Fragments

  18. ARB Vertex program 1.0 Click to edit Master title style !!ARBvp1.0 TEMP R0, R1; DP3 R0, program.local[32], vertex.normal; MUL result.color.primary.xyz, R0, program.local[35]; MAX R0, program.local[64].x, R0; MUL R0, R0, vertex.normal; MUL R0, R0, program.local[64].z; ADD R1, vertex.position, -R0; DP4 result.position.x, state.matrix.mvp.row[3], R1; DP4 result.position.y, state.matrix.mvp.row[1], R1; DP4 result.position.z, state.matrix.mvp.row[2], R1; DP4 result.position.w, state.matrix.mvp.row[3], R1;

  19. nVidia Dawn demo Click to edit Master title style • GeForce FX, 2002

  20. High level shader languages Click to edit Master title style • nVidia Cg, 2002 • Microsoft HLSL, 2002 • OpenGL GLSL, 2004

  21. GLSL example Click to edit Master title style #version 140 uniform Transformation { mat4 projection_matrix; mat4 modelview_matrix; }; in vec3 vertex; void main() { gl_Position = projection_matrix * modelview_matrix * vec4(vertex, 1.0); }

  22. OpenGL 4.x pipeline Click to edit Master title style From http://www.khronos.org/developers/library/overview/opengl_overview.pdf

  23. Examples of programmable graphics Click to edit Master title style • Lego Digital Designer • Subsurface scattering • Molecular visualization

  24. Lego Digital Designer 3 à 4 Click to edit Master title style

  25. YES... Playing with LEGO at work Click to edit Master title style • 5.922 Taj Mahal • 3.803 Death Star

  26. Without SSDO (3.0) Click to edit Master title style June 23, 2009 Page 26

  27. With SSDO (4.0) Click to edit Master title style June 23, 2009 Page 27

  28. SSDO Click to edit Master title style June 23, 2009 Page 28

  29. Light Probagation Volumes Click to edit Master title style • Crytek’s realtime Global Illumination Kaplanyan, A. and Dachsbacher, Cascaded light propagation volumes for real-time indirect illumination. In Proceedings of the 2010 ACM SIGGRAPH Symposium on interactive 3D Graphics and Games June 23, 2009 Page 29

  30. Realtime Subsurface scattering Click to edit Master title style SSLPV: subsurface light propagation volumes. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics (HPG '11)

  31. Molecular visualization Click to edit Master title style

  32. Multicore crisis Click to edit Master title style

  33. Computing power of the GPU Click to edit Master title style

  34. Click to edit Master title style

  35. CMLLab Click to edit Master title style • Physically-Based Visual Simulation on Graphics Hardware. Mark J. Harris, Greg Coombe, Thorsten Scheuermann, and Anselmo Lastra. Proc. 2002 SIGGRAPH / Eurographics Workshop on Graphics 25 x x 25 Hardware 2002 peedup!!! !!! speedup Ignoring early work in the Ikonas (1978), the Pixel Machine (1989) and Pixel Planes 5 (1992)

  36. My adventure in gpgpu land Click to edit Master title style • ... a PhD on surgical simulators for procedures on children with malformed hearts

  37. Physics systems Click to edit Master title style

  38. Click to edit Master title style June 23, 2009 Page 38

  39. Mapping to 2D render-target Click to edit Master title style h s 1 s 2 … • 3D grid à 2D texture d – Flat 3d-texture h s 1 … s d-1 s d w w • Per vertex texture coordinates for neighbors

  40. Approximation of arbitrary shapes Click to edit Master title style • That is, some fragments are not valid particles – Exclude calculations with a depth-test based cull as well as fragment based conditional kill

  41. I don’t like graphics Click to edit Master title style • Graphics API is about graphics • Limitied memory model by textures • Limited shader capabilities • Lack of integer and bit operations • Communication limit between pixels • No scatter operation

  42. Away with the graphics Click to edit Master title style • Early academic work – BrookGPU (2004) • CTM (ati) - 2006 • Cuda (nvidia) - 2007 • OpenCL - 2008

  43. CUDA Click to edit Master title style • Compute Unified Device Architecture – Compute oriented language – Extension of C – A kernel is executed as a number of threads in parallel • Lightweight • 1000s of threads for full efficiency • SIMD (mostly) • Heterogenous computing – Host and device

  44. Grids, blocks, threads Click to edit Master title style Host Device Grid 1 Kernel Block Block Block 1 (0, 0) (1, 0) (2, 0) Block Block Block (0, 1) (1, 1) (2, 1) Grid 2 Kernel 2 Block (1, 1) Thread Thread Thread Thread Thread (0, 0) (1, 0) (2, 0) (3, 0) (4, 0) Thread Thread Thread Thread Thread (0, 1) (1, 1) (2, 1) (3, 1) (4, 1) Thread Thread Thread Thread Thread (0, 2) (1, 2) (2, 2) (3, 2) (4, 2)

  45. CUDA memory space Click to edit Master title style (Device) Grid Block (1, 0) Block (0, 0) Shared Memory Shared Memory Registers Registers Registers Registers Thread (0, 0) Thread (1, 0) Thread (0, 0) Thread (1, 0) Local Local Local Local Memory Memory Memory Memory Host Global Memory Constant Memory Texture Memory

  46. OpenCL, Khronos group Click to edit Master title style • Much the same as CUDA CUDA term OpenCL term GPU Device Multiprocessor Compute Unit Scalar core Processing element Global memory Global memory Shared (per-block) Local memory memory Local memory Private memory (automatic, or local) kernel program block work-group thread work item

  47. GPGPU work at the Alexandra Institute Click to edit Master title style • LEGO, 3D services • Luxion, spatial acceleration structures • BrainReader, Optical flow registration

  48. POPI 4D Thorax registration Click to edit Master title style • Horn & Schunck optical flow estimation 48 x x 48 peedup!!! !!! speedup Acceleration and validation of optical flow based deformable registration for image-guided radiotherapy. K.Ø. Noe , B.D. de Senneville, U.V. Elstrøm, K. Tanderup, T.S. Sørensen. Acta Oncologica 2008; 47(7):1286-1293.

  49. Optical flow registration Click to edit Master title style • 3D grid of displacement vectors – From one dataset to another • Find optimum of the following;

  50. Click to edit Master title style • Euler-Lagrange – Integral to differential equation • Finite difference – discretized – à iterative local update scheme • Multiresolution – Global solution

  51. BrainReader ApS Click to edit Master title style • Registration of the hipocampus

  52. Photorealistic... ”Easy” enough Click to edit Master title style June 23, 2009 Page 52

  53. Photorealistic interactive images Click to edit Master title style • Fast raytracing

Recommend


More recommend