Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at
Walking down the graphics pipeline Application Geometry Rasterizer
What for? Understanding the rendering pipeline is the key to real-time rendering! Insights into how things work Understanding algorithms Insights into how fast things work Performance Vienna University of Technology 3
Simple Graphics Pipeline Often found in text books Will take a more detailed look into OpenGL Application Geometry Rasterizer Display Vienna University of Technology 4
Graphics Pipeline (pre DX10, OpenGL 2 ) Nowadays, everything part Application CPU of the pipeline is hardware Driver accelerated Geometry Command Geometry Fragment: “pixel”, but with additional info (alpha, Rasterization Rasterizer depth, stencil, …) Texture Fragment Display Vienna University of Technology 5
Fixed Function Pipeline – Dataflow View video memory on-chip cache memory vertex pre-TnL geometry shading cache (T&L) system commands memory post-TnL cache CPU triangle setup rasterization texture fragment textures cache shading and frame buffer raster operations Vienna University of Technology 6
DirectX10 /OpenGL 3.2 Evolution Vertex Application Input CPU Buffer Assembler Index Buffer Driver Vertex Texture Shader Geometry Command Geometry Texture Shader Geometry Stream Buffer Out Rasterization Rasterizer Setup/ Memory Rasterization Texture Pixel Texture Shader Fragment Depth Output Merger Display Color Vienna University of Technology 7
OpenGL 3.0 OpenGL 2.x is not as capable as DirectX 10 But : New features are vendor specific extensions (geometry shaders , streams…) GLSL a little more restrictive than HLSL (SM 3.0) OpenGL 3.0 did not clean up this mess! OpenGL 2.1 + extensions Geometry shaders are only an extension New: depreciation mechanism OpenGL 4.x New extensions OpenGL ES compatibility! Vienna University of Technology 8
DirectX 11/OpenGL 4.0 Evolution fixed Not the final place in the pipeline!!! programmable memory Constant Constant Constant Constant Control Pixel Input Vertex Geometry Setup Output Point Tessellator Shader Assembler Shader Shader Rasterizer Merger Shader Stream Sampler Sampler Sampler Sampler out Vertex Index Stream Depth Render Texture Texture Texture Texture Buffer Buffer Buffer Stencil Target Memory Vienna University of Technology 9
DirectX 11 Tesselation At unexpected position! Compute Shaders Multithreading To reduce state change overhead Dynamic shader linking HDR texture compression Many other features... Vienna University of Technology 10
DirectX 11 Pipeline Vienna University of Technology 11
Application Generate database (Scene description) Usually only once Load from disk Build acceleration structures (hierarchy, …) Simulation (Animation, AI, Physics) Input event handlers Modify data structures Database traversal Shaders (vertex,geometry,fragment) Vienna University of Technology 12
Driver Maintain graphics API state Command interpretation/translation Host commands GPU commands Handle data transfer Memory management Emulation of missing hardware features Usually huge overhead! Significantly reduced in DX10 Vienna University of Technology 13
Geometry Stage Command Vertex Processing Tesselation Primitive Assembly Geometry Shading Clipping Perspective Division Culling Vienna University of Technology 14
Command Command buffering (!) Unpack and perform format conversion (“Input Command interpretation Assembler”) glLoadIdentity( ); glMultMatrix( T ); Color glBegin( GL_TRIANGLE_STRIP ); T Transformation matrix glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); glEnd( ); Vienna University of Technology 15
Vertex Processing Transformation Vertex Processing normalized eye object clip window device v e Modelview Projection Perspective Viewport r t Matrix Matrix Division Transform e x Modelview Projection Modelview l l l Vienna University of Technology 16
Vertex Processing Fixed function pipeline: User has to provide matrices, the rest happens automatically Programmable pipeline: User has to provide matrices/other data to shader Shader Code transforms vertex explicitly We can do whatever we want with the vertex! Usually a gl_ModelViewProjectionMatri x is provided In GLSL-Shader : gl_Position = ftransform(); Vienna University of Technology 17
Vertex Processing Lighting Texture coordinate generation and/or transformation Vertex shading for special effects T Screen-space lit triangles Object-space triangles Vienna University of Technology 18
Tesselation If just triangles, nothing needs to be done, otherwise: Evaluation of polynomials for curved surfaces Create vertices (tesselation) DirectX11 specifies this in hardware! 3 new shader stages!!! Still not trivial (special algorithms required) Vienna University of Technology 19
DirectX11 Tesselation Vienna University of Technology 20
Tesselation Example Optimally tesslated! Vienna University of Technology 21
Geometry Shader Calculations on a primitive (triangle) Access to neighbor triangles Limited output (1024 32-bit values) No general tesselation! Applications : Render to cubemap Shadow volume generation Triangle extension for ray tracing Extrusion operations (fur rendering) Vienna University of Technology 22
Rest of Geometry Stage Primitive assembly Geometry shader Clipping (in homogeneous coordinates) Perspective division, viewport transform Culling Vienna University of Technology 23
Rasterization Stage Triangle Setup Rasterization Fragment Texture Processing Processing Raster Operations Vienna University of Technology 24
Rasterization Setup (per-triangle) Sampling (triangle = {fragments}) Interpolation (interpolate colors and coordinates) Screen-space triangles Fragments Vienna University of Technology 25
Rasterization Sampling inclusion determination In tile order improves cache coherency Tile sizes vendor/generation specific Old graphics cards: 16x64 New: 4x4 Smaller tile size favors conditionals in shaders All tile fragments calculated in parallel on modern hardware Vienna University of Technology 26
Rasterization – Coordinates Fragments represent “future” pixels y window Pixel center at coordinate (2.5, 1.5)! 3.0 2.0 Pixel (2,1) 1.0 0.0 x window coordinate 0.0 1.0 2.0 3.0 Lower left corner of the window Vienna University of Technology 27
Rasterization – Rules Separate rule for each primitive Non-ambiguous! Polygons: Pixel center contained in polygon On-edge pixels: only one is rasterized Vienna University of Technology 28
Texture Texture “transformation” and projection E.g., projective textures Texture address calculation (programmable in shader) Texture filtering Fragments Texture Fragments Vienna University of Technology 29
Fragment Texture operations (combinations, modulations, animations etc.) Texture Fragments Textured Fragments Fragments Vienna University of Technology 30
Raster Tests Ownership Is pixel obscured by other window? Scissor test Only render to scissor rectangle Depth test Test according to z-buffer Alpha test Test according to alpha-value Stencil test Test according to stencil Textured Fragments Framebuffer Pixels buffer Vienna University of Technology 31
Raster Operations Blending or compositing Dithering Logical operations Textured Fragments Framebuffer Pixels Vienna University of Technology 32
Raster Operations After fragment color calculation (“Output Merger”) Fragment Pixel Scissor Alpha and Ownership Test Test associated Test data Depth Stencil Test Test Depth Buffer Stencil Buffer Blending Frame Logicop Dithering (RGBA only) Buffer Vienna University of Technology 33
Display Gamma correction Digital to analog conversion if necessary Framebuffer Pixels Light Vienna University of Technology 34
Display Frame buffer pixel format: RGBA vs. index (obsolete) Bits: 16, 32, 128 bit floating point, … Double buffered vs. single buffered Quad-buffered for stereo Overlays (extra bit planes) for GUI Auxiliary buffers: alpha, stencil Vienna University of Technology 35
Functionality vs. Frequency Geometry processing = per-vertex Transformation and Lighting (T&L) Historically floating point, complex operations Today: fully programmable flow control, texture lookup 20-1500 million vertices per second Fragment processing = per-fragment Blending and texture combination Historically fixed point and limited operations Up to 50 billion fragments (“Gigatexel”/sec) Floating point, programmable complex operations Vienna University of Technology 36
Recommend
More recommend