NVPRO-PIPELINE A RESEARCH RENDERING PIPELINE MARKUS TAVENRATH – MATAVENRATH@NVIDIA.COM SENIOR DEVELOPER TECHNOLOGY ENGINEER, NVIDIA
NVPRO-PIPELINE Peak Double Precision FLOPS GPU perf improved better than CPU perf GFLOPS In the past apps were GPU bound 3500 Today apps tend to become CPU bound 3000 2500 nvpro-pipeline started as research platform to address this issue 2000 1500 http://github.com/nvpro-pipeline 1000 500 0 2008 2009 2010 2011 2012 2013 2014 NVIDIA GPU x86 CPU
CPU BOUNDEDNESS REASONS Application Pipeline for Scene traversal application like Culling experiments Other, i.e. animation, simulation Pipeline for OpenGL techniques Driver Inefficient functionality like glBegin/glEnd Pipeline for Functionality which is yet optimized driver verification CPU->GPU data transfer
NVPRO-PIPELINE MODULES SceneGraph RiX (Renderer) Effect System Utilities [dp::sg] [dp::rix] [dp::fx] [dp::util] GL Backend XML Based for GLSL Math library Algorithms [dp::rix::gl] [dp::fx::xml] [dp::math] Vulkan backend Culling SceneTree (XBAR) planned [dp::culling] Windowing Loaders/Savers [dp::ui] Manipulators Renderer for RiX::GL [dp::ui::manipulator]
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
SCENEGRAPH Traverse & Render Simplified version of SceniX SceneGraph GeoNodes, Groups, Transforms, Billboards, Switches still available Animated* objects have been removed to make development easier New property based animation system prepared, but not yet active (LinkManager) G0 T0 T1 S0 G1 T2 T3 S1 S2
SCENEGRAPH TRAVERSAL COST Memory cost Objects scattered in RAM Latency when accessing an object Objects are big Traversing one object might touch multiple cache-lines Instruction calling cost G0 void processNode(Node *node) { // function call switch (node->getType()) { // branch misprediction case Group: handleGroup((Group*)node); // virtual function call break; T0 T1 case Transform: handleTransform((Transform*)node); break; case GeoNode: handleGeoNode((GeoNode*)node); S0 G1 break; } Transformation Cost Compute accumulated transformations during traversal T2 T3 Hierarchy Cost Deep hierarchy adds ‚needless‘ traversal cost ( 5/14 nodes in example of interest) S1 S2
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
SCENETREE REQUIREMENTS G0 Generate on the fly from SceneGraph Incremental updates T0 T1 Minimal amount of work on changes Caching mechanism per path S0 G1 G1‘ No recomputation of ‚unchanged‘ values Flat list of GeoNodes Get rid of traversal T2 T3 T2‘ T3‘ Memory efficient Don‘t copy data, keep references S1 S2 S1‘ S2‘ Flat List S0 S1 S2 S1‘ S2‘
SCENETREE CONSTRUCTION G0 G0 T0 T1 T0 T1 Event based updates S0 G1 S0 G1 G1‘ T2 T3 T2 T3 T2‘ T3‘ S1 S2 S1‘ S2‘ S1 S2 Flat List S0 S1 S2 S1‘ S2‘
SCENETREE CONSTRUCTION G0 G0 T0 T0 S0 S0 Flat List S0
SCENETREE CONSTRUCTION G0 G0 T0 T0 Event: Node added S0 G1 S0 G1 T2 T3 T2 T3 Event: S1 S2 S1 S2 GeoNode added Flat List S0 S1 S2
SCENETREE CONSTRUCTION G0 G0 T0 T0 Event: Node Removed S0 G1 S0 G1 T2 T2 T3 Event: S1 S2 S1 GeoNode Removed Flat List S0 S1 S2
SCENETREE CONSTRUCTION G0 G0 T0 T1 T0 T1 Event: Node added S0 G1 S0 G1 G1‘ T2 T2 T2‘ Event: S1 S1‘ GeoNode S1 added Flat List S0 S1 S1‘
SCENETREE CONSTRUCTION G0 G0 T0 T1 T0 T1 Event: Property Matrix Transform changed S0 G1 S0 G1 G1‘ Event: T2 T2 T2‘ Transform Changed (2x) S1 S1‘ S1 Flat List Construction: S0 S1 S1‘ S3032 Advanced Scenegraph Rendering Pipeline (GTC 2013)
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
SCENERENDERER Observe SceneTree to track GeoNodes in arrays dp::sg::renderer::rix::gl is ‚example‘ renderer Render Scene Update resources Compute near/far plane Frustum culling Depth pass Opaque pass Transparent pass
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
ANATOMY OF A SHADER Shader Part Source Code Example Pipeline Module // version header & extensions Version Header Renderer #version 330 #extension GL_NV_shader_buffer_load : enable // Uniforms uniform struct Parameters{ Uniforms Material description float parameter; }; // vertex attributes (vertex shader) Attributes (Material description) layout(location = 0) in vec4 attrPosition; Hardcoded or Shader Stage variables in/out vec3 varPosition; generated User provided to Bsdf*(params); Library functions determineMaterialColor(); generator determineNormal(); void main() Material description or { User Implementation // some code rendering system }
PARAMETER GROUPING ParameterGroupSpecs Binding Frequency Shader independent globals, i.e. camera constant Shader dependent globals, i.e. environment map EffectSpec Light, i.e. light sources and shadow maps rare Material parameters without objects, i.e. float, int and bool frequent Material parameters with objects, i.e. textures and buffers Object parameters, i.e. position/rotation/scaling always
EFFECT FRAMEWORK GOALS Unique shader interface with support of multiple rendering APIs Code generation for different kind of parameter techniques, i.e. Phong Car paint PBR Uniform Buffer Shader Storage Shader Buffer Other Uniforms Objects Buffer Objects Load Graphics API
PARAMETER SHADER CODE GENERATION ParameterGroup phong_fs vec3 ambient vec3 diffuse vec3 specular float specularExp Uniforms UBO shaderbufferload struct sbl_phong_fs { layout(std140) uniform vec3 ambient; uniform vec3 ambient; uniform ubo_phong_fs { uniform vec3 diffuse; uniform vec3 diffuse; uniform vec3 ambient; uniform vec3 specular; uniform vec3 specular; uniform vec3 diffuse; uniform float specularExp; uniform float specularExp; uniform vec3 specular; } uniform float specularExp; } uniform sbl_phong_fs *sys_phong_fs; #define ambient sys_phong_fs->ambient #define diffuse sys_phong_fs->diffuse Details: #define specular sys_phong_fs->specular S3032 Advanced Scenegraph Rendering Pipeline (GTC 2013) #define specularExp sys_phong_fs->specularExp
RENDERING PIPELINE SceneGraph Scene abstraction, algorithms, loaders, savers,... SceneTree (XBAR) Scene Traversal Rendering Algorithm Developers code with rendering algorithm EffectFramework Shader abstraction OpenGL abstraction, hides VAB, UBO, bindless, ... RiX
RIX Rendering API abstraction with OpenGL backend in place Hide implementation details which generate all kind of (OpenGL) streams Parameter Updates Buffer Upload Vertex Attribute Generic Attributes glUniform glBufferSubData (2.1) Bindless Vertex Array Objects glBufferSubData Batched (VAO, 3.0) Vertex Attrib Binding glBufferAddressRangeNV Persistent Mapped (VAB, 4.3) glBindBufferRange
RENDER PIPELINE USING RIX Render Scene Depth Pass RenderGroup Depth Same objects Opaque Pass RenderGroup Opaque Transparent Pass RenderGroup Transparent Post-Processing RenderGroup per render pass Rendering cache can be optimized for pass Depth-Pass might require only positions, but not normals and texture coordinates -> smaller cache Fewer OpenGL calls than opaque pass with optimized cache Transparent pass might or might not require ordering
RENDER GROUP RenderGroup ‚solid‘ ‚textured‘ ‚solid‘ ‚solid‘ ‚textured‘ GeometryInstance ProgramPipeline ContainerData[] Geometry GeometryInstance can only be referenced by single RenderGroup
RENDER GROUP RenderGroup ‚solid‘ ‚textured‘ ‚solid‘ ‚solid‘ ‚textured‘ ProgramPipelineGroupCache GIs ‚textured‘ ‚textured‘ textured Group by GIs ‚solid‘ ‚solid‘ ‚solid‘ Program solid Sort by ContainerData
PROGRAM PIPELINE GROUP CACHE ProgramPipelineGroupCache<VertexCache, ParameterCache> AttributeCacheEntry ‚solid‘ ‚solid‘ ‚solid‘ GeometryInstanceCacheEntry ContainerCacheEntry offset std::vector<unsigned char> uniforms; dp::gl::Buffer bufferData; // UBO, SSBO
BENCHMARK GLUTAnimation 100x100 Spheres Geometry duplication 5 different materials Each sphere has own ‚color‘
Recommend
More recommend