Real-Time Programmable Real-Time Programmable Shading Shading (http://www.cs.unc.edu/~olano/papers/rtshading/) (http://www.cs.unc.edu/~olano/papers/rtshading/) Anselmo Lastra, Steve Molnar, Anselmo Lastra, Steve Molnar, Marc Olano, Yulan Wang Marc Olano, Yulan Wang University of North Carolina at Chapel Hill University of North Carolina at Chapel Hill {lastra,molnar,olano,wangy}@cs.unc.edu {lastra,molnar,olano,wangy}@cs.unc.edu Requirements Introduction Organization PixelFlow Example ◆ Requirements of programmable shading ◆ PixelFlow shading architecture ◆ An example Requirements Requirements Programmable what? PixelFlow Surface shader PixelFlow Example Example ◆ Simple functions ◆ For one sample ◆ Run at each pixel/sample ◆ Inputs: ◆ Compute surface shading, lighting, – Intrinsic color, normal, texture coordinates, width, length, bumpiness, swirliness, ... displacement maps, atmospheric effects, ... ◆ Outputs: – Whitted 82, Cook 84, Perlin 85, Hanrahan 90 – Color, (opacity) – Rhoades 92 1
Requirements Requirements PixelFlow PixelFlow Resource requirements Programmability Example Example ◆ Programmability ◆ Programmable processors at sample level ◆ Memory ◆ High level language (i.e. RenderMan) ◆ Computational power – Hanrahan 90, Upstill 90 Requirements Requirements Resource requirements PixelFlow Memory PixelFlow Example Example ◆ Programmability ◆ Table memory ◆ Memory ◆ Local memory ◆ Computational power Shader Local Variables carpet 24 marble 26 stippled 33 stone 23 Requirements Requirements Resource requirements PixelFlow Parallelism PixelFlow Example Example ◆ Programmability ◆ Pixel-Planes 5 (Fuchs 89) ◆ Memory – 2–50 Graphics processors (i860) – 1–20 Renderers (custom) ◆ Computational power – 16k Pixel processors / renderer (custom) – Parallelism ◆ Reality Engine (Akeley 93) – Deferred shading – 8–12 Geometry engines (i860XP) – Uniform/varying – 5–20 Fragment generators (custom) – Fixed point/floating point – 80–320 Image engines (custom) 2
Requirements Requirements PixelFlow PixelFlow Resource requirements Deferred shading Example Example ◆ Programmability ◆ Keep shading parameters at each pixel ◆ Memory ◆ Shade after visibility is determined ◆ Computational power ◆ Pros: – Parallelism – Doesn’t shade hidden pixels – Deferred shading – Shading independant of geometric complexity! – Uniform/varying – Better utilization on SIMD – Fixed point/floating point ◆ Cons: – Can’t affect visibility (No transparency, no displacement maps!) Requirements Requirements Resource requirements PixelFlow Uniform/varying PixelFlow Example Example ◆ Programmability ◆ Uniform = constant across pixels/samples ◆ Memory – Wood grain, marble vein frequency, ... ◆ Varying = different in each pixel/sample ◆ Computational power – Normal, texture coordinates, ... – Parallelism ◆ Don’t compute uniform values at every – Deferred shading pixel — compute once and broadcast – Uniform/varying – Fixed point/floating point Requirements Requirements Resource requirements PixelFlow Fixed point/floating point PixelFlow Example Example ◆ Programmability ◆ Pixel processors ◆ Memory – Many processors – Simple instruction set ◆ Computational power – Floating point acceleration is unlikely – Parallelism ◆ Fixed point – Deferred shading – When required precision is known – Uniform/varying – More efficient in time and memory – Fixed point/floating point 3
Requirements Requirements PixelFlow PixelFlow PixelFlow PixelFlow node Example Example Geometry network ◆ Node description ◆ Memory HP PA-RISC HP PA-RISC ◆ Timings ◆ System Optional 128 x 64 Table video SIMD array Memory card Composition network Requirements Requirements PixelFlow PixelFlow Memory PixelFlow Example Example ◆ Node description ◆ 16MB table memory ◆ Memory ◆ 256 bytes local memory ◆ Timings ◆ 128 bytes local memory/communication ◆ System Requirements Requirements Timings ( µ s) PixelFlow Timings ( µ s) PixelFlow Example Example float fixed 4 byte 4 byte 2 byte 8 3.94 µ s + 0.13 0.07 * 2.53 2.00 0.50 4 / 7.04 6.40 1.60 4 4 0 4 4 byte byte sqrt 6.98 3.33 1.22 2 2 sqrt sqrt / / * * + + byte byte float float byte byte fixed fixed fixed fixed 4
Requirements Requirements PixelFlow PixelFlow PixelFlow PixelFlow system Example Example Rasterizer Nodes ◆ Node description ◆ Memory ◆ Timings Shader Nodes ◆ System Frame Buffer Node Requirements Requirements An example PixelFlow An example PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time Show video Requirements Requirements Shading functions PixelFlow Shading functions PixelFlow Example Example ◆ Pins – Crown, label, scuffs, dirt, Phong ◆ Alley – Wood, reflection map ◆ Ball – Phong ◆ Light – Shadow map 5
Requirements Requirements PixelFlow PixelFlow An example Time: 7 ms - shadow map Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Use of multiple processors 33 ms Requirements Requirements An example PixelFlow Time: 7 ms - reflection map PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Use of multiple processors 33 ms Requirements Requirements An example PixelFlow Time: 15.7 ms - final image PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Use of multiple processors 33 ms 6
Requirements Requirements PixelFlow PixelFlow An example Shading: 2 µ s - crown Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Shading: 15 µ s - label PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Shading: 39 µ s - scuffs & dirt PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s 7
Requirements Requirements PixelFlow PixelFlow An example Shading: 15 µ s - wood Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Shading: 28 µ s - light/shadows PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Shading: 12 µ s - Phong (pins) PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s 8
Requirements Requirements PixelFlow PixelFlow An example Shading: 27 µ s - reflection Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Shading: 12 µ s - Phong (ball) PixelFlow Example Example ◆ Video ◆ Shading functions ◆ Time – Breakdown of 33ms frame time – Breakdown of 150 µ s to run all shaders (excluding table lookups) – Time for table lookups – Using multiple processors 150 µ s Requirements Requirements An example PixelFlow Time for table lookups PixelFlow Example Example ◆ Video ◆ About 23ns per pixel ◆ Shading functions ◆ Worst case ◆ Time – Bowling pin (4 lookups) in all pixels » Label image – Breakdown of 33ms frame time » Scuff bump map – Breakdown of 150 µ s to run all shaders » Dirt image (excluding table lookups) » Shadow map – Time for table lookups – Total 760 µ s per region – Using multiple processors 9
Recommend
More recommend