Realtime Hair Rendering Erik Sintorn - erik.sintorn@chalmers.se
State of the art (realtime) In recent games In recent research
State of the art (realtime) A few hundred Half a million individual line textured polygons segments
What makes hair tricky? • Convincing hair requires very many, very thin primitives (~.5M lines or quads or cylinders): • Lots of geometry for GPU to process • Very prone to aliasing • Hair is semitransparent (refractive): • Light is scattered in three strong directions • Simple transparency effect can be achieved by blending, but requires fragments to be sorted
Aliasing
The Awful Truth
Major challenges for realtime • Shading • Self shadowing • Blending for transparency and subsampling
Hair Shaders • Kayija-Kay model (James T Kajiya and Timothy L Kay, Rendering Fur With Three Dimensional Textures ) • Pretends hair is infinitesimally thin specular cylinder • Captures the most obvious specular highlight from hair:
Hair Shaders • Marschner et. al., Light Scattering from Human Hair Fibers
Hair Shaders Proper measurements of light scattering show three distinctive components
Hair Shaders • “Light scattering”
Self shadowing
Self Shadowing traditional techniques Shadow Mapping Shadow Volumes • main problem with algorithm is overdraw since each sillhouette edge becomes a polygon • Hair is ALL sillhouette edges Would require huge shadow maps Neither support semi transparent geometry
Deep Shadow Maps • Introduced in 2000 by Pixar to render “fuzzy” objects offline • Instead of storing the closest fragments depth in the shadow map, store a function, V(d): The visibility as a function of the depth • Requires an A-buffer type renderer
Deep Shadow Maps Store depth and opacity for each fragment along the ray (pixel) V(d) d Compress into piecewise linear function for each pixel V(d) d
Opacity Maps Sample opacity at regular intervals • Render hair once for each slice • Each time, move far plane one slice further away • Save each slice into a 3d texture
Opacity Maps • Regular sampling requires very many slices • Not feasible to render hair 256 times 16 slices 256 slices
Opacity Maps • nVidias NALU demo uses opacity maps with 16 slices, that they can render in a single pass
Deep Opacity Maps • In a first pass, find the closest fragment for each pixel • In fragment shader, find the fragments xy-pos in this map and use fragments depth to find the slice to write to
Alpha Blending • We need to alpha-blend the fragments to simulate: Antialiasing: Transparency:
Alpha Blending
Depth Peeling • Only failsafe technique known when no A- buffer available • Sorting the primitives (triangles/lines) is not necessarily enough • Depth peeling will draw the actual fragments in back to front order
Depth Peeling (algorithm) • Render the image with z-buffer and draw only furthest fragments • Use z-buffer of previous pass as texture to discard all fragments with depth >= that texture • Continue until an occlusion query reports no fragments drawn
Depth Peeling (problems) • Requires as many passes as is the depth complexity of the image • Images of hair can have depth complexity of several hundred fragments in one pixel • Can use front to back drawing instead and stop after a certain number of passes, but will still require to many passes to work in realtime
Real-Time Approximate Sorting for Self Shadowing and Transparency in Hair Rendering Erik Sintorn and Ulf Assarsson Chalmers University Of Technology
What we do... • Introduces an approximate quicksort algorithm for lines that runs entirely on the GPU • Use this to create high def Opacity Maps in real time • And to solve the alpha blending problem
Transform Feedback • Allows us to save the data output from the Geometry Shader (without rendering to screen) • Introduced to simplify for example displacement mapping and cube map • Called “Stream Out” in DirectX 10
Quicksorting points on GPU • Submit all points twice. • First time we discard all points on one side of a split plane, second time we discard the others
Quicksorting points on GPU • Results are stored in two buffers using stream out • Recursively repeat until we have N buffers where all elements in a buffer lie within one of N slices
Quicksorting lines on GPU • A line can lie on both sides of the plane • Just clip the line in the Geometry Shader
Building Opacity Map • Use planes parallel with the light-plane to divide geometry • Now, it is easy to build the opacity map texture by: • Enabling additive blending • Set up camera from lights viewpoint • For each slice s • Render sublist s into the final texture-slice • Copy the final texture-slice to texture-slice s
Alpha Sorting • With GPU based Partial Quicksort, alpha blending is easy • Simply sort geometry into sublists for each slice of the viewing frustrum from the cameras viewpoint • This time, sort back to front • Render the generated VBO (as one large batch) with alpha blending enabled
Hair Self Shadowing and Transparency Depth Ordering Using Occupancy maps Erik Sintorn and Ulf Assarsson Chalmers University Of Technology
Occupancy Maps • One big problem with Opacity Maps is that they take up very much space • 256 slices of 512x512 maps take ~256MB if we use float opacity values • That’s half the available memory on many cards • In this paper we suggest a more compact representation, with similar quality that is also faster to generate
Occupancy Maps • The Occupancy Map is like an opacity map, but we store, for each texel and slice, only a bit that says if the voxel is occupied or not. • In this way, we can keep information about 128 slices in 4 32-bit words (one texture-map with unsigned ints representing RGBA) • We also use the technique suggested in Deep Opacity Maps to optimize the available precision in this map
Occupancy Maps
Occupancy Maps • Generating the occupancy map • In a first pass, render the geometry from the light to find the minimum and maximum depth of each texel • Render the geometry again, to a UINT32 framebuffer. Using an OR logical operation for blending • For each generated fragment, figure out what slice it belongs to and set the corresponding bit in the output
Occupancy Maps • When we shade the hair from the cameras view, for each fragment we find the projected light-space coordinates (x,y) and the slice s and fetch the occupancy words with a texture lookup. • We then count the number of set bits corresponding to slices < s and set the opacity to this number multiplied by some constant • Since several fragments may have recorded themselves in the same slice (the same bit) they may only be counted once... bad.
The Slab Map • Enter the slab-map • The slab map is a simple rgba texture map of the same size as the occupancy map • Each color channel holds the number of fragments recorded into that slab
Putting it together • The slab map and occupancy map together provide a very good estimate of the visibility. So, given the light MVP coords (x,y) , the slice s and the slab m : • Find out how many fragments are in slabs preceeding m , call this a • Find the average number of fragments per bit in slab m : avg = (number of fragments in slab / total number of set bits in slab) • Find approximate number of fragments occluding this as: a + avg * (number of bits set in slabs < s)
Putting it together
Resulting visibility function
Alpha “sorting” • The blending function typically used for rendering transparent objects is:
Alpha “sorting” • For three fragments blended in back to front order, this would expand to:
Alpha “sorting” • Which can be rewritten as: =>
Alpha “sorting” All alphas are approximately the same =>
Alpha “sorting” All alphas are approximately the same => =>
Alpha “sorting” • We don ʼ t know the depth order of the fragment... • But we can approximate how many fragments precede it! Which is really the same thing. So: • Calculate the shaded and shadowed color c of the fragment. • Approximate the depth order i using occupancy and slab maps • Write to the framebuffer:
Results
Video
Recommend
More recommend