Upload
shawn-edson
View
223
Download
2
Embed Size (px)
Citation preview
IIIT
Hyd
erab
ad
Hybrid Ray Tracing and Path Tracing of Bezier Surfaces
using a mixed hierarchy
Rohit Nigam, P. J. NarayananCVIT, IIIT Hyderabad, Hyderabad, India
IIIT
Hyd
erab
ad
Representing a Scene
f>0
f<0
f=0
Triangular Mesh Implicit Surface
Parametric Surface
IIIT
Hyd
erab
ad
Parametric Surface: Motivation
Provide compact and effective representation. Remain curved and smooth at arbitrary level of
zooming. Memory efficient, in comparison with triangular mesh.
IIIT
Hyd
erab
ad
Bezier Surfaces
• Bezier Surfaces are the most basic form of parametric surfaces
• A Bezier Surface can be described as:Q(u,v) = [U][M][P][M]T[V]T
where [U] = [u3 u2 u 1] and [V] = [v3 v2 v 1], 0 ≤ u,v ≤ 1
[M] is the Bezier Basis Matrix
[P] is the set of 16 Control Points defining the patch
IIIT
Hyd
erab
ad
Rendering Bezier Surfaces
• Tessellation based approches– Eisenacher et al.(2009) :
View Dependent Adaptive Subdivision
• Direct Ray Tracing– Geimer et al.(2005) :
Newton Iteration– Pabst et al.(2006) :
Bezier Clipping + Newton Iteration
IIIT
Hyd
erab
ad
Ray Tracing Bezier Surface
• Constructing an Accelaration Structure Bounding Volume Hierarchy(BVH)
IIIT
Hyd
erab
ad
Ray Tracing Bezier Surface
• Ray Traversal through BVH
0 1 2 3 4 5 6 7 8 9
Ray List
Outputs
Potential Ray-Patch intersections list
Initial parameter values
IIIT
Hyd
erab
ad
Ray Tracing Bezier Surface
• Newton Iteration
Picture Courtesy : http://steadyserverpages.com
IIIT
Hyd
erab
ad
Geimer, Abert Approach • Based on the flatness criteria, each
patch is divided into subpatches.• BVH for original surfaces
– Bounding boxes of subpatches at leaf nodes.
• For each potential intersection– Generate initial values for Newton Iteration
BVH Nodes1
2 3
sp1 sp2 sp1 sp2 sp1 sp2 Subpatches at Leaf
Original Curve
Subdivided LinearCurve
Patch1 P2 P3
IIIT
Hyd
erab
ad
Limitation of the Model for GPUs
• GPU Access time:– High for global memory– Comparatively less for shared memory and registers
When subdividing based on flatness criteria, we need to– Store subpatches starting index– Store total number of subpatches– Store initial [u,v] pair for each potential intersection.
Thus more global memory operations result in lower throughput.
• Need to check every subpatch at leaf node
IIIT
Hyd
erab
ad
Our Approach• Create a mixed hierarchy,
consisting of two hierarchical structures.– The top level BVH tree is
constructed from the bounding boxes of original patches.
– Leaf nodes represent the original Bezier Surfaces.
– Each Patch is divided into fixed size subpatches, hierarchically, using De Casteljau algorithm.
– Make subtree for each patch from bounding boxes of the subdivided patches.
IIIT
Hyd
erab
ad
1 2 3 4
BVH Nodes
Patches
Subtree Nodes
Sub-patches
BVH for Patches
SubpatchHierarchy
IIIT
Hyd
erab
ad
Mixed Hierarchy Structure
• Newton Iteration applied to original patches– No memory required to store subpatches
• Fixed depth subtree– Utilize constant degree of bezier surfaces– Utilize shared memory– Apply early termination at subtree level– Leads to tighter bounds– A subdivision depth of 6 was found empirically sufficient.
IIIT
Hyd
erab
ad
Mixed Hierarchy Structure
• Newton Iteration applied on original patches.– No memory required to store subpatches.
• Fixed depth makes it possible utilize shared memory.• A subtree at lower level leads to early termination at this
stage, reducing the (Ray, Bounding Box) intersections.• Subdivision also leads to tighter bounds, which further
reduces the potential (Ray,Patch) intersections.• A subdivision depth of 6 was found empirically sufficient
for our scenes.
IIIT
Hyd
erab
ad
GPU Traversal of Mixed Hierarchy Structure
• A ‘traverse’ kernel traverses the first level of the BVH.– Lists out Potential (Ray,Patch) intersections.– We make use of atomic operations, to provide scalability.
• ‘Recheck’ kernel parallely processes the generated (ray,patch) list.– This leads to further pruning of the list with tighter subpatch
bounding boxes.– We make use of ‘t’ values computed here, to not traverse
subpatch nodes with higher values.– This leads to reduced computation and in cases of false
positive, a little less accurate initial values.– Lists out the reduced potential (Ray,Patch) intersections.– Generates the initial values for each intersection.
IIIT
Hyd
erab
ad
Secondary Rays
IIIT
Hyd
erab
ad
Hybrid Ray Tracing
Start
Preprocessing
rayTraceGPU rayTraceCPU
GPU CPU
Point and Normal
Ray List
Generate Secondary Rays
IIIT
Hyd
erab
ad
Hybrid Ray Tracing
IIIT
Hyd
erab
ad
Results
Teapot ModelFps : 64
Bigguy ModelFps : 28.6
Killeroo ModelFps : 19.2
2 KilleroosFps : 10.6
9 BigguysFps : 5.2
System SpecsGTX 580 + i7 920
1024x1024
IIIT
Hyd
erab
ad
Path Tracing
• We extend our ray tracing approach to Global Illumination effects.
• We use Cook’s approach of Monte Carlo based Stochastic Sampling, to sample the image at appropriate non-uniformly spaced points.
• Each pixel is sampled for a user defined samples per pixel
• We apply our data parallel approach to this massive ray list to generate the desired effects.
IIIT
Hyd
erab
ad
Path Tracing
Bigguy in a box: 400 spp, 512x512 resolution Rendered in 28.5 minutes
IIIT
Hyd
erab
ad
Path Tracing
Bigguy in a box: 1000 spp, 512x512 resolution Rendered in 28.5 minutes
IIIT
Hyd
erab
ad
Path Tracing
Bigguy in a box: 3000 spp, 512x512 resolution Rendered in 28.5 minutes
IIIT
Hyd
erab
ad
Path Tracing
Bigguy in a box: 5000 spp, 512x512 resolution Rendered in 28.5 minutes
IIIT
Hyd
erab
ad
Path Tracing
Bigguy in a box: 10000 spp, 512x512 resolution Rendered in 28.5 minutes
IIIT
Hyd
erab
ad
Conclusion
• A mixed hierarchy model is proposed to speed up Ray Tracing process.
• GPU benefits greatly from fixed depth subtree.• A hybrid model is proposed, to fully utilize compute
power of CPU and GPU.• We demonstrate the capability of our method by
producing Global Illumination effects for Bezier patches.
IIIT
Hyd
erab
ad
THANK YOU
IIIT
Hyd
erab
ad
IIIT
Hyd
erab
ad
Hybrid Ray Tracing
• Divide the Ray list between CPU and GPU Ratio decided based on compute capabilities
• GPU algorithm comprises of three kernels: Traverse : Generate Potential Ray-Patch Intersections Recheck : Further prune intersections and get initial values Newton : Apply Newton iteration to get hit-point
• CPU stage comprises of:1. Divide CPU Raylist into 2c threads, where c is number of cores.
2. Intersection with main BVH
3. If intersects, further intersection with 2nd level subtree.
4. Finally, apply Newton iteration and generate hit-point
• CPU benefits from early ray termination.
IIIT
Hyd
erab
ad
Hybrid Ray Tracing
IIIT
Hyd
erab
ad
Newton Iteration
• We represent a ray as intersection of two planes, (n1,d1) and (n2,d2)
The ray patch intersection equation becomes
Q(u,v) represents the point on the patch.
• We use Newton Iteration to solve for (u,v)
Here J is the inverse Jacobian matrix of R.
IIIT
Hyd
erab
ad
Results (Primary Rays, 1024x1024)
Model Patches Ray-Patch Intersections
Total Frame Time(ms)
CPU GPU Hybrid
Teapot 32 126589 74 8.71 8.01
Bigguy 3570 142779 110 14.59 13.18
Killeroo 11532 147116 193 22.38 20.43
2 Killeroos 23064 317494 356 42.29 38.58
9 Bigguys 32130 570136 2092 77.05 75.9
IIIT
Hyd
erab
ad
Results (Primary +Secondary)
Model Patches Total Frame Time(ms)
CPU GPU Hybrid
Teapot 32 137 17.05 15.61
Bigguy 3570 232 39.45 34.92
Killeroo 11532 351 58.3 52.19
2 Killeroos 23064 726 106.03 94.55
9 Bigguys 32130 3107 196.79 191.81