Upload
pelagia-aradia
View
34
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful. Jiri Bittner 1 , Michael Wimmer 1 , Harald Piringer 2 , Werner Purgathofer 1 1 Vienna University of Technology 2 VRVis Vienna. R. Q. C. Q. R. R. Q. Q. R. Waiting time. Motivation. Coherent Hierarchical Culling. - PowerPoint PPT Presentation
Citation preview
Coherent Hierarchical Coherent Hierarchical Culling:Culling:
Hardware Occlusion Hardware Occlusion Queries Made UsefulQueries Made Useful
Jiri BittnerJiri Bittner11, Michael Wimmer, Michael Wimmer11, , Harald PiringerHarald Piringer22, Werner Purgathofer, Werner Purgathofer11
11Vienna University of TechnologyVienna University of Technology22VRVis ViennaVRVis Vienna
2Michael Wimmer Vienna University of Technology
Coherent Hierarchical CullingCoherent Hierarchical Culling
MotivationMotivation
RR RenderRender
QQ Occlusion QueryOcclusion Query
CC CullCull
CPU
GPU
time
Typical hardware occlusion culling scenarioTypical hardware occlusion culling scenario
R Q
R Q
R Q
R Q
C Q
Q
R
R
Waiting time
3Michael Wimmer Vienna University of Technology
Occlusion Culling: Offline vs. OnlineOcclusion Culling: Offline vs. Online OfflineOffline
Global information about visibility (from region)Global information about visibility (from region)
-- Difficult to implementDifficult to implement
-- Accuracy and maintenance problemsAccuracy and maintenance problems
++ No runtime overheadNo runtime overhead OnlineOnline
Local information about visibility (from point)Local information about visibility (from point)
++ Easier to implementEasier to implement
++ Greater accuracy, easy maintenanceGreater accuracy, easy maintenance
-- Runtime overheadRuntime overhead
4Michael Wimmer Vienna University of Technology
Online Occlusion CullingOnline Occlusion Culling Object space methodsObject space methods
-- Need complex geometric calculationsNeed complex geometric calculations(hard to handle detailed scenes)(hard to handle detailed scenes)
++ Do not require rasterizationDo not require rasterization
Image space methodsImage space methods
++ No geometric calculationsNo geometric calculations
(easier to handle detailed scenes)(easier to handle detailed scenes) -- Require rasterizationRequire rasterization
5Michael Wimmer Vienna University of Technology
Hardware Occlusion CullingHardware Occlusion Culling Hardware is good at rasterization!Hardware is good at rasterization! Hardware counts rasterized fragmentsHardware counts rasterized fragments
But need not update frame bufferBut need not update frame buffer
NV/ARB_occlusion_queryNV/ARB_occlusion_query AsynchronousAsynchronous Allows multiple simultaneous occlusion queriesAllows multiple simultaneous occlusion queries
General algorithm idea:General algorithm idea: Render simple approximation first (bbox)Render simple approximation first (bbox)
invisibleinvisible: cull object: cull object visiblevisible: render object: render object
6Michael Wimmer Vienna University of Technology
Hardware Occlusion CullingHardware Occlusion Culling AdvantagesAdvantages
Pixel-exactPixel-exact No explicit occluder renderingNo explicit occluder rendering Exploit rasterization power of GPUExploit rasterization power of GPU Easy to use (API calls)Easy to use (API calls)
ProblemsProblems Delay in availability of the resultsDelay in availability of the results Time to execute queriesTime to execute queries If fill-bound: only useful if several objects culledIf fill-bound: only useful if several objects culled
7Michael Wimmer Vienna University of Technology
Hierarchical Stop&Wait (S&W)Hierarchical Stop&Wait (S&W)Front-to-back hierarchy traversalFront-to-back hierarchy traversal1. Issue visibility query for node1. Issue visibility query for node2. 2. Stop and WaitStop and Wait for result for result
InvisibleInvisible: cull the subtree: cull the subtree VisibleVisible: render or continue 1. recursively: render or continue 1. recursively
Advantage: Advantage: Hierarchy can cull huge subtreesHierarchy can cull huge subtrees
Problems:Problems: Waiting causes CPU stalls and GPU starvationWaiting causes CPU stalls and GPU starvation Huge rasterization costsHuge rasterization costs
(especially for large interior nodes)(especially for large interior nodes)
8Michael Wimmer Vienna University of Technology
and and
RxRx Render object xRender object x
QxQx Query object xQuery object x
CxCx Cull object xCull object x
CPU
GPU
CPU StallsCPU Stalls GPU StarvationGPU Starvation
R1 Q2
R1 Q2
R2 Q3
R2 Q3
C3 Q4
Q4
R4
R4
time
Waiting time
9Michael Wimmer Vienna University of Technology
Solution: Coherent Hierarchical CullingSolution: Coherent Hierarchical Culling
Scheduling based on temporal coherenceScheduling based on temporal coherence Skipping certain visibility testsSkipping certain visibility tests Immediate rendering of certain geometryImmediate rendering of certain geometry
Clever interleaving of queries and renderingClever interleaving of queries and rendering Maintaining a queue of running occlusion Maintaining a queue of running occlusion
queriesqueries
Design goal: easy implementationDesign goal: easy implementation
10Michael Wimmer Vienna University of Technology
Coherent Hierarchical Culling (CHC)Coherent Hierarchical Culling (CHC)
RxRx Render object xRender object x
QxQx Query object xQuery object x
CxCx Cull object xCull object x
CPU R1 Q2
GPU R1 Q2
R2 Q3
R2 Q3
C3 Q4
Q4
R4
R4
visible in previous frameAssume independent occlusion
time
11Michael Wimmer Vienna University of Technology
CHC Algorithm OutlineCHC Algorithm Outline Front-to-back hierarchy traversalFront-to-back hierarchy traversal1.1. Node handlingNode handling
Interior nodeInterior node Previously invisiblePreviously invisible: : issueissue visibility query visibility query Previously visiblePreviously visible: : continuecontinue 1. recursively 1. recursively
LeafLeaf IssueIssue visibility query visibility query Previously visiblePreviously visible: render : render immediatelyimmediately
2.2. Check availability of query resultsCheck availability of query results InvisibleInvisible: propagate visibility change: propagate visibility change VisibleVisible: render or continue 1. recursively: render or continue 1. recursively
12Michael Wimmer Vienna University of Technology
Why Interleaving Works…Why Interleaving Works… Processing a node only depends on…Processing a node only depends on…
1.1. Front to back orderFront to back order
2.2. Results of queries for processed nodes where:Results of queries for processed nodes where:Previous frame: Previous frame: processed node processed node current node current node S&WS&W CHCCHC
visible visible visible visible yesyes nono
visible visible invisible invisible yesyes nono
invisible invisible visible visible yesyes nono
invisible invisible invisible invisible (different subtrees)(different subtrees) yesyes nono
invisible invisible invisible invisible (parent (parent child, refinement of visibility) child, refinement of visibility) yesyes yesyes
13Michael Wimmer Vienna University of Technology
no queries for previously visible interior nodes
CHC: Hierarchy TraversalCHC: Hierarchy Traversal
1011
76
5
8
1
29
3
4
5
7 6 8
1011
12 13
assume no query dependencies
previously visible
previously invisible
front-to-back order
hidden regions: queries depend on parents 47
681213
109
511
3
14Michael Wimmer Vienna University of Technology
CHC FeaturesCHC Features Reduction of CPU stalls and GPU Reduction of CPU stalls and GPU
starvationstarvation Interleaving queries with rendering Interleaving queries with rendering
previously visible geometrypreviously visible geometry
Reduction of the number of queriesReduction of the number of queries Avoids expensive redundant queries for Avoids expensive redundant queries for
interior nodesinterior nodes Size of tested regions adapts to visibilitySize of tested regions adapts to visibility
pull-up: occluded region growingpull-up: occluded region growing pull-down: visible region growingpull-down: visible region growing
15Michael Wimmer Vienna University of Technology
Implementation IssuesImplementation Issues
Front-to-back traversalFront-to-back traversal Priority queue: Priority queue:
allows various hierarchical data structures allows various hierarchical data structures
Checking query resultsChecking query results glGetOcclusionQueryivNV glGetOcclusionQueryivNV
GL_PIXEL_COUNT_AVAILABLE_NVGL_PIXEL_COUNT_AVAILABLE_NV Very cheap operationVery cheap operation
Queries for previously visible nodesQueries for previously visible nodes Use actual geometry as occludeeUse actual geometry as occludee
(instead of bounding box)(instead of bounding box)
16Michael Wimmer Vienna University of Technology
Further OptimizationsFurther Optimizations Conservative visibility testingConservative visibility testing
Assume visible node remains visible n framesAssume visible node remains visible n frames
++ Saves additional occlusion queriesSaves additional occlusion queries
Approximate visibilityApproximate visibility #visible pixels < threshold #visible pixels < threshold node invisible node invisible
++ Saves rendered geometrySaves rendered geometry
-- Produces image errorsProduces image errors
17Michael Wimmer Vienna University of Technology
Results – Test ScenesResults – Test Scenes
Teapots11.5M triangles21k kD-tree nodes
City1M triangles
33k kD-tree nodes
Power plant12.7M triangles18.7k kD-tree nodes
18Michael Wimmer Vienna University of Technology
Results – SpeedupResults – Speedup
0
1
2
3
4
5
6
7
Teapots City Powerplant
VFC
S&W
CHC
Ideal
Ideal: zero overhead – render only visible geometry
19Michael Wimmer Vienna University of Technology
Results – SummaryResults – Summary
Comparison to hierarchical S&WComparison to hierarchical S&W #queries reduced by almost 2#queries reduced by almost 2 Times for stalls reduced by 20-60xTimes for stalls reduced by 20-60x
(to 0.18 –1.31ms)(to 0.18 –1.31ms)
Close to ideal algorithm! Close to ideal algorithm! Only 2–9ms slowerOnly 2–9ms slower Overhead due to query timeOverhead due to query time
20Michael Wimmer Vienna University of Technology
Results – TeapotResults – Teapot
21Michael Wimmer Vienna University of Technology
Results – CityResults – City
22Michael Wimmer Vienna University of Technology
Results – PowerplantResults – Powerplant
23Michael Wimmer Vienna University of Technology
Optimization ResultsOptimization Results Conservative culling, Conservative culling,
2 frames assumed visible2 frames assumed visible Good for deep hierarchies with simple leaf Good for deep hierarchies with simple leaf
geometrygeometry Further speedup up to 21%Further speedup up to 21%
Approximate culling, Approximate culling, 25 pixels threshold25 pixels threshold Good for scenes with complex visible geometryGood for scenes with complex visible geometry Further speedup up to 33%Further speedup up to 33%
24Michael Wimmer Vienna University of Technology
ConclusionConclusion Efficient scheduling of hardware occlusion Efficient scheduling of hardware occlusion
queriesqueries Greatly reduces CPU stalls and GPU starvationGreatly reduces CPU stalls and GPU starvation Reduces number of required queriesReduces number of required queries
Simple to implementSimple to implement Arbitrary hierarchical data structureArbitrary hierarchical data structure Speedup ~4 over VFCSpeedup ~4 over VFC Close to ideal solution for tested scenesClose to ideal solution for tested scenes
Watch out for GPU Gems IIWatch out for GPU Gems II
25Michael Wimmer Vienna University of Technology
Thanks for Your AttentionThanks for Your Attention
26Michael Wimmer Vienna University of Technology
previously visiblepreviously visible: continue 1. recursively: continue 1. recursivelypreviously visiblepreviously visible: render: render
CHC: ExampleCHC: Examplepreviously visiblepreviously visible: issue query + render: issue query + renderquery result available: continue 1. recursivelypull-up invisibilityfinal classificationpreviously invisible: queryquery result available: renderquery result available: cull
queryqueue
GPU
1
29
1011 3
4 76
5
8
R4
5
6
Q5 Q6/R6
7
Q7
8
Q8 R7
10
Q10/R10
11
Q11
issued queries
R6Q6/
query result available: mark visible
Q10