Volume-based Ambient Occlusion with Voxel Fragmentation · Volume-based Ambient Occlusion with Voxel Fragmentation PROJECT IN TECHNOLOGY FOR ADVANCED COMPUTER GAMES TSBK03 Christopher

ITN, Norrkoping December 21, 2012

Volume-based Ambient Occlusionwith Voxel Fragmentation

PROJECT IN TECHNOLOGY FOR ADVANCED COMPUTER GAMES

TSBK03

Christopher BirgerErik EnglessonAnders Hedblom

[email protected]@student.liu.se

[email protected]

Abstract

Ambient occlusion is an effect for approximating the difficulty for light to reach tight areas in a 3Dscene. Its purpose is to enhance realism in computer graphics without the necessity of using physicallybased methods that are often very computationally heavy. Ambient occlusion also increases our abilityto perceive 3D objects. The effect can be achieved with several different methods but the basic ideais that the more close geometry that can be seen from a point, the darker it will be. For this project,as a part of the course Technology for Advanced Video Games - TSBK03, we created a volume-basedambient occlusion pass renderer. We voxelize the scene and store it as volume data in a 3D texture,which is sent into the graphics pipeline. A Sparse Voxel Octree was also implemented on the CPU.This can be visualized during runtime.

1 Introduction

In real time rendering applications there is notmuch room for accurate lighting calculationsand therefore many tricks must be used toachieve believable results. If the lack ofavailable computation time was not the case,the obvious choice would be to calculate a fullglobal illumination model. Global illuminationcan be calculated using ray tracing methodswhere rays are sent from the camera and arethen followed through the scene, to any ofthe light sources. At each surface hit, theirradiance, i.e, the incoming light is calculatedby integrating all possible directions over thehemisphere, with respect to the normal of thesurface. Practically, this means that a set numberof rays are spawned at each surface hit and raytraced further into the scene. Because of thisrecursive nature, ray traced global illuminationcan’t be used in real time application. Comparedto a local illumination model where each point istreated without regards to any other point, globalillumination depends on the scene complexity.

When objects are close together and thehemisphere is partially occluded, less light willfall onto such points. This is exactly whatambient occlusion tries to approximate, i.e, howmuch of the hemisphere is occluded and howclose other objects are. Ambient occlusionhave been shown to increase the perception ofscene geometry and aid in the apprehension ofthree dimensional objects or depth. Ambientocclusion can easily be solved with ray tracingtechniques but in most real-time applicationsthis is far too expensive to compute. Inthe standard graphics pipeline today there is

no natural way of accessing all of the scenegeometry. Therefore, there has been muchresearch about ambient occlusion using muchfaster methods. Ambient occlusion was madepopular in real-time applications with ScreenSpace Ambient Occlusion (SSAO) [1], which isa really fast method that can be calculated purelyin a fragment shader.

Ambient occlusion is an approximation oflight behavior due to the fact that it manipulatesambient light, which is also just a fakecomponent in lighting. Ambient light is avery crude approximation for how light bouncesaround in a scene with diffuse surfaces. Thiscauses objects to seem lit somewhat from alldirections, not just the directions of the lightsources. Ambient light ignores the shape of allgeometry in the scene and does not thereforecontribute to any realism. Setting the ambientintensity in a scene implies that all fragmentsget a lower illumination threshold, that all otherlight is added to. As the name states, ambientocclusion reduces this threshold for tight areas bysubtracting an occlusion value from the ambientintensity.

Even though SSAO is a very effective wayof creating the AO effect in real-time, itdefinitely has some quality flaws. There areother fast methods for creating AO, such asdifferent volume-based approaches. By storingall geometry as volume data in a texture, itis possible to create ambient occlusion on theGPU, yet still having it view-independent andconverging towards a correct result. In thisproject, we have implemented a volume-basedambient occlusion pass renderer that uses thisprinciple.

1

2 Ambient Occlusion

In global illumination light is considered toreflect in directions over the hemisphere ateach point in the scene. How much andwhich directions the light reflects in, is modeledthrough a bidirectional reflectance distributionfunction, f , which depends on the incoming lightdirection, the viewing direction and the surfacenormal. If the surface is a perfectly diffuse,the BRDF is constant, since light is reflecteduniformly over the hemisphere. Ambientocclusion stems from the fact that in narrowspaces such as corners, the hemisphere over thesepoints will be partially covered. This will resultin that such areas will not receive as much lightas “open” areas. If we consider the renderingequation that calculates the excitant radiance ina direction ,ωo, from a point, ~x, as in equation 1.

Lo(~x, ωo) =∫ΩLi(~x, ωi)f(~x, ωi, ωo)V (~x, ωi)(ωi · n)dωi

(1)

Where Li(~x, ωi) is the incoming irradiancefrom direction ωi, V (~x, ωi) is the visibilityfunction which is either 0 or 1. The last term,(ωi · n), models the fact that incoming lightis projected onto a greater surface when theangle between the surface normal and incomingdirection increases. Again, if we now considerthat all the surface are perfectly diffuse and thatwe replace the incoming irradiance, Li, with aconstant ambient light source, La, we get thefollowing equation.

Lo(~x, ωo) = La

∫ΩV (~x, ωi)(ωi · n)dωi (2)

The ambient light term is an approximationto the indirect lighting in the scene, which isoften used in local lighting models where theposition of the light source is not taken intoaccount. It is now clear that when the hemisphereis occluded, the point will receive less light andvice versa. This happens to be very close to theambient occlusion equation but we also need toadd an attenuation function. Consider that we arelooking inside an infinitely large room. Cornersin the room should become darker since parts of

the hemisphere will become occluded. However,in equation 2 there is nothing that models howclose a surface is when evaluating the visibilityfunction. Therefore, all points in this box wouldevaluate to the same ambient occlusion value.Using this intuition and the approximation made,we end up with a final ambient occlusion integral.

A~x =1

π

∫ΩV (~x, ωi)τ(~x, ~xi)(ωi · n)dωi (3)

Where τ(~x, ~xi) is the attenuation function and~xi is the intersection point if there is any. Themost accurate solution to this integral is to usesome sort of ray tracing method. In ray tracing,the integral is discretized into a sum over thehemisphere. The hemisphere is divided into setnumber of regions which corresponds to multipledirections, ωi. For each direction a ray is sentout in the scene. If the ray intersects the scenethe visibility is evaluated to one and the ray isterminated. Figure 1 depicts two different surfacepoints in the scene.

Figure 1: Visualization of ambient occlusion attwo surface points A and B.

In figure 1, The AO value in point A is lowsince only two rays hit another surface far awayat a high angle between the normal and the ray.The ambient occlusion at point B is high though,since many of the rays find close surfaces for rayswith directions close to the normal.

Ray traced ambient occlusion converges toa correct result when more and more raysare used. This method is too slow forreal-time applications, but is commonly used asa reference to the ground truth when comparingapproximating real-time algorithms. Some of themost common real-time approaches are detailedin the following sections.

2

Figure 2: A comparison between different methods for creating AO.

2.1 Different approaches to AO

We can conclude that ray casting the scene is acorrect way of achieving the ambient occlusioneffect, however, as it is so computationallyheavy, the method is not feasible in real-timeapplications. In static scenes, it is possible topre-render an ambient occlusion pass once, andapply this during runtime as a texture. Thiscan give really nice and smooth shadows formost times, but immediately becomes a problemwhen objects start to move around. In todaysgames, physics and object manipulation is oftenas important part as the graphics, making itnecessary to use other methods than ray castingif AO should be a part of the game.

Listed below are some well used illuminationmethods, for offline renderings, that do not reallyrequire the application of additional AO as theyalready tries to simulate the behaviour of light.Listed are also some ways of doing fast AO,mostly for real-time usage, where the effect hasto be calculated in every single frame.

Offline methods

• Ray-tracing

• Photon mapping

• Radiosity

Real-time methods

• SSAO - Screen Space Ambient Occlusion

• HBAO - Horizon Based Ambient Occlusion

• Volume-based AO

2.2 SSAO

Screen-Space Ambient Occlusion [1] is a veryfast and effective method for achieving the AOeffect in real-time. It was developed at theCrytek company and made its debut in their gameCrysis, 2007. The mind behind the technique isVladimir Kajalin, who came up with the brilliantidea of using the built-in depth buffer, in therendering pipeline, to calculate the occlusionvalues. The depth values are compared forobjects that are close to each other in the viewplane and based on their difference, a final AOintensity can be derived. Due to the fact thatSSAO uses the camera depth buffer, it has tobe calculated for every frame. Though, thanksto its efficiency this is fully possible withoutaffecting the frame rate too much. SSAO quicklybecame the standard for AO in games and isstill being used today. However, it definitelysuffers from a few problems; the method canonly take visible geometry into account. Objectsthat are outside the view plane or blocked byother objects are simply ignored. It also hasdifficulties in managing object edges. When aforeground object pixel is compared to a pixelbelonging to the background, you may get hugedifferences in depth. This can cause problemswhen trying to blur the effect, to smooth outnoise and aliasing, which can result in strangeglorias around objects. Also, areas that are veryoccluded from the sides (in respect to cameradirection) might appear brighter than they shoulddue to lack of variance in depth. Therefore,SSAO will never converge to a correct result, nomatter how many samples that are used.

3

2.2.1 How it works

SSAO has the advantage that it is implementedpurely in a fragment shader, which takes aframebuffer copied texture as input. This makesany RAM fetches redundant, thus increasesthe computing efficiency. The AO dedicatedfragment shader reads the depth value for a pixeland compares it to neighboring pixels. How thesepixels should be sampled can vary depending onthe quality settings from the user. If an adjacentpixel has a smaller depth value, the occlusionvalue will be incremented for the current pixelin the fragment shader. An illustration of this canbe seen in figure 3. Another advantage that theSSAO method has is its independency of scenecomplexity. As the number of fragments in thescene always are constant, it will take the sametime to calculate the AO for a simple plane as fora scene with millions of polygons.

Figure 3: Illustration of different depth values in1D.

2.3 HBAO

Horizon-based Ambient Occlusion is animprovement of the SSAO method. It wasreleased in 2008, presented by Louis Bavoil et al[2]. HBAO uses the angle between the horizonand the vector from an AO point to a point onoccluding geometry, to march over all geometrycloseby, and thereby achieving the total AOvalue. This resolves some of the problems fromSSAO but the computational cost is also greater.

In many real-time applications, such as games,the difference can be hard to spot and it mightnot be worth the extra computation time.

3 Volume-Based AmbientOcclusion

Volume-based ambient occlusion, presented byG. Papaioannou et al[3], utilizes the sameprinciples as when ray casting the stochastichemisphere, as in equation 3. However, here therays are marched within a volume representationof the scene. The scene geometry can bevoxelized and stored inside a 3D texture, andthen conveniently be traversed. As all the datafrom the scene exists in the texture, the methodis view-independent. However, it is possible toonly derive the ambient occlusion for the volumedata that is visible on the screen.

In the implementation presented in this paperwe apply a conservative voxelization step whichwill be described in the following section.

For now, assume that we have a volumerepresentation of a scene, as a 3D texture, filledwith ones where there is geometry and zeroseverywhere else. When the geometry in thescene is rendered, each pixel fragment can beconnected to a voxel fragment in the 3D texture.Given from the geometry is also the normal atthat position. From here, rays can be marchedout through the texture in directions withinthe hemisphere of the normal. As we marchthrough the texture we try to linearly interpolatethe value at each sampling point. If the rayintersects geometry, the ray will be terminatedand the occlusion value for the current pixel willincrease. This increment is dependent on howmany rays that are casted. The final pixel valuewill be calculated as the difference between aset ambient value and the sum of all occlusionvalues.

3.1 Pros and Cons

One of the advantages with volume-based AO isthat, as once the scene has been voxelized andstored in a 3D texture, the rest can be done in afragment shader. In that sense it is possible toutilize the multi parallel computing power of theGPU, which allows for this to be run in real-time.

4

Furthermore, the technique is independent ofscene complexity - AO is only calculated forvoxel fragments that are visible on screen.

Unfortunately there are also some drawbacks.The texture memory on the GPU sets a fairlyearly limit on the size of the 3D texture. Lessavailable memory means lower resolution of thevoxel grid which inevitably leads to voxel shapedartifacts in the AO shadow contours. The methodalso scales badly with the number of pixelfragments that are being displayed. Imaginethat, for every fragment, nine rays are castedover its hemisphere, where each ray reads fromthe 3D texture at 10 different positions. Itwill probably hit a voxel containing geometrybefore it has traversed all these steps, but asa worst case we have 90 texture fetches foreach fragment. Finally, the method suffers fromdifficulties of creating ambient occlusion in areasthat are tighter than the length of a single voxel.If occluding geometry is located in the samevoxel fragment as for the pixel we currentlycalculate AO for, this geometry will be ignored.It is necessary to completely exit the own voxelfragment in the first ray step, otherwise, every raywould report detection of geometry, which wouldresult in every fragment becoming black.

3.2 Bilateral smoothing

The ambient occlusion pass is rendered to a2D texture, which is sent into another shaderin a separate pass. Here, the AO texture isblurred with a Gaussian filter kernel, in order toremove some of the artifacts that arise due to thefinite grid resolution. Though, the blur cannotbe applied to the whole texture, as that wouldeliminate all the details in the scene. Only the AOshadows should be smoothed. Here, a techniqueis used that is much similar to SSAO. Bilateralfilters use more information for the weightingof surrounding pixels than just the distance, topreserve sharp edges. Another radiometric value,such as depth can be used to achieve this effect.The Z-buffer depth values are saved in anothertexture, so that they can be accessed for all pixelsin the AO texture. When fetching the pixel valueswithin the filter kernel, from the AO texture, theirdepth value is also fetched and compared withthe current pixel in the blurring fragment shader.The more a pixel’s depth value differs from the

center pixel, the less it will be weighted for thefinal value. This technique is for instance used inthe Surface Blur filter in Adobe Photoshop.

Figure 4: Example of an application of abilateral filter.

4 Volume Rasterization UsingThe Graphics Pipeline

Volume rasterization or just voxelization is themethod of sampling some function into a threedimensional discrete domain. The functioncan be any type of 3D discrete or continuousfunction, e.g, a level set, density function or inour case a surface or more precise, a mesh. Sincethe domain is discrete it has to be divided intosome set of smaller regions which is commonlycalled voxels. Voxels are the three dimensionalequivalence to pixels in an image, i.e, they havea position, coverage (size) and of course a value.A fragment is a pixel intersected by a triangle.The 3D equivalent is a voxel intersected by atriangle, called voxel fragment. In this paper wewill always assume that the voxels are uniform insize, i.e, they are in the shape of cubes, see figure5.It is also important to note that there are at leasttwo different coordinate spaces when dealingwith voxel grids. The voxel-space is a localspace defined over the number of voxel in eachdimension i.e, (i, j, k) ∈ [0, 1, 2, ..., N∗] whereN∗ denotes the number of voxels in each axis,(∗ = i, j, k). However, the voxels are alsoplaced in world space, i.e, voxel (i, j, k) actuallyrepresents some world space coordinate (x, y, z).In order to perform this mapping, we must decidethe world space size of a voxel, ∆v. The mappingcan then be performed as follows:

5

Figure 5: Voxelized version (left) of geometry(right).

(x, y, z)world = 1∆v

(i, j, k)voxel (4)

(i, j, k)voxel = b∆v(x, y, z)worldc (5)

where b...c is the floor operation.

4.1 Surface mesh voxelization

Compared to a continuously defined threedimensional function, a surface mesh is quitedifficult to rasterize since the surface is notexplicitly defined. In this paper we assumethat a mesh is a set of triangles and thateach triangle is defined by three vertices. Thebasic idea of rasterizing a mesh, utilizingthe graphics pipeline, is to use the hardwarerasterizer to generate our voxel positions. First,a cubical frustum is defined in world spacewhich specifies the domain of the volumerepresentation. The trick is then to set theviewport or plane size to the dimensions of thevoxel dimensions and orthographically projecteach triangle. The graphics pipeline rasterizerwill then generate fragments that will correspondto voxel fragments. That is, the screencoordinates of the fragments will corresponddirectly to the voxel-space coordinate. Thedepth of the fragment can be used to calculatethe voxel-space depth by multiplying with thedimensions of the voxel dimensions. Theapproach is very simple in theory but practically

there are several issues that need to be resolved.We will begin by examining how a simpletriangle is rasterized on to the screen.

Figure 6: Conservative rasterization (b),compared with normal rasterization (a).source: [4]

Two dimensional triangle rasterization canbe done with some scanline algorithm whichwill not be covered in this paper, though theresult can be seen in figure 4.1. This figureshows two different types of rasterization. Infigure (a), only pixels that have their centerinside the triangle are rasterized, which is oftencalled standard rasterization. This is not alwaysgood since it underestimates the boundary of thesurface. Taken to three dimensions, this couldresult in holes in the surface when the anglebetween two triangles is very steep. Conservativerasterization, which has been used in figure (b),can be used to fix this problem. This methodrasterizes pixels that are both intersected bythe triangle and that are contained. How thiscan be achieved with the graphics pipeline iscovered later. As mentioned, if the triangle isorthographically projected onto an image planewith the same resolution as the grid, the screencoordinate of each fragment will be the sameas the voxel-space coordinate. However, inthree dimensions the depth is also needed. Thedepth, d, for any given fragment will be inthe range [0, 1] so it has to be converted bymultiplying it with the dimension of the grid.The problem is that the when the triangle isprojected and rasterized, the area of the triangleis decreased. It is therefore possible that thetriangle intersects multiple voxels in depth, butthe pipeline only generates a single fragment. Inthe next section we will see how to modify thestandard rasterization pipeline to do conservative

6

rasterization, that will alleviate the issues justdescribed.

4.2 Conservative Rasterization

As mentioned, conservative rasterization is atechnique that rasterizes all fragments thatare intersected by a triangle. Conservativerasterization is applied both in screen coordinatesand in depth, but these are two separate methodsand are applied in different stages of the graphicspipeline. Because of the fact that the rasterizationprocess is not open for modification, the onlyway to change it is to feed it with a differentinput. Therefore, to achieve the result depictedin figure 4.1 (b), the triangle must modifiedbefore sending it to the raster stage. One way ofachieving this is by dilating the triangle, whichwas proposed in [5]. Their approach for convexpolygons like triangles, is to put pixels on eachvertex and calculate the convex hull with thesepixels included, see figure 7.

Figure 7: Adjusted over-estimated triagle.

This can be done by enlarging the triangle tobe aligned with the convex hull, shown in blue,and creating a bounding box that bounds theconvex hull, shown in orange in figure 8 below.

Figure 8: Illustration of the bounding box, whichcuts off parts of the over-estimated triangle.

The enlarging of the triangle and creation ofthe bounding box is done in a geometry shader.After the standard rasterization, fragments in theconvex hull are created as intended, but alsofragments in the enlarged triangle outside thebounding box. The first thing that is done in the

subsequent fragment shader is to discard theseunwanted fragments by checking if the fragmentis inside the bounding box or not.

It is also needed to compute a conservativedepth value, though conservative in the senseof depth is a bit different than in screen space.As mentioned, the problems stems from theprojection step. In figures 9, a triangle isprojected and rasterized to the image plane. Theblack points refer to fragments in the image planeand the grid refers to both pixels and voxels.Also, note that the triangle is seen above theprojection dimension, i.e, the depth increaseswhen moving away from the projection plane. Infigure 9 (a) it is clear that some voxels are misseddue to the fact the the triangle intersects severalvoxels in depth within the same fragment. Ourgoal is to produce the results in figure 9 (c).

Figure 9: Going from (a) to (c) by using the z-minand z-max in (b).

In order to achieve this, we must calculatea minimum and maximum depth value withinthe fragment, which can be seen in figure 9(b). However, this requires a known depthchange, ∆z, within the fragment, which can beapproximated by the partial derivatives of thedepth, given by the following equation [6].

∆z =1

2

(|∂z∂x|+ |∂z

∂y|)

(6)

The minimum and maximum depth value isthen given by zmin = zc − ∆z and zmax =zc + ∆z where zc is the depth in the center ofthe fragment. Partial derivatives are often built inshader function in several shader languages e.g,the function fwidth(...) in GLSL. The final stepin the rasterization process is to write to the 3dtexture. At least in OpenGL this has been madepossible since the addition of load/store methodsin the release of OpenGL 4.2.

7

5 Sparse Voxel Octree

When a 3D texture is used to represent thescene most of the data will be unused, sincethe actual surface is just a fraction of the totalvolume. Sparse voxel octree (SVO) is a voxeldata structure that only stores information invoxels intersected by the surface. It is alsoa tree structure where each node can have 8children. Each node represent a voxel in space,e.g. the first node represents the whole sceneas a voxel and its children nodes represents an8th of its parents volume etc. The sparse natureof SVO is that each node does not have to havechildren. The subdivision rule can vary betweendifferent applications of the data structure. Inour implementation it is based on the voxelfragments. See figure 10 for a visualization ofour octree.

Figure 10: Octree visualization on leaf-nodelevel.

5.1 Implementation

In our implementation the actual building of theoctree is only done once at startup. This limitsthe current implementation to static scenes sincemoving objects will not be updated in the octree.Dynamic objects should be rasterized separatelyand stored in a way that it is easy to remove atthe end of each frame. A simple but brute forcesolution could be to voxelize the whole scene,static and dynamic parts, at each frame.

The actual building of the octree is done inseparate passes in a breadth-first manner, i.e.each level of the octree is done in order, whichis described in [7]. Pseudo code for the buildingof our octree is listed below:

for each level lfor each voxel fragment vf

markNode(getVoxelCoordinate(vf));//Marks node based on voxel//coordinate of vf

end forfor all nodes n in l

if(n.isMarked())n.createChildren();

end forend for

This is more illustrated in figure 11. Our octreeimplementation is made in C++ but it is parallelin nature. Each voxel fragment can mark nodesindependently and each node can be subdividedin parallel. When creating each level one byone, all nodes of the same level get storedclose in memory which is good for cache sinceneighboring accesses are common.

5.2 Pros and cons

The SVO reduces the memory requirementsignificantly when compared to 3D textures. The3D texture on the other hand can make use ofthe fast hardware interpolation. Crassin et al.[8] solved this problem by having each nodestore the 3x3x3 voxel neighbourhood in a local3D texture instead of just storing the one voxel.There are problems with this as well, e.g. themost obvious problem is the higher memoryrequirement.

Both the 3D texture and SVO are spatialdata structures which makes it easy to findneighbouring scene information. The SVO ismore efficient when there is a lot of empty spacebetween surfaces, since it can check large emptyareas quickly by traversing the octree. Findingneighbours can also be optimized by storingpointers to neighbours in each node at the costof more memory.

8

Figure 11: The procedure of building the octree.

6 Visualization

6.1 Geometry

To clarify, the actual triangle scene is renderedevery frame and is lit by using the informationin the voxels. This makes the surface qualityindependent of the quality of the lighting, i.e. theactual scene geometry will never get blocky bygrid artifacts. If artifacts from the grid appears itshows up in the lighting.

6.2 Grid - geometry shader

The SVO is visualized by sending the octreelevel to be shown and the positions of the voxelsin the SVO to the GPU. The geometry shaderthen creates wireframe boxes centered on thesepositions scaled appropriately for the specifiedoctree level.

6.3 Phong shading and AO

A scene rendered with ambient occlusion only,can look very appealing. However, whennothing else than ambient occlusion exists, itis impossible to judge its contribution to thefinal rendering, compared to a scene renderedwithout AO. Therefore, a basic Phong shaderis also implemented and mixed into the resultimages. The final shading will subtract theocclusion value from the phong shaded pixelintensity instead of a simple ambient intensity.However, the Phong shading can also be turnedoff completely.

6.4 GUI

To make it easier to test out different settingsfor the rendering, we use the AntTweakBarextension for GLFW. This lets the user connect

different parameters to a GUI and modify theseduring runtime. Being able to turn ambientocclusion on and off, adjusting the number ofcast rays, how far they should reach, and howmuch the AO should be blurred are some of theparameters that we can change.

7 Results

The result of this project is an AmbientOcclusion pass renderer based on volume data.To see the difference between the use of ambientocclusion and having it turned off, a Phongshader is also implemented. We are allowed toload any model into the program and rasterizeit according to described method. The SparseVoxel Octree is done on the CPU and remainsthere, thus it is not used for the AO renderpass, but rather during the visualization of thegeometry filled volume data. For this, wireframecubes are created within a geometry shader, andcan be turned on and off, as well as scaled inaspect to octree level, during runtime. We arealso able to adjust the parameters for the AOshader, such as number of rays and the amountof blur.

The figures below show the result in somedifferent aspects.

9

(a) Assassin’s Creed - Altair: 377,000triangles.

(b) The Sibenik Cathedral: 1,200,000triangles.

Figure 12: Two different renderings with ambient occlusion only.

10

Figure 13: Some different renderings comparing Phong shading only (left) with Phong shading andambient occlusion combined (right).

11

Figure 14: Illustration of some different SVO levels.

Figure 15: The SVO, visualized at its deepest level.

12

8 Conclusion

In this paper we have described a volume basedambient occlusion model. This approach utilizesthe standard graphics pipeline to rasterize thescene geometry into a three dimensional textureas described in [8], [4] and [6]. We have seen thatthis method produces visually pleasing resultsbut suffers both from grid artifacts and slowcomputation time. It is however interesting tohave seen that the computing power of today’sGPU’s can compute ray traced ambient occlusionin real time. We have also seen that the graphicspipeline can effectively be used to voxelizesurface meshes. This has been made possiblethrough OpenGL 4.2’s load/store functionality,which makes it possible to both write and readtwo arbitrary buffers or textures inside a shaderprogram. The major conclusion we make is thatvolume based ambient occlusion is a feasibleapproach but much work has to be done to speedup the ray marching step which is the mainbottleneck of the method.

9 Future work

SVO on the GPU To speed up the ambientocclusion calculations and to save a lot ofmemory, the Sparse Voxel Octree could beimplemented on the GPU. This would implyusing the SVO instead of ray marching a 3Dtexture. This would most easily be done in acompute shader available in OpenGL 4.3.

Implement voxel cone tracing Voxel conetracing [8] is a method that tries to reduce thetime to calculate an intensity for a pixel whenusing rays to gather information. Instead ofsending out a lot of rays for each fragment,the idea of voxel cone tracing is to send outfewer rays that approximate the result as goodas possible. These fewer rays can be seen ascones with the external point of the cones at thesurface.The radius of the cones then determinesat what voxel resolution the ray should sample,see figure 16.

Since the rays sample lower and lowerresolutions while they get further and furtheraway this method gets automatic LOD rendering.The mipmaps are just averages of lower

Figure 16: Visualization of voxel cone tracingusing mipmapped textures.

resolutions which also can be used for blureffects. Voxel cone tracing works very wellwith ambient occlusion as well. In ourimplementation each fragment spawns a numberof rays that samples a 3D texture. The conversionwould be to spawn fewer rays per fragmentand sample the appropriate mipmap level insteadbased on the radius of the cone.

The naive way of implementing this wouldbe to use a high resolution 3D texture and itsmipmaps. The problem with this method isof course the memory requirement. The moresophisticated way would be to use the sparsevoxel octree data structure, described in section5. A modification to our SVO would be to storescene information in all nodes instead of only theleaf nodes, where each node gets its values fromaveraging the values of its children [8].

Shooting a ray from a voxel fragment out inthe scene would be the same thing as samplingoctree nodes closer and closer to the root,which means that the accuracy of the check foroccluding geometry becomes lower and lower.This naturally also contributes to smoothing outvoxel artifacts in the AO.

References

[1] M. Mittring, “Finding next gen: Cryengine2,” in ACM SIGGRAPH 2007 courses,SIGGRAPH ’07, (New York, NY, USA),pp. 97–121, ACM, 2007.

[2] L. Bavoil, M. Sainz, and R. Dimitrov,“Image-space horizon-based ambientocclusion,” in ACM SIGGRAPH 2008 talks,

13

SIGGRAPH ’08, (New York, NY, USA),pp. 22:1–22:1, ACM, 2008.

[3] G. Papaioannou, M. Menexi, andC. Papadopoulos, “Real-time volume-basedambient occlusion,” Visualization andComputer Graphics, IEEE Transactions on,vol. 16, pp. 752 –762, sept.-oct. 2010.

[4] M. Pharr and R. Fernando, GPUGems 2: Programming Techniquesfor High-Performance Graphics andGeneral-Purpose Computation (Gpu Gems).Addison-Wesley Professional, 2005.

[5] H. Nguyen, Gpu gems 3. Addison-WesleyProfessional, first ed., 2007.

[6] L. Zang, W. Chen, D. S Ebert, andQ. Peng, “Conservative voxelization,”Springer-Verlag, 2007.

[7] A. Name, “Octree-based sparse voxelizationusing the gpu hardware rasterizer,”in OpenGL Insights (P. Cozzi andC. Riccio, eds.), pp. 303 – 319,CRC Press, July 2012. http://www.openglinsights.com/.

[8] C. Crassin, F. Neyret, M. Sainz, S. Green,and E. Eisemann, “Interactive indirectillumination using voxel cone tracing,”Computer Graphics Forum (Proceedings ofPacific Graphics 2011), vol. 30, sep 2011.

14

Documents

Volume-based Ambient Occlusion with Voxel Fragmentation · Volume-based Ambient Occlusion with Voxel Fragmentation PROJECT IN TECHNOLOGY FOR ADVANCED COMPUTER GAMES TSBK03 Christopher