Book on Ray Casting

“cg”2002/3page1

i

i

i

i

i

i

i

i

Fundamentals of Computer Graphics

Peter ShirleySchool of Computing

University of Utah

March 10, 2002

“cg”2002/3page2

i

i

i

i

i

i

i

i

2

“cg”2002/3page3

i

i

i

i

i

i

i

i

Contents

1 Surface Shading 51.1 Diffuse Shading. . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.1 Lambertian Shading Model. . . . . . . . . . . . . . . . . 51.1.2 Ambient Shading. . . . . . . . . . . . . . . . . . . . . . 61.1.3 Vertex-based Diffuse Shading . .. . . . . . . . . . . . . 7

1.2 Phong Shading. . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.1 Phong Lighting Model .. . . . . . . . . . . . . . . . . . 81.2.2 Surface Normal Vector Interpolation . . . . . . . . . . . . 9

1.3 Artistic Shading. . . . . . . . . . . . . . . . . . . . . . . . . . . 101.3.1 Line Drawing . . . . . . . . . . . . . . . . . . . . . . . . 101.3.2 Cool-to-Warm Shading .. . . . . . . . . . . . . . . . . . 10

2 Ray Tracing 132.1 The Basic Ray Tracing Algorithm . . . . . . . . . . . . . . . . . 132.2 Computing Viewing Rays . . . . . . . . . . . . . . . . . . . . . . 152.3 Ray-Object Intersection . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.1 Ray–Sphere Intersection. . . . . . . . . . . . . . . . . . 162.3.2 Ray–Triangle Intersection. . . . . . . . . . . . . . . . . 182.3.3 Ray–Polygon Intersection. . . . . . . . . . . . . . . . . 20

2.4 A Ray Tracing Program . . . . . . . . . . . . . . . . . . . . . . . 212.4.1 Object-oriented design for a ray tracing program . . . . . 21

2.5 Shadows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.6 Specular Reflection . . .. . . . . . . . . . . . . . . . . . . . . . 232.7 Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.8 Instancing . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 262.9 Sub-linear Ray–Object Intersection. . . . . . . . . . . . . . . . . 28

2.9.1 Bounding Boxes. . . . . . . . . . . . . . . . . . . . . . 282.9.2 Hierarchical Bounding Boxes . .. . . . . . . . . . . . . 322.9.3 Uniform Spatial Subdivision . . .. . . . . . . . . . . . . 35

2.10 Constructive Solid Geometry . . . . . . . . . . . . . . . . . . . . 412.11 Distribution Ray Tracing . . . . . . . . . . . . . . . . . . . . . . 42

2.11.1 Antialiasing . . . . . . . . . . . . . . . . . . . . . . . . . 42

3

“cg”2002/3page4

i

i

i

i

i

i

i

i

4 CONTENTS

2.11.2 Soft Shadows . .. . . . . . . . . . . . . . . . . . . . . . 432.11.3 Depth of Field . . . . . . . . . . . . . . . . . . . . . . . 442.11.4 Glossy Reflection. . . . . . . . . . . . . . . . . . . . . . 45

3 Texture Mapping 493.1 3D Texture Mapping . .. . . . . . . . . . . . . . . . . . . . . . 50

3.1.1 3D Stripe Textures . . . . . . . . . . . . . . . . . . . . . 503.1.2 Texture Arrays . . . . . . . . . . . . . . . . . . . . . . . 513.1.3 Solid Noise . . . . . . . . . . . . . . . . . . . . . . . . . 533.1.4 Turbulence . . .. . . . . . . . . . . . . . . . . . . . . . 55

3.2 2D Texture Mapping . .. . . . . . . . . . . . . . . . . . . . . . 563.3 Tessellated Models . . .. . . . . . . . . . . . . . . . . . . . . . 573.4 Texture Mapping for Rasterized Triangles .. . . . . . . . . . . . 59

3.4.1 Perspective Correct Textures. . . . . . . . . . . . . . . . 603.5 Bump Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.6 Displacement Mapping .. . . . . . . . . . . . . . . . . . . . . . 62

4 Data Structures for Graphics 654.1 Triangle Meshes. . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.1.1 Winged-Edge Data Structure. . . . . . . . . . . . . . . . 664.2 Scene Graphs .. . . . . . . . . . . . . . . . . . . . . . . . . . . 684.3 Tiling Multidimensional Arrays .. . . . . . . . . . . . . . . . . 69

4.3.1 One-level Tiling for 2D Arrays . . .. . . . . . . . . . . . 704.3.2 Example: Two-level Tiling for 3D Arrays . .. . . . . . . 72

“cg”2002/3page5

i

i

i

i

i

i

i

i

Chapter 1

Surface Shading

To make objects look more “volumetric” it can help to useshading, where thesurface is “painted” with light. This chapter presents the most common heuristicshading methods. The first two, Diffuse and Phong shading, where developed inthe 1970s and are available in most graphics libraries. The last, artistic shading,uses artistic conventions to assign color to objects. This creates images reminis-cent of technical drawings which is desirable in many applications.

1.1 Diffuse Shading

Many objects in the world have a surface appearance loosely described as “matte”,indicating the object is not at all shiny. Examples include paper, unfinished wood,and dry unpolished stones. To a large degree such objects do not have their colorchange with viewpoint. For example, if you stare at a particular point on a piece ofpaper and move while keeping your gaze fixed on that point, the color at that pointwill stay relatively constant. Such matte objects can be idealized as behaving asLambertianobjects. This section discusses how to implement the shading of suchobjects.

1.1.1 Lambertian Shading Model

n

l

θ

Figure 1.1: The geometryfor Lambert’s Law. Both nand l are unit vectors.

A Lambertian object obeysLambert’s Law, which states that the colorc of asurface is proportional to the cosine of the angle between surface normal and thedirection to the light source:

c ∝ cos θ,

or in vector form:c ∝ n · l,

wheren and l are shown in Figure 1.1. This essentially says the color on thesurface will vary according to the cosine of the angle between the surface normal

5

“cg”2002/3page6

i

i

i

i

i

i

i

i

6 CHAPTER 1. SURFACE SHADING

and the light direction. Note that the vectorl is typically assumed to not dependon object location. That assumption is equivalent to assuming the light is “distant”relative to object sizes. Such a light is often called adirectional lightbecause itsposition is specified by a direction only.

n

l

Figure 1.2: When a sur-face points away from thelight it should receive nolight. This case can be de-tected by checking whetherthe dot product of l and n isnegative.

A surface can be made lighter or darker by changing the intensity of the lightsource or the reflectance of the surface. The reflectancecr is the fraction of lightreflected by the surface. This fraction will be different for different color com-ponents. For example, a red surface will reflect a higher fraction of red incidentlight than blue incident light. This means thatcr should be an RGB color. If weassume surface color is proportional to the light reflected from a surface, then thediffuse reflectancecr should be multiplied in:

c ∝ crn · l. (1.1)

The right-hand-side of Equation 1.1 is an RGB color with all RGB componentsin the range[0, 1]. We would like to add the effects of light color while keepingthe RGB components in the range[0, 1]. This suggests adding an RGB light colorcl which itself has components in the range[0, 1]:

c = crcln · l. (1.2)

This is a very convenient form, but it can produce RGB components forc that areoutside the range[0, 1] because the dot product can be negative. The dot product isnegative when the surface is pointing away from the light as shown in Figure 1.2.

The “max” function can be added to Equation 1.2 to test for that case:

c = clcrmax(0,n · l). (1.3)

Another way to deal with the negative light is to use an absolute value:

c = clcr|n · l|. (1.4)

While Equation 1.4 may seem physically implausible, it actually corresponds toEquation 1.3 with two lights in opposite directions. For this reason it is oftencalledtwo-sidedlighting (Figure 1.3).

Figure 1.3: Using Equa-tion 1.4, the two-sided light-ing formula, is equivalentto assuming two opposinglight sources of the samecolor.

1.1.2 Ambient Shading

One problem with the diffuse shading of Equation 1.3 is that any point whosenormal faces away from the light will be black. In real life, light is reflected allover and some light is incident from every direction. In addition there is oftenskylight giving “ambient” lighting. One way to handle this is to use many lightsources. A common trick is to always put a dim source at the eye so that allvisible points will receive some light. Another is to use the two-sided lighting inEquation 1.4. A more common approach is to add an ambient term. This is just aconstant color term added to Equation 1.3:

c = cr (ca + clmax(0,n · l)) .

“cg”2002/3page7

i

i

i

i

i

i

i

i

1.1. DIFFUSE SHADING 7

Figure 1.4: A circle (left) is approximated by an octagon (right). Vertex normals record the

surface normal of the original curve.

Intuitively you can think of the ambient colorca as the average color in the scenenot containing light sources. If you want to ensure that the computed RGB colorstays in the range[0, 1]3, thenca + cl ≤ (1, 1, 1). Otherwise your code should“clamp” RGB values above one to one.

1.1.3 Vertex-based Diffuse Shading

If we apply Equation 1.1 to a object made of triangles we will typically have afaceted appearance. Often the triangles are an approximation to a smooth sur-face. To avoid the faceted appearance we can place surface normal vectors at thevertices of triangles, apply Equation 1.3 at each of the vertices using the normalvectors at the vertices. This will give a color at each triangle vertex and this colorcan be interpolated using the barycentric interpolation described in Section??.

One problem with shading at triangle vertices is that we need to get the nor-mals from somewhere. Many models will come with normals supplied. If youtessellate your own smooth model, you can create normals when you create thetriangles. If you are presented with a polygonal model that does not have nor-mals at vertices and you want to shade it smoothly, you can compute normals bya variety of heuristic methods. The simplest is to just average the normals of thetriangles that share each vertex and use this average normal at the vertex. Thisaverage normal will not automatically be unit length, so you should convert it toa unit vector before using it for shading.

“cg”2002/3page8

i

i

i

i

i

i

i

i


1.2 Phong Shading

Some surfaces are essentially like matte surfaces, but havehighlights. Examplesof such surfaces include polished tile floors, gloss paint, and whiteboards. Wenote that highlights move across a surface as the viewpoint moves. This means wemust add a unit vectore toward the eye into our equations. If you look carefullyat highlights, you will see that they are really reflections of the light. Sometimesthese reflections are blurred. The color of these highlights is the color of the light.The surface color seems to have little effect. This is because the reflection occursat the object surface, and the light that penetrates the surface and picks up objectcolor is scattered diffusely.

1.2.1 Phong Lighting Model

nl

θ θ

r

σe

Figure 1.5: The geometryfor the Phong illuminationmodel. The eye should seea highlight if σ is small.

We would like to add a fuzzy “dot” the color of the light source in the right place.The center of the dot should be drawn where the directione to the eye “lines” upwith the natural direction of reflectionr as shown in Figure1.5. Here “lines up” ismathematically “whereσ is zero”. We would like to have the highlight have somenon-zero area, so we would like to have the eye see some highlight whereverσ issmall. Givenr, we’d like a heuristic function that is bright whene = r and fallsoff gradually whene moves away fromr. An obvious candidate is the cosine ofthe angle between them:

c = cl(e · r),There are two problems with this equation. First is that the dot product can be-come negative. This can be solved with an if statement that sets the color to zerowhen the dot product is negative. The more serious problem is that this highlightis much wider than those seen in real life. The maximum is in the right placeand is the right color, but it is just too big. We can narrow it without reducing itsmaximum color by taking it to a power:

c = clmax(0, e · r)p. (1.5)

Herep is thePhong exponent, a positive real number.To implement Equation 1.5 we need to first compute the unit vectorr. Given

unit vectorsl andn, r is the vectorl reflected aboutn . Figure 1.6 shows that thisvector is:

r = −l + 2(l · n)n, (1.6)

where the dot product is used to computecos θ.

n

l r

-l

θ n cos θ

n cos θ

Figure 1.6: The geometryfor calculating the vector r.

An alternative heuristic model based on Equation 1.5 eliminates the need tocheck for negative values of the number used as a base for exponentiation. Insteadof r, we computeh, the unit vector halfway betweenl ande (Figure 1.7):

h =e + l‖e + l‖

“cg”2002/3page9

i

i

i

i

i

i

i

i

1.2. PHONG SHADING 9

The highlight should occur whenh is nearn, i.e.,cos ω = h · n is near one. Thissuggests the rule:

c = cl(h · n)p. (1.7)

The exponentp here will have analogous control behavior to the exponent inEquation 1.5, but the angle betweenh andn is half the size of the angle betweene andr, so the details will be slightly different. The advantage of using the cosinebetweenn andh is that it is always positive for eye and light above the plane.The disadvantage is that a square root and divide is needed to computeh.

nl

h

ωe

Figure 1.7: The unit vectorh is halfway between l ande.

In practice we want most materials to have a diffuse appearance in addition toa highlight. We can combine Equations 1.3 and 1.7 to get:

c = cr (ca + clmax(0,n · l)) + cl(h · n)p. (1.8)

If we want to allow the user to dim the highlight we can add a control termcp:

c = cr (ca + clmax(0,n · l)) + clcp(h · n)p. (1.9)

The termcp is a RGB color, which allows us to change highlight colors. This isuseful for metals wherech = cr because highlights on metal take on a metalliccolor.

1.2.2 Surface Normal Vector Interpolation

Smooth surfaces with highlights tend to have their color change quickly comparedto Lambertian surfaces with the same geometry. Thus shading at the normal vec-tors can generate disturbing artifacts.

These problems can be reduced by interpolating the normal vectors across thepolygon and then applying Phong shading at each pixel. This will allow you to getgood images without making the size of triangles extremely small. Recall fromChapter?? that when rasterizing a triangle we compute barycentric coordinates(α, β, γ) to interpolate the vertex colorsc0, c1, c2:

c = αc0 + βc1 + γc2. (1.10)

We can use the same equation to interpolate surface normalsn0,n1, andn2:

n = αn0 + βn1 + γn2. (1.11)

And Equation 1.9 can then be evaluated for then computed at each pixel. Notethat then resulting from Equation 1.11 is usually not a unit normal. Better visualresults will be achieved if it is converted to a unit vector before it is used in shadingcomputations. This type of normal interpolation is often calledPhong normalinterpolation.

“cg”2002/3page1

i

i

i

i

i

i

i

i


1.3 Artistic Shading

The Lambertian and Phong shading methods are based on heuristics designed toimitate the appearance of objects in the real world. Artistic shading is designedto mimic drawings made by human artists. Such shading seems to have advan-tages in many applications. For example, auto manufacturers hire artists to drawdiagrams for car owners’ manuals. This is more expensive than using much more“realistic” photographs, so there is probably some intrinsic advantage to the tech-niques of artists when certain types of communication is needed. In this sectionwe show how to make subtly shaded line drawings reminiscent of human drawnimages. Creating such images is often callednon-photorealistic rendering, butwe will avoid that term because many non-photorealistic techniques are used forefficiency that are not related to any artistic practice.

1.3.1 Line Drawing

The most obvious thing we see in human drawings that we don’t see in real lifeis silhouettes. This is a line drawn between the foreground and the background ofan image. When we have a set of triangles with shared edges, we should draw anedge as is silhouette when one of the two triangles sharing an edge faces towardthe viewer, and the other triangle faces away from the viewer. This condition canbe tested for two normalsn0 andn1 by:

draw silhouette if(e · n0)(e · n1) ≤ 0.

Heree is a vector from the edge to the eye. This can be any point on the edge oreither of the triangles. Alternatively, iffi(p) = 0 are the implicit plane equationsfor the two triangles, the test can be written:

draw silhouette iff0(e)f1(e) ≤ 0.

We would also like to draw visible edges of a polygonal model. To do this we canuse either of the hidden surface methods of Chapter??drawing in the backgroundcolor, and then draw the outlines of each triangle in black. This in fact will alsocapture the silhouettes. Unfortunately, if the polygons represent a smooth surface,we really don’t want to draw most of these edges. However, we might want todraw all creaseswhere there really is a corner in the geometry. We can test forcreases by using a heuristic threshold:

draw crease if(n0 · n1) ≤ threshold.

This combined with the silhouette test will give nice-looking line drawings.

1.3.2 Cool-to-Warm Shading

When artists shade line drawings, they often use low intensity shading to givesome impression of curve to the surface, and to give colors to objects. Surfaces

“cg”2002/3page1

i

i

i

i

i

i

i

i

1.3. ARTISTIC SHADING 11

facing in one direction are shaded with a cool color, such as a blue, and surfacesfacing in the opposite direction are shaded with a warm color such as orange.Typically these colors are not very saturated and are also not dark. This way theblack silhouettes show up nicely. Overall this gives a cartoon-like effect. This canbe achieved by setting up a direction to a “warm” lightl and using the cosine tomodulate color, where the warmth constantkw is defined on[0, 1]:

kw =1 + n · l

2.

The colorc is then just a linear blend of the cool colorcc and the warm colorcw:

c = kwcw + (1− kw)cc.

There are many possiblecw andcb that will produce reasonable looking results.A good starting place for a guess is:

cc = (0.4, 0.4, 0.7),cc = (0.8, 0.6, 0.6).

Frequently Asked Questions

All of the shading in this chapter seems like enormous hacks. Isthat true?

Yes. However, they are carefully designed hacks that have proven useful in prac-tice. In the long-run we will probably have better-motivated algorithms that in-clude physics, psychology, and tone-mapping. However, the improvements inimage quality will probably be incremental.

I hate calling pow(). Is there a way to avoid it?

A simple way is to only have exponents that are themselves a power of two, i.e, 2,4, 8, 16, .... In practice this is not a problematic restriction for most applications.A lookup table is also possible, but will often not give a large speedup.

Notes

Diffuse shading was introduced to graphics inContinuous Shading of Curved Sur-faces(Gouraud, Communications of the ACM, June, 1971). Phong illuminationand normal interpolation we introduced inIllumination for computer-GeneratedImages(Phong, Communications of the ACM, June, 1975). The alternate half-angle form is fromLighting Controls for Synthetic Images(Warn, SIGGRAPH,1983). The bookNon-photorealistic Rendering(Gooch and Gooch, AK Peters2001) discusses the work called “artistic shading” in this chapter, as well as otherart-inspired methods of rendering.

“cg”2002/3page1

i

i

i

i

i

i

i

i


Exercises

1. The Moon is poorly approximated by Diffuse or Phong shading. Whatobservations tell you that this is true?

2. Velvet is poorly approximated by Diffuse or Phong shading. What obser-vations tell you that this is true?

3. Why do most highlights on plastic objects look white, while those on goldmetal look gold?

“cg”2002/3page1

i

i

i

i

i

i

i

i

Chapter 2

Ray Tracing

Ray tracingis a way to find what surfaces are visible through each pixel in an im-age. Unlike the z-buffer and BSP tree, ray tracing operates pixel-by-pixel ratherthan primitive-by-primitive. This tends to make ray tracing relatively slow forscenes with large objects in screen space. However it has a variety of nice featureswhich make it often the right choice for batch rendering and even some interactiveapplications. Ray tracing’s primary benefit is that it is relatively straightforward tocompute shadows and reflections. In addition, ray tracing is well-suited to “walk-throughs” of extremely large models due to its low asymptotic time complexitywhich makes up for the required preprocessing of the model.

In an interactive 3D program implemented in a conventional z-buffer environ-ment, it is often useful to be able to select an object using a mouse. The mouse isclicked in pixel(i, j) and the “picked” object is whatever object is “seen” throughthat pixel. If the rasterization process includes an object identification buffer thisis just a matter of looking up the value in pixel(i, j) of that buffer. However, ifthat buffer is not available, we can solve the problem of what object is visible viabrute force geometrical computation using a “ray intersection test”. In this wayray tracing is useful even to programmers who use only standard graphics APIs.

This chapter also discussesdistribution ray tracing, where multiple randomrays are sent through each pixel in an image to simultaneously solve the antialias-ing, soft shadow, fuzzy reflection, and depth-of-field problems.

2.1 The Basic Ray Tracing Algorithm

The simplest use of ray tracing is to produce images similar to those producedby the z-buffer and BSP-tree algorithms. Fundamentally, those methods makesure the appropriate object is “seen” through each pixel, and that the pixel color isshaded based on that object’s material properties, the surface normal seen throughthat pixel, and the light geometry.

13

“cg”2002/3page1

i

i

i

i

i

i

i

i

14 CHAPTER 2. RAY TRACING

(u=r,v=t,w=n)

x

y

z

u

v

w

e

o

(u=l,v=b,w=n)

Figure 2.1: The 3D window we look through is the same as in Chapter ??. The borders of

the window have simple coordinates in the uvw coordinate system with respect to origin e.

u

v

w

e

screen

ray

Figure 2.2: The sample points on the screen are mapped to a similar array on the 3D

window. A viewing ray is sent to each of these locations.

“cg”2002/3page1

i

i

i

i

i

i

i

i

2.2. COMPUTING VIEWING RAYS 15

u

v

w

e

ray

T1

T2

T3

Figure 2.3: The ray is “traced” into the scene and the first object hit is the one seen through

the pixel. In this case the triangle T2 is returned.

Figure 2.1 shows the basic viewing geometry for ray tracing, which is thesame as we saw earlier in Chapter??. The geometry is aligned to auvwcoordinatesystem with the origin at the eye locatione. The key idea in ray tracing is toassociate locations on the view-plane atw = n that correspond to pixel centers asshown in Figure 2.2. A “ray”, really just a directed 3D line, is then sent frome tothat point. We then “gaze” in the direction of the ray to see the first object seen inthat direction. This is shown in Figure 2.3, where the ray intersects two triangles,but only the first triangle hit,T2, is returned. The structure of the basic ray tracingprogram is:

Computeu, v, w basis vectorsfor each pixeldo

compute viewing rayfind first object hit by ray and its surface normalnset pixel color to value based on material, light, andn

The pixel color can be computed using the shading equations of the last chapter.

2.2 Computing Viewing Rays

e

ss - e

Figure 2.4: The ray fromthe eye to a point on thescreen.

First we must choose a mathematical representation for a ray. A ray is just anorigin point and a propagation direction. A 3D parametric line is ideal for this.As discussed in Section??, the 3D parametric line from the eyee to a points onthe screen (Figure 2.4) is given by:

p(t) = e + t(s− e).

This should be read, “we advance frome along the vector(s − e) a fractionaldistancet to find the pointp.” So givent, we have a pointp. Note thatp(0) = e,

“cg”2002/3page1

i

i

i

i

i

i

i

i


andp(1) = s. Also note that for positivet, if t1 < t2, thenp(t1) is closer to theeye thanp(t2). Also, if t < 0, thenp(t) is “behind” the eye. These facts will beuseful when we search for the closest object hit by the ray that is not behind theeye. Note that we are overloading the variablet here which is also used for thetop of the screen’sv coordinate.

To compute a viewing ray, we need to knowe, which is a given, ands. Findings may look somewhat difficult. In fact, it is relatively straightforward using thesame transform machinery we used for viewing in the context of projecting linesand triangles. First, we find the coordinates ofs in the uvw coordinate systemwith origin e. For all points on the screen,ws = n as shown in Figure 2.2.Theuv coordinates are found by the windowing transform that takes[−0.5, nx −0.5]× [−0.5, ny − 0.5] to [l, r]× [b, t]:

us = l + (r − l)i + 0.5

nx

vs = b + (t− b)j + 0.5

ny

where(i, j) are the pixel indices. This gives uss in uvw coordinates. By defini-tion, we can convert to canonical coordinates:

s = e + usu + vsv + wsw. (2.1)

Alternatively we could use the matrix form (Equation??):

xs

ys

zs

1

=

1 0 0 xe

0 1 0 ye

0 0 1 ze

0 0 0 1

xu xv xw 0yu yv yw 0zu zv zw 00 0 0 1

us

vs

ws

1

(2.2)

which is just the matrix form of Equation 2.1. We can compose this with thewindowing transform in matrix form if we wished, but this is probably not worthdoing unless you like the matrix form of equations better.

2.3 Ray-Object Intersection

If we are given a raye + td, we want to find the first intersection with any objectwheret > 0. It will later prove useful to solve a slightly more general problem offinding the first intersection in interval[t0, t1], and using[0,∞] for viewing rays.We solve this for both spheres and triangles in this section. In the next sectionmultiple objects are discussed.

2.3.1 Ray–Sphere Intersection

Given a rayp(t) = o + td and an implicit surfacef(p) = 0, we’d like to knowwhere they intersect. The intersection points occur when points on the ray satisfy

“cg”2002/3page1

i

i

i

i

i

i

i

i

2.3. RAY-OBJECT INTERSECTION 17

the implicit equationf(p(t)) = 0.

This is justf(e + td) = 0.

A sphere with centerc = (cx, cy, cz) and radiusR can be represented by theimplicit equation

(x− cx)2 + (y − cy)2 + (z − cz)2 −R2 = 0.

We can write this same equation in vector form:

(p− c) · (p− c)−R2 = 0.

Any pointp that satisfies this equation is on the sphere. If we plug points on theray p(t) = e + td into this equation we can solve for the values oft on the raythat yield points on the sphere:

(e + td− c) · (e + td− c)−R2 = 0.

Moving terms around yields

(d · d)t2 + 2d · (o− e)t + (e− c) · (e− c)−R2 = 0.

Here, everything is known but the parametert, so this is a classic quadratic equa-tion in t, meaning it has the form

At2 + Bt + C = 0.

The solution to this equation is discussed in Section??. The term under the squareroot sign in the quadratic solution,B2− 4AC, is called thediscriminantand tellsus how many real solutions there are. If the discriminant is negative, its squareroot is imaginary and there are no intersections between the sphere and the line.If the discriminant is positive, there are two solutions; one solution where the rayenters the sphere, and one where it leaves. If the discriminant is zero, the raygrazes the sphere touching it at exactly one point. Plugging in the actual terms forthe sphere and eliminating the common factors of two, we get:

t =−d · (e− c)±

√(d · (e− c))2 − (d · d) ((e− c) · (e− c)−R2)

(d · d).

In an actual implementation, you should first check the value of the discriminantbefore computing other terms. If the sphere is used only as a bounding objectfor more complex objects, then we need only determine whether we hit it andchecking the discriminant suffices.

As discussed in Section??, the normal vector at pointp is given by the gra-dientn = 2(p− c). The unit normal is(p− c)/R.

“cg”2002/3page1

i

i

i

i

i

i

i

i


2.3.2 Ray–Triangle Intersection

There are many algorithms for computing ray–triangle intersections. We will usethe form that uses barycentric coordinates for the parametric plane containing thetriangle because it requires no long-term storage other than the vertices of thetriangle.

To intersect a ray with a parametric surface, we set up a system of equationswhere the Cartesian coordinates all match:

xe + txd = f(u, v),ye + tyd = g(u, v),ze + tzd = h(u, v).

Here we have three equations and three unknowns (t, u, andv), so we can numer-ically solve for the unknowns. If we are lucky, we can solve for them analytically.

a

b

cp

d

e

Figure 2.5: The ray hits theplane containing the trian-gle at point p.

In the case where the parametric surface is a parametric plane, the parametricequation can be written in vector form as discussed in Section??. If the verticesof the triangle area, b andc, then the intersection will occur when:

e + td = a + β(b− a) + γ(c− a). (2.3)

The hitpointp will be at e + td as shown in Figure 2.5. Again from Section??we know the hitpoint is in the triangle if and only ifβ > 0, γ > 0, andβ +γ < 1.Otherwise it hits the plane outside the triangle. If there are no solutions, either thetriangle is degenerate, or the ray is parallel to the plane containing the triangle.

To solve fort, β, andγ in Equation 2.3, we expand it from its vector form intothe three equations for the three coordinates:

xe + txd = xa + β(xb − xa) + γ(xc − xa),ye + tyd = ya + β(yb − ya) + γ(yc − ya),ze + tzd = za + β(zb − za) + γ(zc − za).

This can be rewritten as a standard linear equation:xa − xb xa − xc xd

ya − yb ya − yc yd

za − zb za − zc zd

β

γt

=

xa − xe

ya − ye

za − ze

The fastest classic method to solve this3×3 linear system isCramer’s Rule. Thisgives us the solutions:

β =

∣∣∣∣∣∣xa − xe xa − xc xd

ya − ye ya − yc yd

za − ze za − zc zd

∣∣∣∣∣∣|A| ,

“cg”2002/3page1

i

i

i

i

i

i

i

i

2.3. RAY-OBJECT INTERSECTION 19

γ =

∣∣∣∣∣∣xa − xb xa − xe xd

ya − yb ya − ye yd

za − zb za − ze zd

∣∣∣∣∣∣|A| ,

t =

∣∣∣∣∣∣xa − xb xa − cx xa − xe

ya − yb ya − cy ya − ye

za − zb za − cz za − ze

∣∣∣∣∣∣|A| ,

where the matrixA is

A =

xa − bx xa − cx dx

ay − by ay − cy dy

az − bz az − cz dz

and|A| denotes the determinant ofA. The3×3 determinants have common sub-terms that can be exploited. Looking at the linear systems with dummy variables

a d gb e hc f i

β

γt

=

j

kl

Cramer’s rule gives us

β =j(ei− hf) + k(gf − di) + l(dh− eg)

M,

γ =i(ak − jb) + h(jc− al) + g(bl − kc)

M,

t = −f(ak − jb) + e(jc− al) + d(bl − kc)M

,

whereM = a(ei− hf) + b(gf − di) + c(dh− eg).

We can cut our operations by reusing numbers such asei-minus-hf.The algorithm that the linear solve above goes into can have some early ter-

mination. The function should look something like:

booleanfunction raytri(ray r, vector3a, vector3b, vector3c, interval[t0, t1])computetif (t < t0) or (t > t1) then

return falsecomputeγif (γ < 0) or (γ > 1) then

“cg”2002/3page2

i

i

i

i

i

i

i

i


return falsecomputeβif (β < 0) or (β > 1− γ) then

return falsereturn true

2.3.3 Ray–Polygon Intersection

Given a polygon withm verticesp1 throughpm and surface normaln, we firstcompute the intersection points between the raye + td and the plane containingthe polygon with implicit equation:

(p− p1) · n = 0.

We do this by settingp = e + td and solving fort to get:

t =(p1 − e) · n

d · n .

This allows us to computep. If p is inside the polygon then the ray hits it, andotherwise it does not.

We can answer the question of whetherp is inside the polygon by projectingthe point and polygon vertices to thexy plane and answering it there. The easiestway to do this is to send any 2D ray out fromp and to count the number ofintersections between that ray and the boundary of the polygon. If the numberof intersections is odd, then the point is inside the polygon, and otherwise it isnot. This is because a ray that goes in must go out, so that creates a pair ofintersections. Only a ray that starts inside does not create such a pair. To makecomputation simple the 2D ray may as well propagate along thex axis:[

xy

]=

[xp

yp

]+ s

[10

]

It is straightforward to compute the intersection of that ray with the edges such as(x1, y1, x2, y2) for s ∈ (0,∞).

A problem arises for polygons whose projection into thexy plane is a line. Toget around this we can choose among thexy, yz or zx planes for which is best.If we implement our points to allow an indexing operation, e.g.,p(0) = xp thenthis can be accomplished as follows:

if (abs(zn) ¿ abs(xn)) and (abs(zn) ¿ abs(xn)) thenindex0 = 0index1 = 1

else if(abs(yn) ¿ abs(xn)) thenindex0 = 0index1 = 2

else

“cg”2002/3page2

i

i

i

i

i

i

i

i

2.4. A RAY TRACING PROGRAM 21

index0 = 1index1 = 2

Now all computations can usep(index0) rather thanxp and so on.

2.4 A Ray Tracing Program

Now we know how to generate a viewing ray for a given pixel, and we knowhow to find the intersection with one object. This can be easily extended to aprogram that produces images similar to the z-buffer of BSP-tree codes of earlierchapters:

for each pixeldocompute viewing rayif ray hits an object witht ∈ [0,∞) then

ComputenEvaluate lighting equation an set pixel to that color

elseset pixel color to background color

Here the statement “if ray hits an object...” can be implemented as a a functionthat tests for hits in the intervalt ∈ [t0, t1]:

hit = falsefor each objecto do

if object is hit at ray parametert andt ∈ [t0, t1] thenhit = truehitobject =ot1 = t

return hit

In an actual implementation, you will need to somehow return either a referenceto the object that is hit, or at least its normal vector and material properties. Thisis often done by passing a record/structure with such information. In an object-oriented implementation, it is a good idea to have a class called something likesurfacewith derived classes triangle, sphere, surface-list, et cetera. Anything thatcan be intersected with a ray would be under that class. The ray tracing programwould then have one reference to a “surface” for the whole model, and new typesof objects and efficiency structures can be added transparently.

2.4.1 Object-oriented design for a ray tracing program

As mentioned earlier, the key class hierarchy in a ray tracer are the geometricobjects that make up the model. These should be subclasses of some geometricobject class, and they should support ahit function. To avoid confusion from useof the word “object”,surfaceis the class name often used. With such a class,you can create a ray tracer that has a general interface that assumes little about

“cg”2002/3page2

i

i

i

i

i

i

i

i


modeling primitives, and debug it using only spheres. An important point is thatanything that can be “hit” by a ray should be part of this class hierarchy, e.g., evena collection of surfaces should be considered a subclass of the surface class. Thisincludes efficiency structures such as bounding volume hierarchies; they can behit by a ray, so they are in the class.

For example, the “abstract” or “base” class would specify the hit function, aswell as a bounding box function which will prove useful later:

class surfacevirtual bool hit(raye + td, real t0, real t1, hit-record rec)virtual box bounding-box()

Here(t0, t1) is the interval on the ray where hits will be returned, andrec is arecord that is passed by reference that is filled with data such as thet at inter-section whenhit returns true. The typebox is a 3D “bounding box” that is twopoints that define an axis-aligned box that encloses the surface. For example, fora sphere that function would be implemented:

box sphere::bounding-box()vector3 min = center - vector3(radius,radius,radius)vector3 max = center + vector3(radius,radius,radius)return box(min, max)

Another class that is useful ismaterial. This allows you to abstract the materialbehavior and later add materials transparently. A simple way to link objects andmaterials is to add a pointer to a material in the surface class, although moreprogrammable behavior might be desirable. A big question is what to do withtextures; are they part of the material class or do they live outside of the materialclass? This will be discussed more in Chapter 3.

2.5 Shadowsp

ll

q

Figure 2.6: The point pis not in shadow while thepoint q is in shadow.

Once you have a basic ray tracing program, shadows can be added very easily.Recall from Chapter 1 that light comes from some directionl. If we imagineourselves at a pointp on a surface being shaded, the point is in shadow if we“look” in direction l and see an object. If there are no objects, then the light is notblocked.

This is shown in Figure 2.6, where the rayp + tl does not hit any objects andis thus not in shadow. The pointq is in shadow because the rayq + tl does hitan object. The vectorl is the same for both points because the light is “far” away.This assumption will later be relaxed. The rays that determine in or out of shadoware calledshadow raysto distinguish them from viewing rays.

The algorithm for shading just has an if statement added to determine whetherthe point is in shadow. In a naive implementation, the shadow ray will check fort ∈ [0,∞), but because of numerical imprecision, this can result in an intersectionwith the surfacep lies on. Instead, the usual adjustment to avoid that problem is

“cg”2002/3page2

i

i

i

i

i

i

i

i

2.6. SPECULAR REFLECTION 23

to test fort ∈ [ε,∞) whereε is some small positive constant (Figure 2.7).

p

l

p + εl

Figure 2.7: By testing inthe interval starting at ε,we avoid numerical impre-cision causing the ray to hitthe surface p is on.

If we implement shadow rays for Phong lighting with Equation 1.9 then wehave:

function raycolor( raye + td, real t0, real t1 )hit-record rec, srecif (scene→hit(e + td, t0, t1, rec)) then

p = e + rec.tdcolor c = rec.cr rec.cl

if (not scene→hit(p + sl, ε,∞, srec))thenvector3h = normalized(normalized(l) + normalized(−d)c = c + rec.cr clmax(0, rec.n · l) + clrec.cp(h · rec.n)rec.p

return celse

return background-color

Note that the ambient color is added regardless. If there are multiple light sourceswe can send a shadow ray and evaluate the diffuse/phong terms for each light.The code above assumes thatd and l are not necessarily unit vectors. This iscrucial ford in particular if we wish to cleanly add instancing later.

2.6 Specular Reflection

n

r

θ θd

Figure 2.8: When look-ing into a perfect mirror,the viewer looking in direc-tion d will see whatever theviewer “below” the surfacewould see in direction r.

It is straightforward to addspecularreflection to a ray tracing program. The keyobservation is shown in FIgure 2.8 where a viewer looking from directione seeswhat is in directionr as seen from the the surface. The vectorr is found usinga variant of the Phong lighting reflection Equation 1.6. There are sign changesbecause the vectord points toward the surface in this case, so:

r = d + 2(d · n)n, (2.4)

In the real world, some energy is lost when the light reflects from the surface, andthis loss can be different for different colors. For example, gold reflects yellowmore efficiently than blue, so it shifts the colors of the objects it reflects. This canbe implemented by adding a recursive call in raycolor:

color c = c + csraycolor(p + sr, ε,∞ )

wherecs is the specular RGB color. We need to make sure we test fors ∈ [ε,∞)for the same reason as with shadow rays; we don’t want the reflection ray to hitthe object that generates it.

The problem with the recursive call above is that it may never terminate. Forexample, if a ray starts inside a room it will bounce forever. This can be fixed byadding a maximum recursion depth. The code will be more efficient if a reflectionray is generated only ifcs is not zero (black).

“cg”2002/3page2

i

i

i

i

i

i

i

i


n

r

θθ

d

θφφ

φ

-n

d

t

Figure 2.9: Snells’ Law describes how the angle φ depends on the angle θ and the refractive

indices of the object and the surrounding medium.

2.7 Refraction

Another type of specular object is adielectric. A dielectric is a transparent mate-rial that refracts light. Diamonds, glass, water, and air are dielectrics. Dielectricsalso filter light; some glass filters out more red and blue than green light, so theglass takes on a green tint. When a ray travels from a medium with refractiveindexn into one with a refractive indexnt, some of the light is transmitted, andit bends. This is shown fornt > n in Figure 2.9. Snell’s law tells us that:

n sin θ = nt sin φ.

Computing the sine of an angle between two vectors is usually not as convenientas the cosine which is a simple dot product for the unit vectors such as we havehere. From this andsin2 θ + cos2 θ = 1 we can derive a relationship for cosines:

cos2 φ = 1− n2(1− cos2 θ

)n2

t

.

Note that ifn andnt are reversed, then so areθ andφ as shown on the right ofFigure 2.9.

n

r

θθ

d

φ

t

b

Figure 2.10: The vectors nand b form a 2D orthonor-mal basis that is parallel tothe transmission vector t.

To convertsin φ andcos φ into a 3D vector, we can set up a 2D orthonormalbasis in the plane ofn andd.

From the Figure 2.10 we can see thatn andb form an orthonormal basis forthe plane of refraction. By definition we can describet in terms of this basis:

t = sin φb− cos φn.

Since we can described in the same basis, andd is known, we can solve forb:

d = sin θb− cos θn,

b =d + n cos θ

sin θ.

“cg”2002/3page2

i

i

i

i

i

i

i

i

2.7. REFRACTION 25

This means we can solve fort with known variables:

t =n (d + n cos θ))

nt− n cos θ′

=n (d− n(d · n))

nt− n

√1− n2 (1− (d · n)2θ)

n2t

.

Note that this equation works regardless of which ofn and nt is bigger. Animmediate question is what should you do if the number under the square root isnegative. In this case there is no refracted ray and all of the energy is reflected.This is known astotal internal reflectionand is responsible for much of the richappearance of glass objects.

The reflectivity of a dielectric varies with incident angle according to theFres-nel Equations. An nice way to implement something close to the Fresnel Equa-tions is to use theSchlick approximation:

R(θ) = R0 + (1−R0) (1− cos θ)5 ,

whereR0 is the reflectance at normal incidence:

R0 =(

nt − 1nt + 1

)2

.

Note that thecos θ terms above are always for the angle in air (the larger of theinternal and external angles relative to the normal).

For homogeneous impurities as is found in typical glass, a light carryingray’s intensity will be attenuated according toBeer’s Law. As the ray travelsthrough the medium it loses intensity according todI = −CI dx wheredxis distance. This meansdI/dx = −CI. This is solved by the exponentialI = k exp(−Cx) + k′. The strength of attenuation is described by the RGBattenuation constanta, which is the attenuation after one unit of distance. Puttingin boundary conditions, we know thatI(0) = I0, and I(1) = aI(0). Thefirst impliesI(x) = I0 exp(−Cx). The second impliesI0a = I0 exp(−C) so−C = ln(a). So the final formula is:

I(s) = I(0)e− ln(a)s,

whereI(s) is the intensity of the beam at distances from the interface. In practicewe reverse-engineera by eye because such data is rarely easy to find.

To add transparent materials in our code we need a way to determine whena ray is going “into” an object. The simplest way to do this is to assume that allobjects are embedded in air with refractive index very close to 1.0, and that surfacenormals point “out” (toward the air). The code segment for rays and dielectricswith these assumptions is:

if (p is on a dielectric)then

“cg”2002/3page2

i

i

i

i

i

i

i

i


r = reflect(d, n)if (d · n < 0) then

refract(d, n,n, t)c = −d · nkr = kg = kb = 1

elsekr = exp(−art)kg = exp(−agt)kb = exp(−abt)if refract(d, -n,1/n,t) then

c = t · nelse

returnkcolor(p + tr)R0 = (n− 1)2/(n + 1)2

R = R0 + (1−R0)(1− c)5

returnk(R color(p + tr) + (1−R) color(p + tt))

The code above assumes that the natural log has been folded into the constants(ar, ag, ab). Therefract function returns false if there is total internal reflection,and otherwise it fills in the last argument of the argument list.

2.8 Instancing

x

y

1. scale

2. rotate 3. move

Figure 2.11: An instanceof a circle with a series ofthree transforms is an el-lipse.

An elegant property of ray tracing is that it allows very naturalinstancing. Thebasic idea of instancing is that all points on an object are distorted by a transfor-mation matrix before the object is displayed. For example, if 2D the unit circlecombined with a scale by factor(2, 1) in x andy respectively, then a rotate by45,then move one unit in thex direction, the result is an ellipse with an eccentricityof 2 and a long axis along thex = −y direction centered at(0, 1) (Figure 2.11).The key thing that makes that entity an instance is that we store the circle and thecomposite transform matrix. Thus the explicit construction of the ellipse is left asa future procedure operation at render time.

The advantage of instancing in ray tracing is that we can choose what spaceto do intersection in. If the base object is composed a set of points, one of whichis p, then the transformed object is composed of that set of points transformedby matrix M, where example point is transformed toMp. If we have a raya + tb we want to intersect with the transformed object, we can instead intersectan inverse-transformed raywith the untransformed object (Figure 2.12). Thereare two potential advantages of computing in the untransformed space (i.e., theright hand of Figure 2.12):

1. the untransformed object may have a simpler intersection routine, e.g., asphere versus an ellipsoid;

2. many transformed object can share the same untransformed object thus re-

“cg”2002/3page2

i

i

i

i

i

i

i

i

2.8. INSTANCING 27

points p on circle

points Mp on circle

q

r

Mr

Mq

a

b

ray a + t b

M-1aM-1bray M-1a + t M-1b

Figure 2.12: The ray intersection problem in the two spaces are just simple transforms of

each other. The object is specified as a sphere plus matrix M. The ray is specified in the

transformed (world) space by location a and direction b.

ducing storage, e.g., a traffic jam of cars, where individual cars are justtransforms of a few base (untransformed) models.

As discussed in Section??surface normal vectors transform differently. Combin-ing this issue and the concepts in Figure 2.12 yields the intersection of a ray andan object transformed by matrixM. If we create an instance class of typesurfacewe need to create ahit function:

instance::hit(raya + tb, real t0, real t1, hit-record rec)ray r′ = M−1a + tM−1bif (base-object→hit(r′, t0, t1, rec)) then

rec.n = (M−1)T rec.nreturn true

elsereturn false

An elegant thing about this function is that the parameter rec.t does not need tobe changed because it is the same in either space. This brings up a very importantpoint: the ray directionb mustnot be restricted to a unit-length vector, or noneof the infrastructure above works. For this reason, it is good not to restrict raydirections to unit vectors.

To implement the bounding-box function of class instance we can just take theeight corners of the bounding-box of the base-object and transform all of the byM, and then take the bounding box of these eight points. That will not necessarily

“cg”2002/3page2

i

i

i

i

i

i

i

i


yield the tightest bounding box, but it is general and straightforward to implement.

2.9 Sub-linear Ray–Object Intersection

In the earlier ray-object intersection pseudocode, all objects are looped over check-ing for intersections. ForN objects this is aO(N) linear search and is thus slowfor large values ofN . Like most search problems, the ray-object intersectioncan be computed in sub-linear time using divide and conquer techniques providedwe can create an ordered data structure as a preprocess. There are a plethora oftechniques to do this. This section discusses two of these techniques in detail:the bounding volume hierarchy, and uniform spatial subdivision. References forother popular strategies are given in the notes at the end of the chapter.

2.9.1 Bounding Boxes

A key operation in most intersection acceleration schemes is computing the inter-section of a ray with a bounding box (Figure 2.13). This differs from conventionalintersection tests in that we do not need to know where the ray hits the box; weonly need to know whether it hits the box.

ray

bounding box

e

d

x=xmin x=xmax

y=ymin

y=ymax

Figure 2.13: A 2D ray e +td is tested against a 2Dbounding box.

ray

bounding box

Figure 2.15: The ray isonly tested for intersectionwith the surfaces if it hitsthe bounding box.

To build up an algorithm for ray-box intersection, we begin by consideringa 2D ray whose direction vector has positivex andy components. We can gen-eralize this to arbitrary 3D rays later. The 2D bounding box is defined by twohorizontal and two vertical lines:

x = xmin,

x = xmax,

y = ymin,

y = ymax.

The points bounded by these lines can be described in interval notation:

(x, y) ∈ [xmin, xmax]× [ymin, ymax].

As shown in Figure 2.14 the intersection test can be phrased in terms of theseintervals. First we compute the ray parameter where the ray hits the linex = xmin:

txmin =xmin − xe

xd.

We then make similar computations fortxmax, tymin, andtymax. The ray hits thebox if and only if the intervals[txmin, txmax] and [txmin, txmax] overlap, i.e., theirintersection is non-empty. In pseudocode this algorithm is:

txmin = (xmin− xe)/xd

txmax = (xmax− xe)/xd

“cg”2002/3page2

i

i

i

i

i

i

i

i

2.9. SUB-LINEAR RAY–OBJECT INTERSECTION 29

txmax

txmin

tymax

tymin

t ∈ [ txmin, txmax ]

t ∈ [ tymin, tymax ]

t ∈ [ txmin, txmax ] ∩ [ tymin, tymax ]

Figure 2.14: The ray will be inside the interval x ∈ [xmin, xmax] for some interval in its pa-

rameter space t ∈ [txmin, txmax]. A similar interval exists for the y interval. The ray intersects

the box if it is in both the x interval and y interval at the same time, i.e., the intersection of

the two one dimensional intervals is not empty.

“cg”2002/3page3

i

i

i

i

i

i

i

i


tymin = (ymin− ye)/xd

tymax = (ymax− ye)/xd

if (txmin > tymax) or (tymin > txmax) thenreturn false

elsereturn true

The if statement may seem non-obvious. To see the logic of it, note that there isno overlap if first interval is either entirely to the right or entirely to the left of thefirst interval.

The first thing we must address is to account for whenxd or yd are negative.If xd is negative then the ray will hitxmax before it hitsxmin. Thus the code forcomputingtxmin andtxmax would expand to:

if (xd ≥ 0) thentxmin = (xmin− xe)/xd


elsetxmin = (xmax− xe)/xd

txmax = (xmin− xe)/xd

A similar expansion must be made for they cases.A major concern is that horizontal and vertical rays have a zero value foryd

andxd respectively. This will cause divide by zeroes and thus may cause trouble.However, before addressing this directly we should check wither IEEE floatingpoint computation handles these cases gracefully for us. Recall from Section??the rules for divide by zero: for any positive real numbera:

+a/0 = +∞

−a/0 = −∞

Consider the case of a vertical ray wherexd = 0 andyd > 0. This gives us:

txmin =xmin − xe

0

txmax =xmax− xe

0

There are three possibilities of interest:

1. xe ≤ txmin (no hit)

2. txmin < xe < txmax (hit)

3. txmax≤ xe (no hit)

“cg”2002/3page3

i

i

i

i

i

i

i

i


For the first case we have:

txmin =positive number

0

txmax =positive number

0

This yields the interval(txmin, txmin) = (∞,∞). That interval will not overlapwith any interval, so there will be no hit as desired. For the second case we have:

txmin =negative number

0

txmax =positive number

0

This yields the interval(txmin, txmin) = (−∞,∞) which will overlap with allintervals and thus will yield a hit as desired. The third case results in the interval(−∞,−∞) which yields no hit as desired. Because these cases work as desiredwe need no special checks for them. As is often the case IEEE floating pointconventions are our ally.

In the full 3D case the procedure for determining whether raye + td hits a3D box within the interval(t0, t1) above expands to:

bool box::hitbox(ray(e + td), real t0, real t1)if (xd ≥ 0) then

tmin = (xmin− xe)/xd

tmax = (xmax− xe)/xd

elsetmin = (xmax− xe)/xd

tmax = (xmin− xe)/xd

if (yd ≥ 0) thentymin = (ymin− ye)/yd

tymax = (ymax− xe)/yd

elsetymin = (ymax− ye)/yd

tymax = (ymin− ye)/yd

if (tmin > tymax) or (tymin > tmax) thenreturn false

if (tymin > tmin) thentmin = tymin

if (tymax < tmax) thentmax = tymax

if (zd ≥ 0) thentzmin = (zmin− ze)/zd

“cg”2002/3page3

i

i

i

i

i

i

i

i


tzmax= (zmax− ze)/zd

elsetzmin = (zmax− ze)/zd

tzmax= (zmin− ze)/zd

if (tmin > tzmax) or (tzmin > tmax) thenreturn false

if (tzmin > tmin) thentmin = tzmin

if (tzmax< tmax) thentmax = tzmax

return tmin < t1 andtmax > t0

This code often ends up being the bottleneck, so it is important to write it verytightly as above. Note thathitbox is a member ofbox. The box class isnot asubclass ofsurface. Its only job is to bound objects and return intersections. Itdoes not have material properties so is never rendered directly. However, you mayfind it useful to also implement abox-surfaceas a distinct class that is renderableand implementssurface::hit.

2.9.2 Hierarchical Bounding Boxes

The basic idea of hierarchical bounding boxes can be seen by the common tacticof placing an axis-aligned 3D bounding box around all the objects as shown inFigure 2.15. Rays that hit the bounding box will actually be more expensive thanin a brute for search because testing for intersection with the box is not free.However, rays that miss the box are cheaper than the brute force search. Suchbounding boxes can be made hierarchical by partitioning the set of objects in abox and placing a box around each partition as shown in Figure 2.16.

ray

bounding box

Figure 2.16: The bound-ing boxes can be nested bycreating boxes around sub-sets of the model.

The data structure for the hierarchy shown in Figure 2.17 might be a treewith the large bounding box at the root and the two smaller bounding boxesas left and right subtrees. These would in turn each point to a list of threetriangles. The intersection of a ray with this particular hard-coded tree wouldbe:

if (ray hits root box)thenif (ray hits left subtree box)then

check three triangles for intersectionif (ray intersects right subtree box)then

check other three triangles for intersectionif (an intersections returned from each subtree)then

return the closest of the two hitselse if(a intersection is returned from exactly one subtree)then

return that intersectionelse

return false

“cg”2002/3page3

i

i

i

i

i

i

i

i


elsereturn false

Some observations related to this algorithm are that there is no geometric orderingbetween the two subtrees, and there is no reason a ray might not hit both subtrees.Indeed, there is no reason that the two subtrees might not overlap. A key point ofsuch data hierarchies is that a box is guaranteed to bound all objects that are belowit in the hierarchy, but they arenot guaranteed to have all objects that overlap itspatially, as shown in Figure 2.17. This makes this geometric search somewhatmore complicated than a traditional binary search on strictly ordered 1D data. Thereader may note that several possible optimizations present themselves. We deferoptimizations until we have a full hierarchical algorithm.

Figure 2.17: The grey boxis a tree node that points tothe three grey spheres, andthe thick black box pointsto the three black spheres.Note that not all spheresenclosed by the box areguaranteed to be pointed toby the corresponding treenode.

If we restrict the tree to be binary, and for each node in the tree to have abounding box, then this traversal code extends naturally. Further assume that allnodes are either leaves in the tree and contain a primitive, or that they contain oneor two subtrees. Thebvh-nodeclass should be of type surface, so it should imple-mentsurface::hit. The data it contains should be simple:

class bvh-node subclass of surfacevirtual bool hit(raye + td, real t0, real t1, hit-record rec)virtual box bounding-box()surface-pointer leftsurface-pointer rightbox bbox

The traversal code can then be called recursively in an object-oriented style:

bool bvh-node::hit( raya + tb, real t0, real t1, hit-record rec)if bbox.hitbox(a + tb, t0, t1)) then

hit-record lrec, rrecleft-hit = (left 6= NULL) and (left→hit(a + tb, t0, t1, lrec))right-hit = (right 6= NULL) and (right→hit(a + tb, t0, t1, rrec))if (left-hit and right-hit)then

if (lrec.t < rrec.t) thenrec = lrec

elserec = rrec

return trueelse if(left-hit) then

rec = lrecreturn true

else if(right-hit) thenrec = rrecreturn true

elsereturn false

“cg”2002/3page3

i

i

i

i

i

i

i

i


elsereturn false

Note that becauseleft andright point to surfaces rather than bvh-nodes specifi-cally, we can let the virtual functions take care of distinguishing between internaland leaf nodes; the appropriatehit function will be called for whatever is pointedto.

There are many ways to build a tree for a bounding volume hierarchy. It isconvenient to make the tree binary, roughly balanced, and to have the boxes ofsibling subtrees not overlap too much. A heuristic to accomplish this is to sort thesurfaces along an axis before dividing them into two sublists. If the axes are de-fined by an integer withx = 0, y = 1, andz = 2 we have:

bvh-node::bvh-node(object-array A, int AXIS)N = A.lengthif (N = 1) then

left = A[0]right = NULLbbox = bounding-box(A[0])

else if(N = 2) thenleft-node = A[0]right-node = A[1]bbox = combine(bounding-box(A[0]), bounding-box(A[1]))

elsesort A by the object center along AXISleft= new bvh-node(A[0..N/2-1], (AXIS +1)mod 3)right = new bvh-node(A[N/2..N-1], (AXIS +1)mod 3)bbox = combine(left-node→bbox, right-node→bbox)

The quality of the tree can be improved by carefully choosingAXISeach time.One way to do this is to choose the axis such that the sum of the volumes ofthe bounding boxes of the two subtrees is minimized. This change over rotatingthrough the axes will make little difference for scenes composed of isotopicallydistributed small objects, but may help significantly in less well-behaved scenes.This code can also be made more efficient by doing just a partition rather than afull sort.

Another, and probably better, way to build the tree is to have the subtreescontain about the same amount of space rather than the same amount of objects.To do this we partition the list based on space:

bvh-node::bvh-node(object-array A, int AXIS)N = A.lengthif (N = 1) then

left = A[0]right = NULLbbox = bounding-box(A[0])

else if(N = 2) then

“cg”2002/3page3

i

i

i

i

i

i

i

i


left = A[0]right = A[1]bbox = combine(bounding-box(A[0]), bounding-box(A[1]))

elsefind the midpointm of the bounding box of A along AXISpartition A into lists with lengths k and (N-k) surroundingmleft = new node(A[0..k], (AXIS +1) mod 3)right = new node(A[k+1..N-1], (AXIS +1) mod 3)bbox = combine(left-node→bbox, right-node→bbox)

Although this results in an unbalanced tree, it allows for easy traversal of emptyspace and is cheaper to build because the partitioning is cheaper than sorting.

2.9.3 Uniform Spatial Subdivision

Another strategy to reduce intersection tests is to divide space. This is fundamen-tally different from dividing objects as done with hierarchical bounding volumes:

• In hierarchical bounding volumes each object belongs to a on of two siblingnodes, wheras a point in space may be inside both sibling nodes.

• In spatial subdivision, each point in space belongs to exactly one node,wheras objects may belong to many nodes.

The scene is partitioned into axis-aligned boxes. These boxes are all the samesize, although they are not necessarily cubes. The ray traverses these boxes asshown in Figure 2.18. When an object is hit, the traversal ends.

The grid itself should be a subclass of surface, and should be implemented asa 3D array of pointers to surface. For empty cells these pointers are NULL. Forcells with one object the pointer points to that object. For cells with more thanone object the pointer can point to a list, another grid, or another data structuresuch as a bounding volume hierarchy.

This traversal is done in an incremental fashion. The regularity comes fromthe regular way that a ray hits each set of parallel planes as shown in Figure 2.19.To see how this traversal works, first consider the 2D case where the ray directionhas positivex andy components, and starts outside the grid. Assume the grid isbounded by points(xmin, ymin) and(xmax, ymax). The grid hasnx by ny cells.

Our first order of business is to find the index(i, j) of the first cell hit by theray e + td. Then we need to traverse the cells in an appropriate order. This canbe accomplished as:



tymin = (ymin− ye)/yd

tymax = (ymax− ye)/yd


“cg”2002/3page3

i

i

i

i

i

i

i

i


ray

Figure 2.18: In uniform spatial subdivision the ray is tracked forward through cells until an

object in one of those cells is hit. In this example, only objects in the shaded cells are

checked.

Figure 2.19: Although the pattern of cell hits seems irregular (left), the hits on sets of parallel

planes are very even.

“cg”2002/3page3

i

i

i

i

i

i

i

i


dtx = (txmax− txmin)/nx

dty = (tymax− tymin)/ny

if (txmin > tymin) theni = 0y = ye + txminyd

j = floor(ny(y − ymin)/(ymax− ymin))txnext= txmin + dtxtynext= tymin + (j + 1)dtytlast = txmin

elsej = 0x = xe + tyminxd

i = floor(nx(x− xmin)/(xmax− xmin))tynext= tymin + dtytxnext= txmin + (i + 1)dtxtlast = tymin

while (true)doif (txnext< tynext) then

if (cell 6= NULL) and (cell(i, j)→hit(e, d, tlast, txnext, rec)) thenreturn true

tlast = txnext

txnext= txnext+ dtxi = i + 1if (i == nx) then

return falseelse

if (cell 6= NULL) and (cell(i, j)→hit(e, d, tlast, tynext, rec)) thenreturn true

tlast = tynext

tynext= tynext+ dtyj = j + 1if (j == ny) then

return false

The key parts of this algorithms are finding the initial cell(i, j) and decidingwhether to incrementi or j (Figure 2.20). Note that when we check for an in-tersection with objects in a cell, we restrict the range oft to be within the cell(Figure 2.21).

raycell (i, j )

tlast

tynext

txnext

Figure 2.20: To decidewhether we advance rightor up we keep track ofthe intersections with thenext vertical and horizontalboundary of the cell.

raycell (i, j )

ab

Figure 2.21: Only hitswithin the cell should be re-ported. Otherwise the caseabove would cause us toreport hitting object b ratherthan object a.

There are several extensions we need to make to the code above for the generalcase:

• adding thez axis,

• allowing for negative components ind,

• allowing for floating point error to make the array indices out of range,

“cg”2002/3page3

i

i

i

i

i

i

i

i


• allowing the ray to start in the grid.

• making sure hits with rec.t > t1 are not returned.

Addressing these yields:

bool grid::hit(ray e + td, real t0, real t1, hit-record rec)if (xd > 0) then



iinc = +1istop = nx

elsetxmin = (xmax− xe)/xd

txmax = (xmin− xe)/xd

iinc = −1istop = −1

if (yd > 0) thentymin = (ymin− ye)/yd

tymax = (ymax− ye)/yd

jinc = +1jstop = ny

elsetymin = (ymax− ye)/yd

tymax = (ymin− ye)/yd

jinc = −1jstop = −1


if (zd > 0) thentzmin = (zmin− ze)/zd

tzmax= (zmax− ze)/zd

kinc = +1kstop = nz

elsetzmin = (zmax− ze)/zd

tzmax= (zmin− ze)/zd

kinc = −1kstop = −1

if (tzmin > txmax) or (txmin > tzmax) thenreturn false

if (tzmin > tymax) or (tymin > tzmax) thenreturn false

dtx = (txmax− txmin)/nx

dty = (tymax− tymin)/ny

dtz = (tzmax− tzmin)/nz

“cg”2002/3page3

i

i

i

i

i

i

i

i


if ((p = e + t0d) is inside grid)thenx = xp

i = floor(nx(x− xmin)/(xmax− xmin))if (i < 0) then

i = 0if (i > nx − 1) then

i = nx − 1y = yp

j = floor(ny(y − ymin)/(ymax− ymin))if j < 0 then

j = 0if (j > ny − 1) then

j = ny − 1z = zp

k = floor(nz(z − zmin)/(zmax− zmin))if k < 0 then

k = 0if (k > nz − 1) then

k = nz − 1txnext= txmin + (i + (iinc + 1)/2))dtxtynext= tymin + (j + (jinc + 1)/2))dtytznext= tzmin+ (k + (kinc + 1)/2))dtztlast = t0

else if(txmin > tymin) and (txmin > tzmin) theni = istop− iinc

y = ye + txminyd

j = floor(ny(y − ymin)/(ymax− ymin))if (j < 0) then


j = ny − 1z = ze + txminzd

k = floor(nz(z − zmin)/(zmax− zmin))if (k < 0) then


k = nz − 1txnext= txmin + dtxtynext= tymin + (j + (jinc + 1)/2))dtytznext= tzmin+ (k + (kinc + 1)/2))dtztlast = txmin

else if(tymin > tzmin) thenj = jstop− jinc

x = xe + tyminxd

“cg”2002/3page4

i

i

i

i

i

i

i

i




i = nx − 1z = ze + tyminzd

k = floor(nz(z − zmin)/(zmax− zmin))if (k < 0) then


k = nz − 1tynext= tymin + dtytxnext= txmin + (i + (iinc + 1)/2))dtxtznext= tzmin+ (k + (kinc + 1)/2))dtztlast = tymin

elsek = kstop− kinc

x = xe + tzminxd



i = nx − 1y = ye + tzminyd

j = floor(ny(y − ymin)/(ymax− ymin))if (j < 0) then


j = ny − 1tznext= tzmin+ dtztxnext= txmin + (i + (iinc + 1)/2))dtxtynext= tymin + (j + (jinc + 1)/2))dtytlast = tzmin

if tlast > t1) thenreturn false

while (true) doif (txnext< tynext) and (txnext< tznext) then

if (cell(i, j)→hit(e, d, tlast, txnext,rec)) and (rec.t≤ t1) thenreturn true

tlast = txnext

txnext= txnext+ dtxi = i + iinc

if (i == istop) thenreturn false

“cg”2002/3page4

i

i

i

i

i

i

i

i

2.10. CONSTRUCTIVE SOLID GEOMETRY 41

else if(tynext< tznext) thenif (cell(i, j)→hit(e, d, tlast, tynext,rec)) and (rec.t≤ t1) then

return truetlast = tynext

tynext= tynext+ dtyj = j + jinc

if (j == jstop) thenreturn false

elseif (cell(i, j)→hit(e, d, tlast, tznext,rec)) and (rec.t≤ t1) then

return truetlast = tznext

tznext= tznext+ dtzk = k + jinc

if (k == kstop) thenreturn false

Most implementations make the 3D array of type “pointer to surface”. To improvethe locality of the traversal, the array can be tiled as discussed in Section 4.3.

2.10 Constructive Solid Geometry

One nice thing about ray tracing is that any geometric primitive whose intersectionwith a 3D line can be computed can be seamlessly added to a ray tracer. It turnsout to also be straightforward to addconstructive solid geometry(CSG) to a raytracer. The basic idea of CSG is to use set operations to combine solid shapes.The basic operations are shown in Figure 2.22. These operations can be viewedasset operations. For example, we can considerC the set of all points in thecircle, andS the set of all points in the square. The intersection operationC ∩ Sis the set of all points that are both members ofC andS. The other operations areanalogous.

C S

C ∩ S(intersection)

C ∪ S(union)

C − S(difference)

S − C(difference)

Figure 2.22: The basicCSG operations on a 2Dcircle and square.

Although one can do CSG directly on the model, if all that is desired is animage, we do not need to explicitly change the model. Instead, we perform the setoperations directly on the rays as the interact with a model. To make this natural,we find all the intersections of a ray with a model rather than just the closest. Forexample, a raya + tb might hit a sphere att = 1 andt = 2. In the context ofCSG we think of this as the ray being inside the sphere fort ∈ [1, 2]. We cancompute these “inside intervals” for all of the surfaces and do set operations onthose intervals (recall Section??). This is illustrated in Figure 2.23, where thehit intervals are processed to indicate there are two intervals inside the differenceobject. The first hit fort > 0 is what the ray actually intersects.

C

S

C − S

Ct=0 t=1

t=0 t=1S

t=0 t=1

Figure 2.23: Intervals areprocessed to indicate howthe ray hits the compositeobject.

In practice the CSG intersection routine must maintain a list of intervals.When the first hitpoint is determined, the material property and surface normalis whatever is associated with that hitpoint. In addition, you must pay attention to

“cg”2002/3page4

i

i

i

i

i

i

i

i


precision issues because there is nothing to prevent the user from taking two ob-jects that abut and taking an intersection. This can be made robust by eliminatingany interval whose thickness is below a tolerance.

2.11 Distribution Ray Tracing

For some applications ray-traced images are just too “clean”. This effect can belessened usingdistribution ray tracing. The conventionally ray traced imageslook clean because everything is crisp; the shadows are perfectly sharp, the re-flections have no fuzziness, and everything is in perfect focus. Sometimes wewould like to have the shadows be soft (as they are in real life), the reflections befuzzy as with brushed metal, and the image have variable degrees of focus as ina photograph with a large aperture. While accomplishing these things from firstprinciple is somewhat involved as is developed in Chapter??, we can get mostof the visual impact with some fairly simple changes to the basic ray tracing al-gorithm. In addition, the framework gives us a relatively simple way to antialias(recall Section??) the image.

2.11.1 Antialiasing

Figure 2.24: Sixteen reg-ular samples for a singlepixel.

Figure 2.25: Sixteen ran-dom samples for a singlepixel.

Figure 2.26: Sixteen strat-ified (jittered) samples fora single pixel shown withand without the bins high-lighted. There is exactlyone random sample takenwithin each bin.

Recall that a simple way to antialias an image is to compute the average color forthe area of the pixel rather than the color at the center point (Chapter??will havea more sophisticated discussion of this issue). In ray tracing, our computationalprimitive is to compute the color at a point on the screen. If we average many ofthese points across the pixel, we are approximating the true average. If the screencoordinates bounding the pixel are[i, i + 1]× [j, j + 1] then this can be replacingthe loop:

for each pixel(i, j) docij = ray-color(i + 0.5, j + 0.5)

with code that samples on a regularn × n grid of samples within each pixel:

for each pixel(i, j) doc = 0for p = 0 to n− 1 do

for q = 0 to n− 1 doc = c+ ray-color(i + (p + 0.5)/n, j + (q + 0.5)/n)

cij = c/n2

This is usually calledregular samplingThe 16 sample locations in a pixel forn = 4 are shown in Figure 2.24. Note that this produces the same answer asrendering a traditional ray-traced image with one sample per pixel atnxn by nynresolution and then averaging blocks ofn by n pixels to get anx by ny image.

One potential problem with taking samples in a regular pattern within a pixelis that regular artifacts such as Moire patterns can arise. These artifacts can be

“cg”2002/3page4

i

i

i

i

i

i

i

i

2.11. DISTRIBUTION RAY TRACING 43

turned into noise by taking samples in a random pattern within each pixel asshown in Figure 2.25. This is usually calledrandom samplingand involves justa small change to the code:

for each pixel(i, j) doc = 0for p = 1 to n2 do

c = c+ ray-color(i + ξ, j + ξ)cij = c/n2

Hereξ is a call that returns a uniform random number in the range[0, 1). Unfor-tunately, the noise can be quite objectionable unless many samples are taken.A compromise is to make a hybrid strategy that randomly perturbs a regulargrid:

for each pixel(i, j) doc = 0for p = 0 to n− 1 do

for q = 0 to n− 1 doc = c+ ray-color(i + (p + ξ)/n, j + (q + ξ)/n)

cij = c/n2

That method is usually calledjittering or stratified sampling(Figure 2.26).

2.11.2 Soft Shadows

light

object

umbrap p

intensity on ground plane

Figure 2.27: A softshadow has a gradualtransition from the unshad-owed to shadowed region.The transition zone is the“penumbra” denoted by pin the figure.

The reason shadows are hard in standard ray tracing is that lights are infinitesimalpoints or directions and are thus either visible or invisible. In real life lights havenon-zero area and can thus be partially visible. This idea is shown in 2D inFigure 2.27. The region where the light is entirely visible is called theumbra.The partially visible region is called thepenumbra. There is not a commonly-used term for the region not in shadow, but it is sometimes called theanti-umbra.

p

area light

Figure 2.28: An arealight can be approximatedby some number of pointlights. In the figure, four ofthe nine points are visibleto p so it is in the penum-bra.

The key to implementing soft shadows is to somehow account for the lightbeing an area rather than a point. An easy way to do this is to approximate thelight with a distributed set ofN point lights each with oneN th of the intensityof the base light. This concept is illustrated in Figure 2.28 where nine lights areused. You can do this in a standard ray tracer, and it is a common trick to getsoft shadows in an off-the-shelf renderer. There are two potential problems withthis technique. First is that typically dozens of point lights are needed to achievevisually smooth results, which slows down the program a great deal. The secondproblem is that the shadows have sharp transitions inside the penumbra.

Distribution ray tracing introduces a small change in the shadowing code.Instead of representing the area light a a discrete number of point sources, werepresent it as an infinite number and choose one at random for each viewing ray.This amounts to choosing a random point on the light for any surface point beinglit as is shown in Figure 2.29.

p

r

r-p

Figure 2.29: The shadowray is sent from p to a ran-

If the light is a parallelogram specified by a corner pointc and two edge

“cg”2002/3page4

i

i

i

i

i

i

i

i


vectorsa andb (Figure 2.30) then choosing a random pointr is straightforward:

r = c + ξ1a + ξ2b,

whereξ1 andξ2 are uniform random numbers in the range[0, 1). We then senda shadow ray to this point as shown in Figure 2.29. Note that the direction ofthis ray is not unit length, which may require some modification to your basic raytracer depending upon its assumptions.

We would really like to jitter points on the light. However, it can be dangerousto implement this without some thought. We would not want to always have theray in the upper-left hand corner of the pixel generate a shadow ray to the upper-left hand corner of the light. Instead we would like to scramble the samples suchthat the pixel samples and the light samples are each themselves jittered, but sothat there is no correlation between pixel samples and light samples. A good wayto accomplish this is to generate two distinct sets ofn2 jittered samples, and passsamples into the light source routine:

for each pixel(i, j) doc = 0generateN = n2 jittered 2D points and store in array r[]generateN = n2 jittered 2D points and store in array s[]shuffle the points in array s[]for p = 0 to N − 1 do

c = c+ ray-color(i + r[p].x(), j +r[p].y()), s[p]cij = c/N

This shuffle routine eliminates any coherence between arraysr ands. The shadowroutine will just use the 2D random point stored ins[p] rather than calling therandom number generator. A shuffle routine for an array indexed from0 to n− 1is:

for i = N − 1 downto1 dochoose random integerj between0 andi inclusiveswap array elementsi andj

2.11.3 Depth of Fieldc

b

a

Figure 2.30: The geome-try of a parallelogram lightspecified by a corner pointand two edge vectors.

focusplane

lens

Figure 2.31: The lens av-erages over a cone of di-rections that hit the pixel lo-cation being sampled.

e

ss - e’

e’

The soft focus effects seen in most photos can be simulated by collecting light ata non-zero size “lens” rather than to a point. This is calleddepth of field. Thelens collects light from a cone of directions that has its apex at a distance whereeverything is in focus (Figure 2.31). We can place the “window” we are samplingon the plane where everything is in focus (rather than at thez = n plane as wedid previously), and the lens at the eye. The distance to the plane where all is infocus we call thefocus planeand the distance to it is set by the user, just as thedistance to the focus plane in a real camera is set by the user or range finder.

“cg”2002/3page4

i

i

i

i

i

i

i

i


To be most faithful to a real camera, we should make the lens a disk. However,we will get very similar effects with a square lens (Figure 2.32). So we choose theside-length of the lens, and take random samples on it. The origin of the view rayswill be these perturbed positions rather than the eye position. Again, a shufflingroutine is used to prevent correlation with the pixel sample positions.

2.11.4 Glossy Reflection

Some surfaces such as brushed metal are somewhere between an ideal mirrorand a diffuse surface. Some discernible image is visible in the reflection but itis blurred. We can simulate this by randomly perturbing ideal specular reflectionrays as shown in Figure 2.33.

Only two details need to be worked out: how to choose the vectorr′, and whatto do when the resulting perturbed ray is below the surface the ray is reflectedfrom. The latter detail is usually settled by returning a zero color when the ray isbelow the surface.

To chooser′ we again sample a random square. This square is perpendicularto r and has widtha that will control the degree of blur. We can set up the square’sorientation by creating an orthonormal basis withw = r using the techniques inSection??. Then we create a random point in the 2D square of sidelengthacentered at the origin. If we have a 2D sample points(ξ, ξ′) ∈ [0, 1]2 then theanalogous point on the desired square is

u = −a

2+ ξa,

v = −a

2+ ξ′a.

Because the square we will perturb over is parallel to both theu andv vectors,the rayr′ is just:

r′ = r + uu + vv.

Note thatr′ is not necessarily a unit vector and should be normalized if your coderequires that for ray directions.

nr

θ θdr’

Figure 2.33: The reflectionray is perturbed to a ran-dom vector r’.Frequently Asked Questions

Why is there no perspective matrix in ray tracing?

The perspective matrix in a z-buffer is there so we can turn the perspective pro-jection into a parallel projection. This is not needed in ray tracing because it iseasy to do the perspective projection implicitly by fanning the rays out from theeye.

“cg”2002/3page4

i

i

i

i

i

i

i

i


What is the best ray intersection efficiency structure?

The most popular structures arebinary-space partitioning trees(BSP trees), uni-form subdivision grids, and bounding volume hierarchies. There is no clear-cutanswer for which is best, but all are much, much better than brute-force search inpractice.

Why do people use bounding boxes rather than spheres or ellip-soids?

Sometimes spheres or ellipsoids are better. However, many models have polyg-onal elements that are tightly bounded by boxes but would be difficult to tightlybound with an ellipsoid.

Can ray tracing be made interactive?

For sufficiently small models and images, any modern PC is sufficiently powerfulfor ray tracing to be interactive. In practice, multiple CPUs with a shared frame-buffer are required for a full-screen implementation, although computer power isincreasing much faster than screen resolution, so it is just a matter of time beforeconventional PCs can ray trace complex scenes at screen resolution.

Is ray tracing useful in a hardware graphics program?

Ray tracing is frequently used forpicking. When the user clicks the mouse on apixel in a 3D graphics program, the program needs to determine which object isvisible within that pixel. Ray tracing is an ideal way to determine that.

Notes

Modern ray tracing was introduced inAn Improved Illumination Model for ShadedDisplay(Whitted, Communications of the ACM, June 1980). CSG via ray tracingwas introduced inRay Casting for Modelling Solids(Computer Graphics and Im-age Processing, February, 1982). A description of picking using ray tracing canbe found inReal-Time Rendering(M oller and Haines, AK Peters, 1999). A sur-vey of efficiency structures for ray tracing is in the classicA Survey of Ray TracingAcceleration Techniques(Arvo and Kirk inAn Introduction to Ray Tracing, Aca-demic Press, 1989). The details of a uniform grid efficiency structure are givenin Realistic Ray Tracing(Shirley, AK Peters, 2000). A general discussion of effi-ciency issues for ray tracing, especially involving hierarchical bounding volumes,are inEfficiency Issues for Ray Tracing(Smits, Journal of Graphics Tools, 3(2),1998).

“cg”2002/3page4

i

i

i

i

i

i

i

i


Exercises

1. What are the ray parameters of the intersection points between ray(1, 1, 1)+t(−1,−1,−1) and the sphere centered at the origin with radius 1? Note:this is a good debugging case.

2. What are the barycentric coordinates and ray parameter where the ray(1, 1, 1)+t(−1,−1,−1) hits the triangle with vertices(1, 0, 0), (0, 1, 0), and(0, 0, 1)?Note: this is a good debugging case.

3. Do a back of the envelope computation of the approximate time complexityof ray tracing on “nice” (non-adversarial) models. Split your analysis intothe cases of preprocessing and computing the image, so that you can predictthe behavior of ray tracing multiple frames for a static model.

“cg”2002/3page4

i

i

i

i

i

i

i

i


“cg”2002/3page4

i

i

i

i

i

i

i

i

Chapter 3

Texture Mapping

The shading models presented in Chapter 1 assume that a diffuse surface has uni-form reflectancecr. This is fine for surfaces such as blank paper or painted walls,but is inefficient for objects such as a printed sheet of paper. Such objects havean appearance whose complexity arises from variation in reflectance properties.While we could use such small triangles that such variation is captured by varyingthe reflectance properties of the triangles, this would be inefficient.

The common technique to handle variations of reflectance is to store the re-flectance as a function or a a pixel-based image and “map” it onto a surface. Thefunction or image is called atexture mapand the process of controlling reflectanceproperties is calledtexture mapping. This is not hard to implement once you un-derstand the coordinate systems involved. Texture mapping can be broken downby several properties:

1. The dimensionality of the texture function,

2. The correspondences defined between points on the surface and points inthe texture function,

3. Whether the texture function is primarily procedural or primarily a tablelookup.

These items are usually closely related, so we will somewhat arbitrarily dividetextures by their dimension. We first cover 3D textures, often calledsolid tex-tures orvolumetextures. We will then cover 2D textures, sometimes calledimagetextures. When graphics programmers talk about textures without specifying di-mension, they usually mean 2D textures. However, we begin with 3D texturesbecause in many ways they are easier to understand and implement. At the end ofthe chapter we discuss bump mapping and displacement mapping which use tex-tures to change surface normals and position respectively. Although those meth-ods modify other properties than reflectance, they images/functions they use are

49

“cg”2002/3page5

i

i

i

i

i

i

i

i

50 CHAPTER 3. TEXTURE MAPPING

still called textured. This is consistent with common usage where any image usedto modify object appearance is called a texture.

3.1 3D Texture Mapping

In previous chapters we usedcr as the diffuse reflectance at a point on an object.For an object that does not have a solid color we can replace this with a functioncr(p) which maps 3D points to RGB colors. This might just return the reflectanceof the object that containsp. But for objects withtexturewe should expectcr(p)to vary asp moves across a surface. One way to do this is to create a 3D texturethat defines a RGB at every point in 3D space. We will only call it for pointsp onthe surface, but it is usually easier to define it for all 3D points than a potentiallystrange 2D subset of points that are on an arbitrary surface. Such a strategy isclearly suitable for surfaces that are “carved” from a solid medium, such as amarble sculpture.

Note that in a ray tracing program we have immediate access to the pointpseen through a pixel. However, for a z-buffer or BSP-tree program we only knowthe point after projection into device coordinates. We will show how to resolvethis problem in Section 3.4.1.

3.1.1 3D Stripe Textures

Figure 3.1: Various stripetextures result from draw-ing an regular array of xypoints while keeping z con-stant.

There are a surprising number of ways to make a striped texture. Let’s assumewe have two colorsc0 andc1 that we want to make the stripe color. We needsome oscillating function to switch between the two colors. An easy one is acosine:

RGB stripe( pointp )if (sin(xp) > 0) then

return c0

elsereturn c1

We can also make the stripe’s widthw controllable:

RGB stripe( pointp, real w )if (sin(πxp/w) > 0) then

return c0

elsereturn c1

If we want to interpolate smoothly between the stripe colors we can use a param-etert to linearly vary the color:

RGB stripe( pointp, real w )t = (1 + sin(πpx/w))/2return (1− t)c0 + tc1

“cg”2002/3page5

i

i

i

i

i

i

i

i

3.1. 3D TEXTURE MAPPING 51

These three possibilities are shown in Figure 3.1.

3.1.2 Texture Arrays

Another way we can specify texture in space is to store a 3D array of color values,and to associate a spatial position to each of these values. We first discuss this for2D arrays in 2D space. Such textures can be applied in 3D by using two of thedimensions, e.g.x andy, to determine what texture values are used. We thenextend those 2D results to 3D.

We will assume the two dimensions to be mapped are calledu andv. Wealso assume we have anx by ny image that we use as the texture. Somehow weneed every(u, v) will have an associated color somehow found from the image.A fairly standard way to make texturing work for(u, v) is to first we remove theinteger portion of(u, v) so that it is in the unit square. This has the effect of“tiling” the entire uv plane with copies of the now-square texture (Figure 3.2).We then use on of three interpolation strategies to compute the image color forthat coordinate. The simplest strategy is to treat each image pixel as a constantcolored rectangular tile. This is shown on the left of Figure 3.3. To compute thisapplyc(u, v) = cij , wherec(u, v) is the texture color at(u, v) andcij is the pixelcolor for pixel indices:

i = bunxc,j = bvnyc,

(3.1)

wherebxc is the floor ofx, (nx, ny) is the size of the image being textured, andthe indices start at(i, j) = (0, 0). This method for a simple image is shown in theleft of Figure 3.3.

For a smoother texture a bilinear interpolation can be used as shown in themiddle of Figure 3.3. Here we use the formula:

c(u, v) =(1− u′)(1− v′)cij+u′(1− v′)c(i+1)j+(1− u′)v′ci(j+1)+u′v′c(i+1)(j+1)

where

u′ = nu− bnuc,v′ = nv − bnvc.

The discontinuities in the derivative in intensity can cause visible mach bands, so

“cg”2002/3page5

i

i

i

i

i

i

i

i


(0,0) (1,0)

(1,1)(0.1)

(3,2)

Figure 3.2: The tiling of an image onto the (u,v) plane. Note that the input image is rectan-

gular, and that this rectangle is mapped to a unit square on the (u,v) plane.

Figure 3.3: The image on the left has nine pixels that are all black or white. From left to right

the three interpolation strategies are nearest-neighbor, bilinear, and hermite.

“cg”2002/3page5

i

i

i

i

i

i

i

i


hermite smoothing can be used:

c(u, v) =(1− u′′)(1− v′′)cij+u′′(1− v′′)c(i+1)j+(1− u′′)v′′ci(j+1)+u′′v′′c(i+1)(j+1),

where

u′′ = 3(u′)2 − 2(u′)3,

v′′ = 3(v′)2 − 2(v′)3

which results in the rightmost image of Figure 3.3.In 3D we have a 3D array of values. All of the ideas from 2D extend naturally.

As an example, let’s assume that we will dotrilinear interpolation between values.First we compute the texture coordinates(u′, v′, w′) and the lower indices(i, j, k)of the array element being interpolated:

c(u, v, w) =(1− u′)(1− v′)(1− w′)cijk+u′(1− v′)(1− w′)c(i+1)jk+(1− u′)v′(1− w′)ci(j+1)k+(1− u′)(1− v′)w′cij(k+1)+u′v′(1− w′)c(i+1)(j+1)k+u′(1− v′)w′c(i+1)j(k+1)+(1− u′)v′w′ci(j+1)(k+1)+u′v′w′c(i+1)(j+1)(k+1)

(3.2)

where

u′ = nu− bnuc,v′ = nv − bnvc,w′ = nw − bnwc.

(3.3)

3.1.3 Solid Noise

Although regular textures such as stripes are often useful, more often we wouldlike to able to make “mottled” textures such as we see on birds’ eggs. This isusually done by using a sort of “solid noise” usually calledPerlin noiseafterits inventor who received a technical academy award for its impact in the filmindustry.

Figure 3.4: Absolute valueof solid noise, and noise forscaled x and y values.

Getting a noisy appearance by calling a random number for every point wouldn’tbe appropriate because it would just be like “white noise” in TV static. We wouldlike to make it smoother without losing the random quality. One possibility would

“cg”2002/3page5

i

i

i

i

i

i

i

i


be to blur white noise, but there is no practical implementation of this. Anotherpossibility would be to make a big lattice with a random number at every latticepoint, and interpolate these random points for points between lattice nodes; this isjust a 3D texture array as described in the last section with random numbers in thearray. This would make the lattice too obvious. Perlin used a variety of tricks toimprove this basic lattice technique so the lattice was not so obvious. This resultsin a rather baroque-looking set of steps, but essentially there are three changes tojust linearly interpolating a 3D array of random values. The first change is to useHermite interpolation to avoid mach bands just as can be done with regular tex-tures. The second change is the use random vectors rather than values, with a dotproduct to derive a random number; this makes the underlying grid structure lessvisually obvious by moving the local minima and maxima off the grid vertices.The third change is to use a 1D array and hashing to create a virtual 3D array ofrandom vectors. This adds computation to lower memory use. Here is his basicmethod:

n(x, y, z) =bxc+1∑i=bxc

byc+1∑j=byc

bzc+1∑k=bzc

Ωijk(x− i, y − j, z − k),

where(x, y, z) are the Cartesian coordinates ofx, and

Ωijk(u, v, w) = ω(u)ω(v)ω(w) (Γijk · (u, v, w)) ,

andω(t) is the cubic weighting function:

ω(t) =

2|t|3 − 3|t|2 + 1 if |t| < 1,

0 otherwise.

The final piece is thatΓijk is a random unit vector for the lattice point(x, y, z) =(i, j, k). Since we want any potentialijk we use a pseudorandom table:

Γijk = G (φ(i + φ(j + φ(k)))) ,

whereG is a precomputed array ofn random unit vectors, andφ(i) = P [imod n] whereP is an array of lengthn containing a permutation of the inte-gers 0 throughn − 1. In practice, Perlin reportsn = 256 works well. To choosea random unit vector(vx, vy, vz) uniformly first set:

vx = 2ξ − 1,

vy = 2ξ′ − 1,

vz = 2ξ′′ − 1,

whereξ, ξ′, ξ′′ are canonical random numbers (uniform in the interval[0, 1)).Then if (v2

x + v2y + v2

z) < 1 then make the vector a unit vector. Otherwise keep

“cg”2002/3page5

i

i

i

i

i

i

i

i


Figure 3.6: Turbulence function with one through eight terms in the summation.

setting it randomly until its length is below one, and then make it a unit vector.This is an example of arejection method, which will be discussed more in Chap-ter ??. Essentially the less than test gets a random point in the unit sphere, andthe vector for the origin to that point is uniformly random. That would not be trueof random points in the cube, so we “get rid” of the corners with the test.

Figure 3.5: Using0.5(noise + 1) (top) and0.8(noise + 1) (bottom) forintensity.

Because solid noise can be positive or negative it must be transformed beforebeing converted to a color. The absolute value of noise over a ten by ten square isshown in Figure 3.4, along with stretched versions. There versions are stretchedby scaling the points input the the noise function.

The dark curves are where the original noise function went from positive tonegative. Since noise varies from -1 to 1, a smoother image can be gotten fromusing(noise+ 1)/2 for color. However, since noise values close to 1 or -1 arerare, this will be a fairly smooth image. Larger scaling can increase the contrast(Figure 3.5).

3.1.4 Turbulence

Many natural textures contain a variety of feature sizes in the same texture. Perlinuses a pseudofractal “turbulence” function:

nt(x) =∑

i

|n(2ix)|2i

This effectively repeatedly adds scaled copies of the noise function on top of itselfas shown in Figure 3.6.

The turbulence can be used to distort the stripe function:

RGB turbstripe( pointp, double w )doublet = (1 + sin(k1zp + turbulence(k2p)))/w)/2return t ∗ s0 + (1− t) ∗ s1

“cg”2002/3page5

i

i

i

i

i

i

i

i


Figure 3.7: Various turbulent stripe textures with different k1, k2. Top row has only the first

term of the turbulence series.

Figure 3.8: A Mercator projection map world map and its placement on the sphere. The

distortions in the texture map (i.e., Greenland being so large) exactly correspond to the

shrinking that occurs when the map is applied to the sphere.

Various values fork1 andk2 were used to generate Figure 3.7.

3.2 2D Texture Mapping

Here we use a 2D coordinate, often calleduv, which is used to create a reflectanceR(u, v).

The key is to take an image and associate a(u, v) coordinate system on it sothat it can be associated with points on a 3D surface. For example, if the latitudesand longitudes on the world map are associated with polar coordinate systems onthe sphere we get a globe (Figure 3.8).

As should be clear from Figure 3.8, it is crucial that the coordinates on the im-age and the object match in “just the right way”. As a convention, the coordinatesystem on the image is set to be the unit square(u, v) ∈ [0, 1]2. For(u, v) outsideof this square, only the fractional parts of the coordinates are used resulting ina tiling of the plane (Figure 3.2). Note that the image has a different number of

“cg”2002/3page5

i

i

i

i

i

i

i

i

3.3. TESSELLATED MODELS 57

pixels horizontally and vertically, so the image pixels have a non-uniform aspectratio in (u, v) space.

To map this(u, v) ∈ [0, 1]2 image onto a sphere, we first compute the polarcoordinates. Recall the spherical coordinate system described by Equation??.For a sphere of radiusR and with center(cx, cy, cz) the parametric equation ofthe sphere is:

x = xc + R cos φ sin θ,

y = yc + R sin φ sin θ,

z = zc + R cos θ.

The(θ, φ) can be found using:

θ = arccos(

z − zc

R

),

φ = arctan2(y − yc, x− xc).

where arctan2(a, b) is the theatan2of most math libraries which return the arct-angent ofa/b. Because(θ, φ) ∈ [0, π]× [−π, π], we convert to(u, v) as follows,after first adding2π to φ if it is negative:

u =φ

2π,

v =π − θ

π.

This mapping is shown in Figure 3.8. There is a similar, although likely morecomplicated way, to generate coordinates for most 3D shapes.

3.3 Tessellated Models

Most real-world models are composed of complexes of triangles with shared ver-tices. In many fields these are known astriangular meshesor triangular irregularnetworks(TINs). Most graphics programs need to have a way to make thesemodels not use too much storage, and to deal with texture-maps.

A simple triangular mesh is shown in Figure 3.9. You could store these threetriangles as independent entities, and thus store pointp1 three times, and the othervertices twice each for a total of nine stored points (three vertices for each of twotriangles), or you could try to somehow share the common vertices and store onlyfour. So instead of

class trianglematerial mvector3p0, p1, p2

You would have two classes:

“cg”2002/3page5

i

i

i

i

i

i

i

i


0 1 2

3 2 1

1 0 3

mesh

a b

c

d

a

b c d

meshtriangle

meshtriangle

meshtriangle

parent

parent

parent

Figure 3.9: A three triangle mesh with four vertices.

class meshmaterial marray of vector3 vertices

and

class meshtrianglepointer to mesh meshptrint i0, i1, i2

wherei0, i1, andi2 are indices into theverticesarray. Either the triangle class orthe mesh class will work. Is there a space advantage for the mesh class? Typically,a large mesh has each vertex being stored by about six triangles, although therecan be any number for extreme cases. This means about two triangles for eachshared vertex. If you haven triangles, then there are aboutn/2 vertices in theshared case and3n in the un-shared case. But when you share you get an extra3n integers andn pointers. Since you don’t have to store the material in eachmesh triangle, that savesn pointers, which cancels out the storage formeshptr. Ifwe assume that the data for floats, pointers, and ints all take the same storage (adubious assumption), the triangles will take10n storage units and the mesh willtake5.5n storage units. So around a factor of two, which seems to hold for mostimplementations. Is this factor of two worth the complication? I think the answeris yes as soon as you start adding “properties” to the vertices.

Each vertex can have material parameters, texture coordinates, irradiances,and essentially any parameter that a render might use. In practice these parametersare bilinearly interpolated across the triangle. So if a triangle is intersected atbarycentric coordinates(β, γ), you interpolate the(u, v) coordinates the sameway you interpolate points. Recall that the point at barycentric coordinate(β, γ)is:

p(β, γ) = a + β(b− a) + γ(c− a).

“cg”2002/3page5

i

i

i

i

i

i

i

i

3.4. TEXTURE MAPPING FOR RASTERIZED TRIANGLES 59

(0, 0) (0, 0)

(1, 0.5)(1, 1)

(1, 0)

(0, 1)

(1, 0)

(0, 1)

(0.5, 0.5)

(1.5, 0.5)

(0.5, 1.5)

(0.5, 1.5)

Figure 3.10: Various mesh textures obtained by changing (u, v) coordinates stored at ver-

tices.

A similar equation applies for(u, v):

u(β, γ) = ua + β(ub − ua) + γ(uc − ua),v(β, γ) = va + β(vb − va) + γ(vc − va.

Several ways a texture could be applied by changing the(u, v) at triangle ver-tices are shown in Figure 3.10. This sort of calibration texture map makes iteasier to understand the texture coordinates of your objects during debugging(Figure 3.11).

Figure 3.11: Top: a cali-bration texture map. Bot-tom: the sphere viewedalong the y axis.

3.4 Texture Mapping for Rasterized Triangles

We would like to get the same texture images whether we use a ray tracing pro-gram or if we use a rasterization method such as a z-buffer. There turns out tobe some subtleties in achieving this with correct-looking perspective, but we canaddress this at the rasterization stage. The reason things are not straightforwardis that just interpolating texture coordinates in screen space results in incorrectimages, as shown for the grid texture shown in Figure 3.12. Because things inperspective get smaller with distance with the viewer, the lines that are evenlyspaced in 3D should compress in 2D image space. More careful interpolation oftexture coordinates is needed to accomplish this.

“cg”2002/3page6

i

i

i

i

i

i

i

i


3.4.1 Perspective Correct Textures

We could implement texture mapping on triangles by interpolating the(u, v) co-ordinates by modifying the raterization method of Section?? but it this resultsin the problem on the right of Figure 3.12. A similar problem occurs for tri-angles if screen-space barycentric coordinates are used as in the rasterizationcode:

for all x dofor all y do

compute(α, β, γ) for (x, y)if α ∈ (0, 1) andβ ∈ (0, 1) andγ ∈ (0, 1) then

t = αt0 + βt1 + γt2

drawpixel(x, y) with color texture(t) for a solid texture or with texture(β, γ)for a 2D texture.

This code will generate images, but there is a problem. To unravel the basicproblem, lets consider the progression from world-spaceq to homogeneous pointr to homogenized points:

xq

yq

zq

1

transform−−−−−−→

xr

yr

zr

hr

homogenize−−−−−−−→

xr/hr

yr/hr

zr/hr

1

≡

xs

ys

zs

1

The space we are interpolating in if we use screen space is that ofs. However, wewould like to be interpolating in spaceq or r where the homogeneous divisionhas not yet non-linearly distorted the barycentric coordinates of the triangle.

The key observation is that1/hr is interpolated with no distortion. Likewise,so isu/hr, v/hr. In fact, so isk/hr wherek is any quantity that varies linearlyacross the triangle. Recall from Section?? that if we transform all points alongthe line segment between pointsq andQ and homogenize we have:

s +hRt

hr + t(hR − hr)(S− s)

but if we linearly interpolate in the homogenized space we have:

s + a(S− s)

Although those lines sweep out the same points, typicallya 6= t for the samepoints on the line segment. However, if we interpolate1/h we do get the sameanswer regardless of which space we interpolate in. To see this is true, confirmthat (Exercise 2):

1hr

+hRt

hr + t(hR − hr)

(1

hR− 1

hr

)=

1hr

+ t

(1

hR− 1

hr

)(3.4)

“cg”2002/3page6

i

i

i

i

i

i

i

i

3.4. TEXTURE MAPPING FOR RASTERIZED TRIANGLES 61

This ability to interpolate1/h linearly with no error in the transformed spaceallows us to correctly texture triangles. Perhaps the least confusing way to dealwith this distortion is to compute the world space barycentric coordinates of thetriangle(βw, γw) in terms of screen space coordinates(β, γ). We note thatβs/handγs/h can be interpolated linearly in screen space. For example, at the screenspace position associated with screen space barycentric coordinates(β, γ) we caninterpolateβw/h without distortion. Becauseβw = 0 at vertex 0 and vertex 2,andβw = 1 at vertex 1, we have:

βs

h=

0h0

+ β

(1h1

− 0h0

)+ γ

(0h2

− 0h0

). (3.5)

Because of all the zero terms Equation 3.5 is fairly simple. However, to betβw

from it we must knowh. Because we know1/h is linear in screen space, we have:

1h

=1h0

+ β

(1h1

− 1h0

)+ γ

(1h2

− 1h0

). (3.6)

Dividing Equation 3.5 by Equation 3.6 gives:

βw =βh1

1h0

+ β(

1h1− 1

h0

)+ γ

(1h2− 1

h0

) .

Multiplying numerator and denominator byh0h1h2 and doing a similar set ofmanipulations for the analogous equations inγw gives:

βw =h0h2β

h1h2 + h2β(h0 − h1) + h1γ(h0 − h2)

γw =h0h1γ

h1h2 + h2β(h0 − h1) + h1γ(h0 − h2)

(3.7)

Note that the two denominators are the same.For triangles that use the perspective matrix from Chapter?? recall thatw =

z/n wherez is the distance from the viewer perpendicular to the screen. Thus forthat matrix1/z also varies linearly. We can use this to modify our scan-conversioncode for three pointsti = (xi, yi, zi, hi) that have been passed through the view-ing matrices but have not been homogenized:

Compute bounds forx = xi/hi) andy = yi/hi

for all x dofor all y do

compute(α, β, γ) for (x, y)if (α ∈ [0, 1] andβ ∈ [0, 1] andγ ∈ [0, 1]) then

d = h1h2 + h2β(h0 − h1) + h1γ(h0 − h2)βw = h0h2β/dγw = h0h1γ/d

“cg”2002/3page6

i

i

i

i

i

i

i

i


αw = 1− βw − γw

u = αwu0 + βwu1 + γwu2

v = αwv0 + βwv1 + γwv2

drawpixel(x, y) with color texture(u,v)

For solid textures just recall that by the definition of barycentric coordinates:

p = (1− βw − γw)p0 + βwp1 + γwp2,

wherepi are the world space vertices. Then just call a solid texture routine forpointp.

3.5 Bump Textures

Although so far we have only discussed changing reflectance using texture, youcan also change the surface normal to give an illusion of fine-scale geometry onthe surface. We could do this by applying abump mapwhich perturbs the surfacenormal. One way to do this is:

vector3n = surfaceNormal(x)n+ = k1 ∗ vectorTurbulence(k2 ∗ x)))/w)/2return t ∗ s0 + (1− t) ∗ s1

This is shown in Figure 3.13.To implementvectorTurbulencewe first needvectorNoisewhich produces a

simple spatially varying 3D vector:

nv(x, y, z) =bxc+1∑i=bxc

byc+1∑j=byc

bzc+1∑k=bzc

Γijkω(x)ω(y)ω(z)

With this,vectorTurbulenceis a direct analog ofturbulence: sum a series of scaledversions ofvectorNoise.

Figure 3.13: vector turbu-lence on sphere of radius1.6. Lighting directly fromabove. Top: k1 = 0. Mid-dle: k1 = 0.08, k2 = 8.Bottom: k1 = 0.24, k2 = 8.

3.6 Displacement Mapping

One problem with Figure 3.13 is that the bumps neither cast shadows nor affectthe silhouette of the object. These limitations are because we are not really chang-ing any geometry. If we want more realism we can apply adisplacement map..A displacement map actually changes the geometry using a texture. A commonsimplification is that the displacement will be in the direction of the surface nor-mal. If we take all pointsp on a surface, with associated surface normal vectorsn, then we can make a new surface using a 3D textured(p):

p′ = p + f(p)n.

This concept is shown in Figure 3.14.

pn

p’

Figure 3.14: The points p

“cg”2002/3page6

i

i

i

i

i

i

i

i

3.6. DISPLACEMENT MAPPING 63

Displacement mapping is straightforward to implement in a z-buffer code bystoring the surface to be displaced as a fine mesh of many triangles. Each vertexin the mesh can then be displaced along the normal vector direction. This resultsin large models, but it is quite robust.

Frequently-Asked Questions

How do I implement displacement-mapping in ray tracing?

There is no ideal way to do it. Generating all the triangles is probably best, andcaching the geometry when necessary will prevent memory overload.

Why do my textured images not look more real?

Humans are good at seeing small imperfections in surfaces. Geometric imperfec-tions are typically absent in computer-generated images that use texture maps fordetails, so they look “too smooth”.

My textured animations look bad when there are many texelsvisible inside a pixel. What should I do?

The problem is that the texture resultion is too high for that image. We would likea smaller down-sampled version of the texture. However, if we move closer, sucha down-sampled texture would look too blurry. What we really need is to be ableto dynamically choose the texture resolution based on viewing conditions so thatabout one texel is visible through each pixel. A common way to do that is to useMIP-mapping. That technique establishes a multi-resolution set of textures andchooses one of the textures for each polygon or pixel. Typically the resolutionsvary by a factor-of-two, e.g.,5122, 2562, 1282, etc.

Notes

Solid texture with noise was introduced in separate papers by Perlin and Lewis inSIGGRAPH 85. The paperHypertexture(Perlin, SIGGRAPH, 1989) was used forthe noise algorithm in this chapter. Perlin won a technical Academy Award for hiswork on solid procedural textures. Texture mapping was introduced inTexture andReflection in Computer Generated Images(Blinn and Newell, Communications ofthe ACM, October, 1976). The discussion of perpective-correct textures is basedon Fast Shadows and Lighting Effects Using Texture Mapping(Segal, Korobkinand van Widenfelt, SIGGRAPH, 1992) and on3D Game Engine Design(Eberly,Morgan-Kaufmann, 2000).

“cg”2002/3page6

i

i

i

i

i

i

i

i


Exercises

1. Find several ways to implement an infinite 2D checkerboard using surfaceand solid techniques. Which is best?

2. Verify that Equation 3.4 is a valid equality using brute-force algebra.

3. How could you implement solid texturing by using the z-buffer depth anda matrix transform?

“cg”2002/3page6

i

i

i

i

i

i

i

i

Chapter 4

Data Structures for Graphics

There are a variety of data structures that seem to pop up repeatedly across graph-ics applications. This chapter talks about three basic and unrelated data structuresthat are among the most common and useful. There are many variants of thesedata structures, but the basic ideas behind them can be conveyed using an exam-ple of each as contained here. First the winged-edge data structure for storingtessellated geometric models is discussed. That type of data structure is useful formanaging models where the tessellation changes such as in subdivision or sim-plification routines. Next the scene-graph data structure is presented. These arerapidly becoming well-supported features of all new graphics APIs because theyare so useful in managing objects and transformations. Finally, the tiled multidi-mensional array is presented. Originally developed to help paging performance,such structures are now crucial for memory locality on machines regardless ofwhether the array fits in main memory.

4.1 Triangle Meshes

One of the most common model representations is a polygonal mesh as discussedin Section 3.3. When such meshes are unchanging in the program the simplestructure described in that section is usually sufficient. However, when the meshesare to be modified, more complicated data representations are needed to efficientlyanswer queries such as:

• given a triangle, what are the three adjacent triangles?

• given an edge, what two triangles share it?

• given a vertex, what faces share it?

• given a vertex, what edges share it?

65

“cg”2002/3page6

i

i

i

i

i

i

i

i

66 CHAPTER 4. DATA STRUCTURES FOR GRAPHICS

A

0 1a

ec

db

B

a B 0 1 ec dbA

vertex1

vertex2edge face

leftface right

predleft

succleft

predright

succright

Figure 4.1: An edge in a winged-edge data structure. With each edge is stored the face

(polygon) to the left of the edge, the face to the right of the edge, and the previous and

successor edges in the traversal of each of those faces.

There are many such data structures for triangle meshes, polygonal meshes, andpolygonal meshes with holes (see the Chapter notes for references). In many ap-plications the meshes are very large, so an efficient representation can be crucial.However, it is best to hide the implementation behind a class interface and beginwith a bloated but straightforward implementation as discussed in the next sec-tion. After that a more efficient implementation, the winged-edge data structure,is discussed.

4.1.1 Winged-Edge Data Structure

The most straightforward implementation is to have three types:vertex, edge, andtriangle. There are a variety of ways to divide the data among these types. Whileone might be tempted to just store all the relationships, this makes for variable-length data structures that really are not needed. For example, a vertex can havean arbitrarily high number of edges incident into it. Instead we can use the classwinged-edgedata structure. That data structure makes edges the first-class citizenof the data structure. This data structure is shown in Figures 4.1 and 4.2.

Note that the winged edge data structure makes the desired queries in constanttime. For example, a face can access one of its edges and follow the traversalpointers to find all of its edges. Those edges store the adjoining face.

As with any data structure, the winged edge data structure makes a varietyof time/space trade-offs. For example, we could eliminate theprev references.When we need to know the previous edge, we could follow the successor edgesin a circle until we get back to the original edge. This would save space but makethe computation of the previous edge take longer. This type of issue has led to aproliferation of mesh data structures (see the chapter notes for more informationon those structures).

“cg”2002/3page6

i

i

i

i

i

i

i

i

4.1. TRIANGLE MESHES 67

A

B

C

D

ac e

f

0 1

2

3

b d

a A 3 0 bf ceD

vertex1

vertex2edge face

leftface right

predleft

succleft

predright

succright

f C 3 2 de baA

b A 0 2 fa dcB

d B 1 2 bc feC

e C 1 3 fd acD

c B 0 1 db eaD

A a

vertex edge

B d

D e

C d

0 a

face edge

1 c

3 a

2 d

Figure 4.2: A tetrahedron and the associated elements for a winged edge data structure.

The two small tables are not unique; each vertex and face stores any one of the edges it is

associated with.

“cg”2002/3page6

i

i

i

i

i

i

i

i


θ

φ

p

b

Figure 4.3: A hinged pendulum. On the left are the two pieces in their “local” coordinate

systems. The hinge of the top piece is at point b and the attachment for the bottom piece is

at its local origin. The degrees of freedom for the assembled object are the angles (θ, φ) and

the location p of the top hinge.

4.2 Scene Graphs

To motivate the scene-graph consider the hinged pendulum shown in Figure 4.3.Consider how we would draw the top part of the pendulum:

M1 = rotate(θ)M2 = translate(p)M3 = M2M1

ApplyM3 to all points in upper pendulum

M3

Mc

Figure 4.4: The scene-graph for the hinged pen-dulum of Figure 4.1

The bottom is more complicated, but we can take advantage of the fact that it isattached to the bottom of the upper pendulum at pointb in the local coordinatesystem. First we rotate the lower pendulum so that it is at an angleφ relative toits initial position. Then we move it so that its top hinge is at pointb. Now it isthe appropriate position in the local coordinates of the upper pendulum and it canthen be moved along with that coordinate system. The composite transform forthe lower pendulum is:

Ma = rotate(φ)Mb = translate(b)Mc = MbMa

Md = M3Mc

ApplyMd to all points in lower pendulum

Thus we see that the lower pendulum not only lives in its own local coordinatesystem, that coordinate system itself is moved along with that of the upper pen-dulum.

We can encode the pendulum in a data structure that makes management ofthese coordinate system issues easier as shown in Figure 4.4. The appropriatematrix to apply to an object is just the product of all the matrices in the chain tothe root of the data structure. For example, consider the model of a ferry that hasa car that can move freely on the deck of the ferry, and wheels that each move

“cg”2002/3page6

i

i

i

i

i

i

i

i

4.3. TILING MULTIDIMENSIONAL ARRAYS 69

relative to the car as shown in Figure 4.5.

M0

M1

M2 M3

Figure 4.5: A ferry car-ries a car which has wheelsattached (only two shown)are stored in a scene-graph.

As with the pendulum, each object should be transformed by the product ofthe matrices in the path from the root to the object:

ferry transform usingM0

car body transform usingM0M1

left wheel transform usingM0M1M2

left wheel transform usingM0M1M3

An efficient implementation can be achieved using amatrix stack, a data structuresupported by many APIs. A matrix stack is manipulated usingpushandpopop-erations that add and delete matrices from the right-hand side of a matrix product.For example, calling:

push(M0)push(M1)push(M2)

creates the active matrixM = M0M1M2. A subsequent call topop() stripsthe last matrix added so that the active matrix becomes:M = M0M1. Com-bining the matrix stack with a recursive traversal of a scene graph gives us:

function traverse(node)push(Mlocal)draw object using composite matrix from stacktraverse(left child)traverse(right child)push()

There are many variations on scene graphs but all follow the basic idea above.

4.3 Tiling Multidimensional Arraysi=0 i=3i=2i=1

j=0

j=2

j=1

0 321

4 765

8 11109

Figure 4.6: The memorylayout for an untiled 2D ar-ray with Nx = 4 and Ny =3.

Effectively utilizing the cache hierarchy is a crucial task in designing algorithmsfor modern architectures. Making sure that multidimensional arrays have data ina “nice” arrangement is accomplished bytiling, sometimes also calledbricking.A traditional 2D array is stored as a 1D array plus an indexing mechanism; forexample, anNx by Ny array is stored in a 1D array of lengthNxNy and the 2Dindex(x, y) (which runs from(0, 0) to (Nx − 1, Ny − 1)) and maps it to the 1Dindex (running from0 to NxNy − 1 using the formula:

index= x + Nxy.

An example of how that memory lays out is shown in Figure 4.6. A problem withthat layout is that although two adjacent array elements that are in the same row

“cg”2002/3page7

i

i

i

i

i

i

i

i


n

n

Nx elementsBx = Nx/n blocks

Ny

ele

men

tsB

y =

Ny/

n b

lock

s

Figure 4.8: A tiled 2D array composed of Bx by By tiles each of size n by n.

are next to each other in memory, two adjacent elements in the same column willbe separated byNx elements in memory. This can cause poor memory locality forlargeNx. The standard solution to this is to usetiles to make rows and columnsfor equal. An example is shown in Figure 4.7 where two by two tiles are used.The details of indexing such an array are discussed in the next section. A morecomplicated example with two levels of tiling on a 3D array are covered after that.

i=0 i=3i=2i=1

j=0

j=2

j=1

0 541

2 763

8 13129

Figure 4.7: The memorylayout for an tiled 2D arraywith Nx = 4 and Ny = 3and two by two tiles. Notethat padding on the top ofthe array is needed be-cause Ny is not a multipleof the tile size two.

A key question is what size to make the tiles. In practice they should be similarto the memory-unit size on the machine. For example, On a machine with 128byte cache lines, and using 16 bit data values,n is exactly 4. However, using float(32 bit) datasets,n is closer to 3. Because there are also coarser-sized memoryunits such as pages, hierarchical tiling with similar logic can be useful.

4.3.1 One-level Tiling for 2D Arrays

If we assume aNx by Ny array decomposed into squaren by n tiles (Figure 4.8),then the number of tiles is:

Bx = Nx/n,

By = Ny/n.

Here we assume thatn dividesNx andNy exactly. When this is not true thearray should bepadded. For example, ifNx = 15 andn = 4, thenNx should bechanged to 16. To work out a formula for indexing such an array, we first find thetile indices(bx, by) which give the row/column for the tiles (the tiles themselves

“cg”2002/3page7

i

i

i

i

i

i

i

i


from a 2D array):

bx = x÷ n,

by = y ÷ n,

where÷ is integer division, e.g.,12 ÷ 5 = 2. If we order the tiles along rows asshown in Figure 4.2, then the index of the first element of the tile(bx, by) is:

index= n2(Bxby + bx).

The memory in that tile is arranged like a traditional 2D array as shown in Fig-ure 4.1. The partial offsets(x′, y′) inside the tile are:

x′ = x mod n,

y′ = y mod n,

wheremod is the remainder operator, e.g.,12 mod 5 = 2. Therefore the offsetinside the tile is

offset= y′n + x′.

Thus the full formula for finding the 1D index element(x, y) in an Nx by Ny

array withn by n tiles is:

index= n2(Bxby + bx) + y′n + x′,

= n2((Nx ÷ n)(y ÷ n) + x÷ n) + (y mod n)n + (x mod n).

This expression contains many integer multiplication, divide and modulus opera-tions. On modern processors, these operations are extremely costly. Forn whichare powers of two, these operations can be converted to bitshifts and bitwise log-ical operations. However, as noted above the ideal size is not always a power oftwo. Some of the multiplications can be converted to shift/add operations, butthe divide and modulus operations are more problematic. The indices could becomputed incrementally, but this would require tracking counters, with numerouscomparisons and poor branch prediction performance.

However, there is a simple solution; note that the index expression can bewritten as:

index= Fx(x) + Fy(y)

where

Fx(x) = (x÷ n) + (x mod n),Fy(y) = (Nx ÷ n)(y ÷ n) + (y mod n)n.

We tabulateFx andFy, and usex andy to find to compute the index into the dataarray. These tables will consist ofNx andNy elements respectively. The totalsizes of the tables will fit in the primary data cache of the processor even for verylarge data set sizes.

“cg”2002/3page7

i

i

i

i

i

i

i

i


4.3.2 Example: Two-level Tiling for 3D Arrays

Effective TLB utilization is also becoming a crucial factor in algorithm perfor-mance. The same technique can be used to improve TLB hit rates in a 3D arrayby creatingm×m×m bricks ofn× n× n cells. For example, a40× 20× 19volume could be decomposed into4 × 2 × 2 macrobricks of2 × 2 × 2 bricks of5 × 5 × 5 cells. This corresponds tom = 2 andn = 5. Because 19 cannot befactored bymn = 10, one level of padding is needed. Empirically useful sizesare usem = 5 for 16 bit datasets, andm = 6 for float datasets.

The resulting index into the data array can be computed for any(x, y, z) triplewith the expression:

index = ((x÷ n)÷m)n3m3((Nz ÷ n)÷m)((Ny ÷ n)÷m) +((y ÷ n)÷m)n3m3((Nz ÷ n)÷m) +((z ÷ n)÷m)n3m3 +((x÷ n) mod m)n3m2 +((y ÷ n) mod m)n3m +((z ÷ n) mod m)n3 +(x mod (n2))n2 +(y mod n)n +(z mod n)

whereNx, Ny andNz are the respective sizes of the dataset.

Note that as in the simpler 2D one-level case this expression can be writtenas:

index= Fx(x) + Fy(y) + Fz(z),

where

Fx(x) = ((x÷ n)÷m)n3m3((Nz ÷ n)÷m)((Ny ÷ n)÷m) +((x÷ n) mod m)n3m2 +(x mod (n2))n2

Fy(y) = ((y ÷ n)÷m)n3m3((Nz ÷ n)÷m) +((y ÷ n) mod m)n3m +(y mod n)n

Fz(z) = ((z ÷ n)÷m)n3m3 +((z ÷ n) mod m)n3 +(z mod n)

“cg”2002/3page7

i

i

i

i

i

i

i

i


Frequently Asked Questions

Does tiling really make that much difference in performance?

On some volume rendering applications, a two-level tiling strategy made as muchas a factor of ten performance difference. When the array does not fit in mainmemory, it can effectively prevent thrashing in some applications such as imageediting.

How do I store the lists in a winged-edge structure?

For most applications it is feasible to use arrays and indices for the references.However, if many delete operations are to be performed, then it is wise to uselinked lists and pointers.

Notes

The discussion of the winged-edge data structure is based on the course notes ofChing-Kuang Shene. There are smaller mesh data structures than winged-edge.The trade-offs in using such structures is discussed inDirected Edges– A ScalableRepresentation for Triangle Meshes(Campagna, Kobbelt and Seidel, Journal ofGraphics Tools, 3(4), 1998). The tiled-array discussion is based onA Ray TracingMethod for Isosurface Rendering(Parker et al., IEEE Visualization Conference,1998).

Exercises

1. What is the memory difference for a simple tetrahedron and one stored in awinged-edge data structure?

2. Diagram a scene graph for a bicycle.

3. How many lookup tables are needed for a single-level tiling of an-dimensionalarray?

Documents

Book on Ray Casting