CS179: GPU Programming

CS179: GPU ProgrammingLecture 9: Lab 4 Recitation

Today 3D Textures PBOs Fractals Raytracing Lighting/Phong Shading Memory Coalescing

3D Textures Recall advantages of textures:

Not global memory, faster accesses Still available to all threads/blocks Larger in size Better caching Filtering, clamping, sampling, etc.

3D Textures 3D textures store volume data

Could be used for rendering smoke, particles, fractals, etc. Allocate a 3D cudaArray to make 3D texture

cudaMalloc3DArray gives correct pitch Declare texture in device

texture<type, 3, mode> tex Access on device with texture sampling

tex3D(tex, x, y, z)

3D Textures Some texture properties you can set:

tex.normalized: ints or normalized floats tex.filterMode: linear or point filtering tex.addressMode[x]: wrap or clamp (for each dimension)

Bind texture to array cudaBindTextureToArray

Unbinding is typical, but probably not necessary All of this is done for you in lab 4!

PBOs Pixel Buffer Objects (PBOs)

Store volume data Used to easily render in OpenGL

Recall lab 3 VBOs stored vertex data Vertex data remained on GPU -- minimal transfer to/from CPU Rendered via OpenGL on GPU

Same story here Pixels instead of verts, but same idea

PBOs Initialize:

glGenBuffersARB(target, &pbo) target is the target buffer object

Bind to OpenGL: glBindBufferARB(target, pbo)

Assign Data: glBufferDataARB(target, size, data, usage) data is a pointer to the data, usage tells us how often we

read/write Map to CUDA:

cudaGLMapBufferObject/cudaGLUnmapBufferObject

Fractals Fractals: infinite complexity given by simple instructions

“Self-similar, recursive” Difficult for us to process (but nice for a computer!) Many different kinds (we’ll look at Julia Set) How to render on CUDA:

Calculate fractal volume or area Copy into texture Volume render as PBO

What is a Julia Set?

Mandlebrot Set “Father” of Julia Set

Mandlebrot Set

Mandlebrot Set Simpler than it looks Recursively defined by zn+1 = zn

2 + c c is imaginary constant z0 = 0

Three possible results based on c: Converge to 0 (black space) Stays in finite orbit (boundary Escapes to infinite (blue area)

Mandlebrot Set Computed by iteratively computing zn

Assume after some point, it escapes and we can stop checking…

||zn|| > 2, for example Coloring is oftentimes based on rate

of escape Don’t need more than a few dozen

iterations to see behavior Demo?

Julia Set Each pixel on Mandlebrot set has corresponding Julia Set

Julia Set Idea: instead of starting with z0 = 0, let z0 = c0

c0 changing will change Julia Set dramatically!

Julia Sets Why are they useful?

Nothing really practical yet But they look cool! Can teach us about chaos, model weather, coastlines, etc. Parallelizable problem, so good for us!

Julia Sets Lab 4 is even more exciting than Julia Sets… 4D Julia Sets!

Julia Sets 4D: Using quaternions instead of imaginary Quaternions: 3D extension to imaginary numbers

i2 = j2 = k2 = ijk = -1 ij = k = -ji, jk = i = -kj, ki = j = -ik

Many uses in graphics Rotations Kinematics Visualizations Etc.

We give you some nice quaternion functions (sqr_quat, mul_quat, etc.)

Julia Sets How do we render 4D object? Projection: taking nD slices of an (n+1)D object

Ex.: MRI Scan - 2D images of a 3D volume For 4D Julia set, render volume slices of 4D object

Think of it as time evolving object Slice is one frame in time

Now we have 3 parameters: z0 - starting point for Julia set c - constant for Mandlebrot set zp - slicing plane for projection

Julia Sets How to render:

Transform each coordinate in volume texture to quaternion q = (pos.x, pos.y, pos.z, dot((pos, 1), plane)) Implemented for you as pos_to_quat

Store escape speed or convergence in volume texure Volume render - raytracing

Raytracing Kind of what it sounds like: tracing rays

Start at some origin ray.o Step in direction ray.d If we collide with something, render it! To check shadows, raytrace back toward

light - if object hit, then in shadow Raytracing used for super high-def

images Can also be used to calculate lighting,

volumes, etc.

Raytracing

Raytracing

Raytracing Might not work great for fractals

Fractals are infinitely thin, so we might skip over many details Use distance function estimator

Gives lower bound for distance to set from any point in space Let z’n also be iteratively computed as z’n+1= 2znz’n, z’0= (1,0,0,0)

Raytracing Rendering this distance function isosurface is okay Usage:

Iterate zn and z’n until we escape or reach maximum iterations Return distance of previous slide Render all pixels “close enough” to set in volume

Raytracing Better idea: use a bit of raytracing Load volume data with distances to set Store in volume texture Raytrace along a ray through texture

Stop once we see distance is very low, under some epsilon Each ray handled by one thread, so pretty quick

Raytracing Better raytracing:

Current model: step along ray by step * ray.d step = some small constant, e.g. 0.005

What if we are 0.5 units away? Don’t need to step by 0.005

Use adaptive sampling: step = factor * dist factor = 0.01-0.5 works well

No need to worry about thread divergence

Raytracing Calculating ray:

Inverse matrix needed to calculate where we are looking invViewMatrix given to you, calculated for you

Pass it into constant memory c_invViewMatrix on GPU ray.o = invViewMat * (0, 0, 0, 1) ray.d = invViewMat * (u, v, -2.0)

u, v are screen coordinates -- calculate these based on 2D thread index

Lighting Once we hit fractal, render it! What color?

Depends on lighting, shading model, material properties… You get to color based on however you like

Something with some complexity would be good We suggest phong shading

Phong Shading 3 Components: Ambient, diffuse, specular

Phong Shading Ambient: Just a flat color

amb = amb_color;

Phong Shading Diffuse: Adds soft shadows and

highlights based on normal diff = diff_color * cos(a) a is angle between light and surface

normal Remember to use normalized vectors!

a

LN

Phong Shading Specular: adds in bright highlights

spec = spec_color * dot(R, eye)S

R is L reflected across N Eye = vector to eye S = shininess (weird: higher S = less shiny)

LNR

eye

Phong Shading Final output color is just sum of components:

out = amb + diff + spec Main info we need to know:

Light direction (chosen up to you, just hardcode) Normal (must compute) Eye vector (this is just -ray.d)

Phong Shading Calculating Normal via gradient: sample volume texture For each component (x, y, z):

Sample texture at component + some offset (x + 0.01) Sample texture at component - some offset (x - 0.01) Calculate difference per component Resulting differences are normal components!

We can also directly sample d_juliaDist This can be pretty slow, but normals will be smoother Up to you, if you’d like

Coalesced Memory Recap: coalesced memory gets better access

rates Must be in-order, aligned, and together

Comes into play with thread indexing index = threadIdx.x + blockDim.x * (blockIdx.x + gridDim.x * blockIdx.y);

index = threadIdx.x + blockDim.y * (blockIdx.y + gridDim.y * blockIdx.x);

Your Task Some prelab questions will be available All TODO code is in frac_kernel.cu Host code:

Copy necessary memory into GPU Map/Unmap buffers Run kernels (2 this time, one to compute fractal, one to render) Use timer events to record time of recalculation

Device code: d_setfractal: loads volume data with data from d_juliaDist d_juliaDist: returns estimated distance to Julia set at given point d_juliaNormal: samples texture to calculate normal d_render: raytraces to render isosurface, uses shading model to color fractal

Your Task GPU architecture:

Indexing is made easiest with 1D block, 2D grid Defined for you, see globally defined consts and dim3s

Space is bounded by region [-2, 2]3

You’ll need to convert back and forth between this space and texture array indices

Feel free to play with any architecture/setup In general, feel free to play with anything!

Coloring can be really cool… Try other functions (z3 + c, for example)

Your Task Extra Credit:

Raytracing: use raytracing to render shadows (10pts) Once we hit surface, trace back toward light source If we hit surface again, the original surface pixel was in shadow, make it

slightly darker Adaptive Detailing: higher detail when we’re zoomed in (5pts)

Allows us to see the infiniteness of the fractal Essentially, just adjust epsilon based on distance to camera

epsilon: how close we must be to fractal to be considered a “hit”

Documents

CS179: GPU Programming