69
GPU Programming GPU Programming Yanci Zhang Yanci Zhang Game Programming Game Programming Practice Practice

GPU Programming Yanci Zhang Game Programming Practice

Embed Size (px)

Citation preview

Page 1: GPU Programming Yanci Zhang Game Programming Practice

GPU GPU ProgrammingProgramming

Yanci ZhangYanci Zhang

Game Programming PracticeGame Programming Practice

Page 2: GPU Programming Yanci Zhang Game Programming Practice

Parallel computingParallel computing GPU overviewGPU overview OpenGL shading language overviewOpenGL shading language overview Vertex / Geometry / Fragment shaderVertex / Geometry / Fragment shader Using GLSL in OpenGLUsing GLSL in OpenGL Application: Per-pixel shadingApplication: Per-pixel shading

OutlineOutline

Game Programming PracticeGame Programming Practice

Page 3: GPU Programming Yanci Zhang Game Programming Practice

Performance of CPU increased 50% per year from 1986 Performance of CPU increased 50% per year from 1986 to 2002to 2002 Simply wait for the next generation of CPU in order to obtain Simply wait for the next generation of CPU in order to obtain

increased performanceincreased performance

Single-processor performance improvement slowed Single-processor performance improvement slowed down to 20% since 2002down to 20% since 2002

The road to rapidly increasing performance lay in the The road to rapidly increasing performance lay in the direction of direction of parallelismparallelism

Why Parallel Why Parallel Computing?Computing?

Game Programming PracticeGame Programming Practice

Page 4: GPU Programming Yanci Zhang Game Programming Practice

Performance of CPU increased 50% per year from 1986 Performance of CPU increased 50% per year from 1986 to 2002to 2002 Simply wait for the next generation of CPU in order to obtain Simply wait for the next generation of CPU in order to obtain

increased performanceincreased performance

Single-processor performance improvement slowed Single-processor performance improvement slowed down to 20% since 2002down to 20% since 2002

The road to rapidly increasing performance lay in the The road to rapidly increasing performance lay in the direction of direction of parallelismparallelism

Put multiple processors on a single circuit rather than Put multiple processors on a single circuit rather than developing ever-faster monolithic processor developing ever-faster monolithic processor

Why Parallel Why Parallel Computing?Computing?

Game Programming PracticeGame Programming Practice

Page 5: GPU Programming Yanci Zhang Game Programming Practice

GPU: Graphics Processing UnitGPU: Graphics Processing Unit Developed rapidly from being primitive drawing Developed rapidly from being primitive drawing

devices to being major computing resourcesdevices to being major computing resources Extremely powerful and flexible processor Tremendous memory bandwidth and computational power High level languages have emerged Capable of general-purpose computation beyond graphics

applications

What is GPU ?What is GPU ?

Game Programming PracticeGame Programming Practice

Page 6: GPU Programming Yanci Zhang Game Programming Practice

In many respects GPU is more powerful than CPUIn many respects GPU is more powerful than CPU Computational power: FLOPS (Floating point Operations Per

Second) Parallelism Bandwidth Performance growth rate

MotivationMotivation

Game Programming PracticeGame Programming Practice

Page 7: GPU Programming Yanci Zhang Game Programming Practice

FLOPS: A common benchmark measurement for rating FLOPS: A common benchmark measurement for rating the speed of FPUthe speed of FPU

CPU CPU Intel Core i7 980 XE (quad-core): 107.55 GFLOPS

GPU GPU nVidia GeForce GTX 480: 2.02 TFLOPS Modern GPUs support high precision 32-bit floating point throughout the pipeline No support for a double precision format

Floating Point Floating Point CalculationCalculation

Game Programming PracticeGame Programming Practice

Page 8: GPU Programming Yanci Zhang Game Programming Practice

Parallelism: allows simultaneous operations at the Parallelism: allows simultaneous operations at the same timesame time

CPUCPU Do not adequately exploit parallelism Dual-core, quad-core

GPUGPU GeForce GTX 480: 512 kernels

ParallelismParallelism

Game Programming PracticeGame Programming Practice

Page 9: GPU Programming Yanci Zhang Game Programming Practice

Peak performance of computer systems is often far in Peak performance of computer systems is often far in excess of actual application performanceexcess of actual application performance

The bandwidth between key components ultimately The bandwidth between key components ultimately dictates system performancedictates system performance

CPUCPU 64bits DDR3-2133 dual-channel: 17GB/s

GPUGPU GeForce GTX 480: 384bits, 177.4GB/s

BandwidthBandwidth

Game Programming PracticeGame Programming Practice

Page 10: GPU Programming Yanci Zhang Game Programming Practice

CPUCPU Annual growth ~ 1.5x -> decade growth ~60x Moore’s law

GPUGPU Annual growth ~2.0x -> decade growth > 1000x Faster than Moore’s law Multi-billion dollar video game market is a pressure cooker that

drives innovation

Getting Faster and Getting Faster and FasterFaster

Game Programming PracticeGame Programming Practice

Page 11: GPU Programming Yanci Zhang Game Programming Practice

Efficient computationEfficient computation Maximize the hardware devoted to computation Allow parallelism

Task parallelism Data parallelism Instruction parallelism

Ensure each computation unit operates at maximum efficiency

Keys to High-Perf. Keys to High-Perf. ComputingComputing

Game Programming PracticeGame Programming Practice

Page 12: GPU Programming Yanci Zhang Game Programming Practice

Efficient communicationEfficient communication Simply providing large amounts of computation is not sufficient PEs often spend most of the time waiting for data Minimize off-chip communication

Keys to High-Perf. Keys to High-Perf. ComputingComputing

Game Programming PracticeGame Programming Practice

Page 13: GPU Programming Yanci Zhang Game Programming Practice

A programming model allowing high efficiency in A programming model allowing high efficiency in computation and communicationcomputation and communication

Two basic componentsTwo basic components Stream

All data is represented as a stream An ordered set of data of the same data type

Kernels: operations on streams Applications are constructed by chaining multiple Applications are constructed by chaining multiple

kernels togetherkernels together

Stream Programming Stream Programming ModelModel

Game Programming PracticeGame Programming Practice

Page 14: GPU Programming Yanci Zhang Game Programming Practice

Operates on entire streams of elements and produces Operates on entire streams of elements and produces new streamsnew streams

Within a kernel, computations on one stream element Within a kernel, computations on one stream element are never dependent on computations on another are never dependent on computations on another elementelement Input elements and intermediate computed data are stored

locally Fits perfectly onto data-parallel hardware

KernelKernel

Game Programming PracticeGame Programming Practice

Page 15: GPU Programming Yanci Zhang Game Programming Practice

Use of transistors can be divided to three categories:Use of transistors can be divided to three categories: Control: direct the computation Datapath: perform computation Storage: store data

Efficient Computation Efficient Computation (1)(1)

Game Programming PracticeGame Programming Practice

Page 16: GPU Programming Yanci Zhang Game Programming Practice

Only simple control flow in kernel executionOnly simple control flow in kernel execution Devote most of transistors to datapath hardware rather than

control hardware Streams expose parallelism in the applicationStreams expose parallelism in the application Allows a hardware implementation to specialize Allows a hardware implementation to specialize

hardwarehardware

Efficient Computation Efficient Computation (2)(2)

Game Programming PracticeGame Programming Practice

Page 17: GPU Programming Yanci Zhang Game Programming Practice

Off-chip communication is efficientOff-chip communication is efficient Intermediate results between kernels are kept on-chip Intermediate results between kernels are kept on-chip

to minimize off-chip communicationto minimize off-chip communication High degree of latency tolerance High degree of latency tolerance

Efficient CommunicationEfficient Communication

Game Programming PracticeGame Programming Practice

Page 18: GPU Programming Yanci Zhang Game Programming Practice

Prescribes both the operation to be executed and the Prescribes both the operation to be executed and the required datarequired data

Only a limited prefetch of the input data can occurOnly a limited prefetch of the input data can occur Jumps are expected in the instruction stream

L2 cache consumes lots of the transistors in CPUL2 cache consumes lots of the transistors in CPU

Instruction-Stream-Instruction-Stream-Based (CPU)Based (CPU)

Game Programming PracticeGame Programming Practice

Page 19: GPU Programming Yanci Zhang Game Programming Practice

Separates two tasksSeparates two tasks:: Configuring PEs Controlling data-flow to and from PEs

Data elements can be assembled from memory before Data elements can be assembled from memory before processingprocessing

Uses only small caches and devotes the majority of Uses only small caches and devotes the majority of transistors to computationtransistors to computation

Data-Stream-Based Data-Stream-Based (GPU)(GPU)

Game Programming PracticeGame Programming Practice

Page 20: GPU Programming Yanci Zhang Game Programming Practice

The stream formulation of the graphics pipelineThe stream formulation of the graphics pipeline All data as streams All computation as kernels

Both user-programmable and nonprogrammable stages can be expressed as kernels

Mapping Pipeline to Mapping Pipeline to Stream ModelStream Model

Game Programming PracticeGame Programming Practice

Page 21: GPU Programming Yanci Zhang Game Programming Practice

FixedFixed Very fast Can not modify the pipeline, only can turn on/off some

functions Hard to implement advanced techniques on GPU

ProgrammableProgrammable Allows programmers to write shaders to change the pipeline

Fixed vs. ProgrammableFixed vs. Programmable

Game Programming PracticeGame Programming Practice

Page 22: GPU Programming Yanci Zhang Game Programming Practice

Three programmable kernels in Three programmable kernels in pipelinepipeline Vertex shader Geometry shader Pixel shader

Load shaders through graphics Load shaders through graphics APIAPI

The fixed pipeline are replaced The fixed pipeline are replaced by shadersby shaders

Basic Basic Programmable Graphics Programmable Graphics HardwareHardware

Game Programming PracticeGame Programming Practice

Page 23: GPU Programming Yanci Zhang Game Programming Practice

OpenGL 4.3 Pipelines

graphics rendering pipeline

GPGPU programming pipeline

OpenGL 4.3 PipelinesOpenGL 4.3 Pipelines

Game Programming PracticeGame Programming Practice

Page 24: GPU Programming Yanci Zhang Game Programming Practice

MIMD: Multiple Instruction stream, Multiple Data MIMD: Multiple Instruction stream, Multiple Data streamstream A number of processors that function asynchronously and

independently

Vertex ProcessorVertex Processor

Game Programming PracticeGame Programming Practice

Page 25: GPU Programming Yanci Zhang Game Programming Practice

Operate on a single input vertex and produce a single Operate on a single input vertex and produce a single output vertexoutput vertex

Replace transformation & lighting unitReplace transformation & lighting unit Now you have to do everything by yourselfNow you have to do everything by yourself

Transformation Lighting Texture coordinates generation

As a minimum, a vertex shader must output vertex As a minimum, a vertex shader must output vertex position in homogeneous clip spaceposition in homogeneous clip space

Vertex Shader: Basic Vertex Shader: Basic FunctionFunction

Game Programming PracticeGame Programming Practice

Page 26: GPU Programming Yanci Zhang Game Programming Practice

What else we can do?What else we can do? Displacement mapping Object deformation Vertex blending

Vertex Shader: Advanced Vertex Shader: Advanced FunctionFunction

Game Programming PracticeGame Programming Practice

Page 27: GPU Programming Yanci Zhang Game Programming Practice

We can notWe can not Add or delete any vertices Change the primitive type Change the order of vertices form the primitives No knowledge of the type of primitive and neighboring vertices

Vertex Shader: Vertex Shader: LimitationsLimitations

Game Programming PracticeGame Programming Practice

Page 28: GPU Programming Yanci Zhang Game Programming Practice

SIMD: Single Instruction, Multiple DataSIMD: Single Instruction, Multiple Data Achieves data level parallelism

“get this pixel, get the next one” -> “get lots of pixel”

Fragment ProcessorFragment Processor

Game Programming PracticeGame Programming Practice

Page 29: GPU Programming Yanci Zhang Game Programming Practice

Invoked once for each fragment covered by the Invoked once for each fragment covered by the primitive primitive

Computes the final pixel color and depthComputes the final pixel color and depth Can output up to 8 32-bit 4-component data for the Can output up to 8 32-bit 4-component data for the

current pixel location current pixel location

Fragment Shader: Basic Fragment Shader: Basic FunctionFunction

Game Programming PracticeGame Programming Practice

Page 30: GPU Programming Yanci Zhang Game Programming Practice

Enables rich shading techniques Enables rich shading techniques Per-pixel lighting, bump mapping, normal mapping Fluid simulation …

Fragment Shader: Advanced Fragment Shader: Advanced FunctionFunction

Game Programming PracticeGame Programming Practice

Page 31: GPU Programming Yanci Zhang Game Programming Practice

Dynamic branching less efficient than vertex proc. Dynamic branching less efficient than vertex proc. Can not change the screen coordinate of a fragmentCan not change the screen coordinate of a fragment No arbitrary memory writeNo arbitrary memory write

Fragment Shader: Fragment Shader: LimitationsLimitations

Game Programming PracticeGame Programming Practice

Page 32: GPU Programming Yanci Zhang Game Programming Practice

New for 2007New for 2007 Executed after vertex shadersExecuted after vertex shaders Input: whole primitive, possibly with adjacent Input: whole primitive, possibly with adjacent

informationinformation Invoked once for every primitive

Output: multiple vertices forming a single selected Output: multiple vertices forming a single selected topology (tristrip, linestrip, pointlist)topology (tristrip, linestrip, pointlist)

Output may be fed to rasterizer and/or to a vertex Output may be fed to rasterizer and/or to a vertex buffer in memorybuffer in memory

Geometry ShaderGeometry Shader

Game Programming PracticeGame Programming Practice

Page 33: GPU Programming Yanci Zhang Game Programming Practice

Point Sprite Expansion Point Sprite Expansion Single Pass Render-to-Cubemap Single Pass Render-to-Cubemap Dynamic Particle Systems Dynamic Particle Systems Fur/Fin Generation Fur/Fin Generation Shadow Volume Generation Shadow Volume Generation

Geometry Shader: Geometry Shader: ApplicationsApplications

Game Programming PracticeGame Programming Practice

Page 34: GPU Programming Yanci Zhang Game Programming Practice

Graphics applicationsGraphics applications Per-pixel lighting Ray tracing Deformation

GPGPUGPGPU Computer vision Physically-based simulation Image processing Database queries

Programmable GPUs: Programmable GPUs: ApplicationsApplications

Game Programming PracticeGame Programming Practice

Page 35: GPU Programming Yanci Zhang Game Programming Practice

General-purpose Computation on GPUsGeneral-purpose Computation on GPUs Capable of performing more than the specific graphics

computations Goal: make the inexpensive power of the GPU available to

developers as a sort of computational coprocessor Example applications range from in-game physics simulation to

conventional computational science

GPGPUGPGPU

Game Programming PracticeGame Programming Practice

Page 36: GPU Programming Yanci Zhang Game Programming Practice

Production renderingProduction rendering Geared towards maximum image quality Example: RenderMan

Real-time renderingReal-time rendering GLSL: OpenGL shading language HLSL: DirectX High-level shading language CG: C for Graphic, NVidia

Shading LanguageShading Language

Game Programming PracticeGame Programming Practice

Page 37: GPU Programming Yanci Zhang Game Programming Practice

High level shading language based on CHigh level shading language based on C Not a hardware-specific languageNot a hardware-specific language Cross platform compatibility on multiple OSCross platform compatibility on multiple OS Each hardware vender includes GLSL compiler in their Each hardware vender includes GLSL compiler in their

driverdriver

OpenGL Shading OpenGL Shading LanguageLanguage

Game Programming PracticeGame Programming Practice

Page 38: GPU Programming Yanci Zhang Game Programming Practice

Check whether your GPU supports GLSLCheck whether your GPU supports GLSL GLSL is part of OpenGL 2.0 If OpenGL 2.0 is not available, then use OpenGL extensions

Before Using GLSLBefore Using GLSL

Game Programming PracticeGame Programming Practice

Page 39: GPU Programming Yanci Zhang Game Programming Practice

GL_ARB_shader_objectGL_ARB_shader_object Adds API calls that are necessary to manage shader objects and

program objects GL_ARB_fragment_shaderGL_ARB_fragment_shader

Adds functionality to define fragment shader objects GL_ARB_vertex_shaderGL_ARB_vertex_shader

Adds functionality to define vertex shader objects

Extensions RequiredExtensions Required

Game Programming PracticeGame Programming Practice

Page 40: GPU Programming Yanci Zhang Game Programming Practice

GLEW: The OpenGL Extension Wrangler Library GLEW: The OpenGL Extension Wrangler Library (http://glew.sourceforge.net/)(http://glew.sourceforge.net/)

Initialize GLEWInitialize GLEW

GLEW GLEW 1/21/2

#include <GL/glew.h>#include <GL/glut.h>...glutInit(&argc, argv);glutCreateWindow("GLEW Test");GLenum err = glewInit();if (GLEW_OK != err){  /* Problem: glewInit failed, something is seriously wrong. */  fprintf(stderr, "Error: %s\n", glewGetErrorString(err));  ...}

Game Programming PracticeGame Programming Practice

Page 41: GPU Programming Yanci Zhang Game Programming Practice

Check extensionsCheck extensions

Check core OpenGL functionalityCheck core OpenGL functionality

GLEW GLEW 2/22/2

if (GLEW_ARB_vertex_shader){  /* It is safe to use the GL_ARB_vertex_shader extension here. */  }

if (GLEW_VERSION_2_0){  /* Yay! OpenGL 2.0 is supported! */}

Game Programming PracticeGame Programming Practice

Page 42: GPU Programming Yanci Zhang Game Programming Practice

ScalarScalar bool, int, float

VectorVector Supports 2D, 3D, 4D vector: vec{2,3,4}, ivec{2,3,4}, bvec{2,3,4}

MatrixMatrix Square matrix: mat2, mat3, mat4 mat2x3, mat2x4, mat3x2, mat3x4, mat4x2, mat4x3

TextureTexture sampler1D, sampler2D, sampler3D samplerCube sampler1DShadow, sampler2DShadow

Data TypesData Types

Game Programming PracticeGame Programming Practice

Page 43: GPU Programming Yanci Zhang Game Programming Practice

Pretty much the same as in CPretty much the same as in C

Flexible when initializing variables using other Flexible when initializing variables using other variablesvariables

Variables Variables 1/31/3

float a,b; // two float variables (the comments are like in C) int c = 2; // initialize a variable when declaring itvec3 g = vec3(1.0,2.0,3.0); //declare and initialize a vector

vec2 a = vec2(1.0,2.0); vec2 b = vec2(3.0,4.0); vec4 c = vec4(a,b) // c = vec4(1.0,2.0,3.0,4.0);

Game Programming PracticeGame Programming Practice

Page 44: GPU Programming Yanci Zhang Game Programming Practice

Flexible when accessing a vectorFlexible when accessing a vector {x, y, z, w}: accessing vectors that represent points or normals {r, g, b, a}: accessing vectors that represent colors {s, t, p, q}: accessing vectors that represent texture coordinates

Variables Variables 2/32/3

Game Programming PracticeGame Programming Practice

Page 45: GPU Programming Yanci Zhang Game Programming Practice

Accessing components beyond those declared for the Accessing components beyond those declared for the vector type is an errorvector type is an error

Variables Variables 3/33/3

vec4 a = vec4(1.0, 2.0, 3.0, 4.0); float posX = a.x; //posX = 1.0float posY = a[1]; //posY = 2.0float depth = a.w; //depth = 4.0Vec3 b = a.xxy; // b = vec3(1.0, 1.0, 2.0)Vec3 c = a.bra; // b = vec3(3.0, 1.0, 4.0)

vec2 t = vec2(1.0, 2.0);float tt = t.z; //incorrect!

Game Programming PracticeGame Programming Practice

Page 46: GPU Programming Yanci Zhang Game Programming Practice

Operations are component-wiseOperations are component-wise

Vector and Matrix Vector and Matrix OperationsOperations

vec3 u, v, w;float f;mat3 a1, a2, a3;

u = v+ f;

u = v + w;

u = v * a1;

a1 = a2 * a3;

u.x = v.x + f;u.y = v.y + f;u.z = v.z + f;

u.x = v.x + w.x;u.y = v.y + w.y;u.z = v.z + w.z;

u.x = dot(v, a1[0]);u.y = dot(v, a1[1]);u.z = dot(v, a1[2]);

Game Programming PracticeGame Programming Practice

Page 47: GPU Programming Yanci Zhang Game Programming Practice

selection (if-else)selection (if-else) iteration (for, while, and do-while)iteration (for, while, and do-while) jumps (discard, return, break, and continue)jumps (discard, return, break, and continue)

discard is only allowed within fragment shaders discard causes the fragment to be discarded and no updates to

any buffers will occur

Control Flow Control Flow StatementsStatements

if (depth > 0.5) discard;

Game Programming PracticeGame Programming Practice

Page 48: GPU Programming Yanci Zhang Game Programming Practice

The function The function main()main() is used as the entry point to a is used as the entry point to a shader executableshader executable

Function DefinitionFunction Definition

returnType functionName (type0 arg0, type1 arg1, ..., typen argn){// do some computationreturn returnValue;}

Game Programming PracticeGame Programming Practice

Page 49: GPU Programming Yanci Zhang Game Programming Practice

gl_Position (vec4)gl_Position (vec4) Output of vertex shader Homogeneous vertex position Must write a value into this variable

gl_FragCoord (vec4)gl_FragCoord (vec4) Holds the window relative coordinates x, y, z, and 1/w values

for the fragment Read-only variable in fragment shader

Important Build-in Important Build-in Variables Variables 1/21/2

Game Programming PracticeGame Programming Practice

Page 50: GPU Programming Yanci Zhang Game Programming Practice

gl_FragColor (vec4)gl_FragColor (vec4) Output of fragment shader Writing to gl_FragColor specifies the fragment color

gl_FragDepth (float)gl_FragDepth (float) Output of fragment shader Default value: gl_FragCoord.z If you write to gl_FragDepth, then it is your responsibility for

always writing it

Important Build-in Important Build-in Variables Variables 2/22/2

Game Programming PracticeGame Programming Practice

Page 51: GPU Programming Yanci Zhang Game Programming Practice

Angle and trigonometry functionsAngle and trigonometry functions sin, cos, asin, acos …

Exponential functionsExponential functions pow, exp, sqrt …

Common functionsCommon functions abs, clamp, smoothstep …

Geometric functionsGeometric functions length, dot, cross …

Build-in Functions Build-in Functions

Game Programming PracticeGame Programming Practice

Page 52: GPU Programming Yanci Zhang Game Programming Practice

Matrix functionsMatrix functions outerProduct, transpose …

Vector relational functionsVector relational functions lessThan, equal …

Texture lookup functionsTexture lookup functions texture2D, texture2DLod…

Fragment processing functionsFragment processing functions Noise functionsNoise functions

Build-in Functions Build-in Functions

Game Programming PracticeGame Programming Practice

Page 53: GPU Programming Yanci Zhang Game Programming Practice

ftransform()ftransform() For vertex shaders only Produces exactly the same result as would be produced by

OpenGL’s fixed functionality transform

reflect(vec3 I, vec3 N)reflect(vec3 I, vec3 N) Computes reflection vector by incident vector I and normal

vector N

Important Build-in Important Build-in Functions Functions

gl_Position = ftransform()

Game Programming PracticeGame Programming Practice

Page 54: GPU Programming Yanci Zhang Game Programming Practice

Vertex shaderVertex shader

Fragment shaderFragment shader

First ExampleFirst Example

void main(){ gl_Position = ftransform();}

void main(){ gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);}

Game Programming PracticeGame Programming Practice

Page 55: GPU Programming Yanci Zhang Game Programming Practice

Make Fun of Fragment Make Fun of Fragment ShaderShader

void main(){ vec4 t = vec4(1.0, 0.6, 0.3, 0.0); gl_FragColor = t.xxxx; //flexible vector accessing}

void main(){ gl_FragColor = vec4(gl_FragCoord.zzz, 1.0); //let’s view the depth map}

void main(){ if (gl_FragCoord.x > 320) discard; //try discard gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);}

Game Programming PracticeGame Programming Practice

Page 56: GPU Programming Yanci Zhang Game Programming Practice

Vertex shader build-in attributes Vertex shader build-in attributes gl_Vertex, gl_Normal, gl_Color, gl_MultiTexCoord[] …

Vertex shader build-in output variablesVertex shader build-in output variables gl_FrontColor, gl_TexCoord[] …

Fragment shader build-in input variablesFragment shader build-in input variables gl_Color, gl_TexCoord[] …

Built-In uniform stateBuilt-In uniform state gl_ModelViewMatrix, gl_ProjectionMatrix …

More Build-in VariablesMore Build-in Variables

Game Programming PracticeGame Programming Practice

Page 57: GPU Programming Yanci Zhang Game Programming Practice

Example: Using Build-in Example: Using Build-in MatrixesMatrixes

void main(){ gl_Position = ftransform();}

void main(){ gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;}

void main(){ gl_Position = gl_ModelViewMatrix * gl_Vertex; gl_Position = gl_ProjectionMatrix * gl_Position;}

Game Programming PracticeGame Programming Practice

Page 58: GPU Programming Yanci Zhang Game Programming Practice

Vertex shaderVertex shader

Fragment shaderFragment shader

Example: Using ColorsExample: Using Colors

void main(){ gl_Position = ftransform(); gl_FrontColor = gl_Color; }

void main(){ gl_FragColor = gl_Color;}

Game Programming PracticeGame Programming Practice

Page 59: GPU Programming Yanci Zhang Game Programming Practice

Vertex shaderVertex shader

Fragment shaderFragment shader

Example: Using Texture Example: Using Texture CoordinatesCoordinates

void main(){ gl_Position = ftransform(); gl_TexCoord[0] = vec4(gl_MultiTexCoord0.xy, 1.0, 0.0);}

void main(){ gl_FragColor = gl_TexCoord[0];}

Game Programming PracticeGame Programming Practice

Page 60: GPU Programming Yanci Zhang Game Programming Practice

Important to per-vertex and per-pixel lightingImportant to per-vertex and per-pixel lighting Transpose of the inverse of the upper leftmost 3x3 of Transpose of the inverse of the upper leftmost 3x3 of

gl_ModelViewMatrixgl_ModelViewMatrix Converts normal vector from object space to eye spaceConverts normal vector from object space to eye space

gl_NormalMatrixgl_NormalMatrix

Game Programming PracticeGame Programming Practice

Page 61: GPU Programming Yanci Zhang Game Programming Practice

void main(){ gl_FragColor = gl_Color;}

Vertex shaderVertex shader

Fragment shaderFragment shader

View Normal VectorsView Normal Vectors

void main(){ gl_Position = ftransform(); gl_FrontColor = vec4(gl_Normal, 1.0);}

void main(){ gl_Position = ftransform(); gl_FrontColor = vec4(gl_NormalMatrix * gl_Normal, 1.0);}

Game Programming PracticeGame Programming Practice

Page 62: GPU Programming Yanci Zhang Game Programming Practice

Communication between OpenGL and shaderCommunication between OpenGL and shader One way communication Use uniform qualifier when declaring variables

Communication between vertex and fragment shaderCommunication between vertex and fragment shader Use varying qualifier when declaring variables

CommunicationsCommunications

Game Programming PracticeGame Programming Practice

Page 63: GPU Programming Yanci Zhang Game Programming Practice

Used to declare global variables Used to declare global variables Variable values are the same across the entire Variable values are the same across the entire

primitive being processedprimitive being processed Read-onlyRead-only Initialized externally either at link time or through the Initialized externally either at link time or through the

APIAPI

Uniform Uniform

uniform vec4 lightPosition;uniform vec3 color = vec3(0.7, 0.7, 0.2); // value assigned at link time

Game Programming PracticeGame Programming Practice

Page 64: GPU Programming Yanci Zhang Game Programming Practice

OpenGL SetupOpenGL Setup

Game Programming PracticeGame Programming Practice

Page 65: GPU Programming Yanci Zhang Game Programming Practice

Creating Shader ObjectCreating Shader Object_ShaderID = glCreateShader(GL_VERTEX_SHADER); if (_ShaderID == 0) //glCreateShader() return 0 if it fails to create a shader object{ printf("Fail to create shader object!\n"); exit(-1);}

//load the shader source file to a string _pShaderSource

glShaderSource(_ShaderID, 1, (const GLchar **)&_pShaderSource, &fileLen);CheckGLError(__FILE__, __LINE__);glCompileShader(_ShaderID); glGetShaderiv(_ShaderID, GL_COMPILE_STATUS, &ShaderStatus);if (ShaderStatus == GL_FALSE){ printf("Fail to compile the shader: %s\n", vFileName); exit(-1);}

Game Programming PracticeGame Programming Practice

Page 66: GPU Programming Yanci Zhang Game Programming Practice

_ProgramID = glCreateProgram();if (_ProgramID == 0){ printf("Fail to create shader program object!\n"); exit(-1);}

glAttachShader(_ProgramID, VertexShaderID); //attach vertex shaderCheckGLError(__FILE__, __LINE__);glAttachShader(_ProgramID, FragShaderID); //attach fragment shaderCheckGLError(__FILE__, __LINE__);glLinkProgram(_ProgramID);glGetProgramiv(_ProgramID, GL_LINK_STATUS, &ProgramStatus);if (ProgramStatus == GL_FALSE){ printf("Fail to link the program!\n"); exit(-1);}glUseProgram(_ProgramID);

Creating Program Creating Program ObjectObject

Game Programming PracticeGame Programming Practice

Page 67: GPU Programming Yanci Zhang Game Programming Practice

Suppose an uniform variable is declared in shader:Suppose an uniform variable is declared in shader:

Initialize uniform variable by OpenGLInitialize uniform variable by OpenGL

Initialize Uniform Initialize Uniform VariablesVariables

uniform vec3 u_Color;

loc = glGetUniformLocation(_ProgramID, “u_Color”);if (loc == -1) { cout << "Error: can't find uniform variable! \n";}

glUniform3f(loc, v0, v1, v2);

Game Programming PracticeGame Programming Practice

Page 68: GPU Programming Yanci Zhang Game Programming Practice

Three types of light in OpenGLThree types of light in OpenGL Ambient light Diffuse light Specular light

Fixed pipeline conducts vertex-based shadingFixed pipeline conducts vertex-based shading Fast but poor quality

Per-pixel shading is possible by utilizing the Per-pixel shading is possible by utilizing the programmable ability of modern GPUprogrammable ability of modern GPU

Application: Per-Pixel Application: Per-Pixel ShadingShading

Game Programming PracticeGame Programming Practice

Page 69: GPU Programming Yanci Zhang Game Programming Practice

Add specular lightAdd specular light

AssignmentAssignment

Game Programming PracticeGame Programming Practice