Quick OverviewQuick Overview(The Old Way)(The Old Way)
Graphics cards process TrianglesGraphics cards process Triangles Quads or other polygons are broken Quads or other polygons are broken
down into trianglesdown into triangles Each triangle processed in two Each triangle processed in two
steps-steps- Vertex operationsVertex operations Pixel operationsPixel operations
Transformation and Transformation and LightingLighting
Each vertex is handled separatelyEach vertex is handled separately First the vertex is transformed into First the vertex is transformed into
screen coordinatesscreen coordinates Next lighting for each vertex is Next lighting for each vertex is
calculatedcalculated Only ambient, diffuse, and specular Only ambient, diffuse, and specular
properties of the vertex are properties of the vertex are calculatedcalculated
Pixel RasterizationPixel Rasterization
Each pixel in the triangle is Each pixel in the triangle is compared to the depth buffercompared to the depth buffer
If the depth test passes, the texture If the depth test passes, the texture for the pixel is looked upfor the pixel is looked up
The texture value, along with the The texture value, along with the color of the pixel is blended together color of the pixel is blended together Gouraud Shading is used for pixel colorGouraud Shading is used for pixel color Possibly with previous color of the pixelPossibly with previous color of the pixel
Gouraud Shading isn’t Gouraud Shading isn’t that goodthat good
Should be a nice circular pattern
We want better (most of We want better (most of the time)the time)
Possible solution- put all lighting Possible solution- put all lighting calculations in the pixel stepcalculations in the pixel step Expensive to computeExpensive to compute Not always importantNot always important
Better solution- Programmable Better solution- Programmable ShadersShaders Only perform expensive calculations for Only perform expensive calculations for
objects that really need itobjects that really need it Allows programmers to come up with Allows programmers to come up with
their own effectstheir own effects
PipelinePipeline
Fixed T&L
Vertex Shader Pixel Shader
Tri
Result Image
Position, Lighting, Texturing
More Lighting, Blending, more Texturing
Take control of the GPUTake control of the GPU Two new types of processors to controlTwo new types of processors to control
Vertex shadersVertex shaders Pixel (or Fragment) shadersPixel (or Fragment) shaders
Huge amount of power given to Huge amount of power given to programmerprogrammer
Vertices can be manipulated before they Vertices can be manipulated before they are transformed to screen coordinatesare transformed to screen coordinates
Lighting can now be done on a per pixel Lighting can now be done on a per pixel basisbasis
We can even do pure number crunching on We can even do pure number crunching on the processor – No graphics neededthe processor – No graphics needed
Why not do this on the Why not do this on the CPU?CPU?
Graphics cards have more floating Graphics cards have more floating point power than any CPU on the point power than any CPU on the marketmarket
Specialized hardware allows for Specialized hardware allows for highly optimized calculationshighly optimized calculations
Did I mention Parallel processing?Did I mention Parallel processing?(A NVIDIA 7800GTX has 24 pixel shaders)(A NVIDIA 7800GTX has 24 pixel shaders)
CPU’s aren’t increasing in speed like CPU’s aren’t increasing in speed like GPUs areGPUs are
So how do we do it?So how do we do it?
Learn a new language or new toolLearn a new language or new tool AssemblyAssembly GLSL (OpenGL)GLSL (OpenGL) HLSL (DirectX)HLSL (DirectX) CgCg ATI RenderMonkeyATI RenderMonkey NVIDIA FX ComposerNVIDIA FX Composer ……..
Its not that bad, really…Its not that bad, really… HLSL (High Level Shading Language) HLSL (High Level Shading Language)
and GLSL (GL Shading Language) are and GLSL (GL Shading Language) are very similarvery similar
Very similar to C++Very similar to C++ Created to replace the need to learn Created to replace the need to learn
assembly for each graphics card on the assembly for each graphics card on the marketmarket
Simple to use and learn (assuming you Simple to use and learn (assuming you get a good book)get a good book) Most commands you will use are mul, add, Most commands you will use are mul, add,
dot, sub, and texture lookup.dot, sub, and texture lookup.
vertexOutput VS_TransformAndTexture(vertexInput IN) vertexOutput VS_TransformAndTexture(vertexInput IN) {{ vertexOutput OUT;vertexOutput OUT; OUT.hPosition = mul( float4(IN.position.xyz , 1.0) , worldViewProj);OUT.hPosition = mul( float4(IN.position.xyz , 1.0) , worldViewProj); OUT.texCoordDiffuse = IN.texCoordDiffuse;OUT.texCoordDiffuse = IN.texCoordDiffuse;
//calculate our vectors N, E, L, and H//calculate our vectors N, E, L, and H float3 worldEyePos = viewInverse[3].xyz;float3 worldEyePos = viewInverse[3].xyz; float3 worldVertPos = mul(IN.position, world).xyz;float3 worldVertPos = mul(IN.position, world).xyz; float4 N = mul(IN.normal, worldInverseTranspose); //normal vectorfloat4 N = mul(IN.normal, worldInverseTranspose); //normal vector float3 E = normalize(worldEyePos - worldVertPos); //eye vectorfloat3 E = normalize(worldEyePos - worldVertPos); //eye vector float3 L = normalize( -lightDir.xyz); //light vectorfloat3 L = normalize( -lightDir.xyz); //light vector float3 H = normalize(E + L); //half angle vectorfloat3 H = normalize(E + L); //half angle vector
//calculate the diffuse and specular contributions//calculate the diffuse and specular contributions float diff = max(0 , dot(N,L));float diff = max(0 , dot(N,L)); float spec = pow( max(0 , dot(N,H) ) , shininess );float spec = pow( max(0 , dot(N,H) ) , shininess ); if( diff <= 0 )if( diff <= 0 ) {{ spec = 0;spec = 0; }}
//output diffuse//output diffuse float4 ambColor = materialDiffuse * lightAmbient;float4 ambColor = materialDiffuse * lightAmbient; float4 diffColor = materialDiffuse * diff * lightColor ;float4 diffColor = materialDiffuse * diff * lightColor ; OUT.diffAmbColor = diffColor + ambColor;OUT.diffAmbColor = diffColor + ambColor;
//output specular//output specular float4 specColor = materialSpecular * lightColor * spec;float4 specColor = materialSpecular * lightColor * spec; OUT.specCol = specColor;OUT.specCol = specColor;
return OUT;return OUT; }}
Difference…Difference…float4 PS_Textured( vertexOutput IN): COLORfloat4 PS_Textured( vertexOutput IN): COLOR{{ float4 diffuseTexture = tex2D(TextureSampler, IN.texCoord0Diffuse );float4 diffuseTexture = tex2D(TextureSampler, IN.texCoord0Diffuse ); float4 diffuse2Texture = tex2D( TextureSampler2, IN.texCoord1Diffuse );float4 diffuse2Texture = tex2D( TextureSampler2, IN.texCoord1Diffuse ); return IN.diffAmbColor * diffuseTexture + IN.specCol;return IN.diffAmbColor * diffuseTexture + IN.specCol;}}
float4 PS_Textured( vertexOutput IN): COLORfloat4 PS_Textured( vertexOutput IN): COLOR{{ float4 diffuseTexture = tex2D( TextureSampler, IN.texCoord0Diffuse );float4 diffuseTexture = tex2D( TextureSampler, IN.texCoord0Diffuse ); float3 normTexture = (tex2D( TextureSampler2, IN.texCoord1Diffuse ).xyz - 0.5)*2.0;float3 normTexture = (tex2D( TextureSampler2, IN.texCoord1Diffuse ).xyz - 0.5)*2.0; float4 N = mul(normTexture, worldInverseTranspose);float4 N = mul(normTexture, worldInverseTranspose); float3 L = normalize( -lightDir.xyz); //light vectorfloat3 L = normalize( -lightDir.xyz); //light vector float diff = max(0 , dot(N,L));float diff = max(0 , dot(N,L)); return diffuseTexture * diff;return diffuseTexture * diff;}}
General Requirements for General Requirements for writing shaderswriting shaders
Hardware is optimized for graphicsHardware is optimized for graphics This means you can’t create your own datatypesThis means you can’t create your own datatypes Focused on vectors and matricesFocused on vectors and matrices
Vertex and Pixel Shaders have limited input and Vertex and Pixel Shaders have limited input and outputsoutputs
Shaders have no knowledge of what pixel or vertex Shaders have no knowledge of what pixel or vertex they are processingthey are processing
Tricks must be usedTricks must be used Ie. Encode additional position information in color channelsIe. Encode additional position information in color channels Set texture coordinates to give information about which pixel Set texture coordinates to give information about which pixel
is being processedis being processed Don’t use if statements unless you have a really new Don’t use if statements unless you have a really new
card…card…
Cont…Cont…
Shaders can only be so many lines of code Shaders can only be so many lines of code (at least until DirectX10)(at least until DirectX10)
Most newer graphics card have limits Most newer graphics card have limits around 32000 lines of codearound 32000 lines of code
There are different versions with different There are different versions with different featuresfeatures For instance if statements don’t exist in For instance if statements don’t exist in
Shader Model 1.0Shader Model 1.0 Allows the programmer to write effects for Allows the programmer to write effects for
many types of graphics cards (FX files)many types of graphics cards (FX files)
Vertex ShadersVertex Shaders InputInput
PositionPosition NormalNormal Color (Ambient,Diffuse,Specular)Color (Ambient,Diffuse,Specular) Texture CoordinatesTexture Coordinates
OutputOutput PositionPosition ColorColor Texture CoordinatesTexture Coordinates
New features that aren’t available with the fixed New features that aren’t available with the fixed pipelinepipeline Move vertices (Bump mapping, hair,… )Move vertices (Bump mapping, hair,… ) Texture lookup (Get neighbor information… )Texture lookup (Get neighbor information… ) If statements (Not the best idea here, but can be ok)If statements (Not the best idea here, but can be ok)
Pixel ShadersPixel Shaders InputInput
Color informationColor information Texture CoordinatesTexture Coordinates Position InformationPosition Information
OutputOutput Final ColorFinal Color Depth ValueDepth Value This is it!This is it!
Changes from fixed pipelineChanges from fixed pipeline Dependant Texture Lookup (Use a texture to lookup Dependant Texture Lookup (Use a texture to lookup
into another texture)into another texture) If statements (Really bad idea!)If statements (Really bad idea!) Ability to do lighting per pixelAbility to do lighting per pixel
What if you don’t want to make an What if you don’t want to make an Image?Image?
(General Purpose GPU programming)(General Purpose GPU programming)
Encode all your data in a texture mapEncode all your data in a texture map Write your program in a pixel shaderWrite your program in a pixel shader Do a texture lookup to get data and Do a texture lookup to get data and
“render” the result to the image buffer“render” the result to the image buffer Instead of displaying the image buffer Instead of displaying the image buffer
read it back out and you’re doneread it back out and you’re done Or if you need more processing use the Or if you need more processing use the
results as a new texture and process results as a new texture and process againagain
General TipsGeneral Tips
Texture maps are the keys!Texture maps are the keys! You can store 4 different values per texel- You can store 4 different values per texel-
who says they have to be an imagewho says they have to be an image Be careful – texture maps are generally only Be careful – texture maps are generally only
8 bits per channel, and values only range 8 bits per channel, and values only range from 0-255from 0-255
You can make texture maps up to 32bits per You can make texture maps up to 32bits per channelchannel
Values are always clamped 0..1 so make sure you Values are always clamped 0..1 so make sure you scale your valuesscale your values
Use built-in functions where ever possibleUse built-in functions where ever possible
Resources and Cool Resources and Cool ThingsThings
developer.nvidia.comdeveloper.nvidia.com FX ComposerFX Composer NVIDIA SDK (lots of code demos)NVIDIA SDK (lots of code demos)
ATI SDK and RenderMonkeyATI SDK and RenderMonkey DirectX9 SDK (An absolute must for DirectX9 SDK (An absolute must for
programming GPUs)programming GPUs) GPU Gems booksGPU Gems books OpenGL.orgOpenGL.org OpenGL Orange BookOpenGL Orange Book Introduction to 3D Game Programming Introduction to 3D Game Programming
with DirectX 9.0 by Frank Lunawith DirectX 9.0 by Frank Luna
Thank you and Thank you and Questions?Questions?