Upload
umbra-software
View
500
Download
0
Embed Size (px)
Citation preview
The State of Skinning
… Or How To Maintain Your Physique
Welcome!Tervetuloa!
Rulon RaymondSr. Engine Programmer
Introduction
1) Review2) Evolution of techniques on console HW3) The new hotness (hint: it’s a Clifford
Algebra)4) Extensions
DISCLAIMER: All screenshots and techniques presented are not associated with any specific title, project, or oragnization, unless otherwise
stated.
Outline
What is Skinning?
What is Skinning?
I Was Skinning Long Before 3D Animated Models Were All The Rage
Step 1: Generate a cool animated pose. Step 2: ??? Step 3: Use fancy lighting and shaders to
draw an animated model on-screen (i.e. profit)
What is Skinning?
Step2: Skinning!
What is Skinning?
Skinned Model, ready for drawing
Model Vertices
Bone Weights
Bone Transform
s
What is Skinning?
What is Skinning?
: The initial vertex transform Array of bone weighting values Array of bone transforms: The final vertex transform
Skinning on Consoles
• Sony Playstation (1995)• Geometry Transform Engine (GTE)
Skinning on Consoles
• Sony Playstation2 (2000)• Vector Unit 0 (VU0)
Skinning on Consoles
Microsoft Xbox (2001) NVIDIA GPU (DirectX 8.x)
Skinning on Consoles
Microsoft Xbox 360 (2005) PowerPC CPU
Skinning Implementation
Sony PS3 ( 2006) Synergistic Processing Units (SPU’s)
Skinning on Consoles
Why not use the GPU for skinning on Xbox 360 and PS3?
The CPU’s/SPU’s are actually quite fast.
Skinning Implementation
[email protected](with many restrictions…)
Why not use the GPU for skinning on Xbox 360 and PS3?
Split Vertex Streams
Skinning on Consoles
VertexPosition, Tangent Space• Skinned
Colors, UV’s, etc.• Constant – sent straight to GPU
Stream 0
Stream 1
Why not use the GPU for skinning on Xbox 360 and PS3?
Unified Memory Architecture
Skinning on Consoles
// Just skinned a vertex. Now write it out as// three 16-byte vectors__stvx( skinnedVertexData0, vertsOutBuffer, 0 );__stvx( skinnedVertexData1, vertsOutBuffer, 16 );__stvx( skinnedVertexData2, vertsOutBuffer, 32 );// Gah – why’d that take so long?
// ~20% faster!// (F*&^% write-combine memory)__stvx( skinnedVertexData0, vertsOutBuffer, 0 );_WriteBarrier();__stvx( skinnedVertexData1, vertsOutBuffer, 16 );_WriteBarrier();__stvx( skinnedVertexData2, vertsOutBuffer, 32 );
Why not use the GPU for skinning on Xbox 360 and PS3?
So you can use the GPU for other things.
Skinning on Consoles
Microsoft Xbox One (2013) Sony PS4 (2013) AMD GCN GPU
Skinning on Consoles
Skinning on Consoles
GPU FrameDraw Calls
IDLE Draw Calls
Post FX
IDLEGCN Compute Unit
GCN Compute Unit
Async Compute Skinning
Skinning on Consoles
GPU FrameDraw Calls
Skinning
Draw Calls
Post FX
Skinning
GPU Compute Unit
GPU Compute Unit
• Generate Draw List (frame N)
Visible Models
• Async Compute Dispatch Thread.
Model Skinning
Workloads • GPU rendering (frame N-1)
Skinned Model
(frame N)
Skinning on Consoles
Async Compute Skinning
Skinning on Consoles
MATH WARNING!
The standard approach to real-time skinning, used in almost every modern 3D game.
Linear Matrix Blend Skinning
Suffers from some well-documented problems...
The “candy wrapper” effect
Linear Matrix Blend Skinning
Mesh Volume Preservation
Example: “flat ass syndrome”
Linear Matrix Blend Skinning
Q: Why do these problems exist?A: Let’s take a closer look at the underlying math…
Linear Matrix Blend Skinning
Linear Matrix Blend Skinning
Apply the property of distrubutivity:
Linear Matrix Blend Skinning
To keep it simple: Let represent a rigid transform. No scale, shear, … Most common scenario for skinning in games.
A linear combination of rigid transforms DOES NOT yield a rigid transform! Orthonormal matrices aren’t
closed under addition. Scaling values can creep into
the final vertex transforms. Extreme cases can result in
rank-deficient matrices.
Linear Matrix Blend Skinning
𝑣 ′
𝑣
𝑀 𝑗1𝑣
𝑀 𝑗2𝑣
Example: The “candy wrapper” artifact
The most common workaround to these issues is the addition of new bones. Hand-animated or procedural. Split the rotation of a joint, relative to its parent, into even increments –
for a single axis only. Example: Arm Twist Bone
Parented to the shoulder and consistently represents exactly half its twist(roll) motion.
Linear Matrix Blend Skinning
Adding these bones is not free!
Memory and processing overhead.
Exact amount depends on actual implementation.
Linear Matrix Blend Skinning
Dual Quaternions to the rescue! But what exactly are they? Let’s start with a quick review of the vanilla
variety of quaternions…
Linear Matrix Blend Skinning
Quaternions
Hamilton - 1843
A 4D extension of complex numbers
For our purposes all we care about is unit quaternions. Conveniently represent rotations. Conjugate:
Quaternions
𝑞∗=𝑞−1 ,‖𝑞‖=1
One important quaternion equation to note:
Applies a rotation to a 3D point
Quaternions
Similar in form to complex numbersStored as:
Dual Numbers
Conjugate
Multiplication
Dual Numbers
Basically a quaternion whose elements are dual numbers (quaternion form)
is the scalar part (dual number) is the vector part (dual vector)
(dual number form) : “non-dual part” : “dual part” Most useful for skinning.
Dual Quaternions
Multiplication:
Quaternion Conjugate:
Dual Conjugate:
Quaternion & Dual Conjugate:
Dual Quaternions
𝑁𝑜𝑟𝑚 (�̂�)=‖𝑞𝑎‖+⟨𝑞𝑎 ,𝑞𝑏 ⟩‖𝑞𝑎‖
𝜀
Dual Quaternions
�̂�∗=�̂�−1 ,‖�̂�‖=1
Rigid Transforms:
Dual Quaternions
Transforming a 3D point
Dual Quaternions
Geometric Interpretation Recall:
Dual Quaternions : dual quaternion representing only a rotation
• : translation vector, in quaternion form
• : angle of rotation• : translation along
: unit dual quaternion with a 0 scalar part
• : direction of axis of rotation• : moment of rotation axis
Screw Transform! Rotation about an axis followed by translation
along that axis. All rigid transforms can be described this way.
Dual Quaternions
Simple Case:
Dual Quaternion Blend Skinning
𝑞0 𝑞1
𝑞𝐷𝑄𝐵
Dual Quaternion Blend Skinning
Unlike with matrix blending, the result is always a rigid transform!
Very accurate, but not perfect. Can introduce accelerations when input dual
quaternions differ greatly. 8.15 degrees : Maximum rotational deviation 15.1% : Maximum translational deviation
Modified SLERP can be used if absolute accuracy is required.
Efficiency tradeoff usually not worth it.
Dual Quaternion Blend Skinning
Must handle antipodality! Polarity rule:
We want: Fix up all dual quaternions prior to skinning.
Dual Quaternion Blend Skinning
�̂�
−�̂�
for ( all bones’ unit dual quaternions, dq[i] )if ( InnerProduct( dq[i], dq[parent[i]] ) <
0.0 )Negate( dq[i] );
Dual Quaternion Blend Skinning
// Input: unit quaternion 'q0', translation vector 't' // Output: unit dual quaternion 'dq' static void QuatTrans2UDQ( const float q0[4], const float t[3], float dq[2][4] ) {
// Non-Dual Part: dq[0] = q0 for ( int i=0; i<4; i++ )
dq[0][i] = q0[i];
// Dual Part: dq[1] = ((0,t[0],t[1],t[2])/2)*q0dq[1][3] = -0.5f*(t[0]*q0[0] + t[1]*q0[1] + t[2]*q0[2]); // Scalar
Componentdq[1][0] = 0.5f*( t[0]*q0[3] + t[1]*q0[2] - t[2]*q0[1]); // Vector
Component 0dq[1][1] = 0.5f*(-t[0]*q0[2] + t[1]*q0[3] + t[2]*q0[0]); // Vector
Component 1dq[1][2] = 0.5f*( t[0]*q0[1] - t[1]*q0[0] + t[2]*q0[3]); // Vector
Component 2}
Generating a Dual Quaternion
Dual Quaternion Blending
Dual Quaternion Blend Skinning
// Input: array of dual quaternions 'dqIn'// Input: array of weights 'w‘, totaling 1.0// Input: size of the above two arrays (> 1)// Output: the blended dual quaternion 'dqOut' static void DQB( const float dqIn[][2][4], float w[], int numDQ, float dqOut[2][4] ){ // dqOut = w[0]*dqIn[0] Vec4Scale( dqIn[0][0], w[0], dqOut[0] ); Vec4Scale( dqIn[0][1], w[0], dqOut[1] ); for( int i = 1; i < numDQ; ++i ) { // dqOut += w[i]*dqIn[i] Vec4Mad( dqOut[0], w[i], dqIn[i][0], dqOut[0] ); Vec4Mad( dqOut[1], w[i], dqIn[i][1], dqOut[1] ); }}
Transformation Using a Dual Quaternion
Dual Quaternion Blend Skinning
// Input: unit dual quaternion 'dq' // Input: input position 'vecIn' // Output: rigidly transformed position 'vecOut' static void DQTransform( const float dq[2][4], const vec3_t vecIn, vec3_t vecOut ){ vec4_t q0, q1; float a0, ae, recipDeLen; vec3_t d0, de, temp1, temp2, temp3, temp4, temp5; vec3_t temp6, temp7, temp8, temp9, temp10, temp11;
recipDeLen = 1.0f / I_sqrt( dq[0][3]*dq[0][3] + dq[0][0]*dq[0][0] + dq[0][1]*dq[0][1] + dq[0][2]*dq[0][2] );
// Normalize both parts of the dual quaternion, based // on the length of the non-dual part. Vec4Scale( dq[0], recipDeLen, q0 ); Vec4Scale( dq[1], recipDeLen, q1 );
// Isolate the scalar and vector parts of both // quaternions. This is just for code clarity and can // be omitted for SIMD optimization. a0 = q0[3]; ae = q1[3]; memcpy( d0, &q0[0], sizeof( d0 )); memcpy( de, &q1[0], sizeof( de ));
// Transform 'vecIn' by the dual quaternion // to produce 'vecOut'. vecOut = dq*v*dq^-1 Vec3Cross( d0, vecIn, temp1 ); Vec3Mad( temp1, a0, vecIn, temp2 ); Vec3Scale( de, a0, temp3 ); Vec3Scale( d0, ae, temp4 ); Vec3Cross( d0, de, temp5 ); Vec3Sub( temp3, temp4, temp6 ); Vec3Add( temp6, temp5, temp7 ); Vec3Scale( temp7, 2.0f, temp8 ); Vec3Scale( d0, 2.0f, temp9 ); Vec3Cross( temp9, temp2, temp10 ); Vec3Add( vecIn, temp10, temp11 ); Vec3Add( temp11, temp8, vecOut );}
Blend
ing (2
)
Blend
ing (3
)
Blend
ing (4
)
Transf
orm Po
s
Transf
orm Ve
c05
101520253035
Matrix Skinning (column-major)DQB Skinning
Dual Quaternion Blend Skinning
Instruction Counts (XB360 VMX )
05
101520253035
Matrix Skinning (row-major)DQB Skinning
Dual Quaternion Blend Skinning
Instruction Counts (XB360 GPU)
Dual Quaternion Blend Skinning
On GCN GPU DQ Skinning
Matrix Skinning
Aggregate $ Efficiency VGPR Count Memory Stalls DRAM Footprint
DQ vs. Matrix Skinning
DQ Skinning is ~24% faster***
Dual Quaternion Blend Skinning
***: Depends heavily on vertex layout, tangent space quality, number of bones, and weighting distributions.
Optional Optimizations: Compress quaternions
10:10:10:2 format for non-dual component Tune max waves/SIMD Generate skinning transforms on the GPU
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Procedural Motions
Dual Quaternion Blend Skinning
Spore © EA (2008)
IK
Dual Quaternion Blend Skinning
Especially when animations are played on characters with different or custom proportions.
Ragdolls: Can you spot all the artifacts DQB would resolve?
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Pros GPU/SIMD friendly
No asset changes required Cheaper transform blending
More cache friendly Requires less memory/constants Conducive to procedural motions (Mostly) replaces the need for
the rotational split bones mentioned earlier.
Can be enabled selectively (per-LOD, per-submesh, high end
machines only)
Dual Quaternion Blend Skinning
Cons Less intuitive than matrices
Local scaling must be handled separately
Actual vertex transform is more ALU
Still not 100% accurate Potential bulge artifacts
Not widely adopted in games (yet)
No more flat asses!
Skinning
Blend Shapes
Skinning
Geometry Caching
Skinning
“Bulging-free dual quaternion skinning” (Kim, 2014)
Skinning
Skinning
1.Solve for: Bone weights on to minimize for all t.
2.Re-weight artists-selected vertices in Maya/Max.
Skinning
The optimal model skinning approach can vary per platform.
Give dual quaternion skinning a look. Don’t assume skinning is a “solved
problem”.(Unless you’re Leatherface)
Conclusion