Upload
daria-castaneda
View
19
Download
1
Embed Size (px)
DESCRIPTION
Kalloc Studios. Multi-core/Cell Game Engine Design. Henry Yu President & CEO. Credentials. Worked in video game industry for nearly 20 years Lead Programmer for Sierra-Online Director of Technology for Activision Technical Director for Electronic Arts/ Westwood - PowerPoint PPT Presentation
Citation preview
Multi-core/Cell Game Engine Design
Henry YuPresident & CEO
Kalloc Studios
CredentialsWorked in video game industry for nearly 20
yearsLead Programmer for Sierra-OnlineDirector of Technology for ActivisionTechnical Director for Electronic Arts/
WestwoodSoftware Director for Angel
Studios/RockstarSoftware Engineer Director for THQFound Kalloc Studios 2006
Topics of discussionGame Industry TrendsHardware capabilities comparison
System Architecture Kalloc Studios’ mission
Game Industry TrendsConsumers demand more realistic visuals,
physics interactions, A.I. behaviorsMore game content to give a full,
immersive experienceComputer hardware has been evolving to
utilize multi-core designFaster iteration time to promote rapid
game development
Hardware comparison of the PlayStation 3to the Xbox 360
Multi-core vs. Cell based architecture, different synchronization models
Hard to utilize the SPU due to its small amount of local memory
DMA transfers are difficult to structureSlower RSX graphics performance Memory limitations due to its non-unified
memory architectureSlower Blue Ray Rom data throughput
Fundamental System design Architecture differences between Xbox 360 and PlayStation 3
Core 1
Core 2
Core 3UMA
512 MB
GPU
XBOX 360 system architecture
PPU
SPU SPU SPU SPU
SPU SPU
SYSTEM BUS DMA
SPU SPU
256 MB XDRMain System
Memory
256 MB DDR3Video Local Memory
GPUPlayStation 3 system
architecture
Current Kalloc Engine CapabilitiesCross platform for Xbox 360, PlayStation 3 and PC720p and 1080p high definition support400 or more characters fully skinned with ~5000 polygons
and 91 bones (4 weight influences)50 or more vehicles with ~3000 polygonsNormal mapped charactersAll characters with facial animationsOverall polygon throughput ~24 million polygons per
secondFull collision detection with dynamic objects such as
characters and vehicles NPC driving and responding to collisionsNPC have reactive behaviors toward player’s action
System ArchitectureLocal store and Data Streaming ModelMulti-Threaded Architecture Graphics SubsystemAnimation SystemPhysics ComponentsAsset Pipeline via Live Update System
Local Store and Data Streaming ModelThe architecture works like an array
where individual game objects, physics objects, render objects, etc are each allocated in a contiguous chunk of memory reserved for that type of object.
The contiguous chunk of memory then can be DMA-d over to the PS3 SPU or even cached on the local memory on PS3.
Having objects in contiguous memory is an optimization for the PS3 that will also yield performance increases in Xbox because cache misses will be reduced.
Data Streaming Model to process tasks
SPU Processor
Incoming data stream
Local Store256Kb
Outgoing data stream
DMA Data Bus
DMA Data Bus
Data Streaming Process Model
Multi-threaded ArchitectureThread Based Model and SPU Thread Server
implementation for task based architectureMulti-Threaded Scheduler manages both
blocking and non-blocking processesMulti-stage implementation for data
synchronizationN + 1 frame GPU running concurrently with
core CPU and SPUs
Functional Based multi-threaded ArchitectureFunctional Based architecture associates one
thread per subsystem. All subsystems are processed simultaneously.
Advantages: Very easy to implement since it does not require tasks to be divided and dependencies to be resolved. Suitable for middleware solutions.
Disadvantages: Uneven distribution of processing power since one slower task can hold up the rest of processors, making them idle. Mutexes or some other synchronization protection must be used to resolve data dependencies.
Task Based multi-threaded ArchitectureTask Based architecture uses all threads to process
a subsystem. Subsystems are processed in a given order.
Large tasks must be divided into smaller tasks so that they can be distributed along all processors
Advantage: Extremely even balance of processor power. Virtually eliminates the problem of waiting for the slowest tasks. Due to subsystems being processed in a fixed order, many dependencies are removed, allowing data access without mutex locking.
Disadvantage: Difficult to implement since all tasks are required to be divided and dependencies resolved. Hard to use middleware solutions since this architecture is relatively new.
Task Distribution Model
TASK 8
TASK n
TASK 9
TASK n+1
SPU 0
SPU 1
SPU 2
SPU 3
SPU 4
SPU 5
TASK 5 TASK 6
TASK 2 TASK 3
SPU_SERV
TASK 1
TASK 4
TASK 7
TASK 10
INCOMING TASK SETAvailable SPUs
Solutions to Data SynchronizationMutex locks using critical sectionsData separation using multiple stages (e.g.
read and write stages)Local Store Model using ring buffersComponent object level organization to
separate data dependency
Current Graphics System720p and 1080p native supportInterleaved vertex format with 16-bit normals and UV
data to maximize data throughputMulti-level Shadow Map to enhance resolution qualityUse of instancing to increase rendering performanceDepth Of Field effectHigh Dynamic Range lighting with tone mappingParticle effectsHardware instancing for rendering propsScene graph techniques such as octree and occlusion
systems to further optimize large scale renderingSupports unlimited number of bones for animation
Animation SystemSupport unlimited bones per characterKey frame compressionQuaternion based interpolationSupport for up to 9 channels of animation:
rotation, translation and scaleSupport for overlaid animationsProcedural animation to minimize number of
animations in game
Physics ComponentUse of component system to accommodate
different physics middleware and custom physics engine: Havok, Bullet and Ageia PhysX
Simple custom physics systemSphere to sphere, box to box, box to sphere,
etc collisions2D Grid partition optimizationsPer cell collision detectionSimple vehicle simulation
Instant Asset Update System for Asset pipelineInstant refreshing of assets without
restarting the engine/gameNo intermediate file formats = quick
export process Instant feedback for artists and designers
to check for data validity and qualityNo overnight build/baking processAsset sharing between designers, artists or
programmers within the networkBuilt in support for art outsourcingEasy DVD/Blu Ray burns for archiving and
build delivery
Mission of Kalloc StudiosCreate a truly next gen multi-platform game
engine that maximizes cutting edge hardware such as multi-core and cell architecture and latest graphics rendering capabilities
Create innovative and quality game titlesTrain highly motivated talent to become
industry specialists
Questions ?