Upload
bernard-carson
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Parallel Rendering
1
2
Introduction
• In many situations, standard rendering pipeline not sufficient
Need higher resolution display
More primitives than one pipeline can handle
•Want to use commodity components to build system that can render in parallel
•Use standard network to connect
3
Power Walls
• Where display large data sets? Need resolution comparable to data set to see detail
• Medical: CT / MRI
• Ocean / atmospheric
• Solutions? Multi LCD / Plasma panels
Multiple projectors• Commodity
• High-end
4
Tiled Display
5
CS Power Wall
•6 dual processor Intellestations•G Force 3 Graphics cards•6 commodity projectors (1024 x 768)•Gigabit ethernet•Back projected screen•Shared facility with scalable system group
Investigate OS and network issues
6
CS Power Wall
7
CS Power Wall
8
Power Wall
• Inexpensive but some problems Color matching
Vignetting
Alignment • Overlap areas
Synching
Dark field
9
Graphics Architectures
•Pipeline Architecture SGI Geometry Engine
Geometry passes through pipeline
Hardware for• clipping • transformations• texture mapping
Project/SortClipTransform Rasterize Screen
10
Building Blocks
•Graphics processors consist of geometric blocks and rasterizers
•Geometric units: transformations, clipping, lighting
•Rasterization: scan conversion, shading
•Parallelize by using mutiple blocks
•Where to do depth check?
R
G G G
R R
11
Sorting Paradigm
•Can categorize different ways of interconnecting blocks using sorting paradigm:
•each projector responsible for one area of screen
must sort primitives and assign to proper projector
•Algorithms categorized by where sorting occurs
12
Three Rendering Methods
Sort-First Rendering Sort-Middle Rendering Sort-Last Rendering
R
G G G
R R
Sort G G G
R R R
Sort R
G G G
R R
Composite
13
Sort First
•Each R assigned to area of screen•Each G coupled to own R•Must sort primitives first•Can use commodity cards
R
G G G
R R
Sort
14
Sort-First Rendering: Random Triangles Application
15
Sort Middle
• Gs and Rs decoupled
• Each G can be assigned any group of objects
• Each R assigned to area of screen
• Must sort between stages
G G G
R R R
Sort
16
Sort Last
•Couple Rs and Gs•Assign objects to Gs to load balance or via application
•Composite results at end
R
G G G
R R
Composite
17
Tree Compositing
•Composite in pairs•Send color and depth buffers•Each time half processors become idle
18
Binary Swap Compositing
•Each processor responsible for one part of display
•Pass data to right n times
19
Sort-Last Rendering: Random Triangles Application
20
Comparison
•Sort first Appealing but hard to implement
•Sort middle Used in hardware pipelines
More difficult to implement with add-on commodity cards
•Sort last Easy to implement with compositing stage
High network traffic
21
Mapping to Clusters
•Different architectures Shared vs distributed memory
Communication overhead
Parallel vs distributed algorithms
•Easy to do sort last•Must evaluate communication cost•Standard visualization strategies incorrect if transparency used
22
Vista Azul
•Experimental architecture from IBM donated to AHPCC
•Half Intel nodes, half AIX nodes•Only one (PCI) graphics card per four processors
•Contained Scalable Graphics Engine (SGE):
•high-speed high-resolution color buffer accessible by all processors
23
Vista Azul
24
Comparison Between Sort-First and Sort-Last
Sort Last Rendering vs. Sort First Rendering
0
5
10
15
20
25
30
35
0 2 4 6 8 10 12
Number of Processors
CP
U T
ime
(sec
on
ds) Sort Last Rendering
Sort First Rendering
25
Performance on PC Cluster
•Following experiments done by Ye Cong on CS cluster
6 Intellestations
Gigabit Ethernet
GForce 3 graphics
•Show effect of network
26
Sort-First vs Sort LastRandom Triangles
27
Sort First vs Sort LastTeapot
28
Azul vs Intellistations
29
Software for Parallel Rendering
•Write your own sort-first sort-last•WireGL/Chromium (Stanford)•Embed inside package (VTK)
30
WireGL: A Distributed Graphics System
• SW-based parallel rendering system unifies rendering power of collection of cluster
nodes
• Scalability achieved by integrating parallel applications into sort-first parallel Rs
• Each node in cluster: either rendering client or rendering server
• Clients submit OpenGL commands concurrently to servers
• Servers render final physical image
31
Chromium
•Successor to WireGL•Allows both sort first and sort last rendering
• Implemented on CS cluster•Most of gain in performance because Chromium and WireGL can group state-changing commands separately from rendering commands
32
Chromium vs Sort First
MRI rotation