13
HistoPyramid stream compaction and expansion Christopher Dyken 1 * and Gernot Ziegler 2 Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1 University of Oslo 2 Max-Planck-Institut f¨ ur Informatik, Saarbr¨ ucken Page 1

Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

HistoPyramid stream compaction and expansion

Christopher Dyken1* and Gernot Ziegler2

Advanced Computer Graphics / Vision SeminarTU Graz 23/10-2007

1University of Oslo2Max-Planck-Institut fur Informatik, Saarbrucken

Page 1

Page 2: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

GPUs are highly parallel and perfect for all data local operations.

However, re-arranging or selectively deleting data is difficult onGPUs.

Stream compaction and expansion are such operations:

Stream compactionFor each element in input stream,let a predicate determine if the element should be discarded.

Produce a compact output stream of remaining elements.

Generalization: Stream compaction and expansionFor each element in input stream,let a predicate determine each element’s multiplicity, i.e. how many times theelement should be present in the output stream (0 =⇒ discard)

Produce a compact output stream of all elements with a multiplicity greaterthan 0

Page 2

Page 3: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Why bother with stream compaction and expansion?

I Feature extraction:I get a compact list of all locations satisfying a criterion.

I Point cloud generation:I Generate a set of points on an implicitly defined surface.

I Compaction of intermediate results:I Save GPU → CPU bandwith.

I Emulate/Offload geometry-shader:I Our HP-based Marching Cubes implementation does not need

GS, currently even outperforms GS-based approaches.

I Sparse matrix extraction.

I . . .

Data re-organization is an active field of research (see GPU GemsII and III, etc.)

Page 3

Page 4: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

The HistoPyramid algorithm

I The input stream is laid out over a N × N grid (=tex2D).

I Each input element are subjected to a predicate function:I count = 0 =⇒ discard from output stream.I count = 1 =⇒ keep in output stream.I count > 1 =⇒ repeat in output stream.

I The output of the predicate func forms the HP base layer.I the HP is a mipmap-pyramid of partial sums.I the HP pyramid is built in log2(N) passes.I top element of HP: number of elements in output.

I Then, iterate over output elements:I extraction of an element is done in log2(N) texture lookups.I Each input element can have multiple copies in output.

I No data transfer from GPU to CPU.

Page 4

Page 5: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Overview

Input image Bucket count

0 2 2 02 0 0 20 1 1 00 1 1 0

4 42 2

12

HistoPyramid

(6,5) (4,6) (5,6)(2,6) (3,6) (7,4)(5,2) (0,4) (1,5)(2,1) (2,2) (5,1)

Point list

Predicate HP-builder Extractor

Page 5

Page 6: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Predicate function

I For each input element, determine output stream count.I count is often binary (1/0)

I For different predicates, several base layers can be built:I per base layer, one HP will be built - in parallel.I predicates may overlap if needed!I NV40: 4×RGBA = 16 predicates, G80: 8×RGBA = 32

predicates.

I Example: extract list of edge pixels (Laplace + threshold):

Input data Laplace + threshold︸ ︷︷ ︸predicate

HP base level

Page 6

Page 7: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

HistoPyramid builder

I mipmap-generation, but with sum instead of average:I in effect, each cell counts elements in its sub-pyramid:

1 1 0 11 0 1 00 2 0 11 0 0 0

3 23 1

9

Base level, 4× 4Level 1, 2× 2

Level 2, 1× 1

I top element: total number of elements in base layer =⇒output size.

I Example: HP of the Lena edge-pixels (red = nonzero count):

Page 7

Page 8: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Pointlist extractor

Input: output index used askey index.

0 1 2

3 4 5

6 7 8

Input: Key indices

[0,0],0 [1,0],0 [0,1],0

[3,0],0 [2,1],0 [1,2],0

[1,2],1 [0,3],0 [3,2],0

Output: texcoords & clone ix

Notice:multiplicity from base layer:=1 =⇒ copy once.>1 =⇒ copy multiple times.

9

L2

3 2

3 1

L1

1 1 0 1

1 0 1 0

0 2 0 1

1 0 0 0

L0

∅ [0,1) [1,2) ∅intervals

[0,3) [3,5) [5,8) [8,9)

intervals

∅ [0,2) [2,3) ∅intervals

4

16

1

Page 8

Page 9: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

The Marching Cubes Algorithm

The input to the algorithm is an M3 grid of scalar values.

Examine groups of 2× 2× 2 voxels (MC cell).

Check if MC cell’s corners are inside/outside iso-level.

8 corners, inside/outside=⇒ 256 classes.

Each MC class:combination of edgesthat pierce iso-surface.

Use table with geometryfor MC classes, with allpossible triangulationsof the edge intersections(figure).

Determine exact edge-surface intersections and emit corresponding triangles.

Notice: Effectively a stream compaction/expansion process!Page 9

Page 10: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

HistoPyramid Marching Cubes

Scalar fieldtexture

Vertexcounttexture

HistoPyramidtexture

Triangulationtable

texture

EnumerationVBO

Startnew frame

Updatescalar field

BuildHP base

HPreduce

Vertex countreadback

Rendergeometry

Iso-level

For each level

Input: A stream of (M − 1)3 MC-cells (2x2x2 voxels grouped).

Predicate: Samples and determines MC class viainside/outside-state of MC cell corners, then writes number ofrequired vertices for MC geometry to base layer.

HistoPyramid: Top element gives total number of vertices in theiso-surface (3× the number of triangles).

Extraction: Use output index to traverse HP, determinecorresponding input element (i.e. which MC cell), remainder tellswhich edge intersection this vertex correspond to, determine edgeintersection and emit position.

Page 10

Page 11: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Datasets used in the performance analysis

Bunny CThead MRbrain

Bonsai Aneurism Cayley

Page 11

Page 12: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

Performance of HistoPyramid Marching Cubes

6600GT 7800GT 8800GTX 8800GTXModel MC cells Density HP-VS HP-VS HP-VS NV-SDK10

Bunny 255x255x255 16581375 3.2% – – 538.6 (32.5) –

127x127x127 2048383 5.6% 5.4 (2.6) 11.8 (5.8) 309.5 (151.1) –63x63x63 250047 9.1% 4.0 (16.1) 8.5 (34.1) 163.4 (653.5) 28.3 (113.2)31x31x31 29791 13.6% 2.5 (82.8) 5.0 (167.9) 25.5 (857.0) 21.9 (734.0)

Cth

ead 255x255x128 8323200 3.7% – 16.3 (2.0) 437.6 (53.0) –

127x127x63 1016127 6.3% 5.4 (5.3) 11.6 (11.5) 288.1 (283.6) –63x63x31 123039 9.6% 3.7 (29.9) 7.7 (62.2) 97.3 (791.0) 25.3 (205.9)31x31x15 14415 14.5% 2.3 (161.3) 4.5 (311.5) 12.9 (896.4) 17.1 (1187.0)

mrb

rain 255x255x128 8323200 5.8% – 10.5 (1.3) 309.0 (37.4) –

127x127x63 1016127 7.4% 4.6 (4.5) 9.9 (9.7) 257.7 (263.6) –63x63x31 123039 10.0% 3.5 (28.6) 7.4 (60.0) 96.8 (786.5) 26.4 (214.9)31x31x15 14415 14.9% 2.2 (155.0) 4.3 (300.9) 12.7 (879.7) 18.2 (1257.4)

Bonsa

i 255x255x255 16581375 3.0% – – 560.8 (33.8) –127x127x127 2048383 5.1% 5.9 (2.9) 13.0 (6.3) 329.8 (161.0) –

63x63x63 250047 6.7% 5.4 (21.5) 11.4 (45.5) 186.5 (745.9) 28.9 (115.6)31x31x31 29791 8.2% 4.1 (136.8) 8.0 (268.8) 25.1 (843.0) 24.0 (804.6)

Aneu

rism

255x255x255 16581375 1.6% – – 892.5 (53.8) –127x127x127 2048383 2.1% 12.6 (6.1) 29.1 (14.2) 557.6 (272.2) –

63x63x63 250047 3.7% 9.1 (36.2) 19.2 (76.7) 190.5 (761.9) 32.9 (131.5)31x31x31 29791 6.8% 4.5 (149.7) 8.6 (289.1) 25.0 (839.3) 25.5 (856.6)

Cay

ley 255x255x255 16581375 0.9% – – 1112.3 (67.1) –

127x127x127 2048383 1.9% 13.5 (6.6) 31.2 (15.2) 581.3 (283.8) –63x63x63 250047 3.9% 8.5 (33.9) 17.9 (71.6) 198.0 (791.9) 32.1 (128.5)31x31x31 29791 8.1% 3.7 (123.8) 7.3 (245.8) 25.8 (866.2) 24.7 (827.9)

Numbers in million voxels processed per second (Parentheses: MC runs per second - framerate).

Page 12

Page 13: Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007folk.uio.no/erikd/histo/hpmarchertalk.pdf · Advanced Computer Graphics / Vision Seminar TU Graz 23/10-2007 1University

References:

I C. Dyken, G. Ziegler, C. Theobalt, H.-P. Seidel,GPU Marching Cubes on Shader Model 3.0 and 4.0,MPI-I-2007-4-006, Max-Planck-Institut fur Informatik, 2007

I C. Dyken, J. Seland, and M.Reimers,Real-Time GPU Silhouette Refinement using adaptively blended Bezierpatches,to appear in Graphics Forum, 2007

I I Ihrke, G. Ziegler, A. Tevs, C. Theobalt, M. Magnor, H.-P. SeidelEikonal Rendering: Efficient Light Transport in Refractive Objectsto appear in ACM Trans. on Graphics (Siggraph’07), 2007.

I G. Ziegler, A. Tevs, C. Theobalt, H.-P. Seidel,”GPU Point List Generation through Histogram Pyramids”,MPI-I-2006-4-002, Max-Planck-Institut fur Informatik, 2006.

Page 13