Interactive Geometric Computations using Graphics Processors Naga K. Govindaraju UNC Chapel Hill

Preview:

Citation preview

Interactive Geometric Computations Interactive Geometric Computations using Graphics Processorsusing Graphics Processors

Naga K. GovindarajuNaga K. Govindaraju

UNC Chapel HillUNC Chapel Hill

2

OutlineOutline

• OverviewOverview

• Interactive Collision DetectionInteractive Collision Detection

• Conclusions and Future WorkConclusions and Future Work

3

Geometric ComputationsGeometric Computations

• Well studiedWell studied• Computer graphics, computational Computer graphics, computational

geometry etc.geometry etc.

• Widely used in games, Widely used in games, simulations, virtual reality simulations, virtual reality applicationsapplications

4

GGraphics raphics PProcessing rocessing UUnits (nits (GPUGPUss))

• Well-designed for visibility Well-designed for visibility computationscomputations• Rasterization – image-space visibilityRasterization – image-space visibility

• Massively parallelMassively parallel• Render millions of polygons per secondRender millions of polygons per second

• Well suited for image-based algorithmsWell suited for image-based algorithms

• High growth rateHigh growth rate

5

Recent growth rate of Recent growth rate of GGraphics raphics PProcessing rocessing UUnitsnits

CardCard Million Million triangles/sectriangles/sec

Radeon 9700 ProRadeon 9700 Pro 325325

GeForce FX 5800GeForce FX 5800 350350

Radeon 9800 XTRadeon 9800 XT 412412

GeForce FX 5950GeForce FX 5950 356356

GeForce FX 6800GeForce FX 6800 600600

6

GGraphics raphics PProcessing rocessing UUnits (nits (GPUGPUss))

• Well-designed for visibility Well-designed for visibility computationscomputations• Rasterization – image-space visibilityRasterization – image-space visibility

• Massively parallelMassively parallel• Render millions of polygons per secondRender millions of polygons per second

• Well suited for image-based algorithmsWell suited for image-based algorithms

• High growth rateHigh growth rate

7

GPUs: Geometric ComputationsGPUs: Geometric Computations

• Used for geometric applicationsUsed for geometric applications• Minkowski sums [Minkowski sums [Kim et al. 02Kim et al. 02]]

• CSG rendering [CSG rendering [Goldfeather et al. 89, Rossignac et Goldfeather et al. 89, Rossignac et

al. 90al. 90]]

• Voronoi computation [Voronoi computation [Hoff et al. 01, 02, Sud et al. Hoff et al. 01, 02, Sud et al.

0404]]

• Isosurface computation [Isosurface computation [Pascucci 04Pascucci 04]]

• Map simplification [Map simplification [Mustafa et al. 01Mustafa et al. 01]]

8

GPUs for Geometric Computations: GPUs for Geometric Computations: IssuesIssues• PrecisionPrecision

• Frame-buffer readbacksFrame-buffer readbacks

9

Draw stream of triangles CPUGPU

Visibility of triangles

Vertex Processing Engines

Setup Engine

Stream of visible pixels

Alpha test

Pixel Processing Engines

Stencil test

Depth test

Stream of transformed vertices

Setup of setup commands and state

count of visible pixels

GPU

Stream of vertices

10

IEEE Floating Point (32-bit)

IEEE Floating Point (32-bit)

Limited Resolution!

Draw stream of triangles CPUGPU

Visibility of triangles

Vertex Processing Engines

Setup Engine

Stream of vertices Stream of visible pixels

Alpha test

Pixel Processing Engines

Stencil test

Depth test

Stream of transformed vertices

Setup of setup commands and state

count of visible pixels

GPU

11

Frame-Buffer PrecisionFrame-Buffer Precision

Pixel Processing Engines

Stream of visible pixels

Limited Resolution!

Resolution along X, Y,Z

X – 12 bits fixed precision

Y – 12 bits fixed precision

Z – 24 bits fixed precision

On CPU – 32-bit or 64-bit floating-point precision

12

Frame-Buffer ReadbackFrame-Buffer Readback

• Involve stallsInvolve stalls• Affect throughputAffect throughput

• Slow!Slow!

13

Frame-Buffer Readback Performance

Data Courtesy: www.techreport.comJune 2004

Readback of 1Kx1K frame-buffer takes 18 ms over PCI-ExpressGraphics driver – 61.45

14

Frame-Buffer Read back Frame-Buffer Read back PerformancePerformance

Data Courtesy: www.techreport.com, June 2004

Read back of 1Kx1K frame-buffer takes 18 ms over PCI-ExpressGraphics driver – 61.45

15

GPU Growth Rate

CPU Growth Rate

AGP Bandwidth Growth Rate

Courtesy: Anselmo Lastra

16

OutlineOutline

• OverviewOverview

• Interactive Collision DetectionInteractive Collision Detection

• Conclusions and Future WorkConclusions and Future Work

17

FeaturesFeatures

Interactive collision detectionInteractive collision detectionbetween complex objectsbetween complex objects

•Large number of objectsLarge number of objects

•High primitive countHigh primitive count

•Non-convex objectsNon-convex objects

•Open and closed objectsOpen and closed objects

18

Non-rigid MotionNon-rigid Motion

• Deformable objectsDeformable objects

• Changing topologyChanging topology

• Self-collisionsSelf-collisions

19

Related WorkRelated Work

• Object-space techniquesObject-space techniques

• Image-space techniquesImage-space techniques

20

Object-Space TechniquesObject-Space Techniques

• Broad phase – Compute object pairs in Broad phase – Compute object pairs in close proximity close proximity • Spatial partitioningSpatial partitioning

• Sweep-and-pruneSweep-and-prune

• Narrow phase – Check each pair for exact Narrow phase – Check each pair for exact collision detectioncollision detection• Convex objectsConvex objects

• Spatial partitioningSpatial partitioning

• Bounding volume hierarchiesBounding volume hierarchies

Surveys in Surveys in [[Klosowski 1998, Redon et al. 2002, Lin and Manocha Klosowski 1998, Redon et al. 2002, Lin and Manocha

2003 2003 ]]

21

Limitations of Object-Space Limitations of Object-Space TechniquesTechniques• Considerable pre-processingConsiderable pre-processing

• Hard to achieve real-time Hard to achieve real-time performance on complex deformable performance on complex deformable modelsmodels

22

Collision Detection using Graphics Collision Detection using Graphics HardwareHardware• Primitive rasterization – sorting in Primitive rasterization – sorting in

screen-spacescreen-space• Interference testsInterference tests

23

Image-Space TechniquesImage-Space Techniques

Use of graphics hardwareUse of graphics hardware• CSG rendering [CSG rendering [Goldfeather et al. 1989, Goldfeather et al. 1989,

Rossignac et al. 1990Rossignac et al. 1990]]

• Interferences and cross-sections [Interferences and cross-sections [Shinya and Shinya and Forgue 1991 , Rossignac et al. 1992, Forgue 1991 , Rossignac et al. 1992, Myszkowski 1995, Baciu et al. 1998Myszkowski 1995, Baciu et al. 1998]]

• Minkowski sums [Minkowski sums [Kim et al. 2002Kim et al. 2002]]

• Cloth animation [Cloth animation [Vassilev et al. 2001Vassilev et al. 2001]]

• Virtual Surgery [Virtual Surgery [Lombardo et al. 1999Lombardo et al. 1999]]

• Proximity computation [Proximity computation [Hoff et al. 2001, 2002Hoff et al. 2001, 2002]]

24

Limitations of Image-Space Limitations of Image-Space TechniquesTechniques• Pairs of objectsPairs of objects

• Stencil-based; limited to closed Stencil-based; limited to closed modelsmodels

• Image precisionImage precision

• Frame buffer readbacksFrame buffer readbacks

25

Collision Detection: OutlineCollision Detection: Outline

• OverviewOverview

• Collision Detection: CULLIDECollision Detection: CULLIDE

• Inter- and Intra-Object Collision Inter- and Intra-Object Collision Detection: Quick-CULLIDEDetection: Quick-CULLIDE

• Reliable Collision Detection: FARReliable Collision Detection: FAR

• AnalysisAnalysis

26

OverviewOverview

• Potentially Colliding Set (PCS) Potentially Colliding Set (PCS) computationcomputation

• Exact collision tests on the PCSExact collision tests on the PCS

27

AlgorithmAlgorithm

Object-LevelPruning

Sub-object-Level

PruningExact Tests

GPU-based PCS computation Using CPU

28

Potentially Colliding Set (PCS)Potentially Colliding Set (PCS)

PCS

29

Potentially Colliding Set (PCS)Potentially Colliding Set (PCS)

PCS

30

AlgorithmAlgorithm

Object-LevelPruning

Sub-object- Level

PruningExact Tests

31

Visibility ComputationsVisibility Computations

Lemma 1: Lemma 1: An object O does not An object O does not collide with a set of objects S if O is collide with a set of objects S if O is fully visible with respect to Sfully visible with respect to S

32

Visibility of ObjectsVisibility of Objects

• An object is fully An object is fully visible if it is visible if it is completely in front of completely in front of the remaining objectsthe remaining objects

O1

O

View

O2

33

Visibility for Collisions: Geometric Visibility for Collisions: Geometric InterpretationInterpretation

Sufficient but not a necessary condition for existence of separating surface with unit depth complexity

O1

O

View

O2

34

PCS PruningPCS Pruning

Lemma 2:Lemma 2: Given n objects Given n objectsOO11,O,O22,…,O,…,On n , an object O, an object Oii does not does notbelong to PCS if it does notbelong to PCS if it does notcollide with Ocollide with O11,…,O,…,Oi-1i-1,O,Oi+1i+1,…,O,…,Onn

• Prune objects that do not collidePrune objects that do not collide

35

PCS PruningPCS Pruning

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnnOO1 1 O O2 2 … O … Oi-1i-1 OOii OOi+1i+1 … O … On-1 n-1 OOnnOO1 1 O O2 2 … O … Oi-1 i-1 OOii OOi+1i+1 … O … On-1 n-1 OOnn

36

PCS ComputationPCS Computation

• Each object tested against all Each object tested against all objects but itselfobjects but itself

• Naive algorithm is O(nNaive algorithm is O(n22))

• Linear time algorithmLinear time algorithm• Uses two pass rendering approachUses two pass rendering approach

• Conservative solutionConservative solution

37

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

Render

38

OO1 1 O O2 2 … O … Oi-1 i-1 OOii

PCS Computation: First PassPCS Computation: First Pass

Fully Visible?

Render Yes. Does not collide withO1,O2,…,Oi-1

39

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1n-1 OOnn

Render

Fully Visible?

40

PCS Computation: Second PassPCS Computation: Second Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

Render

41

PCS Computation: Second PassPCS Computation: Second Pass

Render

Fully Visible?

OOii OOi+1i+1 … O … On-1 n-1 OOnn

Yes. Does not collide with Oi+1,…,On-1,On

42

PCS Computation: Second PassPCS Computation: Second Pass

Render

Fully Visible?

OO1 1 OO2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

43

PCS ComputationPCS Computation

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

Fully VisibleFully Visible

44

PCS ComputationPCS Computation

OO1 1 OO22 O O3 3 … O… Oi-1 i-1 OOi i OOi+1 i+1 … O… On-2n-2 O On-1 n-1 OOnn

OO1 1 O O3 3 … O … Oi-1 i-1 OOi+1i+1 … O … On-1n-1

45

AlgorithmAlgorithm

Object-LevelPruning

Sub-object- Level

PruningExact Tests

46

CULLIDE AlgorithmCULLIDE Algorithm

Object-LevelPruning

Sub-object-Level

PruningExact Tests

Exact overlap tests using CPU

47

Full Visibility Queries on GPUsFull Visibility Queries on GPUs

• We require a queryWe require a query• Tests if a primitive is Tests if a primitive is fully visiblefully visible or not or not

• Current hardware supports occlusion Current hardware supports occlusion queriesqueries• Test if only Test if only partpart of a primitive is of a primitive is visiblevisible or not or not

• Our solutionOur solution• Change the sign of the depth functionChange the sign of the depth function

48

Full Visibility Queries on GPUsFull Visibility Queries on GPUs

Depth functionDepth function

GEQUALGEQUAL LESSLESSAll fragmentsAll fragments PassPass FailFailPassPass

FailFail

FailFail

PassPassFailFail PassPassFailFail

Query not supported

Occlusion query

Examples - HP_Occlusion_test, NV_occlusion_query

49

Bandwidth AnalysisBandwidth Analysis

• Read back only integer identifiersRead back only integer identifiers• Computation at high screen resolutionsComputation at high screen resolutions

50

Live Demo: CULLIDELive Demo: CULLIDE

• LaptopLaptop• 1.6 GHz Pentium IV CPU1.6 GHz Pentium IV CPU

• NVIDIA GeForce FX 700 GoGLNVIDIA GeForce FX 700 GoGL

• AGP 4XAGP 4X

51

Live Demo: CULLIDELive Demo: CULLIDE

• EnvironmentEnvironment• Dragon – 250K polygonsDragon – 250K polygons

• Bunny – 35K polygonsBunny – 35K polygons

• Average frame rate – Average frame rate – 1515 frames per frames per second!second!

52

Interactive Collision Detection: Interactive Collision Detection: OutlineOutline• OverviewOverview

• Collision Detection: CULLIDECollision Detection: CULLIDE

• Inter- and Intra-Object Collision Inter- and Intra-Object Collision Detection: Quick-CULLIDEDetection: Quick-CULLIDE

• Reliable Collision Detection: FARReliable Collision Detection: FAR

• AnalysisAnalysis

53

Quick-CULLIDEQuick-CULLIDE

• Improved two-pass algorithmImproved two-pass algorithm

• Utilize visibility relationships among Utilize visibility relationships among objects across different viewsobjects across different views

54

Quick-CULLIDE: Visibility SetsQuick-CULLIDE: Visibility Sets

• Decompose PCS into four disjoint Decompose PCS into four disjoint setssets• FFV (First pass Fully Visible)FFV (First pass Fully Visible)

• SFV (Second pass Fully Visible)SFV (Second pass Fully Visible)

• NFV (Not Fully Visible in either passes)NFV (Not Fully Visible in either passes)

• BFV (Both passes Fully Visible)BFV (Both passes Fully Visible)

• Visibility sets have five interesting Visibility sets have five interesting properties!properties!

55

Visibility Sets: PropertiesVisibility Sets: Properties

Lemma 1: FFV and SFV are Lemma 1: FFV and SFV are collision-free setscollision-free sets

56

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOii … … OOjj … O … On-1 n-1 OOnn

Render

57

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1i-1 OOi i …… OOjj … O … On-1 n-1 OOnn

Render

Fully Visible

58

Visibility Sets: PropertiesVisibility Sets: Properties

Lemma 2: It is sufficient to test Lemma 2: It is sufficient to test visibility of objects in FFV in second visibility of objects in FFV in second pass onlypass only

59

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

60

OO1 1 O O2 2 … O … Oi-1i-1 OOii

PCS Computation: First PassPCS Computation: First Pass

Render

61

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

Not Colliding

Collision tested in Second pass

62

Visibility Sets: PropertiesVisibility Sets: Properties

Lemma 3: It is sufficient to render Lemma 3: It is sufficient to render objects in FFV in first pass only!objects in FFV in first pass only!

63

PCS Computation: First PassPCS Computation: First Pass

OO1 1 O O2 2 … O … Oi-1 i-1 OOi i OOi+1i+1 … O … On-1 n-1 OOnn

64

OO1 1 O O2 2 … O … Oi-1i-1 OOii

PCS Computation: First PassPCS Computation: First Pass

Render

65

PCS ComputationPCS Computation

OO1 1 O O2 2 … O … Oi-1i-1 OOii OOi+1i+1 … O … On-1 n-1 OOnn

Not Colliding

Render

66

Visibility Sets: PropertiesVisibility Sets: Properties

Lemma 4: It is sufficient to test the Lemma 4: It is sufficient to test the visibility of objects in SFV in first visibility of objects in SFV in first pass only!pass only!

67

Visibility Sets: PropertiesVisibility Sets: Properties

Lemma 5: It is sufficient to Lemma 5: It is sufficient to render objects in SFV in second render objects in SFV in second pass only!pass only!

68

Quick-CULLIDE: AdvantagesQuick-CULLIDE: Advantages

• Better culling efficiency Better culling efficiency • Lower depth complexity than CULLIDELower depth complexity than CULLIDE

• Always better than CULLIDEAlways better than CULLIDE

• Faster computational performanceFaster computational performance• Lower number of visibility queries and Lower number of visibility queries and

rendering operations rendering operations

• Can handle self-collisionsCan handle self-collisions

69

Self-Collisions: DefinitionSelf-Collisions: Definition

• Pairs of overlapping triangles in an Pairs of overlapping triangles in an object that are not neighboringobject that are not neighboring

70

Self-Collisions: DefinitionSelf-Collisions: Definition

• Pairs of overlapping triangles in an Pairs of overlapping triangles in an object that are not neighboringobject that are not neighboring

71

Self-CollisionsSelf-Collisions

• Occur in most deformable Occur in most deformable simulationssimulations

Image Courtesy: Baraff and Witkin, SIGGRAPH 2003

Artifacts

72

Our SolutionOur Solution

• Classification of contacts between Classification of contacts between triangles in an objecttriangles in an object• Touching contactsTouching contacts

• Penetrating contactsPenetrating contacts

73

Contacts: ClassificationContacts: Classification

(a) (b) (c)

Touching Contacts Penetrating Contact

74

SolutionSolution

• Ignore touching contactsIgnore touching contacts• Consider only penetrating contactsConsider only penetrating contacts

• Redefine fully visibleRedefine fully visible• We pass a fragment when a touching contact We pass a fragment when a touching contact

occursoccurs

• Pass all fragments with depth ≤ corresponding Pass all fragments with depth ≤ corresponding depths in frame-bufferdepths in frame-buffer

75

Live Demo: Quick-CULLIDELive Demo: Quick-CULLIDE

• LaptopLaptop• 1.6 GHz Pentium IV CPU1.6 GHz Pentium IV CPU

• NVIDIA GeForce FX 700 GoGLNVIDIA GeForce FX 700 GoGL

• AGP 4XAGP 4X

76

Live Demo: Cloth SimulationLive Demo: Cloth Simulation

• Cloth – 20K trianglesCloth – 20K triangles

• Average frame rate – Average frame rate – 1313 frames per frames per second!second!

77

Interactive Collision Detection: Interactive Collision Detection: OutlineOutline• OverviewOverview

• Collision Detection: CULLIDECollision Detection: CULLIDE

• Self-Collision Detection: S-CULLIDESelf-Collision Detection: S-CULLIDE

• Reliable Collision Detection: FARReliable Collision Detection: FAR

• AnalysisAnalysis

78

Inaccuracies in GPU-Based Inaccuracies in GPU-Based AlgorithmsAlgorithms• Image samplingImage sampling

• Depth buffer precisionDepth buffer precision

79

Image SamplingImage Sampling

• Occurs when a primitive is nearly parallel Occurs when a primitive is nearly parallel to view directionto view direction

80

Image SamplingImage Sampling

• Primitives are rasterized but no Primitives are rasterized but no intersecting points are sampled by intersecting points are sampled by hardwarehardware

Viewport

C = pixel center

Intersecting point

81

Depth Buffer PrecisionDepth Buffer Precision

• Intersecting points are sampled but Intersecting points are sampled but precision is not sufficientprecision is not sufficient

Viewport

C = pixel center

Intersecting point

T1

82

Our SolutionOur Solution

• Sufficiently fatten the trianglesSufficiently fatten the triangles

• Use Minkowski sumsUse Minkowski sums

Minkowski Sum AMinkowski Sum AB B = A = A B B

= {a + b: a = {a + b: a A, b A, b B}B}

83

PL

Minkowski Sum: ExampleMinkowski Sum: Example

P L

84

ReliabilityReliability

Lemma 1: Under orthographic Lemma 1: Under orthographic transformation O, the rasterization of transformation O, the rasterization of Minkowski sum QMinkowski sum Qss = Q = Q S, where Q is a S, where Q is a point in 3-D space that projects inside a point in 3-D space that projects inside a pixel X and S is a sphere bounding a pixel pixel X and S is a sphere bounding a pixel centered at the origin, generates two centered at the origin, generates two samples for X that bound the depth value samples for X that bound the depth value of Q. of Q.

85

ReliabilityReliability

Under orthographic transformation OUnder orthographic transformation O, , the rasterization of the rasterization of Minkowski sum QMinkowski sum Qss = Q = Q S, where Q is a point in 3-D S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q.least two fragments bounding the depth value of Q.

z

x

86

ReliabilityReliability

z

x

Q

Under orthographic transformation O, the rasterization of Under orthographic transformation O, the rasterization of Minkowski sum QMinkowski sum QSS = Q = Q S, where S, where Q is a point in 3-D space Q is a point in 3-D space that projects inside a pixel Xthat projects inside a pixel X and S is a sphere centered at and S is a sphere centered at origin bounding a pixel, samples X with at least two origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q.fragments bounding the depth value of Q.

Pixel X

87

S

ReliabilityReliability

Under orthographic transformation O, the rasterization of Under orthographic transformation O, the rasterization of Minkowski sum QMinkowski sum QBB = Q S, where Q is a point in 3-D = Q S, where Q is a point in 3-D space that projects inside a pixel X andspace that projects inside a pixel X and S is a sphere S is a sphere centered at origin bounding a pixelcentered at origin bounding a pixel, , samples X with at least samples X with at least two fragments bounding the depth value of Q.two fragments bounding the depth value of Q.

z

x

Q

Pixel X

88

S

ReliabilityReliability

z

x

Q

Under orthographic transformation O, the rasterization ofUnder orthographic transformation O, the rasterization of Minkowski sum QMinkowski sum QSS = Q S = Q S, , where Q is a point in 3-D space where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two at origin bounding a pixel, samples X with at least two

fragments bounding the depth value of Q.fragments bounding the depth value of Q.

89

S

ReliabilityReliability

z

x

Q

Under orthographic transformation O, the rasterization ofUnder orthographic transformation O, the rasterization of Minkowski sum QMinkowski sum QSS = Q S = Q S, , where Q is a point in 3-D space where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two at origin bounding a pixel, samples X with at least two

fragments bounding the depth value of Q.fragments bounding the depth value of Q.

90

Under orthographic transformation O, Under orthographic transformation O, the rasterization ofthe rasterization of Minkowski sum QMinkowski sum QSS = Q S = Q S, , where Q is a point in 3-D space where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, at origin bounding a pixel, samples X with at least two samples X with at least two

fragmentsfragments bounding the depth value of Q. bounding the depth value of Q. S

ReliabilityReliability

z

x

Q

Sample Depths

91

Under orthographic transformation O, the rasterization of Under orthographic transformation O, the rasterization of Minkowski sum QMinkowski sum QSS = Q S = Q S, , where Q is a point in 3-D where Q is a point in 3-D space that projects inside a pixel X and S is a sphere space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at centered at origin bounding a pixel, samples X with at

least two fragments boundingleast two fragments bounding the depth value of Q the depth value of Q.. S

ReliabilityReliability

z

x

Q

Sample Depths

92

ReliabilityReliability

Lemma 2: Given a primitive P and its Lemma 2: Given a primitive P and its Minkowski sum Minkowski sum PPss = P = P S. Let X be a pixel partly or fully S. Let X be a pixel partly or fully covered by the orthographic projection of P. covered by the orthographic projection of P. P Pxx = {p P, p projects inside X}, = {p P, p projects inside X},

Min-Depth(P, X) = Minimum depth value in PMin-Depth(P, X) = Minimum depth value in Pxx

Max-Depth(P, X) =Maximum depth value in PMax-Depth(P, X) =Maximum depth value in Pxx. .

The rasterization of PThe rasterization of PxxSS generates at least two generates at least two

fragments whose depth values bound both fragments whose depth values bound both Min-Depth(P, X) and Max-Depth(P, X) for each Min-Depth(P, X) and Max-Depth(P, X) for each pixel X.pixel X.

93

ReliabilityReliability

z

x

Given a primitive PGiven a primitive P

P

94

ReliabilityReliability

z

x

PPxx is the portion of P projecting inside pixel X is the portion of P projecting inside pixel X

Pixel X

Px

95

S

ReliabilityReliability

z

x

S is a sphere centered at origin bounding pixel XS is a sphere centered at origin bounding pixel X

Pixel X

Px

96

ReliabilityReliability

z

x

If we compute Minkowski sum PIf we compute Minkowski sum PxxSS= P= Px x S, S,

Pixel X

PxS

Px

97

ReliabilityReliability

z

x

Px

then the rasterization of the Minkowski sum Pthen the rasterization of the Minkowski sum PxxSS

generates two fragmentsgenerates two fragments

Pixel X

PxS

Sample Depths

98

ReliabilityReliability

z

x

Px

and the fragments bound depth values in Pand the fragments bound depth values in Pxx

Pixel X

PxS

Sample Depths

99

ReliabilityReliability

Theorem 1: Given the Minkowski sum Theorem 1: Given the Minkowski sum of two primitives with S, Pof two primitives with S, P11

SS and P and P22SS. If . If

PP11 and P and P22 overlap, then a rasterization overlap, then a rasterization of their Minkowski sums under of their Minkowski sums under orthographic projection overlaps in orthographic projection overlaps in the viewport.the viewport.

100

ReliabilityReliability

z

x

P1

Given two primitives PGiven two primitives P11 and P and P22

Pixel X

P2

101

ReliabilityReliability

d

x

P1

If PIf P11 and P and P22 intersect in 3-D, intersect in 3-D,

Pixel X

P2

P1 and P2 intersect in 3-D

102

ReliabilityReliability

d

x

P1

and we compute their Minkowski sums with a pixel-sized and we compute their Minkowski sums with a pixel-sized

sphere centered at originsphere centered at origin

Pixel X

P2

103

ReliabilityReliability

z

x

P1

rasterization of the Minkowski sums overlap in image-spacerasterization of the Minkowski sums overlap in image-space

Pixel X

P2

104

ReliabilityReliability

Corollary 1: Given the Minkowski sum of Corollary 1: Given the Minkowski sum of two primitives with B, Ptwo primitives with B, P11

SS and P and P22SS. If a . If a

rasterization of Prasterization of P11SS and P and P22

SS under under orthographic projection do not overlap in orthographic projection do not overlap in the viewport, then Pthe viewport, then P11 and P and P22 do not do not overlap in 3-D.overlap in 3-D.

Useful in Collision Culling: apply fattened Useful in Collision Culling: apply fattened primitives Pprimitives P11

SS in CULLIDE in CULLIDE

105

z

x

Pixel X

P1

P2

P1S

P2S

106

Bounding Offsets of a TriangleBounding Offsets of a Triangle

• Exact OffsetsExact Offsets• Three edge-aligned cylinders, three spheres, Three edge-aligned cylinders, three spheres,

two trianglestwo triangles

• Can be rendered using fragment programsCan be rendered using fragment programs

• Expensive!Expensive!

• Oriented Bounding Box (OBB)Oriented Bounding Box (OBB)

107

(c)(a) (d)(b)

OBB ConstructionOBB Construction

108

Union of OBBsUnion of OBBs

109

Live Demo: FARLive Demo: FAR

• LaptopLaptop• 1.6 GHz Pentium IV CPU1.6 GHz Pentium IV CPU

• NVIDIA GeForce FX 700 GoGLNVIDIA GeForce FX 700 GoGL

• AGP 4XAGP 4X

110

Live Demo: FARLive Demo: FAR

• EnvironmentEnvironment• Tree – 4000 trianglesTree – 4000 triangles

• Leaf – 200 triangles, 200 leavesLeaf – 200 triangles, 200 leaves

• Scene – 44K trianglesScene – 44K triangles

• Average frame rate – Average frame rate – 1515 frames per frames per second!second!

111

Interactive Collision Detection: Interactive Collision Detection: OutlineOutline• OverviewOverview• Collision Detection: CULLIDECollision Detection: CULLIDE• Self-Collision Detection: S-CULLIDESelf-Collision Detection: S-CULLIDE• Reliable Collision Detection: FARReliable Collision Detection: FAR• AnalysisAnalysis

• PerformancePerformance

• Pruning efficiencyPruning efficiency

• PrecisionPrecision

112

Analysis: PerformanceAnalysis: Performance

• Based on pruning algorithm in Based on pruning algorithm in CULLIDECULLIDE

• FactorsFactors• Output sizeOutput size

• Rasterization optimizationsRasterization optimizations

• Number of objectsNumber of objects

• Number of triangles per objectNumber of triangles per object

• Image resolutionImage resolution

113

Analysis: PerformanceAnalysis: Performance

Collision timevs.number of objects

NV30 GPUPentium IV 2GHz CPU

(in m

s)

114

Analysis: PerformanceAnalysis: Performance

Collision timevs.number of polygons

NV30 GPUPentium IV 2 GHz CPU

(in m

s)

115

Analysis: PerformanceAnalysis: Performance

Collision timevs.screen resolution

NV30 GPUPentium IV 2 GHzCPU

(in m

s)

116

Analysis: Pruning EfficiencyAnalysis: Pruning Efficiency

• Input complexityInput complexity

• Relative object configurationsRelative object configurations

• Pruning efficiency in Pruning efficiency in • Object-Level CullingObject-Level Culling

• Subobject-Level CullingSubobject-Level Culling

117

Comparison: FAR and I-COLLIDEComparison: FAR and I-COLLIDE

118

Analysis: AccuracyAnalysis: Accuracy

• CULLIDE and S-CULLIDE: Image CULLIDE and S-CULLIDE: Image resolutionresolution

• FAR: IEEE 32-bit floating-point FAR: IEEE 32-bit floating-point precisionprecision

• Comparison:Comparison:• FAR vs. CULLIDEFAR vs. CULLIDE

119

Accuracy: FAR vs. CULLIDEAccuracy: FAR vs. CULLIDE

120

OutlineOutline

• OverviewOverview

• Interactive Collision DetectionInteractive Collision Detection

• Conclusions and Future WorkConclusions and Future Work

121

ConclusionsConclusions

• Designed efficient geometric Designed efficient geometric algorithms on GPUsalgorithms on GPUs• interactive collision detectioninteractive collision detection

• Applied them to complex 3-D Applied them to complex 3-D environmentsenvironments

• Compared to prior state-of-the-Compared to prior state-of-the-art algorithmsart algorithms• Significant speedups in some casesSignificant speedups in some cases

122

AdvantagesAdvantages

• GeneralityGenerality

• AccuracyAccuracy• Image-precision for shadow generation Image-precision for shadow generation

algorithmsalgorithms

• IEEE 32-bit floating-point precision for collision IEEE 32-bit floating-point precision for collision computationscomputations

• Low BandwidthLow Bandwidth• No readbacksNo readbacks

123

AdvantagesAdvantages

• Significant CullingSignificant Culling

• PracticalityPracticality• Designed on commodity hardwareDesigned on commodity hardware

• Assumes availability of occlusion queriesAssumes availability of occlusion queries

124

LimitationsLimitations

• PrecisionPrecision• Shadow and self-collision algorithms are Shadow and self-collision algorithms are

limited by image-precisionlimited by image-precision

• Accuracy can be improvedAccuracy can be improved

• Pair computationPair computation• Algorithms compute potential setsAlgorithms compute potential sets

125

Thank YouThank You

• Questions or Comments?Questions or Comments?

naga@cs.unc.edu

http://gamma.cs.unc.edu/CULLIDEhttp://gamma.cs.unc.edu/RCULLIDEhttp://gamma.cs.unc.edu/QCULLIDE

Recommended