31
Optimization of Mesh Optimization of Mesh Locality for Locality for Transparent Vertex Transparent Vertex Caching Caching Hugues Hoppe Hugues Hoppe Microsoft Research Microsoft Research SIGGRAPH 99 SIGGRAPH 99

Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Optimization of Mesh Locality forOptimization of Mesh Locality forTransparent Vertex CachingTransparent Vertex Caching

Optimization of Mesh Locality forOptimization of Mesh Locality forTransparent Vertex CachingTransparent Vertex Caching

Hugues HoppeHugues Hoppe

Microsoft ResearchMicrosoft Research

SIGGRAPH 99SIGGRAPH 99

Hugues HoppeHugues Hoppe

Microsoft ResearchMicrosoft Research

SIGGRAPH 99SIGGRAPH 99

Page 2: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Triangle meshesTriangle meshesTriangle meshesTriangle meshes

Page 3: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

geometricgeometricprocessingprocessinggeometricgeometricprocessingprocessing

rasterizationrasterizationrasterizationrasterization

graphics processorgraphics processorgraphics processorgraphics processor

frame bufferframe bufferframe bufferframe buffer

busbusbusbus(e.g.(e.g. AGP) AGP)(e.g.(e.g. AGP) AGP)

texturetextureimageimagetexturetextureimageimage

............texturetextureimageimagetexturetextureimageimage

geometrygeometrygeometrygeometry

verticesverticesverticesvertices facesfacesfacesfaces

mesh in memorymesh in memorymesh in memorymesh in memory

System architectureSystem architectureSystem architectureSystem architecture

texturetexturecachecachetexturetexturecachecache

busbusbusbus

CPUCPUCPUCPU

L2 cacheL2 cacheL2 cacheL2 cache

L1 cacheL1 cacheL1 cacheL1 cache

bottleneckbottleneckbottleneckbottleneck

verticesverticesverticesvertices

????

Page 4: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Previous workPrevious workPrevious workPrevious work

16-entry FIFO buffer16-entry FIFO buffer [Deering95, Chow97][Deering95, Chow97]

stack bufferstack buffer [BarYehuda-Gotsman96][BarYehuda-Gotsman96]

mesh compressionmesh compression [Taubin-Rossignac98][Taubin-Rossignac98][Gumhold-Strasser98][Gumhold-Strasser98]……

16-entry FIFO buffer16-entry FIFO buffer [Deering95, Chow97][Deering95, Chow97]

stack bufferstack buffer [BarYehuda-Gotsman96][BarYehuda-Gotsman96]

mesh compressionmesh compression [Taubin-Rossignac98][Taubin-Rossignac98][Gumhold-Strasser98][Gumhold-Strasser98]……

compressedcompressedgeometry streamgeometry stream

compressedcompressedgeometry streamgeometry stream

geometricgeometricprocessingprocessinggeometricgeometricprocessingprocessing

graphics processorgraphics processorgraphics processorgraphics processor

busbusbusbus

meshmeshbufferbuffermeshmeshbufferbuffer

parsingparsinglogiclogic

parsingparsinglogiclogicv1v1 cc v2-v1v2-v1 cc

v3-v2v3-v2 cc v4-v3v4-v3 ccv5-v4v5-v4 cc cc

cc

nθ nθ

Page 5: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Previous workPrevious workPrevious workPrevious work

Drawbacks:Drawbacks: only static geometryonly static geometry new APInew API not backward compatiblenot backward compatible

Drawbacks:Drawbacks: only static geometryonly static geometry new APInew API not backward compatiblenot backward compatible

compressedcompressedgeometry streamgeometry stream

compressedcompressedgeometry streamgeometry stream

geometricgeometricprocessingprocessinggeometricgeometricprocessingprocessing

graphics processorgraphics processorgraphics processorgraphics processor

busbusbusbus

meshmeshbufferbuffermeshmeshbufferbuffer

parsingparsinglogiclogic

parsingparsinglogiclogicv1v1 cc v2-v1v2-v1 cc

v3-v2v3-v2 cc v4-v3v4-v3 ccv5-v4v5-v4 cc cc

cc

Page 6: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Our approachOur approachOur approachOur approach

texturetextureimageimagetexturetextureimageimage

......texturetexturecachecachetexturetexturecachecachetexturetexture

imageimagetexturetextureimageimage

geometricgeometricprocessingprocessinggeometricgeometricprocessingprocessing

rasterizationrasterizationrasterizationrasterization

graphics processorgraphics processorgraphics processorgraphics processor

busbusbusbus

vertexvertexcachecachevertexvertexcachecache

traditionaltraditionalmesh APImesh APItraditionaltraditionalmesh APImesh API

vertexvertexarrayarrayvertexvertexarrayarray

indexedindexedstripsstrips

indexedindexedstripsstrips

Optimize ordering!Optimize ordering!Optimize ordering!Optimize ordering!

No explicit cache managementNo explicit cache managementNo explicit cache managementNo explicit cache management

Page 7: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

applicationapplicationapplicationapplication

Transparent vertex cachingTransparent vertex cachingTransparent vertex cachingTransparent vertex caching

Pros:Pros: animated geometryanimated geometry application program unchangedapplication program unchanged backward compatible on legacy hardwarebackward compatible on legacy hardware

Cons:Cons: less compression (but still a factor ~2)less compression (but still a factor ~2)

Pros:Pros: animated geometryanimated geometry application program unchangedapplication program unchanged backward compatible on legacy hardwarebackward compatible on legacy hardware

Cons:Cons: less compression (but still a factor ~2)less compression (but still a factor ~2)

traditional mesh APItraditional mesh APItraditional mesh APItraditional mesh API

vertexvertexarrayarrayvertexvertexarrayarray

indexedindexedstripsstrips

indexedindexedstripsstrips

graphicsgraphicssystemsystem

graphicsgraphicssystemsystem

Page 8: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

1111

2222

3333

44 5555

6666

7777

Indexed triangle stripsIndexed triangle stripsIndexed triangle stripsIndexed triangle strips

v1v1v1v1v2v2v2v2v3v3v3v3v4v4v4v4v5v5v5v5v6v6v6v6v7v7v7v7

1111 2222 33334444 3333 55556666

2222 7777 44445555

positionpositionx y zx y z

positionpositionx y zx y z

normalnormalnx ny nznx ny nznormalnormal

nx ny nznx ny nzcolorcolorrgbargbacolorcolorrgbargba

texture1 texture1 u vu v

texture1 texture1 u vu v

texture2 texture2 u vu v

texture2 texture2 u vu v

~ 32 bytes~ 32 bytes~ 32 bytes~ 32 bytes

~ 2 bytes~ 2 bytes~ 2 bytes~ 2 bytes

Page 9: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Cache parametersCache parametersCache parametersCache parameters

size size ??size size ??

replacementreplacementpolicy policy ??

replacementreplacementpolicy policy ??

vertex cachevertex cachevertex cachevertex cache

16 entries16 entries16 entries16 entries

FIFOFIFOFIFOFIFO

Page 10: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Vertex data accessVertex data accessVertex data accessVertex data access

= cache hit= cache hit= cache hit= cache hit = cache miss= cache miss= cache miss= cache miss

traditional stripstraditional stripstraditional stripstraditional strips with cachingwith cachingwith cachingwith caching

transfer ~transfer ~0.5 0.5 vertex/trivertex/tritransfer ~transfer ~0.5 0.5 vertex/trivertex/triassume in cacheassume in cacheassume in cacheassume in cache

transfer ~transfer ~1.0 1.0 vertex/trivertex/tritransfer ~transfer ~1.0 1.0 vertex/trivertex/tri

Page 11: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

# misses# misses# misses# misses 0000 1111 2222 3333

Vertex data accessVertex data accessVertex data accessVertex data access

traditional stripstraditional stripstraditional stripstraditional strips with cachingwith cachingwith cachingwith caching

transfer ~transfer ~0.5 0.5 vertex/trivertex/tritransfer ~transfer ~0.5 0.5 vertex/trivertex/tritransfer ~transfer ~1.0 1.0 vertex/trivertex/tritransfer ~transfer ~1.0 1.0 vertex/trivertex/tri

Page 12: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ExampleExampleExampleExamplebefore optimizationbefore optimizationbefore optimizationbefore optimization

# misses# misses# misses# misses 0000 1111 2222 3333

after optimizationafter optimizationafter optimizationafter optimization

47% bandwidth reduction47% bandwidth reduction47% bandwidth reduction47% bandwidth reduction

Page 13: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Optimization problemOptimization problemOptimization problemOptimization problem

Given mesh,Given mesh,find strips minimizing bus bandwidthfind strips minimizing bus bandwidth

( strips correspond to ordering of faces ( strips correspond to ordering of faces F F ))

Given mesh,Given mesh,find strips minimizing bus bandwidthfind strips minimizing bus bandwidth

( strips correspond to ordering of faces ( strips correspond to ordering of faces F F ))

232

FbFrFPF )ˆ(

min 232

FbFrFPF )ˆ(

min

cache miss ratecache miss ratecache miss ratecache miss rate # vertex indices# vertex indices# vertex indices# vertex indices

Page 14: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Two reordering techniquesTwo reordering techniquesTwo reordering techniquesTwo reordering techniques

Greedy strip-growingGreedy strip-growing fast: 40,000 faces/secfast: 40,000 faces/sec

Local optimizationLocal optimization improve initial greedy solutionimprove initial greedy solution

very slowvery slow

Greedy strip-growingGreedy strip-growing fast: 40,000 faces/secfast: 40,000 faces/sec

Local optimizationLocal optimization improve initial greedy solutionimprove initial greedy solution

very slowvery slow

Page 15: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Greedy strip-growingGreedy strip-growingGreedy strip-growingGreedy strip-growing

Inspired by [Chow97]Inspired by [Chow97] Inspired by [Chow97]Inspired by [Chow97]

To decide when to restart strip,To decide when to restart strip, perform lookahead cache simulations perform lookahead cache simulations

To decide when to restart strip,To decide when to restart strip, perform lookahead cache simulations perform lookahead cache simulations

11

2233

44

Page 16: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

When to restart strip?When to restart strip?When to restart strip?When to restart strip?

good strip lengthgood strip lengthgood strip lengthgood strip length

22 1133

33 2244

11

44 33

22 11

44

33 22 11

44 33 22

11

44 33

22 11

44

33 22 11

(cache size 4)(cache size 4)(cache size 4)(cache size 4)

Page 17: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

When to restart strip?When to restart strip?When to restart strip?When to restart strip?

good strip lengthgood strip lengthgood strip lengthgood strip length strip too longstrip too longstrip too longstrip too long

44

33 22 11

22 1133

33 2244

11

44 33

22 11

44

33 22 11

44 33 22

11

44 33

22

11

33 11

22

44

44 22

33 11

33 11

44 22

44 22

33 11

jump in miss rate!jump in miss rate! jump in miss rate!jump in miss rate!

(cache size 4)(cache size 4)(cache size 4)(cache size 4)

Page 18: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Lookahead simulationsLookahead simulationsLookahead simulationsLookahead simulations

Perform Perform ss simulations simulations Perform Perform ss simulations simulations

44

33 22 11

(a)(a) restart immediately, after 0 faces restart immediately, after 0 faces(a)(a) restart immediately, after 0 faces restart immediately, after 0 faces(b)(b) restart after restart after 0 < i < s0 < i < s faces faces(b)(b) restart after restart after 0 < i < s0 < i < s faces faces

If If (a)(a) is best, restart strip is best, restart strip If If (a)(a) is best, restart strip is best, restart strip

Page 19: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ResultResultResultResult

traditional long stripstraditional long stripstraditional long stripstraditional long strips

face orderface orderface orderface order

within stripwithin stripwithin stripwithin strip

strip restartstrip restartstrip restartstrip restart

Page 20: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ResultResultResultResult

traditional long stripstraditional long stripstraditional long stripstraditional long strips greedy strip-growinggreedy strip-growinggreedy strip-growinggreedy strip-growing

Page 21: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ResultResultResultResultbeforebeforebeforebefore afterafterafterafter

45.8 bytes/triangle45.8 bytes/triangle45.8 bytes/triangle45.8 bytes/triangle 25.5 bytes/triangle25.5 bytes/triangle25.5 bytes/triangle25.5 bytes/triangle

Page 22: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Local optimizationLocal optimizationLocal optimizationLocal optimization

Apply perturbations to face ordering if cost is lowered:Apply perturbations to face ordering if cost is lowered:Apply perturbations to face ordering if cost is lowered:Apply perturbations to face ordering if cost is lowered:

Initial order FInitial order FInitial order FInitial order F

FF1..x-11..x-1FF1..x-11..x-1 FFy+1..my+1..mFFy+1..my+1..mF’=ReflectF’=Reflectx,yx,y(F)(F)F’=ReflectF’=Reflectx,yx,y(F)(F)

FF1..x-11..x-1FF1..x-11..x-1 FFy+1..my+1..mFFy+1..my+1..mF’=Insert1F’=Insert1x,yx,y(F)(F)F’=Insert1F’=Insert1x,yx,y(F)(F) FFyyFFyy

FF1..x-11..x-1FF1..x-11..x-1 FFy+1..my+1..mFFy+1..my+1..mF’=Insert2F’=Insert2x,yx,y(F)(F)F’=Insert2F’=Insert2x,yx,y(F)(F) FFy-1..yy-1..yFFy-1..yy-1..y

FFyyFFyy

xxxx yyyy

FFxxFFxx

FFx..y-1x..y-1FFx..y-1x..y-1

FFy..xy..xFFy..xy..xFFy..xy..xFFy..xy..x

FFyyFFyy

FFy-1..yy-1..yFFy-1..yy-1..y FFx..y-2x..y-2FFx..y-2x..y-2

FFx..y-1x..y-1FFx..y-1x..y-1

Page 23: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ResultResultResultResult

greedy strip-growinggreedy strip-growinggreedy strip-growinggreedy strip-growing local optimizationlocal optimizationlocal optimizationlocal optimization

25.5 bytes/triangle25.5 bytes/triangle25.5 bytes/triangle25.5 bytes/triangle 24.2 bytes/triangle24.2 bytes/triangle24.2 bytes/triangle24.2 bytes/triangle

Page 24: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Bandwidth ResultsBandwidth ResultsBandwidth ResultsBandwidth Results

0

10

20

30

40

50

Byt

es /

tri

ang

le

fandisk gameguy bunny2k bunny4k bunny buddha schooner

Original Greedy strips Local optimization

Improvement by factor of 1.6 – 1.9Improvement by factor of 1.6 – 1.9Improvement by factor of 1.6 – 1.9Improvement by factor of 1.6 – 1.9

Page 25: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Choice of cache sizeChoice of cache sizeChoice of cache sizeChoice of cache size

0

0.5

1

1.5

2

2.5

3

0 10 20 30 40 50 60 70Cache size

Cac

he

mis

s r

ate

r

0

0.5

1

1.5

2

2.5

3

0 10 20 30 40 50 60 70Cache size

Cac

he

mis

s r

ate

r

size 16 sufficient for most gainsize 16 sufficient for most gainsize 16 sufficient for most gainsize 16 sufficient for most gain

Page 26: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Cache replacement policyCache replacement policyCache replacement policyCache replacement policy

22 1133

33 2244

11

44 33

22 11

44

33 22 11

44 33 22

11

all is OKall is OK all is OKall is OK

FIFOFIFOFIFOFIFO LRULRULRULRU(cache size 4)(cache size 4)(cache size 4)(cache size 4)

Page 27: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Cache replacement policyCache replacement policyCache replacement policyCache replacement policy

22 1133

11 4433

22

33 44

22 11

33 11

44 22

44 33

22 11

33 11 44

22

FIFOFIFOFIFOFIFO LRULRULRULRU

22 1133

33 2244

11

44 33

22 11

44

33 22 11

44 33 22

11

strips twice as longstrips twice as long strips twice as longstrips twice as long

(cache size 4)(cache size 4)(cache size 4)(cache size 4)

Page 28: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

ComparisonComparisonComparisonComparison

FIFOFIFOFIFOFIFO LRULRULRULRU

Page 29: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

FIFOFIFOFIFOFIFO LRULRULRULRU

ComparisonComparisonComparisonComparison

Page 30: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

SummarySummarySummarySummary

Vertex caching reduces geometryVertex caching reduces geometrybandwidth by factor of bandwidth by factor of 1.6 to 1.9 1.6 to 1.9

Transparent to application:Transparent to application: simply pre-process the models (fast)simply pre-process the models (fast)

Still efficient on legacy hardwareStill efficient on legacy hardware

Supports dynamic geometrySupports dynamic geometry

Vertex caching reduces geometryVertex caching reduces geometrybandwidth by factor of bandwidth by factor of 1.6 to 1.9 1.6 to 1.9

Transparent to application:Transparent to application: simply pre-process the models (fast)simply pre-process the models (fast)

Still efficient on legacy hardwareStill efficient on legacy hardware

Supports dynamic geometrySupports dynamic geometry

Page 31: Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99

Future workFuture workFuture workFuture work

Issue of cache sizeIssue of cache size Find face ordering good for all sizes?Find face ordering good for all sizes? Standardize on size 16?Standardize on size 16? Reprocess mesh at load timeReprocess mesh at load time

Interaction with texture cachingInteraction with texture caching

Cache efficiency during runtime LODCache efficiency during runtime LOD

Issue of cache sizeIssue of cache size Find face ordering good for all sizes?Find face ordering good for all sizes? Standardize on size 16?Standardize on size 16? Reprocess mesh at load timeReprocess mesh at load time

Interaction with texture cachingInteraction with texture caching

Cache efficiency during runtime LODCache efficiency during runtime LOD