77
Glift Glift : Generic, Efficient : Generic, Efficient Random Random - - Access GPU Data Structures Access GPU Data Structures Aaron Lefohn Aaron Lefohn University of California, Davis University of California, Davis

Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

GliftGlift: Generic, Efficient: Generic, EfficientRandomRandom--Access GPU Data StructuresAccess GPU Data Structures

Aaron LefohnAaron LefohnUniversity of California, DavisUniversity of California, Davis

Page 2: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Problem StatementProblem Statement

•• GoalGoal•• Simplify creation and use of randomSimplify creation and use of random--access GPU access GPU

data structures for graphics and GPGPU data structures for graphics and GPGPU programmingprogramming

•• ContributionsContributions•• Abstraction for GPU data structuresAbstraction for GPU data structures

•• GliftGlift template librarytemplate library

•• IteratorIterator computation model for computation model for GPUsGPUs

Page 3: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

CollaboratorsCollaborators•• Joe Joe KnissKniss

University of UtahUniversity of Utah

•• Robert Robert StrzodkaStrzodkaStanford UniversityStanford University

•• ShubhabrataShubhabrata SenguptaSenguptaUniversity of California, DavisUniversity of California, Davis

•• John OwensJohn OwensUniversity of California, DavisUniversity of California, Davis

Page 4: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Many Interesting GPU Data StructuresMany Interesting GPU Data Structures

•• Photon mapPhoton map PurcellPurcell

•• Sparse matrixSparse matrix BoltzBoltz, Krueger, Krueger

•• Sparse simulation gridSparse simulation grid LefohnLefohn

•• PolycubePolycube (3D grid, (3D grid, cubeMapcubeMap, , ……)) TariniTarini

•• NN--treetree LefebvreLefebvre

•• ButBut……•• No way to distribute/reuse implementationsNo way to distribute/reuse implementations

•• Complexity stifles innovationComplexity stifles innovation

Motivation

Page 5: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

CPU Software DevelopmentCPU Software Development

•• BenefitsBenefits•• Algorithms and data structures expressed in problem domainAlgorithms and data structures expressed in problem domain

•• Decouple algorithms and data structuresDecouple algorithms and data structures

•• Code reuseCode reuse

Motivation

Application

Data Structure Library

CPU Memory

Algorithm Library

Page 6: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GPU Software DevelopmentGPU Software Development

•• ProblemsProblems•• Code is tangled mess of algorithm and data structure accessCode is tangled mess of algorithm and data structure access

•• Algorithms expressed in GPU memory domainAlgorithms expressed in GPU memory domain

•• No code reuseNo code reuse

Application- Data structure and algorithm

GPU Memory

Motivation

Page 7: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GPU Data StructuresGPU Data Structures

•• WhatWhat’’s Missing?s Missing?•• Standalone abstraction for GPU data structures for Standalone abstraction for GPU data structures for

graphics or GPGPU programminggraphics or GPGPU programming

Motivation

C++ Cg OpenGL

STL ???

ShScoutBrook

Page 8: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

•• CPU (C++)CPU (C++)

float srcData[10][10][10];float srcData[10][10][10];

float dstData[10][10][10];float dstData[10][10][10];

…… initialize data initialize data ……

forfor ((size_tsize_t z = 1; z < 10; ++z) {z = 1; z < 10; ++z) {

forfor ((size_tsize_t y = 1; z < 10; ++y) {y = 1; z < 10; ++y) {

forfor ((size_tsize_t x = 1; z < 10; ++x) {x = 1; z < 10; ++x) {

dst[z][y][xdst[z][y][x] = log( 1 + ] = log( 1 + src[z][y][xsrc[z][y][x] );] );

}}

}}

}}

Simple ExampleSimple ExampleMotivation

Page 9: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

We Want To Transform ThisWe Want To Transform This……

•• GPU (Cg)GPU (Cg)

float3float3 getAddr3D( getAddr3D( float2float2 winPoswinPos, , float2float2 winSizewinSize, , float3float3 sizeConst3D ) {sizeConst3D ) {float3float3 curAddr3D;curAddr3D;float2float2 winPosIntwinPosInt = = floor(winPosfloor(winPos););floatfloat addr1D = addr1D = winPosInt.ywinPosInt.y * * winSize.xwinSize.x + + winPosInt.xwinPosInt.x;;

addr3D.z = floor( addr1D / sizeConst3D.z );addr3D.z = floor( addr1D / sizeConst3D.z );addr1D addr1D --= addr3D.z * sizeConst3D.z; = addr3D.z * sizeConst3D.z; addr3D.y = floor( addr1D / sizeConst3D.y );addr3D.y = floor( addr1D / sizeConst3D.y );addr3D.x = addr1D addr3D.x = addr1D -- addr3D.y * sizeConst3D.y;addr3D.y * sizeConst3D.y;

returnreturn addr3D;addr3D;}}

float3float3 logAlg(logAlg(uniformuniform samplerRECTsamplerRECT data, data, uniform float2uniform float2 winSizewinSize, , uniformuniform float3float3 sizeConst3D,sizeConst3D,

float2float2 winPoswinPos : WPOS ) : COLOR: WPOS ) : COLOR{{

float3float3 addr3D = getAddr3D( addr3D = getAddr3D( winPoswinPos, , winSizewinSize, sizeConst3D );, sizeConst3D );floatfloat data = data = texRECTtexRECT(data(data, addr3D );, addr3D );returnreturn log( 1 + data );log( 1 + data );

}}

Motivation

Page 10: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

We Want To Transform ThisWe Want To Transform This……

•• GPU (Cg and C++)GPU (Cg and C++)

float3float3 getAddr3D( getAddr3D( float2float2 winPoswinPos, , float2float2 winSizewinSize, , float3float3 sizeConst3D ) {sizeConst3D ) {

float3float3 curAddr3D;curAddr3D;float2float2 winPosIntwinPosInt = = floor(winPosfloor(winPos););floatfloat addr1D = addr1D = winPosInt.ywinPosInt.y * * winSize.xwinSize.x + + winPosInt.xwinPosInt.x;;

addr3D.z = floor( addr1D / sizeConst3D.z );addr3D.z = floor( addr1D / sizeConst3D.z );addr1D addr1D --= addr3D.z * sizeConst3D.z; = addr3D.z * sizeConst3D.z; addr3D.y = floor( addr1D / sizeConst3D.y );addr3D.y = floor( addr1D / sizeConst3D.y );addr3D.x = addr1D addr3D.x = addr1D -- addr3D.y * sizeConst3D.y;addr3D.y * sizeConst3D.y;

returnreturn addr3D;addr3D;

}}

float3float3 logAlg(logAlg(uniformuniform samplerRECTsamplerRECT data, data, uniform float2uniform float2 winSizewinSize, , uniformuniform float3float3 sizeConst3D,sizeConst3D,

float2float2 winPoswinPos : WPOS ) : COLOR: WPOS ) : COLOR

{{

float3float3 addr3D = getAddr3D( addr3D = getAddr3D( winPoswinPos, , winSizewinSize, sizeConst3D );, sizeConst3D );

floatfloat data = data = texRECTtexRECT(data(data, addr3D );, addr3D );

returnreturn log( 1 + data );log( 1 + data );

}}

Motivation

GLuintGLuint srcDataIdsrcDataId = 1;= 1;

glBindTexture(GL_TEXTURE_RECTANGLE_ARBglBindTexture(GL_TEXTURE_RECTANGLE_ARB, , srcDataIdsrcDataId););

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_MIN_FILTER, GL_NEAREST);, GL_MIN_FILTER, GL_NEAREST);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_MAG_FILTER, GL_NEAREST);, GL_MAG_FILTER, GL_NEAREST);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_WRAP_S, GL_CLAMP);, GL_WRAP_S, GL_CLAMP);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_WRAP_T, GL_CLAMP);, GL_WRAP_T, GL_CLAMP);

glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_LUMINANCE32F_ARB, glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_LUMINANCE32F_ARB, 0, 0, 40, 40, GL_LUMINANCE, NULL);0, 0, 40, 40, GL_LUMINANCE, NULL);

GLuintGLuint dstDataIddstDataId = 2;= 2;

glBindTexture(GL_TEXTURE_RECTANGLE_ARBglBindTexture(GL_TEXTURE_RECTANGLE_ARB, , dstDataIddstDataId););

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_MIN_FILTER, GL_NEAREST);, GL_MIN_FILTER, GL_NEAREST);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_MAG_FILTER, GL_NEAREST);, GL_MAG_FILTER, GL_NEAREST);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_WRAP_S, GL_CLAMP);, GL_WRAP_S, GL_CLAMP);

glTexParameteri(GL_TEXTURE_RECTANGLE_ARBglTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_WRAP_T, GL_CLAMP);, GL_WRAP_T, GL_CLAMP);

glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_LUMINANCE32F_ARB, glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_LUMINANCE32F_ARB, 0, 0, 40, 40, GL_LUMINANCE, NULL);0, 0, 40, 40, GL_LUMINANCE, NULL);

…… Initialize data Initialize data ……

Page 11: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Into This.Into This.

•• GPU (C++ and Cg with GPU (C++ and Cg with GliftGlift))typedeftypedef glift::ArrayGpuglift::ArrayGpu<vec3i,vec1f><vec3i,vec1f> ArrayType;ArrayTypeArrayType srcsrc( vec3i(10,10,10) );( vec3i(10,10,10) );ArrayTypeArrayType dstdst( vec3i(10,10,10) );( vec3i(10,10,10) );

…… initialize data initialize data ……

floatfloat logAlglogAlg( ( ElementIterElementIter srcDatasrcData ) : COLOR) : COLOR{{

returnreturn log( 1 + log( 1 + srcData.valuesrcData.value() );() );}}

Motivation

Page 12: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

OverviewOverview

•• Motivation and Previous WorkMotivation and Previous Work•• AbstractionAbstraction•• ImplementationImplementation•• ExamplesExamples•• ConclusionsConclusions

Page 13: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Abstraction Design GoalsAbstraction Design Goals

•• GPU data structure abstraction thatGPU data structure abstraction that•• Enables easy creation of new structures Enables easy creation of new structures

•• Is minimal abstraction of GPU memory modelIs minimal abstraction of GPU memory model

•• Separates data structures and algorithmsSeparates data structures and algorithms

•• Encourages efficiencyEncourages efficiency

Abstraction

Page 14: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Building the AbstractionBuilding the Abstraction

•• ApproachApproach•• BottomBottom--up, working towards STLup, working towards STL--like syntaxlike syntax

•• Identify common patterns in GPU papers and codeIdentify common patterns in GPU papers and code

•• Inspired byInspired by

••STL, Boost, Brook, STAPL, STL, Boost, Brook, STAPL, StepanovStepanov

Abstraction

Page 15: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

What is the GPU Memory Model?What is the GPU Memory Model?

•• CPU interfaceCPU interface•• glTexImageglTexImage mallocmalloc•• glDeleteTexturesglDeleteTextures freefree•• glTexSubImageglTexSubImage memcpymemcpy GPU GPU --> CPU> CPU•• glGetTexSubImageglGetTexSubImage** memcpymemcpy CPU CPU --> GPU> GPU•• glCopyTexSubImageglCopyTexSubImage memcpymemcpy GPU GPU --> GPU> GPU•• glBindTextureglBindTexture readread--only only parameter bindparameter bind•• glFramebufferTextureglFramebufferTexture writewrite--only only parameter bindparameter bind

* * Does not exist. Emulate withDoes not exist. Emulate with glReadPixelsglReadPixels

Abstraction

Page 16: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

What is the GPU Memory Model?What is the GPU Memory Model?•• GPU Interface (shown in Cg)GPU Interface (shown in Cg)

•• uniform uniform samplerNDsamplerND data structure data structure paramparam declarationdeclaration•• texND(textexND(tex, , addraddr)) randomrandom--access readaccess read

•• varying varying floatNfloatN streamstream streamstream parameter declarationparameter declaration•• streamstream streamstream readread

Abstraction

Page 17: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GPU Data Structure AbstractionGPU Data Structure Abstraction

•• Factor GPU data structures intoFactor GPU data structures into•• Physical memoryPhysical memory

•• Virtual memoryVirtual memory

•• Address translatorAddress translator

•• IteratorsIterators

Abstraction

Page 18: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Physical MemoryPhysical Memory

•• Native GPU texturesNative GPU textures•• Choose based on algorithm efficiency requirementsChoose based on algorithm efficiency requirements

•• 1D, 2D, 3D, Cube, 1D, 2D, 3D, Cube, MipMip

••DimensionalityDimensionality••ReadRead--only vs. readonly vs. read--writewrite••PointPoint--sample vs. filteringsample vs. filtering••Maximum sizeMaximum size

Abstraction

Page 19: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Virtual MemoryVirtual Memory

•• Virtual NVirtual N--D address spaceD address space•• Choose based on problem space of algorithmChoose based on problem space of algorithm

•• Defined by physical memory and address translatorDefined by physical memory and address translator

Abstraction

Virtual representation of memory: 3D grid

Translation

3D native mem

Translation

2D slices

Translation

Flat 3D texture

Page 20: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Address TranslatorAddress Translator

•• Mapping between physical and virtual Mapping between physical and virtual addrsaddrs

•• Core of data structureCore of data structure

•• Small amount of code defines Small amount of code defines allall required CPU and required CPU and GPU memory interfacesGPU memory interfaces

Abstraction

PhysicalAddress

VirtualAddress

Page 21: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Address TranslatorAddress Translator

•• Core of data structureCore of data structure•• Extension point for creating new structuresExtension point for creating new structures

•• Must defineMust define

translate(translate(……))translate_rangetranslate_range((……))

Implementation

Page 22: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Address Translator ClassificationsAddress Translator Classifications•• RepresentationRepresentation

•• Analytic / DiscreteAnalytic / Discrete

•• Memory ComplexityMemory Complexity•• O(1), O(1), O(logO(log N), O(N), N), O(N), ……

•• Compute ComplexityCompute Complexity•• O(1), O(1), O(logO(log N), O(N), N), O(N), ……

Abstraction

•• Compute ConsistencyCompute Consistency•• Uniform vs. nonUniform vs. non--uniformuniform

•• Total / PartialTotal / Partial•• Complete vs. sparseComplete vs. sparse

•• OneOne--toto--one / Manyone / Many--toto--oneone•• Uniform vs. adaptiveUniform vs. adaptive

Page 23: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Brook streamsBrook streams (Buck et al. 2004)(Buck et al. 2004)

Abstraction

1D Virtual 2D Physical

Page 24: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Brook streamsBrook streams (Buck et al. 2004)(Buck et al. 2004)•• Physical addressPhysical address 2D2D

•• Virtual addressVirtual address NN--DD

•• Address translatorAddress translator NDND--toto--2D2D

••AnalyticAnalytic••O(1) memoryO(1) memory••O(1) computeO(1) compute••Uniform consistencyUniform consistency••Total, uniform mappingTotal, uniform mapping

Abstraction

Page 25: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Dynamic sparse 3D grid Dynamic sparse 3D grid (Lefohn et al. 2003)(Lefohn et al. 2003)

Application

Physical MemoryPage TableVirtual Domain

Page 26: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Dynamic sparse 3D gridDynamic sparse 3D grid (Lefohn et al. 2003)(Lefohn et al. 2003)

•• Physical addressPhysical address 2D2D

•• Virtual addressVirtual address 3D3D

•• Address translatorAddress translator 3D page table3D page table

••DiscreteDiscrete••O(N) memoryO(N) memory••O(1) computeO(1) compute••Uniform consistencyUniform consistency••Partial, uniform mappingPartial, uniform mapping

Abstraction

Page 27: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Photon Map (Photon Map (kNNkNN--grid)grid) (Purcell et al. 2003)(Purcell et al. 2003)

Abstraction

Image from “Implementing Efficient Parallel Data Structures on GPUs,”Lefohn et al., GPU Gems II, ch. 33, 2005

Page 28: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Data Structure ExamplesData Structure Examples

•• Photon Map (Photon Map (kNNkNN--grid)grid) (Purcell et al. 2003)(Purcell et al. 2003)•• Physical addressPhysical address 2D2D

•• Virtual addressVirtual address 3D3D

•• Address translatorAddress translator 3D page table3D page table-- Variable sized phys pagesVariable sized phys pages-- ““Grid of listsGrid of lists””

••DiscreteDiscrete••O(N) memoryO(N) memory••O(L) computeO(L) compute••NonNon--uniform consistencyuniform consistency••Partial, adaptive mappingPartial, adaptive mapping

Abstraction

Page 29: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift IteratorsIterators

•• WeWe’’ve so far only discussed datave so far only discussed data accessaccess•• What about data structure What about data structure traversaltraversal??

Page 30: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

IteratorsIterators

•• Separate algorithms and data structuresSeparate algorithms and data structures•• Minimal interface between data and algorithmMinimal interface between data and algorithm

•• Required for GPGPU use of data structureRequired for GPGPU use of data structure

•• Encapsulate GPGPU optimizationsEncapsulate GPGPU optimizations

Abstraction

Page 31: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

IteratorsIterators

•• Abstract data access and traversalAbstract data access and traversal

DataStructureType::iterator it;

for (it = data.begin(); it != data.end(); ++it)

{

*it = -(*it);

}

Abstraction

Page 32: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift IteratorsIterators

•• Address Address iteratorsiterators•• IteratorIterator value is Nvalue is N--D addressD address

•• GPU GPU interpolantsinterpolants

•• Element Element iteratorsiterators•• IteratorIterator value is data structure elementvalue is data structure element

•• C/C++ pointer, STL C/C++ pointer, STL iteratoriterator, streams, streams

Abstraction

Page 33: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Element Element IteratorIterator ConceptsConcepts

•• PermissionPermission•• ReadRead--only, writeonly, write--only, readonly, read--writewrite

•• Access regionAccess region•• Single, neighborhood, randomSingle, neighborhood, random

•• TraversalTraversal•• Forward, backward, parallel rangeForward, backward, parallel range

Abstraction

Page 34: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Which Element Which Element IteratorsIterators??

•• ReadRead--only, single access, range only, single access, range iteratoriterator•• GPU stream inputGPU stream input

•• ReadRead--only, randomonly, random--access, range access, range iteratoriterator•• GPU texture inputGPU texture input

•• WriteWrite--only, single access, range only, single access, range iteratoriterator•• GPU render targetGPU render target

Abstraction

Page 35: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Example 1 : Example 1 : ““BeforeBefore”” and and ““AfterAfter”” GliftGlift

•• Transform GPU code with Transform GPU code with GliftGlift

Page 36: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

•• 3D Array with 2D physical memory3D Array with 2D physical memory

CPU (C++)CPU (C++)float srcData[10][10][10];float srcData[10][10][10];float dstData[10][10][10];float dstData[10][10][10];

…… initialize data initialize data ……

forfor ((size_tsize_t z = 1; z < 10; ++z) {z = 1; z < 10; ++z) {forfor ((size_tsize_t y = 1; z < 10; ++y) {y = 1; z < 10; ++y) {

forfor ((size_tsize_t x = 1; z < 10; ++x) {x = 1; z < 10; ++x) {dstData[z][y][xdstData[z][y][x] = ] = srcData[zsrcData[z––1][y1][y––1][x1][x––1];1];

}}}}

}}

Simple ExampleSimple ExampleAbstraction

Page 37: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

float3float3 physToVirtphysToVirt( ( float2float2 pa, pa, float2float2 physSizephysSize, , float3float3 virtSizesvirtSizes ) {) {float3float3 vava;;floatfloat addr1D = addr1D = pa.ypa.y * * physSize.xphysSize.x + + pa.xpa.x;;

va.zva.z = floor( addr1D / = floor( addr1D / virtSizes.zvirtSizes.z ););addr1D addr1D --= = va.zva.z * sizeConst3D.z; * sizeConst3D.z; va.yva.y = floor( addr1D / = floor( addr1D / virtSizes.yvirtSizes.y ););va.xva.x = addr1D = addr1D -- va.yva.y * * virtSizes.yvirtSizes.y;;

returnreturn vava;;}}

float2float2 virtToPhysvirtToPhys( ( float3float3 vava, , float2float2 physSizephysSize, , float3float3 virtSizesvirtSizes ) {) {floatfloat addr1D = dot( addr1D = dot( vava, , virtSizesvirtSizes ););floatfloat normAddr1D = addr1D / normAddr1D = addr1D / physSize.xphysSize.x;;float2float2 pa = pa = float2float2(frac(normAddr1D) * (frac(normAddr1D) * physSize.xphysSize.x, normAddr1D);, normAddr1D);

}}

float3float3 main( main( uniform uniform samplerRECTsamplerRECT physMemphysMem, , uniform float2uniform float2 physSizephysSize, , uniformuniform float3float3 virtSizesvirtSizes,,

float2float2 pa : WPOS ) : COLORpa : WPOS ) : COLOR{{

float3float3 vava = = physToVirtphysToVirt( ( floor(pafloor(pa), ), physSizephysSize, , virtSizesvirtSizes ););float3float3 neighborAddrneighborAddr = = vava -- float3(1, 1, 1);float3(1, 1, 1);returnreturn texRECTtexRECT(data(data, virtToPhys(neighborAddr3D, , virtToPhys(neighborAddr3D, physSizephysSize, , virtSizesvirtSizes) );) );

}}

Example 1: Example 1: ShaderShader w/out w/out GliftGliftAbstraction

PhysicalPhysical--toto--Virtual Virtual Address TranslationAddress Translation

VirtualVirtual--toto--PhysicalPhysicalAddress TranslationAddress Translation

Physical Memory ReadPhysical Memory Read

Page 38: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

float3float3 physToVirtphysToVirt( ( float2float2 pa, pa, float2float2 physSizephysSize, , float3float3 virtSizesvirtSizes ) {) {float3float3 vava;;floatfloat addr1D = addr1D = pa.ypa.y * * physSize.xphysSize.x + + pa.xpa.x;;

va.zva.z = floor( addr1D / = floor( addr1D / virtSizes.zvirtSizes.z ););addr1D addr1D --= = va.zva.z * sizeConst3D.z; * sizeConst3D.z; va.yva.y = floor( addr1D / = floor( addr1D / virtSizes.yvirtSizes.y ););va.xva.x = addr1D = addr1D -- va.yva.y * * virtSizes.yvirtSizes.y;;

returnreturn vava;;}}

float2float2 virtToPhysvirtToPhys( ( float3float3 vava, , float2float2 physSizephysSize, , float3float3 virtSizesvirtSizes ) {) {floatfloat addr1D = dot( addr1D = dot( vava, , virtSizesvirtSizes ););floatfloat normAddr1D = addr1D / normAddr1D = addr1D / physSize.xphysSize.x;;float2float2 pa = pa = float2float2(frac(normAddr1D) * (frac(normAddr1D) * physSize.xphysSize.x, normAddr1D);, normAddr1D);

}}

float3float3 main( main( uniform uniform samplerRECTsamplerRECT physMemphysMem, , uniform float2uniform float2 physSizephysSize, , uniformuniform float3float3 virtSizesvirtSizes,,

float2float2 pa : WPOS ) : COLORpa : WPOS ) : COLOR{{

float3float3 vava = = physToVirtphysToVirt( ( floor(pafloor(pa), ), physSizephysSize, , virtSizesvirtSizes ););float3float3 neighborAddrneighborAddr = = vava -- float3(1, 1, 1);float3(1, 1, 1);returnreturn texRECTtexRECT(data(data, virtToPhys(neighborAddr3D, , virtToPhys(neighborAddr3D, physSizephysSize, , virtSizesvirtSizes) );) );

}}

Example 1: Example 1: GliftGlift ComponentsComponentsAbstraction

Address Address IteratorIterator

VirtMemVirtMem

VirtMemVirtMem

Page 39: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Example 1: GPU Example 1: GPU ShaderShader with with GliftGlift

Cg UsageCg Usage

float3float3 main( main( uniformuniform VMem3D VMem3D srcDatasrcData, ,

AddrIter3DAddrIter3D iteriter ) : COLOR) : COLOR

{{

float3float3 vava = = iter.valueiter.value();();

returnreturn srcData.vTex3D( srcData.vTex3D( vava –– float3(1,1,1) );float3(1,1,1) );

}}

Abstraction

Page 40: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Example 1: Example 1: GliftGlift Data StructuresData Structures

C++ UsageC++ Usagevec3i origin(0,0,0); vec3i origin(0,0,0); vec3i size(10,10,10);vec3i size(10,10,10);

typedeftypedef ArrayGpuArrayGpu<vec3i,vec1f> <vec3i,vec1f> ArrayTypeArrayType;;ArrayTypeArrayType srcDatasrcData( size );( size );ArrayTypeArrayType dstDatadstData( size );( size );

…… initialize initialize dataPtrdataPtr ……srcData.writesrcData.write( origin, size, ( origin, size, dataPtrdataPtr ););

typedeftypedef ArrayType::addr_transArrayType::addr_trans AddrTransTypeAddrTransType;;AddrTransType::gpu_rangeAddrTransType::gpu_range it = it =

dstData.addr_trans().gpu_range(origindstData.addr_trans().gpu_range(origin, size);, size);

it.bind_for_readit.bind_for_read( ( iterCgParamiterCgParam ););srcData.bind_for_readsrcData.bind_for_read( ( srcCgParamsrcCgParam ););dstData.bind_for_writedstData.bind_for_write( COLOR0, ( COLOR0, myFrameBufferObjectmyFrameBufferObject ););

exec_gpu_iteratorsexec_gpu_iterators(( itit ););

Abstraction

Page 41: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

OverviewOverview

•• Motivation Motivation •• AbstractionAbstraction•• ImplementationImplementation•• ExamplesExamples•• ConclusionsConclusions

Page 42: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift ComponentsComponents

Application

PhysMem AddrTrans

C++ / Cg / OpenGL

VirtMem

Container Adaptors

Implementation

Page 43: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift Design GoalsDesign Goals

•• Efficiency Efficiency •• Easy, incremental adoptionEasy, incremental adoption•• Easily extensibleEasily extensible•• CPU/GPU interoperabilityCPU/GPU interoperability

Implementation

Page 44: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift Design GoalsDesign Goals

•• Efficiency Efficiency •• Static polymorphism (C++ and Cg)Static polymorphism (C++ and Cg)

•• Cg program specializationCg program specialization

•• Cg compiler optimizationsCg compiler optimizations

•• Easy, incremental adoptionEasy, incremental adoption•• Easily extensibleEasily extensible•• CPU/GPU interoperabilityCPU/GPU interoperability

Implementation

Page 45: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift Design GoalsDesign Goals

•• EfficiencyEfficiency•• Easy, incremental adoptionEasy, incremental adoption

•• Integrate with Cg/OpenGL/C++Integrate with Cg/OpenGL/C++

•• STLSTL--like and texturelike and texture--like interfaceslike interfaces

•• Use components alone or Use components alone or compositedcomposited

•• Easily extensibleEasily extensible•• CPU/GPU interoperabilityCPU/GPU interoperability

Implementation

Page 46: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift Design GoalsDesign Goals

•• EfficiencyEfficiency•• Easy, incremental adoptionEasy, incremental adoption•• Easily extensibleEasily extensible

•• Create new structure by:Create new structure by:

••Change behavior of existing address translatorChange behavior of existing address translator••New address translatorNew address translator••New container adaptorNew container adaptor

•• CPU/GPU interoperabilityCPU/GPU interoperability

Implementation

Page 47: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift Design GoalsDesign Goals

•• EfficiencyEfficiency•• Easy, incremental adoptionEasy, incremental adoption•• Easily extensibleEasily extensible•• CPU/GPU interoperabilityCPU/GPU interoperability

•• Unified C++/Cg code baseUnified C++/Cg code base

•• Map memory to CPU or GPUMap memory to CPU or GPU

•• CPU and GPU CPU and GPU iteratorsiterators

Implementation

Page 48: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

C++/Cg IntegrationC++/Cg Integration•• Each component defines C++ and Cg codeEach component defines C++ and Cg code

•• C++ objects have Cg C++ objects have Cg structstruct representationrepresentation

•• StringifiedStringified Cg parameterized by C++ templatesCg parameterized by C++ templates

•• Cg Cg ““templatetemplate”” instantiationinstantiation•• Insert generated Insert generated GliftGlift source code into source code into shadershader

glift::cgGetTemplateTypeglift::cgGetTemplateType<<MyDataStructTypeMyDataStructType>();>();glift::cgInstantiateParameterglift::cgInstantiateParameter((……););

•• All other compilation/loading/binding identical to All other compilation/loading/binding identical to standard standard shadershader

Implementation

Page 49: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Cg Compilation ExampleCg Compilation Example

•• Cg codeCg codefloat4 main( uniform VMem3D float4 main( uniform VMem3D octreeoctree, ,

float3 float3 coordcoord ) : COLOR ) : COLOR

{{

return octree.vMem3D(coord);return octree.vMem3D(coord);

}}

•• C++ codeC++ codetypedef OctreeGPU<vec4ub> octree_type;

GliftType type = cgGetTemplateType<octree_type>();

CGprogram prog = cgCreateProgram(…);

prog = cgInstantiateParameter(prog, “octree”, type);

cgCompileProgram(prog);

Implementation

Page 50: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

OverviewOverview

•• Motivation and previous workMotivation and previous work•• AbstractionAbstraction•• Case StudyCase Study

•• Adaptive shadow maps and Adaptive shadow maps and octreeoctree 3D paint3D paint

•• ConclusionsConclusions

Page 51: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Example 2: Adaptive Shadow MapsExample 2: Adaptive Shadow Maps

•• Show Show GliftGlift usage withusage with•• Complex applicationComplex application

•• Complex data structureComplex data structure

Application

Page 52: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Example 2: Adaptive Shadow MapsExample 2: Adaptive Shadow Maps

•• Fernando et al., ACM SIGGRAPH 2001Fernando et al., ACM SIGGRAPH 2001•• Elegant solution to shadow map aliasingElegant solution to shadow map aliasing

•• QuadtreeQuadtree of small shadow mapsof small shadow maps

•• Shadow maps need resolution only on shadow boundaryShadow maps need resolution only on shadow boundary

•• Required resolution determined by projected area of Required resolution determined by projected area of screen space pixel into light spacescreen space pixel into light space

Application

Page 53: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Adaptive Shadow MapsAdaptive Shadow Maps

•• Why Adaptive Shadow Maps with Why Adaptive Shadow Maps with GliftGlift??•• Many recent (2004) shadow papers cite Many recent (2004) shadow papers cite ASMsASMs as high as high

quality solution but not possible on graphics hardwarequality solution but not possible on graphics hardware

•• Algorithm is simple. Data structure is hard.Algorithm is simple. Data structure is hard.

Application

Page 54: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Adaptive Shadow Map AlgorithmAdaptive Shadow Map Algorithm

•• Iterative refinement algorithmIterative refinement algorithm•• Identify shadow pixels w/ resolution mismatchIdentify shadow pixels w/ resolution mismatch

•• Create small shadow map Create small shadow map ““pagespages”” at requested resolutionat requested resolution

•• Shadow lookupShadow lookup•• Compute shadow map coordinate and resolutionCompute shadow map coordinate and resolution

•• Lookup in ASM (tree of small shadow map pages)Lookup in ASM (tree of small shadow map pages)

•• ASM depends on both camera and light position!ASM depends on both camera and light position!

Application

Page 55: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• AdaptiveAdaptive•• MultiresolutionMultiresolution•• Fast, parallel randomFast, parallel random--access readaccess read

•• 2x2 native Percentage Closer Filtering (PCF)2x2 native Percentage Closer Filtering (PCF)

•• TrilinearTrilinear interpolated interpolated mipmappedmipmapped PCFPCF

•• Fast, parallel writeFast, parallel write•• Fast, parallel insert and eraseFast, parallel insert and erase

Application

Page 56: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data StructureASM Data Structure

•• Start with page table address translatorStart with page table address translator•• Coarse, uniform Coarse, uniform discretizationdiscretization of virtual domainof virtual domain

•• O(N) memoryO(N) memory O(1) insertO(1) insert•• O(1) computationO(1) computation O(1) eraseO(1) erase•• Uniform consistencyUniform consistency•• Partial mapping (sparse)Partial mapping (sparse)

Application

Page 57: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data StructureASM Data Structure

•• Page table examplePage table example

Application

Physical MemoryPage TableVirtual Domain

vpn = va / pageSizeppa = pageTable(vpn)

off = va % pageSizepa = ppa + off

Page 58: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• AdaptiveAdaptive•• MultiresolutionMultiresolution•• Fast, parallel randomFast, parallel random--access readaccess read

•• 2x2 native Percentage Closer Filtering (PCF)2x2 native Percentage Closer Filtering (PCF)

•• TrilinearTrilinear interpolated interpolated mipmappedmipmapped PCFPCF

•• Fast, parallel writeFast, parallel write•• Fast, parallel insert and eraseFast, parallel insert and erase

Application

Page 59: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data StructureASM Data Structure

•• Adaptive Page TableAdaptive Page Table•• Map multiple virtual pages to single physical pageMap multiple virtual pages to single physical page

Application

Physical MemoryVirtual Domain

ppa = pageTable(vpn).ppa()

vpn = va / pageSizes = pageTable(vpn).s()off = (va * s) % pageSizepa = ppa + off

Page Table

Page 60: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• AdaptiveAdaptive•• MultiresolutionMultiresolution•• Fast, parallel randomFast, parallel random--access readaccess read

•• 2x2 native Percentage Closer Filtering (PCF)2x2 native Percentage Closer Filtering (PCF)

•• TrilinearTrilinear interpolated interpolated mipmappedmipmapped PCFPCF

•• Fast, parallel writeFast, parallel write•• Fast, parallel insert and eraseFast, parallel insert and erase

Application

Page 61: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data StructureASM Data Structure

•• MultiresolutionMultiresolution Page TablePage Table

Application

Physical MemoryVirtual DomainMipmap

Page Table

Page 62: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• AdaptiveAdaptive•• MultiresolutionMultiresolution•• Fast, parallel randomFast, parallel random--access readaccess read

•• 2x2 native Percentage Closer Filtering (PCF)2x2 native Percentage Closer Filtering (PCF)

•• TrilinearTrilinear interpolated interpolated mipmappedmipmapped PCFPCF

•• Fast, parallel writeFast, parallel write•• Fast, parallel insert and eraseFast, parallel insert and erase

Application

Page 63: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• How support bilinear filtering?How support bilinear filtering?•• Duplicate 1 column and 1 row of Duplicate 1 column and 1 row of texelstexels in each pagein each page

•• MipmappedMipmapped trilineartrilinear??•• ““ByBy--handhand”” interpolation between interpolation between mipmapmipmap levelslevels

Application

Page 64: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure RequirementsASM Data Structure Requirements

•• AdaptiveAdaptive•• MultiresolutionMultiresolution•• Fast, parallel randomFast, parallel random--access readaccess read

•• 2x2 native Percentage Closer Filtering (PCF)2x2 native Percentage Closer Filtering (PCF)

•• TrilinearTrilinear interpolated interpolated mipmappedmipmapped PCFPCF

•• Fast, parallel writeFast, parallel write•• Fast, parallel insert and eraseFast, parallel insert and erase

Application

Page 65: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

How Define ASM Structure in How Define ASM Structure in GliftGlift??

•• Start with generic page table Start with generic page table AddrTransAddrTrans•• Use Use mipmappedmipmapped PhysMemPhysMem for page tablefor page table

•• Change template parameter to add Change template parameter to add adaptivityadaptivity

•• Write page Write page allocatorallocator•• alloc_pagesalloc_pages, , free_pagesfree_pages

•• FinallyFinally……typedeftypedef PageTableAddrTransPageTableAddrTrans<<……>> PageTablePageTable;;

typedeftypedef PhysMemGPUPhysMemGPU<vec2f, vec1s><vec2f, vec1s> PMem2D;PMem2D;

typedeftypedef VirtMemGPUVirtMemGPU<<PageTablePageTable, PMem2D> , PMem2D> VPageTableVPageTable;;

typedeftypedef AdaptiveMemAdaptiveMem<<VPageTableVPageTable, , PageAllocatorPageAllocator> ASM;> ASM;

Application

Page 66: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM Data Structure UsageASM Data Structure Usagefloat4float4 main(main( uniformuniform VMem2D VMem2D asmasm,,

float3float3 shadowCoordshadowCoord,,

float4float4 litColorlitColor ) : ) : COLORCOLOR

{{

floatfloat isInLightisInLight = asm.vTex2Ds( = asm.vTex2Ds( shadowCoordshadowCoord ););

return lerp( black, return lerp( black, litColorlitColor, , isInLightisInLight ););

}}

asm.bind_for_readasm.bind_for_read( ( …… ););

asm.bind_for_writeasm.bind_for_write( ( …… ););

asm.alloc_pagesasm.alloc_pages( ( …… ););

asm.free_pageasm.free_page( ( …… ););

……

Application

Page 67: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

Adaptive Shadow Map AlgorithmAdaptive Shadow Map Algorithm

•• Faithful to Fernando et al. 2001Faithful to Fernando et al. 2001•• Refinement algorithmRefinement algorithm

•• Identify shadow pixels w/ resolution mismatch (GPU)Identify shadow pixels w/ resolution mismatch (GPU)

•• Compact pixels into small stream (GPU)Compact pixels into small stream (GPU)

•• CPU reads back compacted stream (GPUCPU reads back compacted stream (GPU CPU)CPU)

•• Allocate pagesAllocate pages

•• Draw new Draw new PTEsPTEs into into mipmapmipmap page tables (CPUpage tables (CPU GPU)GPU)

•• Draw depth into ASM for each new page (GPU)Draw depth into ASM for each new page (GPU)

Application

Page 68: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

[Thanks to Yong Kil for the tree model]

ASM: Effective resolution 131,0722 (37 MB); SM: 20482

Page 69: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

““OctreeOctree”” 3D Paint3D Paint•• Interactive painting on Interactive painting on unparameterizedunparameterized 3D surfaces3D surfaces

•• 3D version of ASM data structure3D version of ASM data structure

•• Differs from previous work:Differs from previous work:•• QuadrilinearQuadrilinear filteringfiltering

•• O(1), uniform accessO(1), uniform access

•• Interactive withInteractive witheffectiveeffectiveresolutionsresolutionsbetweenbetween646433 and 2048and 204833

Application

Page 70: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

DemoDemo

Page 71: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

ASM ResultsASM Results

•• Effective shadow map resolution up to Effective shadow map resolution up to 131,072131,07222

161622 -- 646422 page sizepage size5125122 2 -- 204820482 2 page tablepage table204820482 2 -- 4096409622 physical memoryphysical memory20 20 -- 80 MB80 MB

•• Performance (45k polygon model)Performance (45k polygon model)•• 15 fps while moving camera (including refinement)15 fps while moving camera (including refinement)

•• 55--10 fps while moving light10 fps while moving light

•• Lookup time compared to 2048Lookup time compared to 204822 shadow map:shadow map:•• Bilinear filtered: 90% performance of traditionalBilinear filtered: 90% performance of traditional

•• TrilinearTrilinear filtered filtered mipmappedmipmapped: 73%: 73%

Application

Page 72: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

GliftGlift ResultsResults•• Static instruction resultsStatic instruction results

•• With Cg program specializationWith Cg program specialization

GliftGlift ByBy--HandHand BrookBrook•• 1D 1D 2D2D 44 33 44

•• 3D page table3D page table 55 55

•• ASM ASM 99 99

•• OctreeOctree 1010 99

•• ASM + offsetASM + offset 1010 99

•• Conclusion : Conclusion : GliftGlift structures within 1 structures within 1 instrinstr of handof hand--coded Cgcoded Cg

Measured with Measured with NVShaderPerfNVShaderPerf, NVIDIA driver 75.22, Cg 1.4a, NVIDIA driver 75.22, Cg 1.4a

Application

Page 73: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

OverviewOverview

•• Motivation and previous workMotivation and previous work•• AbstractionAbstraction•• ImplementationImplementation•• ExamplesExamples•• ConclusionsConclusions

Page 74: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

SummarySummary

•• GPU programming needs data structure GPU programming needs data structure abstractionabstraction•• Separate data structures and algorithmsSeparate data structures and algorithms

•• More complex data structures and algorithmsMore complex data structures and algorithms

•• Why programmable address translation?Why programmable address translation?•• Common pattern in GPU data structuresCommon pattern in GPU data structures

•• Small amount of code virtualizes GPU memory modelSmall amount of code virtualizes GPU memory model

Page 75: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

SummarySummary

•• GliftGlift template librarytemplate library•• Generic C++/Cg implementation of abstractionGeneric C++/Cg implementation of abstraction

•• Nearly as efficient as hand codingNearly as efficient as hand coding

•• Integrates with OpenGL/CgIntegrates with OpenGL/Cg

•• IteratorIterator computation modelcomputation model•• Generalize GPU computation modelGeneralize GPU computation model

•• Can future rasterizer increment Can future rasterizer increment iterators?iterators?

Page 76: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

AcknowledgementsAcknowledgements•• Craig Kolb, Nick Craig Kolb, Nick TriantosTriantos, Cass , Cass EverittEveritt NVIDIANVIDIA•• Fabio Fabio PellaciniPellacini DartmouthDartmouth

•• Adam Adam MoerschellMoerschell, Yong , Yong KilKil UCDavisUCDavisSerbanSerban PorumbescuPorumbescu, Chris Co, , Chris Co, ……..

•• Ross Whitaker, Chuck Hansen, Milan Ross Whitaker, Chuck Hansen, Milan IkitsIkits U. of UtahU. of Utah

•• National Science Foundation Graduate FellowshipNational Science Foundation Graduate Fellowship•• Department of EnergyDepartment of Energy

Page 77: Glift: Generic, Efficient Random-Access GPU Data Structures · 2015. 7. 29. · Aaron Lefohn University of California, Davis Problem Statement •Goal • Simplify creation and use

Aaron LefohnUniversity of California, Davis

More InformationMore Information•• Upcoming paper in ACM Transactions on GraphicsUpcoming paper in ACM Transactions on Graphics

•• ““GliftGlift : Generic, Efficient, Random: Generic, Efficient, Random--Access GPU Data Access GPU Data StructuresStructures””

•• ACM SIGGRAPH 2005 SketchesACM SIGGRAPH 2005 Sketches•• ““Dynamic Adaptive Shadow Maps on Graphics HardwareDynamic Adaptive Shadow Maps on Graphics Hardware””

•• ““OctreeOctree Texture on Graphics HardwareTexture on Graphics Hardware””

•• Google Google ““GliftGlift””•• http://http://graphics.cs.ucdavis.edu/~lefohngraphics.cs.ucdavis.edu/~lefohn//