51
Tristan Lorach, March 17 th 2016 Vulkan: the essentials

Vulkan: the essentials - Nvidia

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Vulkan: the essentials - Nvidia

Tristan Lorach, March 17th 2016

Vulkan: the essentials

Page 2: Vulkan: the essentials - Nvidia

2

Analogy On Graphic APIs

Page 3: Vulkan: the essentials - Nvidia

3

Analogy

Fixed-function OpenGL

Pre-assembled toy car fun out of the box,

not much room for customization

Page 4: Vulkan: the essentials - Nvidia

4

Analogy

Modern AZDO OpenGL with Programmable Shaders

LEGO Kit you build it yourself,

comes with plenty of useful, pre-shaped pieces

AZDO = Approaching Zero Driver Overhead

Page 5: Vulkan: the essentials - Nvidia

5

Analogy

Pine Wood Derby Kit you build it yourself to race from raw materials

power tools used to assemble, adult supervision highly recommended

Vulkan

Page 6: Vulkan: the essentials - Nvidia

6

Analogy

Different Valid Approaches

Fixed-function OpenGL Modern AZDO OpenGL with Programmable Shaders

Vulkan

Page 7: Vulkan: the essentials - Nvidia

7

Beneficial Vulkan Scenarios

Is your graphics

work CPU bound?

Can your graphics

creation be parallelized?

start

yes

Vulkan

friendly

yes

Your graphics

platform is fixed

You’ll

do whatever

it takes to squeeze out

Max perf.

You put

a premium on

avoiding

hitches

You can

manage your

graphics resource

allocations

yes

yes

yes

yes

Page 8: Vulkan: the essentials - Nvidia

8

Beneficial Vulkan Scenarios

Is your graphics

work CPU bound?

Can your graphics

creation be parallelized?

start

yes

Vulkan

friendly

yes

Your graphics

platform is fixed

You’ll

do whatever

it takes to squeeze out

Max perf.

You put

a premium on

avoiding

hitches

You can

manage your

graphics resource

allocations

yes

yes

yes

yes

Tired with OpenGL

(state-machine)

or even D3D ?

Want to learn new stuff ?

Spend lots of time coding ?

No sleep ?

Kinda…

(it’s a Yes)

Alright…

(Yes)

Page 9: Vulkan: the essentials - Nvidia

9

Unlikely to Benefit

1. Need for compatibility to pre-Vulkan platforms

2. Heavily GPU-bound application

3. Heavily CPU-bound application due to non-graphics work

4. Single-threaded application, unlikely to change

5. App can target middle-ware engine, avoiding 3D graphics API dependencies

• Consider using an engine targeting Vulkan, instead of dealing with Vulkan yourself

OpenGL / D3D

Scenarios to Reconsider Coding to Vulkan

Page 10: Vulkan: the essentials - Nvidia

10

memory GPU

Big Picture – Typical OpenGL Case

Vertex Puller (IA)

Vertex Shader

TCS (Tessellation)

TES (Tessellation)

Tessellator

Geometry Shader

Transform Feedback

Rasterization

Fragment Shader

Per-Fragment Ops

Framebuffer

Tr. Feedback buffer

Uniform Block

Texture Fetch

Image Load/Store

Atomic Counter

Shader Storage

Element buffer (EBO)

Draw Indirect Buffer

Vertex Buffer (VBO)

Front-End

(decoder)

OpenGL

Driver Application

FBO resources

(Textures / RB)

OpenGL Commands

Cmd bundles

Push-Buffer

(FIF

O)

cm

ds

OpenGL resources

Resources

Graphics pipeline States

Dependencies

Heap

. .

Page 11: Vulkan: the essentials - Nvidia

11

Application

memory GPU

Big Picture – Vulkan

Vertex Puller (IA)

Vertex Shader

TCS (Tessellation)

TES (Tessellation)

Tessellator

Geometry Shader

Transform Feedback

Rasterization

Fragment Shader

Per-Fragment Ops

Framebuffer

Tr. Feedback buffer

Uniform Block

Texture Fetch

Image Load/Store

Atomic Counter

Shader Storage

Element buffer (EBO)

Draw Indirect Buffer

Vertex Buffer (VBO)

Front-End

(decoder)

OpenGL

Driver

FBO resources

(Textures / RB)

Cmd bundles

Push-Buffer

(FIF

O)

cm

ds

Minimal memory management

Fewer translation, Validation checks And internal mgt

Resources

Pipeline

States

Cmd-buffers / queues

Dependencies

Heap

Render

Passes Descriptor

Sets

Page 12: Vulkan: the essentials - Nvidia

12

Vulkan Components

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Page 13: Vulkan: the essentials - Nvidia

13

Device

Vulkan Components

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Page 14: Vulkan: the essentials - Nvidia

14

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Can have many …

Vulkan Objects: Device

Device

VkPhysicalDevice • Capabilities • Memory Management • Queues • Objects

• Buffers • Images • Sync Primitives

Page 15: Vulkan: the essentials - Nvidia

15

NVIDIA’s Vulkan Capabilities

•Properties listed from Physical Device

•NVIDIA is almost full featured

•Top to bottom: from GeForce, Quadro down to Tegra

•Check http://vulkan.gpuinfo.org/listreports.php

Page 16: Vulkan: the essentials - Nvidia

16

NVIDIA’s Vulkan Capabilities GeForce GTX 980

Tegra X1 & K1

Page 17: Vulkan: the essentials - Nvidia

17

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Vulkan Components

Queue

Page 18: Vulkan: the essentials - Nvidia

18

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Queues

•Command queue was hidden in OpenGL Context… now explitly declared

•Multiple threads can submit work to a queue (or queues)!

•Queues accept GPU work via CommandBuffer submissions

• few operations available:, “submit work” and “wait for idle”

•Queue submissions can include sync primitives for the queue to:

•Wait upon before processing the submitted work

•Signal when the work in this submission is completed

•Queue “families” can accept different types of work, e.g.

•NVIDIA exposes 16 Queues

•Only one type of queue for all the types of work

Queue

Page 19: Vulkan: the essentials - Nvidia

19

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Vulkan Components Command-buffer

2ndary Command-buffer

Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Cmd.Buffer Pool

Page 20: Vulkan: the essentials - Nvidia

20

Command-Buffers

•Vulkan Rendering Command-Buffers

•Almost what GPU will get at Front-End (FIFO)

•Minor translation & optimization from the Driver prior to sending to the GPU

•Each can be created either for one shot or for multiple frames/submissions

•Cannot create Graphic Work from GPU (command-lists can): API calls to vkCmd…() between Begin & End

•Multi-threading friendly !

•Primary Cmd-Buffer can call many many 2ndary Cmd-Buffers

2ndary Command-

buffer

2ndary Command-

buffer

Primary Cmd-buffer

2ndary Cmd-buffer

Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Cmd.Buffer Pool

Page 21: Vulkan: the essentials - Nvidia

21

Command-Buffers: Update/Push Constants

•2 more ways to update constants/uniforms for Shaders from the Command-Buffer

•Update-Buffer: prior to Render-Pass: can target any Buffer bound by Descriptor Sets

•Push-Constants: targets a dedicated section in GLSL/SpirV

•New values appended “in-band”: in the Command-Buffer

•Efficient; but good for small amount of values

Primary Cmd-buffer

Begin Render-Pass

Draw…

vkCmdPushConstants

vkCmdUpdateBuffer()

layout(push_constant) uniform objectBuffer {

mat4 matrixObject;

vec4 diffuse;

} object;

layout(set=0 , binding = 2 ) uniform MyBuffer {

mat4 mW;

Page 22: Vulkan: the essentials - Nvidia

22

Synchronization

•semaphores

•used to synchronize work across queues or across coarse-grained submissions to a single queue

•events and barriers

•used to synchronize work within a command buffer or sequence of command buffers submitted to a single queue

•fences

•used to synchronize work between the device and the host.

Device

Queue

Cmd-buffer

Host

barrier

event

Queue

Cmd-buffer

Queue

Cmd-buffer

event

Semaphores

Fences

Page 23: Vulkan: the essentials - Nvidia

23

Thread 1

(Busy)

Update Work

Feed Cmd Buffers

cmd. Buffer Pool

Create 2dary Cmd Buffer

Give out Cmd Buffers

Command-Buffers and Multi-Threading

Main thread

(Busy)

Game Work

Thread Coordination

Swapping

Collect

Submit to Q

cmd. Buffer Pool

Create 1ary Cmd Buffer

1ary Cmd calls 2dary ones

Thread 3

(Busy)

Update Work

Feed Cmd Buffers

cmd. Buffer Pool

Create 2dary Cmd Buffer

Give out Cmd Buffers

Thread 4

(Busy)

Update Work

Feed Cmd Buffers

cmd. Buffer Pool

Create 2dary Cmd Buffer

Give out Cmd Buffers

Thread 2

(Busy)

Update Work

Feed Cmd Buffers

cmd. Buffer Pool

Create 2dary Cmd Buffer

Give out Cmd Buffers

! Command Buffer Pool local to the thread !

Page 24: Vulkan: the essentials - Nvidia

24

•Must not recycle a CommandBuffer for rewriting until it is no longer in flight (In flight == GPU still consuming it on its side)

•But can’t flush the queue each frame: would break parallelism !

•VkFences can be provided with a queue submission to test when a command buffer is ready to be recycled

Command Buffer Thread Safety

App Submissions to the Queue

GPU Consumes Queue

Fence A

CommandBuffer CommandBuffer CommandBuffer

Fence B

CommandBuffer CommandBuffer

Fence A Signaled to App

Rewrite command buffer

CommandBuffer

Page 25: Vulkan: the essentials - Nvidia

25

Frame N Frame N-1

•Threads can have more than 1 Command Pool

•Ring-buffer: One Command-Pool per Frame

•when that thread/frame is no longer in flight (Using Fences)

•Faster to simply reset a pool

Frame N-2

Threads And Command Pools

Thread 1 CommandPool Command

Buffer

Command

Buffer

CommandPool Command

Buffer

Command

Buffer

CommandPool Command

Buffer

Command

Buffer

Thread 2 CommandPool Command

Buffer

Command

Buffer

CommandPool Command

Buffer

Command

Buffer

CommandPool Command

Buffer

Command

Buffer

Page 26: Vulkan: the essentials - Nvidia

26

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Graphics pipeline

Page 27: Vulkan: the essentials - Nvidia

27

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Descriptor-set Layout Descriptor-set Layout

Graphics Pipeline

•Snapshot of all States

• Including Shaders

•Pre-compiled & Immutable

• Ideally: done at Initialization time

•Ok at render-time *if* using the Pipeline-Cache

•Prevents validation overhead during rendering loop

•Some Render-states can be excluded from it: they become “Dynamic” States

Graphics pipeline

Pipeline-layout

Descriptor-set Layout

Shader Stage

Vertex Input

Tess. State

Viewport State

Rasterizations State

Multi-Sample State

Depth & Stencil State

Color Blend State

Optional Dynamic States

Viewport Scissor

Blend const Stencil Ref

Depth Bounds Depth Bias

Shader Module

Shader Stage

Rasterizations State

• depthClipEnable

• rasterizerDiscardEnable

• fillMode

• cullMode

• frontFace

• depthBiasEnable

• depthBias

• depthBiasClamp

• slopeScaledDepthBias

• lineWidth

Pipeline

cache

Page 28: Vulkan: the essentials - Nvidia

28

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Descriptor-set Layout Descriptor-set Layout

Graphics Pipeline

•Graphics Pipeline must be consistent with shaders

•No “introspection”, so everything known & prepared in advance

•Vertex Input:

• tells how Attributes: Locations are attached to which Vertex Buffer at which offset

•Pipeline Layout:

•Tells how to map Sets and Bindings for the shaders at each stage (Vtx, Fragment, Geom…)

Graphics pipeline

Pipeline-layout

Descriptor-set Layout

Shader Stage

Vertex Input

Shader Module

GLSL Code

layout(std140, set= 0 , binding= 0) uniform A { ... };

layout(std140, set= 0 , binding= 1) uniform B { … };

layout(std140, set= 1 , binding= 2) uniform C { … };

layout(location=0) in vec3 pos;

layout(location=1) in vec3 N; … void main() { …

Spir

-V c

om

piled Descriptor-Set(s)

Descriptor-Sets for

Page 29: Vulkan: the essentials - Nvidia

29

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Buffer

Buffer

Page 30: Vulkan: the essentials - Nvidia

30

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Buffers

•Highly Heterogenous. Most often used for:

• Index/Vertex Buffers

•Uniform Buffers (Matrices, material parameters…)

•Vulkan Object: Must be bound to some Device Memory

•Can be CPU accessible memory (mappable)

•Can be CPU cached

•Can be GPU accessible only: need a “Staging Buffer” to write into it

• But most Efficient

(More on Device Memory later…)

Page 31: Vulkan: the essentials - Nvidia

31

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Image

Image View

Image View

Page 32: Vulkan: the essentials - Nvidia

32

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Images And ImageView

• Images represent all kind of ‘pixel-like’ arrays

•Textures: Color or Depth-Stencil

•Render targets : Color and Depth-Stencil

•Even Compute data

•Shader Load/Store (imgLoadStore)

• ImageView required to expose Images properly when specific format required

•For Shaders

•For Framebuffers

Page 33: Vulkan: the essentials - Nvidia

33

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Descriptor-Set

Page 34: Vulkan: the essentials - Nvidia

34

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Descriptor-Set

•Each DescriptorSet holds references to some resources

•Descriptor-Set-Layout defines how resources must be put together in a DescriptorSet

•Command buffers can then efficiently bind any or them

•They must match what shaders of each stage expect !

Descriptor-Set

Image

Memory

Image View

Buffer

Sampler

Heap

DescriptorSet Pool

Descriptor-set Layout

GLSL Code

layout(std140, set= 0 , binding= 0) uniform A { ... };

layout(std140, set= 0 , binding= 1) uniform B { … };

layout(std140, set= 1 , binding= 2) uniform C { … };

layout(set=0, binding=3) uniform sampler2D tex;

… void main() { …

Page 35: Vulkan: the essentials - Nvidia

35

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Descriptor-Set

Descriptor-set Layout A

Uniform Buffer for Vtx

Storage Buffer for frag.

Image View for frag

Descriptor-set Layout C

Uniform Buffer for Vtx

Descriptor-set Layout B

Image View for frag.

Sampler for frag. shd

Dset A1

Buffer L Image View M

Buffer K

Dset B1

Image View Q

Sampler R

Dset C3

Buffer S

Command-buffer

Bind Graphics-pipeline

Bind Descriptor-Set(s)

Dset A2

Buffer O

Image View P

Buffer N

?

Memory

Heap

Image M

Buffer K

Image P

Image Q

Buffer L Buffer N Buffer O Buffer S Buffer T

Dset C2

Buffer T

Dset C1

Buffer S

?

Defined for

Layout

A+B+C

Page 36: Vulkan: the essentials - Nvidia

36

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Framebuffer Render-Pass

Page 37: Vulkan: the essentials - Nvidia

37

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Framebuffer Render-Pass

Image

Image View Image View Image View ImageView #0

Image Image Image #0 Attachment Desc. • Image #0: fmt…

• Image #1: fmt…

• Image #2: fmt…

• Image #3: fmt…

Must match

Sub-Pass #0 Desc. • Color: Image #0

• DST: Image #1

• MSAA resolve:Image #2

Sub-Pass #1 Desc. • Color0: Image #3

• Color1: Image #4

• DST: Image #1

• MSAA resolve:Image #5

• Framebuffer • Simpler than OpenGL • “Bag” or “Repository” of resource views • No role defined for the resources

• Render-Pass • Really defines the role of Framebuffer resources • Can have more than 1 Sub-Pass • Each Sub-Passes defines which Framebuffer resource to use • invented for Tilers Arch

Sub-Pass #N Desc.

Default Sub-Pass

Can use many if compatibles

Page 38: Vulkan: the essentials - Nvidia

38

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Vulkan Components

Memory (Vid)

Heap 2

Memory (Sys)

Heap 1

Page 39: Vulkan: the essentials - Nvidia

39

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Memory Vulkan Objects

•Vulkan Objects referring to buffer(s) of data need binding to memory

•Vertex/Index Buffers; Uniform Buffers; Images/Textures…

•Vulkan Device exposes various Memory Heaps - Example:

•heap 0: size:12,288Mb (Video Memory of my K6000)

•heap 1: size:17,911Mb (System Memory of my PC)

•And various Memory Types from these Heaps. Example:

Mem.Type Heap Flags

0 1 (sys.mem) -

1 0 (Video) DEVICE_LOCAL

2 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT

3 1 (sys.mem) HOST_VISIBLE | HOST_COHERENT | HOST_CACHED

Tegra: Adds one more: HOST_VISIBLE “NON-Coherent”

Memory (Vid)

Heap 2

Memory (Sys)

Heap 1

Page 40: Vulkan: the essentials - Nvidia

40

Device

Queue

Command-buffer

Graphics pipeline

Descriptor-Set

Framebuffer Render-Pass

Image

Memory

2ndary Command-buffer

Buffer

Image View

Buffer

Sampler

Image View Begin Render-Pass

Bind Graphics-pipeline

Bind Vertex/Idx Buffer(s)

Bind Descriptor-Set(s)

Draw…

End Render-Pass

Execute Commands

Update Buffer

Set misc. dynamic states

Barrier synchronization

Heap

Cmd.Buffer Pool

DescriptorSet Pool

Command-buffer

Memory Vulkan Objects

Image (Tex)

Image View

Buffer

(Uniforms)

Image View

Buffer (Vertices)

Memory (Vid)

Heap 2

Memory (Sys)

Heap 1

matrices

Vtx Buf.

V1.xyzw

V2.xyzw…

Image (RT)

rgba

rgb Vtx

Buf.

Page 41: Vulkan: the essentials - Nvidia

41

Resource management

Allocation and Sub allocation

HEAP supporting A,B HEAP supporting B

Allocation Type A Allocation Type B

Image

...

Cube Image Buffer

Allocate memory type from heap

Query Vulkan Object about size,

alignment & type requirements

Assign memory subregion to a resource

(allows aliasing)

BufferView BufferView Create resource views on subranges of

a buffer or image (array slices...)

Buffer

Page 42: Vulkan: the essentials - Nvidia

42

Resource Management

#HappyGPU

Memory Allocation

Buffer

Uniform Vertex Index

Memory Allocation

Buffer

Index Vertex

Buffer

Uniform

Buffer

Better...

Buffer

Index Vertex

Buffer

Uniform

Buffer

Not. So. Good.

Bind same buffer with Offsets > 0

1 buffer can have many types of data

Page 43: Vulkan: the essentials - Nvidia

44

Shaders

•Vulkan uses SPIR-V passed directly to the driver

•Can be compiled from GLSL Via glslang or LunarG’s glslangValidator; Google ShaderC

• theoretically other languages could be compiled to Spir-V…

•Libraries available to compile GLSL to Spir-V from the application

•NVIDIA allows to compile GLSL directly

glslang GLSL

Some other

language

NVIDIA VK_NV_glsl_shader: Vulkan reads GLSL directly

Page 44: Vulkan: the essentials - Nvidia

45

Shaders

•Multiple entry points can be defined in a signle Spir-V shader-module

•Prevents redundant code: shader module used by many Graphics-Pipelines

•Allows sharing snippets of code

•Easier to share common shader code

Module

vtx_main()

frag_pastic()

Frag_metal() lighting()

rotate(…)

Graphics pipeline A Vtx_main()

frag_pastic()

Graphics pipeline B Vtx_main()

frag_metal()

Graphics pipeline C Vtx_main()

frag_wood()

frag_wood()

Page 45: Vulkan: the essentials - Nvidia

46

Vulkan Window System Integration (WSI)

•WSI manages the ownership of images via a swap chain

•One image is presented while the other is rendered to

•WSI is a Vulkan Extension

WSI Display

Swap Chain (images)

Application

Submit image to WSI The display owns the image

Acquire image from WSI The application owns the image

Page 46: Vulkan: the essentials - Nvidia

47

NVIDIA OpenGL Vulkan Interop

•Alternative to WSI: GL_NV_draw_vulkan_image

•Create an OpenGL Context and all the usual things

•Create Vulkan Device

•Rendering Loop involves both OpenGL and Vulkan

•Blit the Vulkan image to OpenGL backbuffer: glDrawVkImageNV

•Extra care on synchronization (Semaphores)

•Bonus: Mix OpenGL rendering (UI overlay…) with Vulkan

•Allows smooth transition in projects

OpenGL

Vulkan

Page 47: Vulkan: the essentials - Nvidia

48

Pre-requisites to work with Vulkan

•Lunar-G (http://lunarg.com/ )

•Vulkan Loader (+Source code)

•Tools: Spir-V compiler for GLSL code and other libraries

•Layers: intermediate code invoked by Vulkan API functions to help debug

•Vulkan Includes

•Drivers:

•GeForce Experience (latest is 364.51 for a fix)

•https://developer.nvidia.com/vulkan-driver

•NVIDIA resources: https://developer.nvidia.com/Vulkan

Page 48: Vulkan: the essentials - Nvidia

49

Recap’ On NVIDIA-Specific Features

•Compatible GPUs for Vulkan: Kepler and Higher; Shield Tablet; Shield Android TV

•GLSL can be directly sent to Vulkan

•GL_NV_draw_vulkan_image can replace WSI

•16 Queues. All available for any kind of use

•2 frames in flight with WSI

•All Host memories are “Coherent” (except one for Tegra)

•Layout transitions don’t exist in our HW (VK_IMAGE_LAYOUT_GENERAL)

•Linear-Tiling only for 2D non-mipmapped textures

•Shaders never need re-compilation due to states in Graphics-pipeline

Page 49: Vulkan: the essentials - Nvidia

50

NSight for Vulkan

Page 50: Vulkan: the essentials - Nvidia

51

Recap’ on Vulkan Philosophy

•Validate as much as possible up-front (DescriptorSets; Pipelines…)

•The driver doesn’t waste time on figuring-out how to set things-up

•Reuse existing patterns of Graphics-Pipelines: cached pipelines

•Know your application: Taylor Vulkan design according to it

•Know your memory usage: You are in charge of optimal sub-allocations

•Explicit multi-threading for graphics: Application’s responsibility

•Explicit Resource updates: Either through [non]Coherent buffers; or Queue-Based DMA transfers