38
Raj Rao, NVIDIA GRID Product Management Ziv Kalmanovich, vSphere ESXi Product Management SER3052BU #VMworld #SER3052BU How VMware vSphere and NVIDIA GPUs Accelerate Your Organization VMworld 2017 Content: Not for publication or distribution

SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

  • Upload
    phamnhu

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Raj Rao, NVIDIA GRID Product Management Ziv Kalmanovich, vSphere ESXi Product Management

SER3052BU

#VMworld #SER3052BU

How VMware vSphere and NVIDIA GPUs Accelerate Your Organization

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Empower

Digital

Workspaces

Transform

Security

Modernize

Data

Centers

Integrate

Public

Clouds

Your Strategic IT Priorities

Aligning To Your Strategic Priorities

3

VMware’s Vision

Any Cloud

Any DeviceVMware Workspace ONE™ Desktop Mobile Identity

Any ApplicationTraditional Apps Cloud-Native Apps SaaS Apps

Software-Defined Data Center

VMware Cross-Cloud Architecture™

Private Cloud Hybrid Cloud Public Cloud

VMware Cloud Foundation™

VMware vRealize® Cloud Management

VMware Cloud Provider Partners

F

VMware Cross-Cloud Services™

Your Strategic

IT Priorities

Integrate

Public

Clouds

Modernize

Data

Centers

Transform

Security

Empower

Digital

Workspaces

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

4

Test / Dev /Tier 2/3

Business-CriticalApps

DesktopVirtualization

3DGraphics

BigData

Cloud-NativeApplications

Deep Learningwith GPU

vSphere Integrated Containers

SAPHANA

Universal App Platform

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Virtualizing HPC, Big Data and ML Workloads with GPUs

CONFIDENTIAL 5

Big Data &Analytics

AI andDeep Learning

High Performance Computing

Agility

Resiliency

Security

Near Native Performance

Application Compatibility

Efficiency

Traditional Enterprise Applications

Agility

Resiliency

Security

Efficiency

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

CONFIDENTIAL

6

GPU Compute on vSphere with DirectPath IO – Benefits

GPUGPUGPUGPU

GPU

GPU

GPU

GPU

vSphere

GPUGPUGPUGPU

GPUGPUGPUGPU

vSphere

VM

Workload Isolation

Reproducibility

VM level QoS

HW IsolationNear Bare Metal Performance

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

CONFIDENTIAL

7

GPUGPUGPUGPU

GPU

GPU

GPU

GPU

vSphere

GPUGPUGPUGPU

GPUGPUGPUGPU

vSphere

VM

1:1 X:1

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

GPU Compute on vSphere with DirectPath IO – Machine Learning

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

CONFIDENTIAL

8

Master

node

GPUGPUGPUGPU

GPU

GPU

GPU

GPU

vSphere

GPUGPUGPUGPU

GPUGPUGPUGPU

vSphere

VM

Researchers

GPU Compute on vSphere with DirectPath IO – HPC and Big Data

Worker

Worker

WorkerWorkerWorkerWorker

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

CONFIDENTIAL

9

GPU Compute on vSphere with DirectPath IO – Benefits

GPUGPUGPUGPU

GPU

GPU

GPU

GPU

vSphere

GPUGPUGPUGPU

GPUGPUGPUGPU

vSphere

VM

Workload Isolation

Reproducibility

VM level QoS

HW IsolationNear Bare Metal Performance

GPU sharing

GPU acc VM Resiliency

GPU QoS

GPU Resource Scheduling

What’s Missing?

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

10

AGENDA

Why Compute Workloads important to your DataCenter

What are the key requirements that Compute Workloads bring

How does NVIDIA GPU Virtualization enable you to host these Workloads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

11

WHY IS SUPPORT FOR COMPUTE WORKLOADS IMPORTANT

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

12

GPU ACCELRATED APPLICATIONSBROAD RANGE OF INDUSTRIES TRANSFORMED

Visual Computing

Computational Fluid Dynamics Computational Finance

Machine LearningDefenseComputational Chemistry

Electric Design Automation Numerical Analytics

Data Science

Medical ImagingComputational Structural

MechanicsWeather and Climate

For a complete list go to: http://www.nvidia.com/object/gpu-applications.html

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

13

VISUAL

WORKSPACE

THE EVOLUTION OF MODERN WORKFLOWS

Information Workers/Students Designers/ScientistsVISUAL COMPUTING SPECTRUM

COLLABORATIONLARGE DATA

INTERACTIVE

HPC

VR PHOTOREALISM AIMOBILITY

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

14

WHAT ARE THE KEY REQUIREMENTS TO CONSIDER

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

15

VIRTUALIZING MIXED WORKLOADS

Virtual Machine

Guest OS

Server

Hypervisor

NVIDIA

GPU

NVIDIA vGPU manager

NVIDIA Driver

vGPU vGPU vGPU

CPUs

Virtual Machine

Guest OS

NVIDIA Driver

Virtual Machine

Guest OS

NVIDIA Driver

Apps Apps Apps

Performance

Guaranteed QoS

Insight

Fully Accelerate every Application

Requirements

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

16

GRID AUGUST 2017 RELEASE

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

17

NVIDIA VIRTUAL GPU SW -

AUGUST 2017RELEASE

PASCAL HW SUPPORT END TO END MANAGEMENT

GPU SCHEDULER ADVANCEMENTS COMPUTE IN ALL vDWSPROFILES

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

18

UNDERSTANDING HARDWARE SPECSPlanning for Performance

Graphics Performance

Compute Performance

Video Memory

Peak TFLOPS – DP/SP

Memory Bandwidth

3DMark 11 - DX

SPECviewperf 12 - OGL

Decoding/Encoding

Frame Buffer Size

Scalability & Flexibility

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

19

GRID PERFORMANCE OPTIMIZEDTESLA M60 TESLA P40

GPUs Dual GM204 Single GP102

CUDA Cores 4,096 (2,048 per GPU) 3,840

Memory Size 16 GB GDDR5 (8 GB per GPU) 24 GB GDDR5

Form Factor PCIe 3.0 Dual Slot PCIe 3.0 Dual Slot

Thermal passive / active passive

Power 300W / 240W 250W

Max Concurrent Users 32 (0.5GB FB) 24 (1GB FB)

Profile Options

0Q, 1Q, 2Q, 4Q, 8Q

0B, 1B

0A, 1A, 2A, 4A, 8A

1Q, 2Q, 3Q, 4Q, 6Q, 8Q, 12Q, 24Q

1B

1A, 2A, 3A, 4A, 6A, 8A, 12A, 24A

H.264 1080p30 Streams 36 24*

3DMark 11 13,732 25,000*

SPECviewperf 12 62 110*

SGEMM TFLOPS 2x 3.8 10.6

Memory Bandwidth 2x 160 GB/s 347 GB/s

~2x* estimate

forVirtual Data Center

Workstations

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

20

STANDARD SCHEDULER

VM 1

VM 2

VM 3

Round

Robin

Scheduler GPU Engine

VM1

VM2

VM3

SHARE OF GPU CYCLES

Timesliced Round Robin Scheduler

Tasks generally execute within a timeslice

Best Effort Scheduling

124

3

6

578 12345678

BEST EFFORT SCHEDULING

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

21

SCHEDULING LONG RUNNING TASKS

VM 1

VM 2

VM 3

Round

Robin

Scheduler GPU Engine

Compute Tasks can be long running

Round Robin Scheduler fails when a single task does not complete within a reasonable time

Starves other VMs

Injects the “noisy neighbor” symptom

1

3

468 1

ROOT-CAUSE FOR QoS ISSUES

2

57

VM1

SHARE OF GPU CYCLES

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

22

INTRODUCING : EQUAL SHARE SCHEDULER

New Advanced Scheduling mode : Equal Share Scheduler (Available on Pascal HW)

Long Running Tasks are pre-empted and context saved to be resumed when rescheduled

Deterministic share of GPU cycles per VM

All running vGPU enabled VMs get equal share of GPU cycles

GUARENTEE DETERMISINSTIC QoS

VM 1

VM 2

VM 3

Equal

Share

Round

Robin

Scheduler GPU Engine

VM1

VM2

VM3

SHARE OF GPU

13141618

1

3

257468 2

57

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

23

END-TO-END MANAGEMENTTaking GPU visibility to a new level with application monitoring

Guest monitoring

Host monitoring

New

End user experience monitoring

Accurate sizingPerformance

troubleshooting

Session monitoring

App monitoring

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

24

GPU ACCELERATED APPLICATIONS

- CUDA 9.0

- OCL 2.0

- Quadro Value-Add

- Vulkan 1.0

- Shader Model 5.0

- OGL 4.5

- DX 9, 10, 11, 12

CUDA enabled app

GRID Virtual PC

Quadro Virtual Data Center Workstation

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

25

NVIDIA GRID GPU VIRTUALIZATION PLATFORMIndustry standard virtualization platform

NVIDIA Tesla GPU

NVIDIA Virtualization Software

Hypervisor

HPCvPC

Quadro virtual

Data Center

Workstation

Rendering AI

M60, M6, M10 (graphics/sharing only) P40, P6, P100, P4

Data Center and/or Cloud Accessible

vGPU Monitoring, Insight and Management

CUDA Compute Support(Workstation Apps, Rendering, HPC, DL, AI)

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

26

WRAP UP

- GPU Accelerated Apps is transforming a Broad Range of Industries

- Virtualizing Mixed Workloads in a multi-tenant environment requires

- Performance

- Deterministic QoS

- Insight

- Full Acceleration for every Application

- NVIDIA GPU Virtualization Platform delivers key requirements to host mixed workloads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Vision – Extend all vSphere Benefits to NVIDIA GRID™ vGPU

27

DRSvMotionSnapshotsSuspend

&Resume

CPU Mem

Shared Resources

vGPUvGPU

vGPUvGPU

vSphere

Roadmap

GPU

NVIDIA GRIDtm

GPU

See @booths

vSphere Cloud

Platform - New

Workloads

EUC 3D Experience

NVIDIA GRID

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It

is for informational purposes only and may not be incorporated into any contract.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Considered Milestones for VMware vSphere with NVIDIA GRID

28

Suspend&Resume vSphere vMotion vSphere DRSSnapshots

Roadmap RoadmapRoadmapTech Preview

See @booths

VMW Cloud Platform -

New Workloads

VMW EUC 3D

Experience

NVIDIA GRID

See @booths

VMW Cloud Platform -

New Workloads

VMW EUC 3D

Experience

NVIDIA GRID

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It

is for informational purposes only and may not be incorporated into any contract.

Virtual PCVirtual

Workstation High Performance

ComputingMachine Learning

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

29

vSphere with NVIDIA GRID – Simplified Maintenance

GPUGPUGPUGPU

VMware vSphere

vGPUvGPUvGPUvGPU

VM

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

vGPUvGPUvGPUvGPU

VM

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

vGPUvGPU

GPUGPUGPUGPU

VMware vSphere NVIDIA GRIDNVIDIA GRIDNVIDIA GRID NVIDIA GRIDRemediate

Roadmap

See @booths

VMW Cloud Platform -

New Workloads

VMW EUC 3D

Experience

NVIDIA GRID

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It

is for informational purposes only and may not be incorporated into any contract.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

30

vSphere with NVIDIA GRID – 24h Utilization

GPUGPUGPUGPU

VMware vSphere NVIDIA GRID

vGPUvGPUvGPUvGPU

VM

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

vGPUvGPUvGPUvGPU

VM

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

vGPUvGPUvGPUvGPU

VM

CUDA Developer

CUDA Developer

Data Scientist

Data Scientist

vGPUvGPU

vGPUvGPU

Develop/VDI/Inference by day

ML Training by night

Same Infrastructure

Tech Preview

See @Booths

VMW Cloud Platform-

New Workloads

VMW EUC

NVIDIA

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is

for informational purposes only and may not be incorporated into any contract.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Vision - Extend vSphere Benefits to NVIDIA GPUs with DirectPathIO (passthrough) GPU workloads

CONFIDENTIAL 31

DRSvMotionSnapshotsSuspend

&Resume

CPU Mem GPU

Shared Resources

GPU

GPU

vSphere

GPU

DRS

Placement

vSphere

HA

Roadmap

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is

for informational purposes only and may not be incorporated into any contract.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Overview - Considered Milestones for all NVIDIA GPU Enablement

CONFIDENTIAL

32

Suspend&ResumeFor NVIDIA GRID

vSphere vMotion for NVIDIA GRID

Snapshots for NVIDIA GRID

RoadmapTech Preview

See @booths

VMW Cloud Platform -

New Workloads

VMW EUC 3D

Experience

NVIDIA GRID

vSphere DRS for NVIDIA GRID

Roadmap

vSphere DRS for DirectPath IO GPU

Roadmap

vSphere HA for DirectPath IO GPU

Roadmap

The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is for

informational purposes only and may not be incorporated into any contract.

Roadmap

Virtual PC (VDI)

See @booths

VMW Cloud Platform -

New Workloads

VMW EUC 3D

Experience

NVIDIA GRID

Virtual

WorkstationHigh Performance

ComputingMachine Learning

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Introducing vSphere Scale-Out for Big Data and HPC Workloads

33

• Hypervisor, vMotion, vShield Endpoint, Storage vMotion, Storage APIs, Distributed Switch, I/O Controls & SR-IOV, Host Profiles / Auto Deploy and more

Features

• Sold in Packs of 8 CPU at a cost-effective price pointPackaging

• EULA enforced for use w/ Big Data/HPC/ML workloads onlyLicensing

New package that provides all the core features required for scale-out workloads at an attractive price point

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Value of vSphere Scale-Out for Big Data and HPC

34

Flexibility & Agility

• Infrastructure on demand

• Iterate faster

• Scale out more rapidly

• Multi-tenancy enables different multiple distros on the same set of server

Operational Efficiency

• CapEx and OpExSaving

• Cluster Consolidation

• Increase Server Utilization

Reduced Complexity

• Simple operations using tools that IT is familiar with

• Live workload mobility for Master nodes

• Reference architecture and best practices

Data Governance and Control of Sensitive Data

• Host and VM security for your customer data

• Security isolation

• Hypervisor Guests have low privileges by default

Faster time to results and insights at a lower cost

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Key Takeaways

• NVIDIA GRID and VMware vSphere provide the operational benefits of virtualization with near native

performance (95%) for GPU accelerated HPC, Big Data and Machine Learning

• VMware's vision is seamless integration of NVIDIA GPU technologies as native resources of VM

infrastructure

• New vSphere Scale-Out SKU; new package with attractive price point for Big Data/HPC/ML dedicated

infrastructure virtualization. http://blogs.vmware.com/vsphere/2017/09/vsphere-scale-now-available.html

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Contact us! We’d like to learn about your use cases and challenges.

Raj Rao, NVIDIA GRID Product Management – [email protected] Kalmanovich, vSphere Product Management – [email protected] 2017 Content: N

ot for publicatio

n or distribution

Page 37: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

Recommended Additional Resources at VMWorld

CONFIDENTIAL 37

VMware Cloud Platform New Workloads

VMware End User Computing 3D Experience

NVIDIA

GPU Enabled Linux VDI [VMTN6636U]

Machine Learning and Deep Learning on VMware vSphere: GPUs Are Invading the Software-Defined Data Center [VIRT1997BU]

Empowering the digital workspace: balancing tomorrow’s trends with today’s needs [UEM3332PUS]

Wringing Maximum Performance from vSphere for Extremely Demanding Workloads and Customers [FUT2020BU]

Expo Booths GPU Sessions @VMworld

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: SER3052BU How VMware vSphere and NVIDIA GPUs or ... · •This presentation may contain product features that are currently under development. •This overview of new technology represents

VMworld 2017 Content: Not fo

r publication or distri

bution