Upload
phamnhu
View
223
Download
0
Embed Size (px)
Citation preview
Raj Rao, NVIDIA GRID Product Management Ziv Kalmanovich, vSphere ESXi Product Management
SER3052BU
#VMworld #SER3052BU
How VMware vSphere and NVIDIA GPUs Accelerate Your Organization
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
CONFIDENTIAL 2
VMworld 2017 Content: Not fo
r publication or distri
bution
Empower
Digital
Workspaces
Transform
Security
Modernize
Data
Centers
Integrate
Public
Clouds
Your Strategic IT Priorities
Aligning To Your Strategic Priorities
3
VMware’s Vision
Any Cloud
Any DeviceVMware Workspace ONE™ Desktop Mobile Identity
Any ApplicationTraditional Apps Cloud-Native Apps SaaS Apps
Software-Defined Data Center
VMware Cross-Cloud Architecture™
Private Cloud Hybrid Cloud Public Cloud
VMware Cloud Foundation™
VMware vRealize® Cloud Management
VMware Cloud Provider Partners
F
VMware Cross-Cloud Services™
Your Strategic
IT Priorities
Integrate
Public
Clouds
Modernize
Data
Centers
Transform
Security
Empower
Digital
Workspaces
VMworld 2017 Content: Not fo
r publication or distri
bution
4
Test / Dev /Tier 2/3
Business-CriticalApps
DesktopVirtualization
3DGraphics
BigData
Cloud-NativeApplications
Deep Learningwith GPU
vSphere Integrated Containers
SAPHANA
Universal App Platform
VMworld 2017 Content: Not fo
r publication or distri
bution
Virtualizing HPC, Big Data and ML Workloads with GPUs
CONFIDENTIAL 5
Big Data &Analytics
AI andDeep Learning
High Performance Computing
Agility
Resiliency
Security
Near Native Performance
Application Compatibility
Efficiency
Traditional Enterprise Applications
Agility
Resiliency
Security
Efficiency
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL
6
GPU Compute on vSphere with DirectPath IO – Benefits
GPUGPUGPUGPU
GPU
GPU
GPU
GPU
vSphere
GPUGPUGPUGPU
GPUGPUGPUGPU
vSphere
VM
Workload Isolation
Reproducibility
VM level QoS
HW IsolationNear Bare Metal Performance
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL
7
GPUGPUGPUGPU
GPU
GPU
GPU
GPU
vSphere
GPUGPUGPUGPU
GPUGPUGPUGPU
vSphere
VM
1:1 X:1
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
GPU Compute on vSphere with DirectPath IO – Machine Learning
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL
8
Master
node
GPUGPUGPUGPU
GPU
GPU
GPU
GPU
vSphere
GPUGPUGPUGPU
GPUGPUGPUGPU
vSphere
VM
Researchers
GPU Compute on vSphere with DirectPath IO – HPC and Big Data
Worker
Worker
WorkerWorkerWorkerWorker
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL
9
GPU Compute on vSphere with DirectPath IO – Benefits
GPUGPUGPUGPU
GPU
GPU
GPU
GPU
vSphere
GPUGPUGPUGPU
GPUGPUGPUGPU
vSphere
VM
Workload Isolation
Reproducibility
VM level QoS
HW IsolationNear Bare Metal Performance
GPU sharing
GPU acc VM Resiliency
GPU QoS
GPU Resource Scheduling
What’s Missing?
VMworld 2017 Content: Not fo
r publication or distri
bution
10
AGENDA
Why Compute Workloads important to your DataCenter
What are the key requirements that Compute Workloads bring
How does NVIDIA GPU Virtualization enable you to host these Workloads
VMworld 2017 Content: Not fo
r publication or distri
bution
11
WHY IS SUPPORT FOR COMPUTE WORKLOADS IMPORTANT
VMworld 2017 Content: Not fo
r publication or distri
bution
12
GPU ACCELRATED APPLICATIONSBROAD RANGE OF INDUSTRIES TRANSFORMED
Visual Computing
Computational Fluid Dynamics Computational Finance
Machine LearningDefenseComputational Chemistry
Electric Design Automation Numerical Analytics
Data Science
Medical ImagingComputational Structural
MechanicsWeather and Climate
For a complete list go to: http://www.nvidia.com/object/gpu-applications.html
VMworld 2017 Content: Not fo
r publication or distri
bution
13
VISUAL
WORKSPACE
THE EVOLUTION OF MODERN WORKFLOWS
Information Workers/Students Designers/ScientistsVISUAL COMPUTING SPECTRUM
COLLABORATIONLARGE DATA
INTERACTIVE
HPC
VR PHOTOREALISM AIMOBILITY
VMworld 2017 Content: Not fo
r publication or distri
bution
14
WHAT ARE THE KEY REQUIREMENTS TO CONSIDER
VMworld 2017 Content: Not fo
r publication or distri
bution
15
VIRTUALIZING MIXED WORKLOADS
Virtual Machine
Guest OS
Server
Hypervisor
NVIDIA
GPU
NVIDIA vGPU manager
NVIDIA Driver
vGPU vGPU vGPU
CPUs
Virtual Machine
Guest OS
NVIDIA Driver
Virtual Machine
Guest OS
NVIDIA Driver
Apps Apps Apps
Performance
Guaranteed QoS
Insight
Fully Accelerate every Application
Requirements
VMworld 2017 Content: Not fo
r publication or distri
bution
16
GRID AUGUST 2017 RELEASE
VMworld 2017 Content: Not fo
r publication or distri
bution
17
NVIDIA VIRTUAL GPU SW -
AUGUST 2017RELEASE
PASCAL HW SUPPORT END TO END MANAGEMENT
GPU SCHEDULER ADVANCEMENTS COMPUTE IN ALL vDWSPROFILES
VMworld 2017 Content: Not fo
r publication or distri
bution
18
UNDERSTANDING HARDWARE SPECSPlanning for Performance
Graphics Performance
Compute Performance
Video Memory
Peak TFLOPS – DP/SP
Memory Bandwidth
3DMark 11 - DX
SPECviewperf 12 - OGL
Decoding/Encoding
Frame Buffer Size
Scalability & Flexibility
VMworld 2017 Content: Not fo
r publication or distri
bution
19
GRID PERFORMANCE OPTIMIZEDTESLA M60 TESLA P40
GPUs Dual GM204 Single GP102
CUDA Cores 4,096 (2,048 per GPU) 3,840
Memory Size 16 GB GDDR5 (8 GB per GPU) 24 GB GDDR5
Form Factor PCIe 3.0 Dual Slot PCIe 3.0 Dual Slot
Thermal passive / active passive
Power 300W / 240W 250W
Max Concurrent Users 32 (0.5GB FB) 24 (1GB FB)
Profile Options
0Q, 1Q, 2Q, 4Q, 8Q
0B, 1B
0A, 1A, 2A, 4A, 8A
1Q, 2Q, 3Q, 4Q, 6Q, 8Q, 12Q, 24Q
1B
1A, 2A, 3A, 4A, 6A, 8A, 12A, 24A
H.264 1080p30 Streams 36 24*
3DMark 11 13,732 25,000*
SPECviewperf 12 62 110*
SGEMM TFLOPS 2x 3.8 10.6
Memory Bandwidth 2x 160 GB/s 347 GB/s
~2x* estimate
forVirtual Data Center
Workstations
VMworld 2017 Content: Not fo
r publication or distri
bution
20
STANDARD SCHEDULER
VM 1
VM 2
VM 3
Round
Robin
Scheduler GPU Engine
VM1
VM2
VM3
SHARE OF GPU CYCLES
Timesliced Round Robin Scheduler
Tasks generally execute within a timeslice
Best Effort Scheduling
124
3
6
578 12345678
BEST EFFORT SCHEDULING
VMworld 2017 Content: Not fo
r publication or distri
bution
21
SCHEDULING LONG RUNNING TASKS
VM 1
VM 2
VM 3
Round
Robin
Scheduler GPU Engine
Compute Tasks can be long running
Round Robin Scheduler fails when a single task does not complete within a reasonable time
Starves other VMs
Injects the “noisy neighbor” symptom
1
3
468 1
ROOT-CAUSE FOR QoS ISSUES
2
57
VM1
SHARE OF GPU CYCLES
VMworld 2017 Content: Not fo
r publication or distri
bution
22
INTRODUCING : EQUAL SHARE SCHEDULER
New Advanced Scheduling mode : Equal Share Scheduler (Available on Pascal HW)
Long Running Tasks are pre-empted and context saved to be resumed when rescheduled
Deterministic share of GPU cycles per VM
All running vGPU enabled VMs get equal share of GPU cycles
GUARENTEE DETERMISINSTIC QoS
VM 1
VM 2
VM 3
Equal
Share
Round
Robin
Scheduler GPU Engine
VM1
VM2
VM3
SHARE OF GPU
13141618
1
3
257468 2
57
VMworld 2017 Content: Not fo
r publication or distri
bution
23
END-TO-END MANAGEMENTTaking GPU visibility to a new level with application monitoring
Guest monitoring
Host monitoring
New
End user experience monitoring
Accurate sizingPerformance
troubleshooting
Session monitoring
App monitoring
VMworld 2017 Content: Not fo
r publication or distri
bution
24
GPU ACCELERATED APPLICATIONS
- CUDA 9.0
- OCL 2.0
- Quadro Value-Add
- Vulkan 1.0
- Shader Model 5.0
- OGL 4.5
- DX 9, 10, 11, 12
CUDA enabled app
GRID Virtual PC
Quadro Virtual Data Center Workstation
VMworld 2017 Content: Not fo
r publication or distri
bution
25
NVIDIA GRID GPU VIRTUALIZATION PLATFORMIndustry standard virtualization platform
NVIDIA Tesla GPU
NVIDIA Virtualization Software
Hypervisor
HPCvPC
Quadro virtual
Data Center
Workstation
Rendering AI
M60, M6, M10 (graphics/sharing only) P40, P6, P100, P4
Data Center and/or Cloud Accessible
vGPU Monitoring, Insight and Management
CUDA Compute Support(Workstation Apps, Rendering, HPC, DL, AI)
VMworld 2017 Content: Not fo
r publication or distri
bution
26
WRAP UP
- GPU Accelerated Apps is transforming a Broad Range of Industries
- Virtualizing Mixed Workloads in a multi-tenant environment requires
- Performance
- Deterministic QoS
- Insight
- Full Acceleration for every Application
- NVIDIA GPU Virtualization Platform delivers key requirements to host mixed workloads
VMworld 2017 Content: Not fo
r publication or distri
bution
Vision – Extend all vSphere Benefits to NVIDIA GRID™ vGPU
27
DRSvMotionSnapshotsSuspend
&Resume
CPU Mem
Shared Resources
vGPUvGPU
vGPUvGPU
vSphere
Roadmap
GPU
NVIDIA GRIDtm
GPU
See @booths
vSphere Cloud
Platform - New
Workloads
EUC 3D Experience
NVIDIA GRID
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It
is for informational purposes only and may not be incorporated into any contract.
VMworld 2017 Content: Not fo
r publication or distri
bution
Considered Milestones for VMware vSphere with NVIDIA GRID
28
Suspend&Resume vSphere vMotion vSphere DRSSnapshots
Roadmap RoadmapRoadmapTech Preview
See @booths
VMW Cloud Platform -
New Workloads
VMW EUC 3D
Experience
NVIDIA GRID
See @booths
VMW Cloud Platform -
New Workloads
VMW EUC 3D
Experience
NVIDIA GRID
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It
is for informational purposes only and may not be incorporated into any contract.
Virtual PCVirtual
Workstation High Performance
ComputingMachine Learning
VMworld 2017 Content: Not fo
r publication or distri
bution
29
vSphere with NVIDIA GRID – Simplified Maintenance
GPUGPUGPUGPU
VMware vSphere
vGPUvGPUvGPUvGPU
VM
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
vGPUvGPUvGPUvGPU
VM
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
vGPUvGPU
GPUGPUGPUGPU
VMware vSphere NVIDIA GRIDNVIDIA GRIDNVIDIA GRID NVIDIA GRIDRemediate
Roadmap
See @booths
VMW Cloud Platform -
New Workloads
VMW EUC 3D
Experience
NVIDIA GRID
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It
is for informational purposes only and may not be incorporated into any contract.
VMworld 2017 Content: Not fo
r publication or distri
bution
30
vSphere with NVIDIA GRID – 24h Utilization
GPUGPUGPUGPU
VMware vSphere NVIDIA GRID
vGPUvGPUvGPUvGPU
VM
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
vGPUvGPUvGPUvGPU
VM
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
vGPUvGPUvGPUvGPU
VM
CUDA Developer
CUDA Developer
Data Scientist
Data Scientist
vGPUvGPU
vGPUvGPU
Develop/VDI/Inference by day
ML Training by night
Same Infrastructure
Tech Preview
See @Booths
VMW Cloud Platform-
New Workloads
VMW EUC
NVIDIA
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is
for informational purposes only and may not be incorporated into any contract.
VMworld 2017 Content: Not fo
r publication or distri
bution
Vision - Extend vSphere Benefits to NVIDIA GPUs with DirectPathIO (passthrough) GPU workloads
CONFIDENTIAL 31
DRSvMotionSnapshotsSuspend
&Resume
CPU Mem GPU
Shared Resources
GPU
GPU
vSphere
GPU
DRS
Placement
vSphere
HA
Roadmap
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is
for informational purposes only and may not be incorporated into any contract.
VMworld 2017 Content: Not fo
r publication or distri
bution
Overview - Considered Milestones for all NVIDIA GPU Enablement
CONFIDENTIAL
32
Suspend&ResumeFor NVIDIA GRID
vSphere vMotion for NVIDIA GRID
Snapshots for NVIDIA GRID
RoadmapTech Preview
See @booths
VMW Cloud Platform -
New Workloads
VMW EUC 3D
Experience
NVIDIA GRID
vSphere DRS for NVIDIA GRID
Roadmap
vSphere DRS for DirectPath IO GPU
Roadmap
vSphere HA for DirectPath IO GPU
Roadmap
The information in this presentation is intended to outline our general product direction and should not be relied on in making a purchasing decision. It is for
informational purposes only and may not be incorporated into any contract.
Roadmap
Virtual PC (VDI)
See @booths
VMW Cloud Platform -
New Workloads
VMW EUC 3D
Experience
NVIDIA GRID
Virtual
WorkstationHigh Performance
ComputingMachine Learning
VMworld 2017 Content: Not fo
r publication or distri
bution
Introducing vSphere Scale-Out for Big Data and HPC Workloads
33
• Hypervisor, vMotion, vShield Endpoint, Storage vMotion, Storage APIs, Distributed Switch, I/O Controls & SR-IOV, Host Profiles / Auto Deploy and more
Features
• Sold in Packs of 8 CPU at a cost-effective price pointPackaging
• EULA enforced for use w/ Big Data/HPC/ML workloads onlyLicensing
New package that provides all the core features required for scale-out workloads at an attractive price point
VMworld 2017 Content: Not fo
r publication or distri
bution
Value of vSphere Scale-Out for Big Data and HPC
34
Flexibility & Agility
• Infrastructure on demand
• Iterate faster
• Scale out more rapidly
• Multi-tenancy enables different multiple distros on the same set of server
Operational Efficiency
• CapEx and OpExSaving
• Cluster Consolidation
• Increase Server Utilization
Reduced Complexity
• Simple operations using tools that IT is familiar with
• Live workload mobility for Master nodes
• Reference architecture and best practices
Data Governance and Control of Sensitive Data
• Host and VM security for your customer data
• Security isolation
• Hypervisor Guests have low privileges by default
Faster time to results and insights at a lower cost
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Takeaways
• NVIDIA GRID and VMware vSphere provide the operational benefits of virtualization with near native
performance (95%) for GPU accelerated HPC, Big Data and Machine Learning
• VMware's vision is seamless integration of NVIDIA GPU technologies as native resources of VM
infrastructure
• New vSphere Scale-Out SKU; new package with attractive price point for Big Data/HPC/ML dedicated
infrastructure virtualization. http://blogs.vmware.com/vsphere/2017/09/vsphere-scale-now-available.html
VMworld 2017 Content: Not fo
r publication or distri
bution
Contact us! We’d like to learn about your use cases and challenges.
Raj Rao, NVIDIA GRID Product Management – [email protected] Kalmanovich, vSphere Product Management – [email protected] 2017 Content: N
ot for publicatio
n or distribution
Recommended Additional Resources at VMWorld
CONFIDENTIAL 37
VMware Cloud Platform New Workloads
VMware End User Computing 3D Experience
NVIDIA
GPU Enabled Linux VDI [VMTN6636U]
Machine Learning and Deep Learning on VMware vSphere: GPUs Are Invading the Software-Defined Data Center [VIRT1997BU]
Empowering the digital workspace: balancing tomorrow’s trends with today’s needs [UEM3332PUS]
Wringing Maximum Performance from vSphere for Extremely Demanding Workloads and Customers [FUT2020BU]
Expo Booths GPU Sessions @VMworld
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution