BCO2655-VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines—Technical Preview and...

Preview:

DESCRIPTION

BCO2655-VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines—Technical Preview and Best Practices_Final_US.pdf

Citation preview

VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines— Technical Preview

Jim Chow, VMware, Inc.

Shrinand Javadekar, VMware, Inc.

INF-BCO2655

#vmworldinf

2

Disclaimer

This session may contain product features that are currently under development.

This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new technologies or features discussed or presented have not been determined.

3

Agenda

vSphere Availability Portfolio Why Fault Tolerance Multiprocessor Fault Tolerance details Live Demo Performance Numbers Questions

4

43% of companies experiencing disasters never re-open, and 29% close within two years.

(McGladrey and Pullen)

93% of business that lost their data center for 10 days went bankrupt within one year.

(National Archives & Records Administration)

Top executives say 10 hours to recovery; IT managers say up to 30 hours.

(Harris Interactive)

Disasters Happen. Do You Need Protection?

5

Do you need protection?

Server failures happen • Google released some data about their server failures

• 2% to 4% servers fail, 1% to 5% of disk drives crash. • 20 rack failures: 40-80 machines instantly disappeared • 1-6 hours to get back

Sources

• http://content.dell.com/us/en/gen/d/large-business/google-data-center

6

vSphere Offers Protection at Every Level

NIC Teaming, Storage Multipathing

High Availability, Fault Tolerance, vMotion, DRS

Storage vMotion

Site Recovery Manager

Component Server Storage Data Site

Backup Solutions

Protection against hardware failures Planned maintenance with zero downtime Protection against unplanned downtime and disasters

7

vSphere Availability Portfolio

Coverage

Hardware

Guest OS

Application

Fault Tolerance

App Monitoring APIs

none minutes Downtime

Guest Monitoring

VM Infrastructure HA

8

vSphere Availability Portfolio

Coverage

Hardware

Guest OS

Application

Fault Tolerance

App Monitoring APIs

none minutes Downtime

Guest Monitoring

VM Infrastructure HA

9

Why Fault Tolerance?

Continuous Availability • Zero downtime

• Zero data loss

• No loss of TCP connections

• Completely transparent to guest software • No dependency on Guest OS, applications • No application specific management and learning

10

Background

2009: vSphere Fault Tolerance in vSphere 4.0 2010: Updates to vSphere Fault Tolerance in vSphere 4.1 2011: Updates to vSphere Fault Tolerance in vSphere 5.0 Details: http://www.vmware.com/products/fault-tolerance/ Problem:

• FT only for uni-processor VMs

• Is FT for multi-processor VMs possible? • An impressively hard problem • Concerted effort to find an approach

Reached a key milestone • We’d like to share it

11

A Starting Point: vSphere FT

FT LOGGING

Shared VMDKs

vLockstep

vSphere ESX

(Primary)

vSphere ESX

(Secondary)

12

A Clean Slate

FT LOGGING

Shared VMDKs

vSphere ESX

(Primary)

vSphere ESX

(Secondary)

13

A Clean Slate

FT LOGGING vSphere ESX

(Primary)

vSphere ESX

(Secondary)

Next: FT in practice

14

Turning on Multiprocessor FT

Creating two VMs A new VM, but identical configuration

• vRAM, # vCPUs, vNICs, etc.

Each VM owns a complete set of VM files • Separate vmdks completely owned by each VM

Primary VM

Disk 2

Config

Disk 1

Secondary VM

Disk 2

Config

Disk 1

15

Datastores

Primary VM

Disk 2

Config

Disk 1

Secondary VM

Disk 2

Config

Disk 1

16

Datastores

Primary VM

Disk 2

Config

Disk 1

Secondary VM

Disk 2

Config

Disk 1

17

Datastores

One datastore must be common Ensures only one running copy of the VM at any time

Primary VM

Disk 2

Config

Disk 1

Secondary VM

Disk 2

Config

Disk 1 Tie

Break Datastore

18

19

Initial placement of secondary

Not tied to the host

20

- Intel vs AMD - vMotion

compatible

21

Co-located on single Datastore

by default

22

23

24

All done!

25

Live Demo

VMware vCenter Server

Central management server Continuous availability difficult Multiprocessor FT makes it simple

• Natural fit

26

Live Failover

vSphere Web Client

Continuous availability through server failure

vCenter Server

27

Backing up FT VMs

Support for vStorage APIs for Data Protection (VADP) • API for non-disruptive snapshots

Many VADP solutions on the market

28

Live Demo Summary

FT in action • Principles to keep in mind • Doing backups of FT VMs • Ensure continuous availability of multiprocessor workloads

• Presented a good solution • Client oblivious to FT operation • Zero downtime, zero data loss

• Taste for performance / bandwidth

But that’s not all

29

Performance Numbers

0

20

40

60

80

100

Microsoft SQLServer 2-vCPU

Microsoft SQLServer 4-vCPU

OracleSwingbench 2-

vCPU

OracleSwingbench 4-

vCPU

% Throughput (FT/non FT) (higher is better)

Similar configuration to vSphere 4 FT Performance Whitepaper • Models real-world workloads: 60% CPU utilization

30

vSphere FT Summary

Why Fault Tolerance • Continuous availability

Fault Tolerance for multi-processor VMs • Good solution to impressively hard problem

• A new design

• Demonstrated similar experience to existing vSphere FT • But more vCPUs

31

Thank you!

Questions?

FILL OUT A SURVEY

EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A

$25 VMWARE COMPANY STORE GIFT CERTIFICATE

VMware vSphere Fault Tolerance for Multiprocessor Virtual Machines— Technical Preview

Jim Chow, VMware, Inc.

Shrinand Javadekar, VMware, Inc.

INF-BCO2655

#vmworldinf

Recommended