51
Hyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman [email protected] Thomas Maurer @ThomasMaurer www.ThomasMaurer.ch

Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman [email protected] Thomas Maurer

Embed Size (px)

Citation preview

Page 1: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V High-AvailabilityBest Practices with Failover Clustering

Symon Perriman@[email protected]

Thomas [email protected]

Page 2: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V Cluster Planning

Hyper-V Cluster Optimization

Windows Server 2012 R2 Clustering and Hyper-

V

System Center 2012 R2 Clustering and Hyper-V

Agenda

Page 3: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V Cluster Planning

Page 4: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

High Availability throughout the Datacenter

Hardware High Availability• Servers, storage, networking, etc.

Workload High Availability• Host Failover Clustering for VMs & Roles

Guest Application High Availability• Guest Failover Clustering for apps within VMs

VM Storage High Availability• Scale Out File Server Failover Clustering

Management High Availability• System Center

Site High Availability• Hyper-V Replica & Multi-Site Clustering

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter &

C

luste

ring

Multiple Datacenters

Page 5: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hardware High Availability

Redundancy everywhereServer• Redundant server roles (AD, DHCP, DNS, etc.), • System: Hot swapping, BMC sensors, Power

protection • Processor: Instruction error detection, instruction

retry, lock-step processors, machine check architecture, extended precision

• Memory: Windows Hardware Error Architecture (WHEA), parity bits, error correcting code, memory scrubbing, bad page offloading

Storage• Multi-Path IO (MPIO), RAID, checksums, background

scrubbing, resilient file systems

Networking• Multiple networks, NIC Teaming, Load Balancing

(NLB), Multi-Channel SMB

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

Page 6: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Workload High Availability

Physical servers create a failover clusterSurvive Host Crashes

• VMs restarted on another node

Restart VM Crashes• VM OS restarted on same node

Recover VM Hangs• VM OS restarted on same node

Zero Downtime Maintenance & Patching• Live migrate VMs to other hosts

Mobility & Load Distribution• Live migrate VMs to different

servers to load balance host usage

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

Page 7: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Guest Application High Availability

VMs create a (virtualized) failover clusterGuest Application Health Monitoring• Application restarts or fails over• Detect blue screens & user mode hangs• VM network availability

Application Mobility• Guest OS needs patching or VM needs maintenance, application moved to the other node

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

Page 8: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Cluster

VM Storage High Availability

Keeps the file path to \\SOFS\MyVHD.vhd highly-available Guest

Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

Node 2Node 1

Share2 Share1 Share2

SMB Client

\\SOFS \\SOFS

Share1

Page 9: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Management High AvailabilityServer• Redundant server

deployments• Run server inside a clustered

VM• Backup using DPM or Replicate

using Hyper-V Replica• Monitor with a SCOM

Management Pack

Database• SQL Server 2012 SP1

AlwaysOn Clustering• Replication / Mirroring /

Backup to a secondary site• Run SQL inside a clustered VM• Backup using DPM or Replicate

using Hyper-V Replica• Monitor with a SCOM

Management Pack

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

• App Controller, Orchestrator Web Console, Service Manager Service Catalog• Load-balance network traffic

• Operations Manager Server• Highly-Available Management

Group

• Orchestrator Runbook Server• Primary and redundant

runbooks server failover

• VMM Library Server• Run a file server on a failover

cluster

• VMM Management Server• Run directly on a failover

cluster

R2

Page 10: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Site High Availability

Nodes or clusters in different physical locationsSurvive the loss of an entire datacenterStretch sites over a large distanceStorage at both sites with replicationAutomatic (recommended) or manual recoverySynchronizes cluster, role & VM changesTechnologies • Multi-Site Clustering• Hyper-V Replica• Azure Site Recovery (ASR)

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter

Multiple Datacenters

Page 11: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Also in Windows Server 2008 R2

Combining Host & Guest Clustering

Best of both worlds for flexibility and protection• VM high-availability & mobility between physical nodes• Application & service high-availability & mobility between VMs

Cluster-on-a-cluster does increase complexityMixing physical and virtual nodes is supported• Must pass Validate

CLUSTER CLUSTERShared Storage

Guest Cluster

Shared Storage Shared Storage

Page 12: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Microsoft Hyper-V Server 2012 R2

Similar to Windows Server 2012 R2 CoreFree Server SKU: aka.ms/HyperVserver (RTM)Enterprise-class Microsoft hypervisorCLI, remote GUI management or 3rd party add-onsDoes not include guest OS licensesContains all Hyper-V & Clustering features

• 8,000 VMs/cluster• Cluster Shared Volumes (CSV) 2• All types of live migration

Page 13: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Choosing a Host OS SKU

Microsoft Hyper-V

Server 2012 R2

Windows Server 2012 R2 Standard

Windows Server 2012

R2 Datacenter

Host OS is free

Licensed per Proc

Licensed per Proc

No guest OS licenses

2 guest OS licenses

Unlimited guest licenses

CLI & Remote Management

only

Full installation

or Server Core

Full installation

or Server Core

Page 14: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V Clusters are Different

Cluster Validation tests (pre and post deployment)Supports shared VHDXHeartbeat settings reconfiguredCluster database update requirements•Only a majority of nodes must acknowledge GUM updatesCluster Property Default Hyper-V Default

SameSubnetThreshold 5 10

CrossSubnetThreshold 5 20

Page 15: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

DemoDeployed a Clustered Virtual Machine

Page 16: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V Cluster Optimization

Page 17: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

CLUSTER

Enable VM Health Monitoring

Enable VM heartbeat setting• Requires Integration Components (ICs)

installed in VM

Health check for VM OS from host• User-Mode Hangs• System Crashes

Shared Storage

Also in Windows Server 2008 R2

Page 18: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Disable Starting Low Priority VMs

‘Auto Start’ setting configures if a VM should be automatically started on failover•Group property•Disabling mark groups as lower priority• Enabled by default

Disabled VMs needs manual restart to recover after a crash

Also in Windows Server 2008 R2

Page 19: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Keep VMs on Preferred Hosts

1st choice: ‘Preferred Owners’• VMs will start on preferred host

2nd choice: ‘Possible Owners’• VMs will start on a possible owner, only if a preferred owner is not available

If neither a preferred or possible owner is available, the VM will move to an active node, but not start

Also in Windows Server 2008 R2

Page 20: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Start VMs on Preferred Hosts

‘Persistent Mode’ will attempt to place VMs back on the last node they were hosted on during start• Only takes affect when complete cluster is

started up• Prevents overloading the first nodes that

startup with large numbers of VMs

Better VM distribution after cold startEnabled by default for VM groups• Option is hidden from GUI in 2012+

Also in Windows Server 2008 R2

Page 21: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Also in Windows Server 2008 R2

Keep VMs off the Same Host

AntiAffinityClassNames • Groups with same AACN try to avoid moving to same node

Configured by PowerShell directly on the cluster System Center VMM has a GUI “Availability Groups”Enables VM distribution across host nodesBetter utilization of host OS resourcesScenarios• Separate similar VMs

• Guest cluster nodes• DCs or infrastructure servers

• Separate tenets

For affinity, use preferred owners

Page 22: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

DemoOptimize a ClusteredVirtual Machine

Page 23: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Windows Server 2012 R2 Clustering and Hyper-V

Page 24: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Cluster Scale

Increased scale out and scale up• 4x scale over Windows Server 2008 R2• 64-nodes in a cluster• 8,000 VMs in a cluster• 1,024 VMs per node

. . .

Sca

le u

p

Scale out

..

.

Page 25: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Cluster Validation

Faster storage validationSelect a specific LUNReplicated storage for multi-site clustersNew Hyper-V Tests• Run when Hyper-V role is installed• Integration Components• Memory Compatibility• Virtual Switch Compatibility• Hyper-V Role Enabled• Network Configuration• Storage Configuration

Page 26: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Node Maintenance Mode

Drain all VMs off a nodeSupports all cluster roles Role-specific features• Live migration or

quick migration for VMs • Uses VM Priority

Page 27: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Windows Update

Cluster-Aware Updating

Automated cluster updatingCoordinator serially updates all nodes • Windows Update Agent (WUA)

• Windows Server Update Services (WSUS)• Windows Update

Workflow• Scan nodes to find which patches are needed• Identify node with fewest workloads• Move workloads or live migrate VMs• Call to WUA to patch • Verify patch is successful• Repeat steps 2 – 5 on next node• Repeat on remaining nodes

UpdateCoordinator

Page 28: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

VM Drain on Shutdown

VMs live migrated to another node during shutdownVMs moved to “Best Available Node” (most free memory)Honors VM prioritizationEnsures reboot / shutdown does not incur downtime to VMs for unknowing adminEnabled/Disabled via the DrainOnShutdown cluster common property

Page 29: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Cluster Upgrades

VMs can live migrate from 2012 to 2012 R2• Need to upgrade ICs in VMs•May want to upgrade other clusters in the stack, such as Scale Out File Server

Other roles & VMs running on 2008 R2 use the Copy Cluster Roles Wizard•Migrate to CSV disks• Storage mapping• Virtual network mapping• Use the same storage or different storage

Page 30: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Cluster Shared Volumes (CSV) 2

Distributed access file systemNew roles• File Server - Scale out File Server•Hyper-V over SMB

Improved backup, performance and resiliencyDirect I/O for more scenarios• Better VM creation and copy performance

Multi-subnet support for live migration

Page 31: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Guest Cluster• Fibre Channel• Shared VHDX

• SMB• iSCSI• FCoE

Host Cluster• Fibre Channel

• SAS• SMB• iSCSI• FCoE

Shared VHDX for Guest Clusters

Abstract the storage infrastructure from tenantsVM sees a shared Virtual SAS disk• VMs could be on the same or different nodes

Shared VHDX can be stored on:• Cluster Shared Volumes (CSV) on block storage• Separate Scale-Out File Server Cluster

WS 2012 WS 2012 R2

Fibre Channel P P

iSCSI P P

SMB P P

Shared VHDX P VHDX

VHDX

Page 32: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Virtual Machine Priority

Start OrderNode MaintenanceRunning Priority• Pre-emption shuts down

lower priority VMs

No Auto Start• Must be restarted manually

High Medium Low

No Auto Start

Page 33: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Improved Live Migration

Concurrent Live MigrationsLive Migration Queuing (waiting)Live Migration CompressionLive Migration over RDMABest Available Node•Moves to node with most free memory

Page 34: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

New Live Migrations

Storage Live MigrationLive Migration over SMB“Shared Nothing” Live MigrationHyper-V Replica

Branch Office

VHD

SAN

VHD

Network

Network

Page 35: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Site C

Site B

30s, 5m, or 15m replication frequencies

Variable Recovery Point Objective (RPO) for metro vs. geographically dispersed

Hyper-V Replica Enhancements

Site A

Configurable Replication Frequencies

Near site and offsite Replication

Second hop can be equal to or greater replication frequencies than first hop

Multiple Replicas

Coordinate orchestrated replica failover across sites via System Center VMM, or from a site to Microsoft AzureAzure Site

Recovery

Page 36: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Hyper-V Replica on a Cluster

Hyper-V Replica BrokerReplication agent is highly-availableReplication• Standalone to standalone• Cluster to cluster• Standalone to cluster• Cluster to standalone• There is no replication within a single cluster

Multi-site clusters now use a cluster at each site

Page 37: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

DemoConfigure Hyper-VReplica Broker

Page 38: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

System Center 2012 R2 Clustering & Hyper-V

Page 39: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Managing Clusters

Scale management for hosts and VMsBare metal cluster provisioningNVGRE Gateway failover supportDeploy & patch Scale Out File Server clusters (SOFS)Windows Azure Hyper-V Recovery ManagerManage VMware and Citrix clustersSQL Clustering for databases (all System Center components)

Page 40: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Intelligent Placement

Automates placement logic on hostsCapacity planning improves resource utilizationSpreads VMs across nodes‘Star-Rated’ results for easy decision makingCustomizable algorithm

Page 41: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Automated Update Management

Automated cluster updatingUses Intelligent Placement &

live migrationWindows PowerShell SupportMost hosts can be patched• Hosts, Host Groups, Host Clusters• VMM Server, Library Server,

PXE Server, Update Server

Does not patch VMs or VHDs• Virtual Machine Servicing Tool (VMST)

Enable Feature

Manage Baselines

Scan Servers

Remediate Servers

Manage Exemptio

ns

Page 42: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Dynamic Optimization

No SCOM dependencyRebalances VMs across hostsLive migration• Keeps cluster balanced• Avoids VM downtime• Supports heterogeneous clusters

Managed resources• Considers CPU, memory, disk IO, network IO• Optimize when above resource threshold• Considers entire cluster

Options• Manual or automatic• User controlled frequency• Configurable aggressiveness

Page 43: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

VMM Server High Availability

Highly available VMM server• Cluster-aware VMM server• Protects against OS and VMM failures• Admin console with reconnection logic

Hyper-V cluster creation & validationCreate non-HAVMs on clustered hostsAdd/remove Hyper-V clusters in untrusted domains

Page 44: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Performance & Resource Optimization (PRO)

Page 45: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

System Center 2012 R2 Integration

App Controller (being deprecated)•Deploy VMs to a cluster

Configuration Manager•Make a cluster & VMs secure and compliant

Data Protection Manager• Backup / restore VMs on a CSV disk• Backup during live migration

Endpoint Protection• Protect the cluster & roles from viruses & malware

Page 46: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

System Center 2012 R2 Integration

Operations Manager•Monitor clusters & VMs

Orchestrator• Automate actions with clusters & VMs (via VMM)

Service Manager• Report cluster & VM problems

Azure Operational Insights (“Advisor”)• Analyze clusters and VMs for best practices

Page 47: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

DemoManage a Cluster withSystem Center Virtual Machine Manager

Page 48: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Conclusion

Page 49: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

High Availability throughout the Datacenter

Hardware High AvailabilityWorkload High AvailabilityGuest Application High AvailabilityVM Storage High AvailabilityManagement High AvailabilitySite High Availability

Guest Failover Cluster

Host Failover Cluster

File Server Failover Cluster

Hard

ware

Syste

m C

en

ter &

C

luste

ring

Multiple Datacenters

Page 50: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer

Start with the Cloud OS today!

Microsoft Virtual Academyaka.ms/MVA

Evaluation Editionaka.ms/EvalCenter

Microsoft Certificationsaka.ms/MSCerts

TechNet Virtual Labsaka.ms/V-Labs

Customer Evidenceaka.ms/Evidence

Free eBooksaka.ms/MVAebook

Page 51: Hyper-V High-AvailabilityHyper-V High-Availability Best Practices with Failover Clustering Symon Perriman @SymonPerriman Symon@5nine.com Thomas Maurer