Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Jase McCarty & GS Khalsa
STO1118BU
#VMworld #STO1118
Successful vSAN Stretched Clusters
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
2#STO1118BU CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Session Agenda
1 Why Stretched Clusters?
2 vSAN Stretched Cluster Architecture
3 vSAN Stretched Cluster Failure Scenarios
VMworld 2017 Content: Not fo
r publication or distri
bution
Why Stretched Clusters?
#STO1118BU CONFIDENTIAL4
VMworld 2017 Content: Not fo
r publication or distri
bution
What Are Stretched Clusters?
#STO1118BU CONFIDENTIAL5
COMPUTE STORAGE COMPUTESTORAGE
Stretched Cluster= Infrastructure spanning across
multiple physical locations
Continuous active-active availability
App mobility and load-balancing
Disaster avoidance
Benefits
Complex configuration and management
Expensive to deploy and operate
Creates silos of specialized hardware
Typical Challenges
Compute and Storage shared across sites
VMworld 2017 Content: Not fo
r publication or distri
bution
VMware HCI Addresses Typical Challenges ofStretched Clusters
#STO1118BU CONFIDENTIAL6
Complex configuration and management
Expensive to deploy and operate
Creates silos of specialized hardware
COMPUTE STORAGE COMPUTESTORAGE
Typical Stretched Cluster
vSphere
Storage and Replication Management
Simplifies management: policy-based, app-level
Lowers TCO: server economics
Eliminates Silos: use servers of your choice
COMPUTE + SERVER-ATTACHED STORAGE
COMPUTE + SERVER-ATTACHED STORAGE
vSphere vSAN
vSAN Stretched Cluster
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Cluster Architecture
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Fault Domains
• Create fault domains to increase
availability
• Protect against rack failure, etc.
• Example:
Four defined fault domains
FD1 = esxi-01, esxi-02
FD2 = esxi-03, esxi-04
FD3 = esxi-05, esxi-06
FD4 = esxi-07, esxi-08
• Cluster can tolerate single rack
failure in illustrated scenario
FD1 FD2 FD3
vmdk
RAID-1
FTT=1
FD4
witnessvmdk
esxi-02
esxi-01
esxi-04
esxi-03 esxi-05 esxi-07
esxi-06 esxi-08
Rack Rack Rack Rack
#STO1118BU CONFIDENTIAL 8
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Cluster – Built on Fault Domains
#STO1118BU CONFIDENTIAL 9
• Single vSAN cluster across 2 sites
• Each site is a Fault Domain (FD)
• Automated failover
• Does not require any specialized hardware
• Setup via Wizard
• Requires vSAN Enterprise license
• Witness only stores meta-data, no customer data
• 1 Witness per cluster
Today
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0
3rd site for
witness
RAID-0
RAID-1
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Cluster – Hosts
#STO1118BU CONFIDENTIAL 10
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0
3rd site for
witness
RAID-0
RAID-1
• vSAN 6.1 or higher
• Layer 2 inter-host connectivity for vSAN (Layer 3 supported)
• vSAN traffic on dedicatedVMkernel interface
• Up to 5ms RTT latency between sites
• 10Gbps recommended between sites
– Dependent on number of writes
– Reads are not sized in calculation
• vSAN Enterprise LicenseVMworld 2017 Content: N
ot for publicatio
n or distribution
vSAN Stretched Cluster – Witness
• Can be physical ESXi host or vSAN Witness Appliance (Nested ESXi)
– vSAN Witness Appliance cannot host VMs
• Latency requirements
– Up to 10 Hosts/site: Up to 200ms RTT
– 11 to 15 Hosts/site: Up to 100ms RTT
• Routing requirements
– L3 independently routable to each site
– Static routes to hosts at each site
• Port requirements
– Normal vSAN/vSphere ports
– Also UDP 12321
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0 RAID-0
RAID-1
3rd site for
witness
vESXi
#STO1118BU CONFIDENTIAL 11
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Witness Appliance
vESXi
• Different sizes in OVA
– All include:Boot vmdk: 12GbCache vmdk: 10GbCapacity vmdk(s)
– Tiny (15Gb), Normal (350Gb), & Large (3x350Gb)
• Capacity tier requires 16MB per witness component
• Amount of witness storage related to component #s
• Does not require SSDs –OVA has cache vmdk tagged as SSD
#STO1118BU CONFIDENTIAL 12
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Clusters – Witness Network Traffic
#STO1118BU CONFIDENTIAL 13
Cost Payload Operation Unit When
*** ~2K bytes Create/Delete VMDK Creation/Deletion time
*** Create/Delete VM Per object Creation/Deletion time
*** Create Snapshot Per Snapshot At time of snap creation
*** Power on VM (creates .vswp) At Power on
** ~1.5K bytes Master Heartbeat Cluster Size Dependent Every Second
* ~300 bytes Disk Failures All objects on disk Time of failure
* Disk/Node Absent All objects on disk/node Time of detecting disk is absent
* Rebuild Per object Once rebuild is complete
* Node decommissioned Per object on Node When task starts
* Site Failure Per object on site When failure is detected
* Owner Migration/Failback Per object When ownership changes
* Rebalance Per object Start of rebalance process
None Read/Write Steady State
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Clusters – Writes
#STO1118BU CONFIDENTIAL 14
• Writes are synchronous to both sites
• Latency cannot exceed 5ms RTT
• Sized based on write IO
• Minimum 10Gbps recommended
• Some minimal changes in vSAN 6.6
Today
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0
3rd site for
witness
RAID-0
RAID-1
VMworld 2017 Content: Not fo
r publication or distri
bution
Host BandwidthDependent on number of writes, not reads
𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡 = 𝑤𝑟𝑖𝑡𝑒 𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡ℎ ∗ 𝑑𝑎𝑡𝑎 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 ∗ 𝑟𝑒𝑠𝑦𝑛𝑐 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡 = 𝑊𝑏 ∗ 𝑚𝑑 ∗ 𝑚𝑟
𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡𝟏𝟎,𝟎𝟎𝟎 𝒘𝒓𝒊𝒕𝒆𝒔 𝟒𝑲 = 320 𝑀𝑏𝑝𝑠 ∗ 1.4 ∗ 1.25 = 560 𝑀𝑏𝑝𝑠
𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡𝟑𝟎,𝟎𝟎𝟎 𝒘𝒓𝒊𝒕𝒆𝒔 𝟒𝑲 = 960 𝑀𝑏𝑝𝑠 ∗ 1.4 ∗ 1.25 = 1.7 𝐺𝑏𝑝𝑠
𝑫𝒂𝒕𝒂 𝒎𝒖𝒍𝒕𝒊𝒑𝒍𝒊𝒆𝒓 = 𝑂𝑣𝑒𝑟ℎ𝑒𝑎𝑑 𝑓𝑜𝑟 𝑉𝑆𝐴𝑁 𝑚𝑒𝑡𝑎𝑑𝑎𝑡𝑎 𝑡𝑟𝑎𝑓𝑓𝑖𝑐, … (40%)𝑹𝒆𝒔𝒚𝒏𝒄𝒎𝒖𝒍𝒕𝒊𝒑𝒍𝒊𝒆𝒓 = 𝑂𝑣𝑒𝑟ℎ𝑒𝑎𝑑 𝑓𝑜𝑟 𝑟𝑒𝑠𝑦𝑛𝑐ℎ𝑟𝑜𝑛𝑖𝑧𝑎𝑡𝑖𝑜𝑛𝑠 (25%)
Witness BandwidthDependent on number of vSAN components
𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡 =1,138 𝐵𝑦𝑡𝑒𝑠 ∗ 𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡𝑠
5𝑠
Rule of thumb: 2 Mbps per 1000 components
𝐁𝐚𝐧𝐝𝐰𝐢𝐝𝐭𝐡𝟏𝟎𝟎𝟎 =1,138 𝐵 ∗ 8
𝑏𝑖𝑡𝑠𝐵 ∗ 1,000
5𝑠
= 1. 82 𝑀𝑏𝑝𝑠 + 10% 𝑠𝑎𝑓𝑒𝑡𝑦 𝑚𝑎𝑟𝑔𝑖𝑛 = 2 𝑀𝑏𝑝𝑠
vSAN Stretched Clusters - Bandwidth Requirements
#STO1118BU CONFIDENTIAL 15
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0
3rd site for
witness
RAID-0
RAID-1
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched Clusters – Read Locality
• Enabled by default
• Reads occur only on site VM
resides on
• VMs typically do not move
between sites
– Consider cache rewarm in vSAN
Hybrid architectures
Today
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
RAID-0
3rd site for
witness
RAID-0
RAID-1
#STO1118BU CONFIDENTIAL 16
VMworld 2017 Content: Not fo
r publication or distri
bution
Local & Remote Protection for Stretched Clusters
#STO1118BU CONFIDENTIAL 17
vSphere vSAN
ClusterCluster
5ms RTT, 10GbE
• Redundancy locally andacross sites
• With site failure, vSAN maintains availability with local redundancy in surviving site
• No change in stretched cluster configuration steps
• Optimized site locality logic to minimize I/O traffic across sites
RAID-6
3rd site for
witness
RAID-6
RAID-1
VMworld 2017 Content: Not fo
r publication or distri
bution
Site Affinity for Stretched Clusters
#STO1118BU CONFIDENTIAL 18
vSphere vSAN
ClusterCluster
• User can specify single site location of VM’s components if site level protection is unnecessary
• Policy driven using SPBM
• Reduces network and storage requirements
• Ideal for solutions that already use application redundancy (Exchange DAGs, SQL Availability groups, etc.)
3rd site for
witness
RAID-0
RAID-6
VMworld 2017 Content: Not fo
r publication or distri
bution
Change Witness Host for Stretched Cluster
#STO1118BU CONFIDENTIAL 19
Cluster
vSphere vSAN
Cluster
vSphere vSAN
• Easy replacement of witness host in stretched cluster environments
• Reduces potential time stretched cluster configuration may be without a witness
• UI driven. Eliminates need for manual steps or scripting
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN Stretched ClusterFailure Scenarios
VMworld 2017 Content: Not fo
r publication or distri
bution
Stretched Cluster Local Failure Protection – RAID-1
#STO1118BU CONFIDENTIAL 24
Witness
Preferred Site Secondary Site
Tertiary Site
VM
VMDK
New in vSAN 6.6
VMworld 2017 Content: Not fo
r publication or distri
bution
Stretched Cluster Local Failure Protection – RAID-1
#STO1118BU CONFIDENTIAL 25
New in vSAN 6.6
VMworld 2017 Content: Not fo
r publication or distri
bution
Stretched Cluster Local Failure Protection – RAID-5
#STO1118BU CONFIDENTIAL 26
Witness
Preferred Site Secondary Site
Tertiary Site
VM
VMDK
New in vSAN 6.6
VMworld 2017 Content: Not fo
r publication or distri
bution
Stretched Cluster Local Failure Protection – RAID-5
#STO1118BU CONFIDENTIAL 27
New in vSAN 6.6
VMworld 2017 Content: Not fo
r publication or distri
bution
Normal
#STO1118BU CONFIDENTIAL 28
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Network Partition or Site Failure
#STO1118BU CONFIDENTIAL 29
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM
X
X
X
VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Network Partition or Site Failure
#STO1118BU CONFIDENTIAL 30
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM
VM VM VM VM
X
X
X
HA Restart
VMworld 2017 Content: Not fo
r publication or distri
bution
Normal
#STO1118BU CONFIDENTIAL 31
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Inter-site Network Disconnected
#STO1118BU CONFIDENTIAL 32
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
HA Power Off
VMworld 2017 Content: Not fo
r publication or distri
bution
Inter-site Network Disconnected
#STO1118BU CONFIDENTIAL 33
Witness
Tertiary Site
VM VM VM VM
VM VM VM VM
X
HA Restart
Preferred Site Secondary Site
VMworld 2017 Content: Not fo
r publication or distri
bution
Normal
#STO1118BU CONFIDENTIAL 34
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Witness Network Disconnected
#STO1118BU CONFIDENTIAL 35
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
Witness Leaves Cluster
VMworld 2017 Content: Not fo
r publication or distri
bution
Normal
#STO1118BU CONFIDENTIAL 36
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Witness Host Offline
#STO1118BU CONFIDENTIAL 37
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
VMworld 2017 Content: Not fo
r publication or distri
bution
Witness Host Online
#STO1118BU CONFIDENTIAL 38
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
VMworld 2017 Content: Not fo
r publication or distri
bution
Questions?
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution