110
FOSSETCON 2015 Success with XenServer by Design XenServer Design Workshop

XenServer Design Workshop

Embed Size (px)

Citation preview

Page 1: XenServer Design Workshop

FOSSETCON 2015

Success with XenServer by DesignXenServer Design Workshop

Page 2: XenServer Design Workshop

#whoami

Name: Tim Mackey

Current roles: XenServer Community Manager and Evangelist; occasional coder

Cool things I’ve done• Designed laser communication systems• Early designer of retail self-checkout machines• Embedded special relativity algorithms into industrial control system

Find me• Twitter: @XenServerArmy• SlideShare: slideshare.net/TimMackey• LinkedIn: www.linkedin.com/in/mackeytim• Github: github.com/xenserverarmy

Page 3: XenServer Design Workshop

We’re following “MasterClass Format”

Admins matter• No sales pitch• No cost• Just the facts man

Interactive• Ask questions; the harder the better• Get what you need to be successful

Page 4: XenServer Design Workshop

What is XenServer?

Page 5: XenServer Design Workshop

What is a “XenServer”?

Packaged Linux distribution for virtualization• All software required in a single ISO

Designed to behave as an appliance• Managed via SDK, CLI, UI

Not intended to be a toolkit• Customization requires special attention

Open Source• Open source roots• Acquired by Citrix in 2007• Made open source in 2013 (xenserver.org)

Page 6: XenServer Design Workshop

XenServer market dynamic

Millions of Downloads

Over 1 million servers deployed

Optimized for XenDesktop

Powering NetScaler SDX

Supporting Hyper-Dense Clouds

Page 7: XenServer Design Workshop

Why XenServer?

Broad provisioning support• Apache CloudStack• Citrix CloudPlatform and XenDesktop• OpenStack• Microsoft System Center• VMware vCloud

Full type-1 hypervisor• Strong VM isolation• Supporting Intel TXT for secure boot

Designed for scale• 1000 VMs per host• Over 140 Gbps throughput in NetScaler SDX• Up to 96 shared hardware GPU instances per host

Page 8: XenServer Design Workshop

Understanding the architecture

Page 9: XenServer Design Workshop

Strong technical foundation with Xen Project

Advisory Board Members

Page 10: XenServer Design Workshop

Networking StorageCompute

Simplified XenServer Architecture Diagram

Xen Project Hypervisor

Standard Linux Distribution (dom0)

qemu

drivers

xapi

Guest

Driver front

Driver back

Guest

Driver front

Page 11: XenServer Design Workshop

dom0 in detail (XenServer 6.5)3.10+ kernel.org kernel with CentOS 5.10 distribution

kernel-space

netback

blkback

blktap3

user-space

XenAPI (xapi)

SM xha xenopsd

squeezed alertd multipathd

perfmon

interface

stunnel metadata xenstored

ovs-vswitchd qeum-dm Likewise

networkd

Hardware drivers

Page 12: XenServer Design Workshop

Features impacting functional design

Page 13: XenServer Design Workshop

Resource pools

Advantages• Reduce points of failure• Simplify management at scale• Reduce downtime during maintenance

Requirements• Shared storage• Network redundancy• Provisioning management

Core concepts• Pool master vs. member server roles

Page 14: XenServer Design Workshop

XenMotion Live VM Migration

XenServerXenServerXenServer

Shared Storage

Page 15: XenServer Design Workshop

XenServer Pool

Live Storage XenMotion

Migrates VM disks from any storage type to any other storage type• Local, DAS, iSCSI, FC

Supports cross pool migration• Requires compatible CPUs

Encrypted Migration model

Specify management interface for optimal performance

XenServer Hypervisor

VDI(s)

Live Virtual

Machine

More about Storage XenMotion

Page 16: XenServer Design Workshop

Migration vs. Storage Migration

Start VM migration

Copy VM’s RAM

Copy VM’s RAM delta

Repeat until no

delta left

End VM migration

Use VM’s Hard disk from

destination Host

Start Storage

VM migration

Snapshot VMs first / next disk

Transfer snapshot

disks

End Storage

VM migration

XenMotion Storage XenMotion

If transfer is finished, repeat until no disk left

to copy

Mirror all write activity after snapshot to destination host

All disks mirroring to

destination host

“normal” XenMotion

Page 17: XenServer Design Workshop

Heterogeneous Resource Pools

Safe Live Migrations

Feature 5

Virtual Machine

Older CPU

Feature 1

Feature 2

Feature 3

Feature 4

XenServer 1

Newer CPU

Feature 1

Feature 2

Feature 3

Feature 4

XenServer 2

Mixed Processor Pools

Page 18: XenServer Design Workshop

High Availability in XenServer

Automatically monitors hosts and VMs

Easily configured within XenCenter

Relies on Shared Storage• iSCSI, NFS, HBA

Reports failure capacity for DR planning purposes

More about HA

Page 19: XenServer Design Workshop

Taking advantage of GPUs

NVIDIA• vGPU with NVIDIA GRID providing 96 GPU instances• GPU pass-through• CUDA support on Linux• Uses NVIDIA drivers for capability

Intel• GVT-d support with Haswell and newer

• No extra hardware!!• Uses standard Intel drivers

AMD• GPU pass-through

More about GPU

Page 20: XenServer Design Workshop

Distributed Virtual Network Switching

Virtual Switch• Open source: www.openvswitch.org• Provides a rich layer 2 feature set• Cross host private networks• Rich traffic monitoring options• ovs 1.4 compliant

DVS Controller• Virtual appliance• Web-based GUI• Can manage multiple pools• Can exist within pool it manages• Note: Controller is deprecated, but supported

VM

VM

VM

VM

VM

Page 21: XenServer Design Workshop

Deployment Design

Page 22: XenServer Design Workshop

Host requirements

Validate Hardware Compatibility List (HCL)• http://hcl.xenserver.org• Component’s firmware version could be important

BIOS configuration• VT extensions enabled• EFI profiles disabled

Limits• Up to 1TB RAM• Up to 160 pCPUs• Up to 16 physical NICs• Up to 16 hosts per cluster

Page 23: XenServer Design Workshop

Network topologies

Management networks• Handle pool configuration and storage traffic• Require default VLAN configuration• IPv4 only

VM networks• Handle guest traffic• IPv4 and IPv6• Can assign VLAN and QoS• Can define ACL and mirroring policy• Should be separated from mgmt networks

All networks in pool must match More about network design

Page 24: XenServer Design Workshop

Storage topologies

Local storage• Yes: SAS, SATA, RAID, DAS• No: USB, Flash, SW RAID• Supports live migration

Shared Storage• iSCSI Unipath/Multipath, NFSv3• HBA – Check HCL• Supports live migration

Cloud Storage• Only if presented as iSCSI/NFS

ISO storage• CIFS/NFSv3

More about storage design

Page 25: XenServer Design Workshop

Installation

Page 26: XenServer Design Workshop

Installation options

Boot from DVD/USB• Intended for low volume• ISO media on device• Install from local/NFS/HTTP/FTP

Boot from PXE• For scale deployments• Install from NFS/HTTP/FTP• Post installation script capabilities

Boot from SAN/iSCSI• Diskless option

Page 27: XenServer Design Workshop

Driver disks

Shipped as supplemental packs• Often updated when kernel is patched• Option to specify during manual install

Network drivers• Slipstream into XenServer installer• Modify XS-REPOSITORY-LIST

Storage drivers• Add to unattend.xml• <driver-source type="url">ftp://192.168.1.1/ftp/xs62/driver.qlcnic</driver-source>

Page 28: XenServer Design Workshop
Page 29: XenServer Design Workshop

Types of updates

New version• Delivered as ISO installer• Requires host reboot

Feature Pack• Typically delivered as ISO installer• Typically requires host reboot

Hotfix• Delivered as .xsupdate file• Applied via CLI/XenCenter• May require host reboot• Subscribe to KB updates

Page 30: XenServer Design Workshop

Backup more than just your VMs

Local storage• Always use RAID controller with battery backup to reduce risk of corruption

dom0 (post install or reconfiguration)• xe host-backup file-name=<filename> -h <hostname> -u root -pw <password>

Pool metadata (weekly – or when pool structure changes)• xe pool-dump-database file-name=<NFS backup>

VM to infrastructure relationships (daily or as VMs created/destroyed)• xe-backup-metadata -c -i -u <SR UUID for backup>

LVM metadata (weekly)• /etc/lvm/backup

Page 31: XenServer Design Workshop

XenServer host upgrade

Disk Partitions

4GB

1st Partition XenServer 6.2

installed

2nd Partition XenServer Backup

-empty-

4GB

1. Initial installation 2. Backup existing installation, then upgrade

3. XenServer upgraded

Rem

aining space

Local Storage Space

Disk Partitions

4GB

1st Partition XenServer

Installation of Version 6.5

2nd Partition XenServer Backup

of Version 6.24G

BR

emaining

space

Local Storage Space

XenServer 6.5Install Media

Disk Partitions

4GB

1st Partition XenServer 6.5

installed

2nd Partition XenServer 6.2 kept

as Backup

4GB

Rem

aining space

Local Storage Space

Page 32: XenServer Design Workshop

Pool upgrade process

Upgrade host to new version

Place host into normal operation

Evacuate virtual machines from host

Place host into maintenance mode

=

Proceed with next host

1 2 3

Page 33: XenServer Design Workshop

3rd party components

dom0 is tuned for XenServer usage• yum is intentionally disabled• Avoid installing new packages into dom0

• Performance/scalability/stability uncertain

Updates preserve XenServer config only!• Unknown drivers will not be preserved• Unknown packages will be removed• Manual configuration changes may be lost

Citrix Ready Marketplace has validated components

Page 34: XenServer Design Workshop

Exchange SSL certificate on XenServer

• By default, XenServer uses a self-signed certificate created during installation to encrypt communication via SSH and XAPI or HTTPS.

• To trust this certificate, verify its fingerprint to the one shown on its physical console (xsconsole / status display).

• The certificate can also be exchanged for a certificate issued from a trusted corporate certificate authority.

Company Certificate Authority

Request certificate & key

Issue certificate & key

XenServer Host

Upload to /etc/xensource

Replace xapi-ssl.pem

Convert to PEM format

Page 35: XenServer Design Workshop

Performance planning

Page 36: XenServer Design Workshop

Configuration Maximums: XenServer 6.2 vs 6.5Per-VM scalability limits more is better

VM

...

VM

...

Host

...

Host

VM VM ... VM

Host

VM VM...

VM

Per-host scalability limits more is better

RAM per VM

RAM per host

Running VMs per host VBDs per host

pCPUs per host

vCPUs per VM

Host

...

Multipathed LUNs per host

Host

...

XS6.2XS6.5XS6.5

16 (Windows)16 (Windows)

32 (Linux)

XS6.2XS6.5

160160

XS6.2XS6.5 1000

XS6.2XS6.5 256

150

500

XS6.2XS6.5 192GB

128GB

XS6.2XS6.5

1TB1TB

XS6.2XS6.5 2,048

512

Page 37: XenServer Design Workshop

Highlights of XenServer XenServer 6.5 Performance Improvements

Bootstorm data transferred lower is better

XS 6.2XS 6.5

18.0 GB0.7 GB = -96%

VMVM

VMVM

XenServer

Bootstorm duration lower is better

XS 6.2XS 6.5

470 s140 s = -70%

Measurements were taken on various hardware in representative configurations. Measurements made on other hardware or in other configurations may differ.

XenServer

VMVM

VMVM

VMVM

VMVM

XenServer

XS 6.2XS 6.5

3 Gb/s25 Gb/s = +700%

Aggregate storage read throughput higher is better

VMVM

VMVM

XenServer

XS 6.2XS 6.5

2.2 GB/s9.9 GB/s= +350%

XS 6.2XS 6.5

2.8 GB/s7.8 GB/s = +175%

VMVM

VMVM

XenServer

Aggregate network throughput higher is better

Aggregate storage write throughput higher is better

Booting a large number of VMs is significantly quicker in XS 6.5 due to the read-caching feature.

The read-caching feature significantly reduces the IOPS hitting the storage array when VMs share a common base image.

XS 6.5 brings many improvements relating to network throughput.For example, the capacity for a large number of VMs to send or receive data at a high throughput has been significantly improved.

The new, optimized storage datapath in XS 6.5 enables aggregate throughput to scale much better with a large number of VMs. This allows a large number of VMs to sustain I/O at a significantly higher rate, for both reads and writes.

vGPU scalability higher is better

XS 6.2XS 6.5

6496 = +50%

VM VM VM VM

XenServer

The number of VMs that can share a GPU has increased in XS 6.5. This will reduce TCO for deployments using vGPU-enabled VMs.

GPU

VMVM

VMVM

XenServer

NFS

NFS

NFS

NFS

Page 38: XenServer Design Workshop

64 bit control domain improves overall scalability

In XenServer 6.2:• dom0 was 32-bit so had 1GB of ‘low memory’• Each running VM ate about 1 MB of dom0’s low memory• Depending on what devices you had in the host, you would exhaust dom0’s low memory with a

few hundred VMs

In Creedence:• dom0 is 64-bit so has a practically unlimited supply of low memory • There is no longer any chance of running out of low memory• Performance will not degrade with larger dom0 memory allocations

Page 39: XenServer Design Workshop

Limits on number of VMs per hostScenario 1: HVM guests, each having 1 vCPU, 1 VBD, 1 VIF, and having PV drivers

Limitation XenServer 6.1 XenServer 6.2 Creedencedom0 event channels 225 800 no limit

tapdisk minor numbers 1024 2048 2048aio requests 1105 2608 2608

dom0 grant references 372 no limit no limitxenstored connections 333 500 1000consoled connections no limit no limit no limit

dom0 low memory 650 650 no limitOverall limit 225 500 1000

Page 40: XenServer Design Workshop

Limits on number of VMs per hostScenario 2: HVM guests, each having 1 vCPU, 3 VBDs, 1 VIF, and having PV drivers

Limitation XenServer 6.1 XenServer 6.2 Creedencedom0 event channels 150 570 no limit

tapdisk minor numbers 341 682 682aio requests 368 869 869

dom0 grant references 372 no limit no limitxenstored connections 333 500 1000consoled connections no limit no limit no limit

dom0 low memory 650 650 no limitOverall limit 150 500 682

Page 41: XenServer Design Workshop

Limits on number of VMs per hostScenario 3: PV guests each having 1 vCPU, 1 VBD, 1 VIF

Limitation XenServer 6.1 XenServer 6.2 Creedencedom0 event channels 225 1000 no limit

tapdisk minor numbers 1024 2048 2048aio requests 1105 2608 2608

dom0 grant references no limit no limit no limitxenstored connections no limit no limit no limitconsoled connections 341 no limit no limit

dom0 low memory 650 650 no limitOverall limit 225 650 2048

Page 42: XenServer Design Workshop

Netback thread-per-VIF model improves fairness

Improves fairness and reduces interference from other VMs

XenServer host

VM

VM

VM

VM

VM

VM

net-back

net-back

net-back

net-back

XenServer host

VM

VM

VM

VM

VM

VM

netb

ack

netb

ack

netb

ack

netb

ack

netb

ack

netb

ack

XenServer 6.2 Creedence

Page 43: XenServer Design Workshop

OVS 2.1 support for ‘megaflows’ helps when you have many flows

The OVS kernel module can only cache a certain number of flow rules

If a flow isn’t found in the kernel cache then the ovs-vswitchd userspace process is consulted• This adds latency and can lead to a severe CPU contention bottleneck when there are many

flows on a host

OVS 2.1 has support for ‘megaflows’• This allows the kernel to cache substantially more flow rules

Page 44: XenServer Design Workshop

More than just virtualization

Page 45: XenServer Design Workshop

Visibility into Docker Containers

Containers• Great for application packaging• Extensive tools for deployment

Virtualization• Total process isolation• Complete control

Docker and XenServer• View container details• Manage container life span• Integrated in XenCenter

Page 46: XenServer Design Workshop
Page 47: XenServer Design Workshop

WORK BETTER. LIVE BETTER.

Page 48: XenServer Design Workshop

GPU enablement

Page 49: XenServer Design Workshop

Deployment sceanrios

XenServer

XenDesktopVMs

3D VM

1:1 GPU passthrough

Hypervisor(optional)

XenAppServer VMs

XenAppVM

Session 1 Session 2 Session n...

1:nRDS Sessions

XenServer

vGP

U

vGP

U

vGP

U

vGP

U

vGP

U

XenDesktopWindows VMs

3DVM

3DVM

3DVM

3DVM

3DVM

1:nHardware virtualization

Nvidia, AMD& Intel

AMD & Nvidia

Nvidia only

3D VM

3D VM

Page 50: XenServer Design Workshop

HYPERVISOR

NVIDIA/AMD/Intel GPU

NVIDIA/AMD/Intel GPU

Responsiveness VM has direct access to GPU and includes NVIDIA fast remoting technology

VM PortabilityCannot migrate VM to any node.

App PerformanceFull API support including latest OpenGL, DirectX and CUDA. Includes Application certifications

VIRTUAL MACHINE

Guest OS

Native GPU Driver

Virtual Desktop

AppsRemote Protocol

VIRTUAL MACHINE

Guest OS

Native GPU Driver

Virtual Desktop

Apps

Remote Protocol

DensityLimited by the number of GPUs in the server

Dedicated GPU per User/VM

Direct GPU access from Guest VMDirect GPU access from Guest VM

Remote Workstations1:1 GPU pass-through

Page 51: XenServer Design Workshop

VM

pgpu, vgpus and gpu-group objects

XenServer automatically creates gpu-group, pgpu, vgpu-type objects for the physical GPUs it discovers on startup

GRID K1GRID K1

5:0.0

gpu-group

GRID K1

allocation: depth-first

GRID K1

6:0.0GRID K1

7:0.0

GRID K1

8:0.0

GRID K2GRID K2

11:0.0GRID K2

12:0.0

GRID K2GRID K2

85:0.0GRID K2

86:0.0

pgpu

5:0.0GRID K1

pgpu

6:0.0GRID K1

pgpu

7:0.0GRID K1

pgpu

8:0.0GRID K1

gpu-group

GRID K2

allocation:depth-first

pgpu

11:0.0GRID K1

pgpu

12:0.0GRID K1

pgpu

85:0.0GRID K1

pgpu

86:0.0GRID K1

vgpu-typeGRID K100

vgpu-typeGRID K120Q

User creates vgpu objects: - owned by a specific VM - associated with a gpu-group - with a specific vgpu-type

At VM boot, XenServer picks an available pgpu in the group to host the vgpu

vgpu

GRID K260QGRID K286:0.0

VM

vgpu

GRID K100GRID K18:0.0

vgpu-typeGRID K140Q

vgpu-typeGRID K160Q

vgpu-typeGRID K180Q

vgpu-typeGRID K200

vgpu-typeGRID K220Q

vgpu-typeGRID K240Q

vgpu-typeGRID K260Q

vgpu-typeGRID K280Q

Page 52: XenServer Design Workshop

How GPU Pass-through Works

Identical GPUs in a host auto-create a GPU group

The GPU Group can be assigned to set of VMs – each VM will attach to a GPU at VM boottime

When all GPUs in a group are in use, additional VMs requiring GPUs will not start

GPU and non-GPU VMs can (and should) be mixed on a host

GPU groups are recognized within a pool• If Server 1, 2, 3 each have GPU type 1, then VMs

requiring GPU type 1 can be started on any of those servers

Page 53: XenServer Design Workshop

Limitations of GPU Pass-through

GPU Pass-through binds the VM to host for duration of session • Restricts XenMotion

Multiple GPU types can exist in a single server• E.g. high performance and mid performance GPUs

VNC will be disabled, so RDP is required

Fully supported for XenDesktop, best effort for other windows workloads

HCL is very important

Page 54: XenServer Design Workshop

GRID K1/K2 GPUVirtual GPU

Virtual GPU

Virtual GPU

Virtual GPU

Virtual GPU

VIRTUAL MACHINE

Guest OS

NVIDIA Driver

Virtual Desktop

Apps

Remote ProtocolXenServer

NVIDIA GRID Virtual GPU Manager

Physical GPU

Management

State

Direct G

raphic

Commands

VM PortabilityCannot migrate VM to any node.

DensityLimited by number of Virtual GPUs in system

Responsiveness VM has direct access to GPU and includes NVIDIA fast remoting technologyApp PerformanceFull API support including latest OpenGL & DirectX . Includes Application certifications

NVIDIA Grid ArchitectureHardware virtualized GPU

Page 55: XenServer Design Workshop

Overview of vGPU on XenServer GRID vGPU enables multiple VMs to share a single physical GPU

VMs run an NVIDIA driver stack and get direct access to the GPU• Supports same graphics APIs as

physical GPUs (DX9/10/11, OGL 4.x)

NVIDIA GRID Virtual GPU Manager for XenServer runs in dom0

GRID K1 or K2

Xen Hypervisor

XenServer dom0

GRID Virtual GPU Manager

NVIDIAKernel Driver

Graphics fast path - direct GPU

access from guest VMs

HypervisorControl

Interface

Host channel registers, framebuffer regions, display, etc.

Per-VM dedicated channels, framebuffer. Shared access to GPU

engines.

Virtual Machine

Guest OS

NVIDIA

Driver

Apps

Management

Interface

Citrix XenDeskt

op

Virtual Machine

Guest OS

NVIDIA

Driver

AppsCitrix

XenDesktop

NVIDIA paravirtualize

d interface

Page 56: XenServer Design Workshop

GRID GPU enabled Server

Nvidia vGPU Resource Sharing

Nvidia GRID vGPU

Citrix XenServer dom0

GRID Virtual GPU Manager

Timeshared Scheduling

Virtual Machine 1

Guest OS

NVIDIA

Driver

Apps

Citrix VDA

CPU MMU

3D CE NVENC

NVDEC

Virtual Machine 2

Guest OS

NVIDIA

Driver

Apps

Citrix VDA

GPU BAR VM BAR 1 VM BAR 2

Framebuffer

VM1 Framebuffer

VM2 Framebuffer

Channels

Framebuffer• Allocated at VM startup

Channels• Used to post work to the GPU• VM accesses its channels via

GPU Base Address Register; isolated by CPU‘s Memory Management Unit (MMU)

GPU Engines• Timeshared among VMs, like

context on single OS

Page 57: XenServer Design Workshop

57

GPUs in XenCenter

Page 58: XenServer Design Workshop

vGPU Settings in XenCenter

Page 59: XenServer Design Workshop

GPU profile

Page 60: XenServer Design Workshop

High Availability Details

Page 61: XenServer Design Workshop

Protecting Workloads

Not just for mission critical applications anymore

Helps manage VM density issues

"Virtual" definition of HA a little different than physical

Low cost / complexity option to restart machines in case of failure

Page 62: XenServer Design Workshop

High Availability Operation

Pool-wide settings

Failure capacity – number of hosts to carry out HA Plan

Uses network and storage heartbeat to verify servers

Page 63: XenServer Design Workshop

VM Protection Options

Restart Priority• Do not restart• Restart if possible• Restart

Start Order• Defines a sequence and delay to ensure applications run correctly

Page 64: XenServer Design Workshop

HA Design – Hot Spares

Simple Design• Similar to hot spare in disk array• Guaranteed available• Inefficient Idle resources

Failure Planning• If surviving hosts are fully loaded – VMs will be forced to start on spare• Could lead to restart delays due to resource plugs• Could lead to performance issues if spare is pool master

Page 65: XenServer Design Workshop

HA Design – Distributed Capacity

Efficient Design• All hosts utilized

Failure Planning• Impacted VMs automatically placed for best fit• Running VMs undisturbed• Provides efficient guaranteed availability

Page 66: XenServer Design Workshop

HA Design – Impact of Dynamic Memory

Enhances Failure Planning• Define reduced memory which meets SLA• On restart, some VMs may “squeeze” their memory• Increases host efficiency

Page 67: XenServer Design Workshop

High Availability – No Excuses

Shared storage the hardest part of setup• Simple wizard can have HA defined in minutes• Minimally invasive technology

Protects your important workloads• Reduce on-call support incidents• Addresses VM density risks• No performance, workload, configuration penalties

Compatible with resilient application designs

Fault tolerant options exist through ecosystem

Page 68: XenServer Design Workshop

Storage XenMotion

Page 69: XenServer Design Workshop

VHD Benefits

Many SRs implement VDIs as VHD trees

VHDs are a copy-on-write format for storing virtual disks

VDIs are the leaves of VHD trees

Interesting VDI operation: snapshot (implemented as VHD “cloning”)

A: Original VDI

B: Snapshot VDI

ARW

BRO

ARW

RO

Page 70: XenServer Design Workshop

Source Destination

Storage XenMotion

“A” represents the VHD of a VM

The VHD structure (not contents) of “A” is duplicated on the Destination

VirtualMachine

AAParent

Child

Empty

Page 71: XenServer Design Workshop

Source

A

Destination

Storage XenMotion

A snapshot is taken on the Source

The new child object is duplicated on the Destination

VirtualMachine

AB BParent

Child

Empty

Page 72: XenServer Design Workshop

Source Destination

B B

Storage XenMotion

VM writes are now synchronous to both Source & Destination Active child VHDs

Parent VHD (now Read-Only) is now background copied to the Destination

AA

VirtualMachine

AB BCopy

AParent

Child

Empty

Page 73: XenServer Design Workshop

Source Destination

Storage XenMotion

Once the Parent VHD is copied, the VM is moved using XenMotion

The synchronous writes continue until the XenMotion is complete

AB BA

VirtualMachine

AParent

Child

Empty

Page 74: XenServer Design Workshop

Source Destination

Storage XenMotion

The VHDs not required are removed

The VM and VDI move is complete

AB B A

VirtualMachine

AParent

Child

Empty

Page 75: XenServer Design Workshop

Benefits of VDI Mirroring

Optimization: start with most similar VDI• Another VDI with the least number of different blocks• Only transfer blocks that are different

New VDI field: Content ID for each VDI• Easy way to confirm that different VDIs have identical content• Preserved across VDI copy, refreshed after VDI attached RW

Worst case is a full copy (common in server virtualization)

Best case occurs when you use VM “gold images” (i.e. CloudStack)

Page 76: XenServer Design Workshop

Network topologies

Page 77: XenServer Design Workshop

XenServer Network Terminology

Internal Switches

PIF (eth0)

VIF

VIF

VIF

Virtual Machine

Virtual Machine

Network 0 (xenbr0)

Private(xapi1)

Network Card

Page 78: XenServer Design Workshop

Virtual Machine

XenServer Network Terminology

Internal Switches

PIF (eth1)

PIF (eth0)

VIF

VIF

VIF

Virtual Machine

Network 1 (xenbr1)

Network 0 (xenbr0)

Network Card

Network Card

Page 79: XenServer Design Workshop

Virtual Machine

XenServer Network Terminology

PIF (bond0)

PIF

VIF

VIF

Virtual Machine

Network Card

Network Card

VIF

Bond 0+1 (xapi2)

PIF (eth0)

PIF (eth1)

Page 80: XenServer Design Workshop

XenServer Networking Configurations

Linux NIC Drivers

vSwitch Config

XenServer PoolDB

PhysicalNetwork

Card

XAPI

Command Line

XenCenter

xsconsole

Page 81: XenServer Design Workshop

Bonding Type (Balance SLB)

Virtual Machine

Network Card

Network Card

Virtual Machine

Bond

0:00 SEC0:10 SEC0:20 SEC0:30 SEC

Stacked Switches

Virtual Machine

30 Gbps

14 Gbps

6 Gbps

9 Gbps

25 Gbps

4 Gbps

22 Gbps

5 Gbps

9 Gbps

11 Gbps

4 Gbps

18 Gbps

Page 82: XenServer Design Workshop

Virtual Machine

10.1.2.3:80/MAC1

Virtual Machine

10.1.2.2:80/MAC1

Virtual Machine10.1.2.1:80/MAC1

10.1.2.1:443/MAC1

Bonding Type (LACP)

Network Card

Network Card Bond

Stacked Switches

Page 83: XenServer Design Workshop

Distributed Virtual Switch – Flow ViewDVS Controller

OVS

Flow Table

Flow Table CachevSwitch

Network A

Flow Table

Flow Table CachevSwitch

Network B

ovsdb-server vswitchd

OpenFlowJSON-RPC

PIF PIF

VIF

VIF

VIF

VIF

Page 84: XenServer Design Workshop

Storage Networks

Independent management network• Supports iSCSI multipath• Bonded for redundancy; multipath as best practice• Best practice to enable Jumbo frames• Must be consistent across pool members• 802.3ad LACP provides limited benefit (hashing)

Page 85: XenServer Design Workshop

Guest VM Networks

Single server private network• No off host access• Can be used by multiple VMs

External network• Off host network with 802.1q tagged traffic• Multiple VLANs can share physical NIC• Physical switch port must be trunked

Cross host private network• Off host network with GRE tunnel• Requires DVSC or Apache CloudStack controller

Page 86: XenServer Design Workshop

Storage Topologies

Page 87: XenServer Design Workshop

Storage Repository

VBD

Virtual Machine

VDI

VDI VBD

VDI VBD Virtual Machine

XenServer storage concepts

PBDXenServer Host

PBD

PBDXenServer Host

XenServer Host

Page 88: XenServer Design Workshop

Thick provisioning

With thick provisioning, disk space is allocated statically.

As virtual machines are created, their virtual disks utilize the entire available disk size on the physical storage.

This can result in a large amount of unused allocated disk space.

A virtual machine created using a 75 GB virtual disk would consume the entire 75 GB of physical storage disk space, even if it only requires a quarter of that.

Thick Provisioning

75 GB Disk

Space Required75 GB

Allocated, but unused space

50 GB

Actually Used25 GB

Page 89: XenServer Design Workshop

Thin Provisioning

75 GB Disk

Thin Provisioning

With thin provisioning, disk space is allocated on an “as-needed” basis.

As virtual machines are created, their virtual disks will be created using only the specific amount of storage required at that time.

Additional disk space is automatically allocated for a virtual machine once it requires it. The unused storage space remains available for use by other virtual machines.

A virtual machine created using a 75 GB virtual disk, but that only uses 25 GB, would consume only 25 GB of space on the physical storage.

Space Required

25 GB

Free Space for

Allocation 50 GB

Actually Used25 GB

Page 90: XenServer Design Workshop

Thin Provisioning

75 GB Disk

Sparse Allocation

Sparse allocation is used with thin provisioning.

As virtual machines are created, their virtual disks will be created using only the specific amount of storage required at that time.

Additional disk space is automatically allocated for a virtual machine once it requires it. If the OS allocates the blocks at the end of the disk, intermediate blocks will become allocated

A virtual machine created using a 75 GB virtual disk, but that uses 35 GB in two blocks, could consume between 35 GB and 75GB of space on the physical storage.

Space Required

75 GB

Allocated, but unused space

40 GB

Actually Used25 GB

10GB used

Page 91: XenServer Design Workshop

Local DiskLocal Disk

XenServer Disk Layouts (Local)

LVM Volume Group

LVHD Logical Volumes (Thick)

Virtual Machine Virtual Machine

Storage Repository

Default Layout

dom0 Partition

(4GB)

EXT-Based Layout

xxx.vhd yyy.vhd zzz.vhd

EXT File SystemBackup Partition

dom0 Partition

(4GB)

Backup Partition

Virtual Machine Virtual Machine

Files (Thin)

Storage Repository

VHD Header

OS Partition &

File System

Page 92: XenServer Design Workshop

SAN “Raw” Disk

NASVolume

XenServer Disk Layouts (Shared)

NFS Share

xxx.vhd yyy.vhd zzz.vhd

LUNLVM Volume GroupLVHD Logical Volumes (Thick)

Virtual Machine Virtual Machine

Storage Repository

Virtual Machine Virtual Machine

Native iSCSI & Fiber Channel

NFS-Based Storage

Storage Repository

Page 93: XenServer Design Workshop

Management and MonitoringFibre Channel LUN Zoning

Since Enterprise SANs consolidate data from multiple servers and operating systems, many types of traffic and data are sent through the interface, whether it is fabric or the network.

With Fibre Channel, to ensure security and dedicated resources, an administrator creates zones and zone sets to restrict access to specified areas. A zone divides the fabric into groups of devices.

Zone sets are groups of zones. Each zone set represents different configurations that optimize the fabric for certain functions.

WWN - Each HBA has a unique World Wide Name (similar to an Ethernet MAC)

node WWN (WWNN) - can be shared by some or all ports of a deviceport WWN (WWPN) - necessarily unique to each port

Page 94: XenServer Design Workshop

Fibre Channel LUN Zoning

Initiator GroupXen1, Xen2

LUN0 LUN1

Xen1 Xen2 Xen3

Pool1 Pool2

LUN2

Initiator GroupXen3

FC Switch

Storage

Zone1Xen1 WWN Xen2 WWN

Storage WWN

Zone2Xen3 WWN

Storage WWN

FC Switch example

Page 95: XenServer Design Workshop

Management and MonitoringiSCSI Isolation

With iSCSI type storage a similar concept of isolation as fibre-channel zoning can be achieved by using IP subnets and, if required, VLANs.

IQN – Each storage interface (NIC or iSCSI HBA) has configured a unique iSCSI Qualified Name

Target IQN – Typically associated with the storage provider interfaceInitiator IQN – Configured on the client side

IQN format is standardized:iqn.yyyy-mm.{reversed domain name} (e.g. iqn.2001-04.com.acme:storage.tape.sys1.xyz)

Page 96: XenServer Design Workshop

Storage

iSCSI Isolation

Controller Interface 1

LUN0 LUN1

Xen1 Xen2 Xen3

Pool1 Pool2

LUN2

Controller Interface 2

Network Switch

iSCSI Example

VLAN1 / Subnet1Xen1 Initiator IQN Xen2 Initiator IQN

Controller 1 Target IQN

VLAN2 / Subnet2Xen3 Initiator IQN

Controller 2 Target IQN

Page 97: XenServer Design Workshop

Storage multipathing

• Routes storage traffic over multiple physical paths

• Used for redundancy and increased throughput

• Unique logical networks are required

• Available for Fibre Channel and iSCSI

• Uses Round-Robin Load Balancing (Active- Active)

Storage Array

Network 1

XenServer Host

Storage Controller 1

Storage Controller 2

192.168.1.200

192.168.2.200

192.168.1.201

192.168.2.201

192.168.1.202

192.168.2.202

Network 1

Page 98: XenServer Design Workshop

Understanding dom0 storage

dom0 isn’t general purpose Linux• Don’t manage storage locally• Don’t use software RAID• Don’t mount extra volumes• Don’t use dom0 storage as “scratch”

Local storage is automatically an SR

Adding additional local storage• xe sr-create host-id=<host> content-type=user name-label=”Disk2” device-config:device=”/dev/sdb” type=ext

Spanning multiple local storage drives• xe sr-create host-id=<host> content-type=user name-label=”Group1” device-config:device=”/dev/sdb,/dev/sdc” type=ext

Page 99: XenServer Design Workshop

Snapshots

Page 100: XenServer Design Workshop

Snapshot Behavior Varies By

The type of SR in use• LVM-based SRs use “volume-based” VHD• NFS and ext SRs use “file-based” VHDs• Native SRs use capabilities of array

Provisioning type• Volume-based VHDs are always thick provisioned• File-based VHDs are always thin provisioned

For LVM-based SR types• If SR/VM/VDI created in previous XS version, VDIs (volumes) will be RAW

Page 101: XenServer Design Workshop

Snapshot (NFS and EXT Local Storage)

Resulting VDI tree Disk utilization• VHD files thin provisioned• VDI A contains writes up to point of snapshot

• VDI B and C are empty*• Total:• VDI A: 20• VDI B: 0*• VDI C: 0*

• Snapshot requires no space*

A

B

20 40

400 40C 0

(1)(2)

(1) Size of VDI(2) Data written in VDI

Key

Snapshot CloneParent Active* Plus VHD headers

Page 102: XenServer Design Workshop

Snapshot (Local LVHD, iSCSI or FC SR)

Resulting VDI tree Disk utilization• Volumes are thick provisioned• Deflated where possible• Total:• VDI A: 20• VDI B: 40*• VDI C: 0*

• Snapshot requires 40 + 20GB

A 4020

400 B 40C 0

(3) (1)(2)

(1) Size of VDI(2) Data written in VDI(3) Inflated / deflated state

Key

Snapshot CloneParent Active* Plus VHD headers

Page 103: XenServer Design Workshop

Automated Coalescing Example

1) VM with two snapshots,

C and EA

CB

D E

A + B

3) Parent B is no longer required and will be coalesced into A

D E

Key

Snapshot CloneParent Active

2) When snapshot C is deleted…

A

B

D E

http://support.citrix.com/article/CTX122978

Page 104: XenServer Design Workshop

Suspend VM / Checkpoints

Suspend and snapshot checkpoints store VM memory content on storage

The storage selection process• SR specified in pool parameter suspend-image-sr is used• Suspend-image-sr (pool) by default is default storage repository• In case no suspend-image-sr (e.g. no default SR) is set on pool level

XenServer falls back to local SR of the host running the VM

Size of suspend image is ~ 1.5 * memory size

Best practice: configure an SR as the suspend images store • xe pool-param-set uuid=<pool uuid> suspend-image-SR=<shared sr uuid>

VM suspend

Save memory on storage

Page 105: XenServer Design Workshop

Snapshot storage utilization

LVM-based VHD

Read-Write Child Image (VHD)

Read-only Parent Image (VHD)

60GB VDI

60GBRead-Write Child Image

(VHD)

60GB

File-based VHD

Read-Write Child Image (VHD)

Read-only Parent Image (VHD)

60GB VDI 50% Allocated

30GB ofData Used

Read-Write Child Image (VHD)

Size equals data written to disk

since cloning (thin)

Size equals data written to disk

since cloning (thin)

50% Allocated

30GB ofData Used

Page 106: XenServer Design Workshop

Integrated Site Recovery Details

Page 107: XenServer Design Workshop

Integrated Site Recovery

Supports LVM SRs only

Replication/mirroring setup outside scope of solution• Follow vendor instructions• Breaking of replication/mirror also manual

Works all iSCSI and FC arrays on HCL

Supports active-active DR

Page 108: XenServer Design Workshop

Feature Set

Integrated in XenServer and XenCenter

Support failover and failback

Supports grouping and startup order through vApp functionality

Failover pre-checks• Powerstate of source VM• Duplicate VMs on target pool• SR connectivity

Ability to start VMs paused (e.g. for dry-run)

Page 109: XenServer Design Workshop

How it Works

Depends on “Portable SR” technology• Different from Metadata backup/restore functionality

Creates a logical volume on SR during setup

Logical Volume contains• SR metadata information• VDI metadata information for all VDIs stored on SR

Metadata information is read during failover sr-probe

Page 110: XenServer Design Workshop

Integrated Site Recovery - Screenshots