52
© 2009 VMware Inc. All rights reserved Confidential Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011 Iwan ‘e1’ Rahabok virtual-red-dot.blogspot.com | tinyurl.com/SGP-User-Group M: +65 9119-9226 | [email protected] VCAP-DCD

Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

  • Upload
    mira

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011. Iwan ‘e1 ’ Rahabok virtual-red-dot.blogspot.com | tinyurl.com/SGP-User-Group M: +65 9119-9226 | [email protected]. VCAP-DCD. Purpose of This Document. - PowerPoint PPT Presentation

Citation preview

Page 1: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

© 2009 VMware Inc. All rights reserved

Confidential

Private CloudSample Architectures for >1000 VM

Singapore, Oct 2011

Iwan ‘e1’ Rahabokvirtual-red-dot.blogspot.com | tinyurl.com/SGP-User-Group

M: +65 9119-9226 | [email protected]

VCAP-DCD

Page 2: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

2 Confidential

Purpose of This Document There is a lot of talk about Cloud Computing. But how does it look like at technical level?

• How do we really assure SLA, and have 3 Tier of service?• If I have 1000 VM, what does the architecture it look like?

This is my personal opinion. • Please don’t take it as official and formal VMware Inc recommendation. I’m not authorised to do so.• Also, generally we should judge the content, rather than the organisation/person behind the content. A

technical fact is a technical fact, regardless who said it Technology changes

• SSD disk, >10 core CPU, FCoE, CNA, vStorage API, storage virtualisation, etc will impact the design. A lot ot new innovation coming within next 2 years.

• New modules/products from VMware & Ecosystem Partners will also impact the design.

This is just a sample • Not a Reference Architecture, let alone a Detailed Blueprint. • So please don’t print and follows to the dot. This is for you to think and tailor.

It is written for hands-on vSphere Admin who have attended Design Workshop & ICM • You should be at least a VCP 5, preferably VCAP-DCD• No explanation on features.• A lot of the design consideration is covered in vSphere Design Workshop.

Folks, some disclaimer, since I am employee of VMware

Page 3: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

3 Confidential

Table of Contents

Introduction• Requirements, Assumptions, Consideration, and Design Summary

vSphere Design: Data Center• Data Center, Cluster (DRS, HA, DPM, Resource Pool)

vSphere Design: Server• ESXi, physical host

vSphere Design: Network vSphere Design: Storage vSphere Design: Security

• vCenter roles/permission, vSphere Design: VM vSphere Design: Management

• Performance troubleshooting

Page 4: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

4 Confidential

Design Methodology Architecting a Private Cloud is not a sequential process

• There are 6 components.• The components are inter-linked. Like a mash.• In >1000 VM category, where it takes >2 years to implement,

new vSphere will change the design.

Even the Bigger picture is not sequential• Sometimes, you may even have to leave Design and go back to Requirements or Budgetting.

Again, there is no perfect answer. Below is one example. This entire document is about Design only. Operation is another big space.

• I have not taken into account Audit, Change Control, ITIL, etc.VM

Server

Storage

Data Center

Network

MgmtThe steps are more like this

Page 5: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

5 Confidential

Introduction

Page 6: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

6 Confidential

Assumptions Assumptions are needed to avoid the infamous “It depends…” answer.

• The architecture for 50 VM differs with that for 500 VM, which in turn differs with that for 5000 VM.• A design for large VM (16 vCPU, 128 GB) differs with a design for small VM (1 vCPU, 1 GB)• A design for Server farm differs to Desktop farm.

This assumes 100% virtualised, not 99% • It is easier to have 1 platform than 2.• Certain things in company, you should only have 1 (email, directory, office suite, back up). Something as big as

a “platform” should be standardised. That’s why they are called platform Out of the 1000 VM, we assume some will be…

• Huge. 10 vCPU, 96 GB RAM, 10 TB storage• Latency sensitive. 0.01 ms end to end latency• Secret. Holding company secret data.

We assume, it will have …• 50 databases, mixed of Oracle and SQL• Other Oracle softwares (they are charged per “cluster”)

The design is “forward looking”• Based on 10 GE network. • Assume Security team can be convinced on mixed-mode.

Page 7: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

7 Confidential

Assumptions used in this exampleAssumptions

# VM that our design needs to cater 750 Production2250 Non Production (1:3 ratio)5 data centers

Data Center 2 large one (Singapore + HK), with private connectivity.5 small ones (to comply with country regulatory)

# Desktops/Laptop 10000 With remote access + 2 FA (RSA)Need offline VDI and iPad access

DMZ Zone / SSLF Zone Yes/Yes. Intranet also zoned

Back up Tape

Network standard Cisco

ITIL Compliance In place

Change Management In place

Overall System Mgmt SW (BMC, CA, etc) Yes, CA

Database Oracle, SQL. Some have > 5 TB database.

MSCS Required

Audit Team External & Internal

Oracle softwares (BEA, DB, etc) Yes. “sub-cluster” will be used.

IT Organisation Different teams for Server, Storage, Network, Security, Database, etc.

Fault Tolerance Yes (for Tier 0)

Complex App dependancy Yes (some apps spans >30 VM)

Page 8: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

8 Confidential

Application consideration

Type of VM Impact on DesignApp that holds sensitive data Should encrypt the data or the entire file system. vSphere 5 can’t encrypt the

vmdk file yet. If you encrypt the Guest OS, back up product may not be able to do file-level back up.Should ensure no access by MS AD Group Administrator. Find out how it is back up, and who has access to the tape. If IT does not even have access to the system, then vSphere may not pass the audit requirement.Check partner products like Intel TXT and HytrustShould be placed on separate cluster, or even vCenters?

A group of apps with complex power on sequence

I recommend HA Isolation response to shut down the VMs running on the isolated host. If they are shut-down, powering them on may need App Owner involvement (especially if it needs manual intervention)

App that takes advantages of specific CPU Instruction Set

Mixing with older CPU Architecture is not possible. This is a small problem if you are buying new server.EVC will not help, as it’s only a mask. See speaker notes

App that need < 0.01 ms end to end latency

Separate cluster.

Page 9: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

9 Confidential

Application consideration

Type of VM Impact on DesignApp that require software dongle Dongle must be attached to 1 ESX. vSphere 4.1 adds this support. Best to use

network dongle.In the DR site, the same dongle must be provided too

App with high IOPS May need its own datastore with dedicated spindles. No point having dedicated datastores if the underlying spindles are shared among multiple datastores.

Apps that uses very large block size SharePoint uses 256 KB block size. So a mere 400 IOPS will saturate the GE link already. For such application, FC or FCoE will be a better protocol.Any application with 1 MB block size can easily saturate 1 GE link.

App with very large RAM or vCPU This will impact DRS when a HA event occurs as it needs to have a host that house the VM. It will still boot so long reservation is not set to a high number.

App that is very sensitive to time accuracy.

Time drift is a possibility in virtual world.Find out business or technical impact if time deviates by 10 seconds.

Page 10: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

10 Confidential

Architecting a private cloud: what to consider Architecture is an Art

• Balancing a lot of things, some are not even technical.• It considers future (unknown requirements). • Trying to be close to best practice• Not in any particular order, below is what I consider in this vSphere based architecture

My personal principle: Do not design something you cannot troubleshoot.• A good IT Architect does not setup potential risk for Support Person down the line.• Not all counters/metrics/info are visible in vSphere.

Consideration• Upgradability

• This is unique in the virtual world. A key component of cloud that people have not talked much.• After all my apps run on virtual infrastructure, how do I upgrade the virtualisation layer itself?• Based on historical data, VMware releases major upgrade every 2-3 years. vSphere 4.0 was released on May 2009, 5.0 was Sep 2011.• If you are laying down an architecture, check with your VMware rep for NDA roadmap presentation.

• Debugability• Troubleshooting in virtual environment is harder than physical, as boundary is blurred and physical resources are shared.• 3 types of troubleshooting:

• Configuration. This does not normally happen in production, as once it is configured, it is not normally changed.• Stability. Stability means something hang or crash (BSOD, PSOD, etc) or corrupted• Performance. This is the hardest among the 3, especially if the slow performance is short lived and in most cases it is performing

well.

Page 11: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

11 Confidential

Architecting a private cloud: what to consider

Consideration• Supportability

• This is related, but not the same with Debug-ability. Support relates to things that make day to day support easier. Monitoring counters, reading logs, setting up alerts, etc. For example, centralising the log via syslog and providing intelligent search (e.g. using Splunk or Integrien) improves Supportability

• A good design makes it harder for Support team to make human error. Virtualisation makes task easy, sometimes way to easy relative to physical world. Consider this operational/phychological impact in your design.

• Support also means using components that are support by the vendors. For example, SAP support is from certain versions onwards (old version not supported)

• Availability• Software has Bugs. Hardware has Fault. We cater for hardware fault mostly. What about software bugs?• Cater for software bug, which is why the design has 2 VMware clusters with 2 vCenter. This lets you test cluster-related

features in one cluster, while keeping your critical VM on another cluster. • Tier 0 can be added that uses Fault Tolerant hardware (e.g. Stratus)

• Reliability• Related to availabity, but not the same. Availability is normally achieved by redundancy. Reliability is normally achieved by

keeping things simple, using proven components, separating things, standardising.• You will notice a lot of standardisation in the design. The drawback of standardisation is overhead, as we have to round up to

the next bracket. A VM with 6 GB RAM ends up getting 8 GB.

• Performance• Storage, Network, VMkernel, VMM, Guest OS, etc are considered.• We are aiming for <1% CPU Ready Time and near 0 Memory Ballooning in Tier 1. In Tier 3, we can and should have higher

ready time and some ballooning, so long it still meet SLA.

• Scalability• Includes both horizontal and vertical. Includes both hardware and software.

Page 12: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

12 Confidential

Architecting a private cloud: what to consider

Consideration• Cost

• An even bigger Cost is ISV. Dedicating cluster for them is cheaper.• DR Site serves multiple purpose.• VMs from different Business Units are mixed in 1 cluster. If they can share same Production LAN and SAN, same reason can

apply to hypervisor.• Window, Linux and Solaris VMs are mixed in 1 cluster.

• Security• vSphere Security Hardening Guide split security into 3 levels: Production, DMZ and SSLF• vShield is used to complement vSphere. • Changing the paradigm in security. From “Hypervisor as another point to secure” to “Hypervisor to give unfair advantage for

security team”.

• Skills of IT team• Skills include both internal and external (preferred vendor who complement the IT team)

• Improvement• Beside meeting current requirements, can we improve things? • Moving toward “1 VM 1 OS 1 App”. In physical, some physical servers may serve multiple purpose. In virtual, they can afford,

and should do so, to run 1 App per VM.• We consider Desktop Virtualisation in the overall architecture.

Page 13: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

13 Confidential

Data Center DesignData Center, Cluster, Resource Pool, DRS, DPM

Page 14: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

14 Confidential

Methodology

Define how many physical data centers are required• DR requirements normally dictate 2

For each Physical DC, define how many vCenter are required• Desktop and Server should be separated by vCenter

• View 5 comes with bundled vSphere (unless you are buying add-on)

• Security Requirement, not scalability, drives this one• In our sample scenario, it does not warrant separation.• Different vCenter Admin drives this segregration

For each vCenter, define how many virtual data centers are required• Virtual Data Center serve as name boundary. • Different paying customer drives this segregation

For each vDC, define how many Cluster are required For each Cluster, define how many ESXi are required

• Preferably 4 – 8. 2 is too small a size. Adjust according to workload• Standardise the host spec across cluster. While each cluster can have its own host type, this adds complexity

Physical DC vCenter Virtual

DC Cluster ESXi

Physical DC

vCenter

Virtual DC

Virtual DC

Cluster Cluster

ESXi ESXi ESXi

vCenter

Virtual DC

Page 15: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

15 Confidential

DataCenter and Cluster When do you decide to use a separate…• Cluster?• Datacenter?• vCenter?

Input needed to decide the above• Application licensing• Application Workload• Application hardware requiremens

Group

Page 16: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

16 Confidential

Special scenarios There are scenarios where you might need to create a separate cluster, or even vCenter. Large VM (>6 vCPU, > 36 GB RAM)• Separate cluster, as the ESXi host spec is different?

Databases. Do you …• …group them into 1 cluster (to save licence, give DBA more access to the cluster, vShield group)?• …or put them together with the app they support?• … put DB used by IT in the same cluster with DB used by Business?

Oracle VM • Separate cluster or sub-cluster?

VM that needs hardware dongle• Use network-base.• Separate subcluster. Will also need the same at DR Site.

VM holding company secret• Do you put them them in separate cluster? Can you trust the vCenter Admin?• Do you put them in separate datastore? Do you use VSA as you can’t trust the SAN Admin?• Enhance the security with vShield

VM with 0.1 ms network latency• Do you put them in separate cluster as your ESXi has to be configured differently?

VM with 5 ms disk latency VM on DMZ zone• Same cluster. We wil use vShield

Page 17: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

17 Confidential

Overall Architecture

This shows an example for Cloud for >500 VM. It also uses Active/Passive data centers.The overall architecture remain similar with Large Cloud.

Page 18: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

18 Confidential

Cluster Design (1 DC)

Primary Data Center (Active)

Tier 2 Clusters Special Clusters IT Cluster

vCenter 2

With LinkedMode.With SRM integration

FC Storage NFS Storage

NFS LANSAN Fabric

Tier 1 Clusters Tier 3 Clusters DesktopCluster 1

DesktopCluster N

vCenter 3

Management VMsfor Desktops

reside in IT Cluster

8 ESXi 8 ESXi

Tape back upTier 1 StorageTier 2 Storage

Tier 3 StorageIT Cluster

Confidential Cluster

vCenter 1

Standalone

NFS Storage

NFS LAN

Confidential VM

Page 19: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

19 Confidential

The need for Non Prod Cluster This is unique in the virtual data center.• Well, we don’t have “Cluster” to begin with in physical DC.

Non-Prod Cluster serves multiple purposes• Run Non Production VM

• In our design, all Non-Production run on DR Site to save cost. • A consequence of our design is migrating from/to Production can mean

copying large data across WAN.• Disaster Recovery• Test-Bed for Infrastructure patching or updates.• Test-Bed for Infrastructure upgrade or expansion

Evaluating or Implementing new features• In Virtual Data Centre, a lot of enhancements can impact entire data centre• e.g. Distributed Switch, Nexus 1000V, Fault Tolerant, vShield• All the above need proper testing. • Non-Prod Cluster should provide sufficient large scale scope

to make testing meaningful Upgrade of the core virtual infrastructure• e.g. from vSphere 5 to future version (major release)• This needs extensive testing and roll back plan.

Even with all the above…• How are you going to test SRM properly?

• SRM test needs 2 vCenters, 2 arrays, 2 SRM servers. If all are used in production, then where is the test-environment for SRM?

Business

IT

This new layer does not exist in physical world.

It is software, hence needs its own Non Prod envi.

Page 20: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

20 Confidential

The need for IT Cluster

Special purpose cluster• Running all the IT VMs used to manage the

virtual DC or provide core services• The Central Management will reside here too• Separated for ease for management &

security

This separation keeps Business Cluster clean,

“strictly for business”.

Large Cloud

VMware vCenter (for Server Cloud)vCenter Heart-beatvCenter Update ManagerSymantec AppHA ServervCloud Director

Storage Storage Mgmt tool (may need physical RDM to get fabric info)

Network Network Management ToolNexus 1000V Manager (VSM)

Core Infra MS AD 1MS AD 2Syslog serverFile Server (FTP Server)

Advance vDC Services Site Recovery Manager + DBChargeback + DBAgentless AVObject-based Firewall

Security Security Management ServervShield Manager

Admin Admin client (1 per Sys Admin)VMware ConvertervMAvCenter Orchestrator

Application Mgmt App Dependancy Manager

Management vCenter Ops + DBHelp Desk

Desktop View Managers + DBThinApp Update ServervCenter (for Desktop Cloud)

Page 21: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

21 Confidential

Cluster Size I recommend 8 nodes per cluster. Why 8, not 4 or 12 or 16 or 32?

• A balance between too small (4 hosts) and too large (>12 hosts)• DRS: 8 give DRS sufficient host to “maneuver”. 4 is rather small from DRS scheduler point of view.• With vSphere 4.1, having 4 hosts do not give enough hosts to do “sub-cluster”• For cost reason, some clusters can be as small as 2 nodes. But DPM benefit can’t be used.

Best practice for cluster is same hardware spec with same CPU frequency.• Eliminates risk of incompatibility• Consistent performance (from user point of view)• Complies with Fault Tolerant & VMware View best practices• So more than 8 means it’s more difficult/costly to keep them all the same. You need to buy 8 hosts a time. • Upgrading >8 servers at a time is expensive ($$) and complex. A lot of VMs will be impacted when you upgrade > 8 hosts.

Manageability• Too many hosts are harder to manage (patch, performance troubleshooting, too many VMs per cluster, HW upgrade)• Allow us to isolate 1 host for VM-troubleshooting purpose. At 4 node, we can’t afford such ”luxury”

Too many paths to a LUN can be complex to manage and troubleshoot• Normally, a LUN is shared by 2 clusters, which are “adjacent” cluster.• 1 ESX is 4 paths. So 8 ESX is 32 paths. 2 clusters is 64 paths. This is a rather high number (if you compare with physical world)

N+2 for Tier 1 and N+1 for others• With 8 host, you can withstand 2 host failures if you design it to. • At 4 nodes, it is too expensive as payload is only 50% at N+2

Small Cluster size• From Availability and Performance point of view, this is rather risky.• Say you have 3-node cluster…. You are doing maintenance on Host 1 and suddenly Host 2 goes down… you are exposed with just 1 node. Assuming

HA Admission Control is enabled (which you should), the affected VM may not even boot. When a host is placed into maintenance mode, or disconnected for that matter, it is taken out of the admission control calculation.

• Cost: Too few hosts result in overhead (the “spare” host)

See slide notes for more details

Page 22: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

22 Confidential

3-Tier cluster

The host spec can be identical. But the service can be very different. Below is an example of 3-tier cluster.

Tier # Host Node Spec? FailureTolerance

MSCS #VM Monitoring Remarks

Tier 1 5(always)

Always Identical

2 hosts Yes Max 18 Application level.Extensive Alert

Only for Critical App. No Resource Overcommit.

Tier 2 4 – 8 Maybe 1 host Limited 10 per (N-1) App can be vMotioned to Tier 1 during critical run

Tier 3 4 – 8 No 1 host No 15 per (N-1) Infrastructure levelMinimal Alert.

Some Resource Overcommit

Page 23: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

23 Confidential

ESXi Host: CPU Sizing

ESXi Host: CPU• 2 - 4 vCPU per physical core

• This is a general guideline. • Not meant for sizing Tier 1 Application. Tier 1 App should be given 1:1 sizing.• More applicable for Test/Dev or Tier 3

• 12 core box 24 – 48 vCPU• Design with ~10 VM per box in Production and ~15 VM per box in Non Production.• ~10 VM per box means impact of downtime when host fails are capped at ~10 Production VM.• ~10 VM per box in a 8-node cluster means ~10 VMs may be able to boot in 7 hosts in the event of HA, hence reducing down

time.• Based on 10:1 consolidation ratio, if all your VMs are 3 vCPU, then you need 30 vCPU, which means a 12 core ESX gives 2.5:1

CPU oversubcribe.• Based on 15:1 consolidation ratio, if all your VMs are 2 vCPU, then you need 30 vCPU.

• Buffer the following:• HA event• Performance isolation. • Hardware maintenance• Peak: month end, quarter end, year end• Future requirements: within 12 months• DR. If your cluster needs to run VM from the Production site.

Page 24: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

24 Confidential

ESXi: Sample host specification

Estimated Hardware Cost: US$ 8K per ESXi. Configuration included in the above price:

• 2 Xeon X5650. The E series has different performance & price attributes• 72 GB RAM (18 slots x 4 GB) or 96 GB RAM (12 slots x 8 GB)• 2 x 10 GE ports (no hardware iSCSI)• 2 x 8 Gb FC HBA• 5 year warranty (next business day)• 2x 50 GB SSD.

• Swap to host-cache feature in ESXi 5• Running agent VM that is IO intensive• Could be handy during troubleshooting. Only need 1 HD as it’s for troubleshooting purpose.

• PXE boot• No need local disk

• Installation service• Light-Out Management. Avoid using WoL. Uses IPMI or HP iLO.

Page 25: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

25 Confidential

Blade or Rack Both are good. Both have pro and cons. Table below is relative comparison, not absolute. • Consult principal for specific model. Below is just for guidelines.

Comparison below is only for vSphere purpose. Not for other use case, say HPC or non VMware.

Blade RackRelativeAdvantages

• Some blades come with built-in 2x 10 GE port. To use it, you just need to get 10 GE switch.• Flexibility. Some blade virtualise the 10 GE NIC and can slice it. As usual, adding another layer adds

complexity.• Less cabling. • Better power efficiency. Better rack space efficiency.• Better cooling efficiency. The larger fan (4 RU) is better than the small fan (2 RU) used in rack• Some blade can be stateless. The management software can clone 1 ESX to another.• Better management

• Typical 2RU rack server normally comes with 4 built-in ports.

• Better suited for <20 ESX per site

• More local storage

RelativeDisadvantages

• Another level or layer of virtualisation. It adds complexity and must be learned.• Some replacement or major upgrade may require all blade to be powered off• Some have limited PCI slots (2 slots). Ensure that the # NIC ports and HBA can be met.• Best practice recommends 2 enclosure. The enclosure is passive in some models, it does not contain

electronic. So there can be initial cost as each chassis needs to have switches too.• Ownership of the SAN/LAN switches in the chassis needs to be made clear.• Need to learn the rules of the chassis/switches. Positioning of the switch matters in some model.• The common USB port in the enclosure may not be accessible by ESX. Need to check with respective blade

vendor.• USB dongle (which you should not use) can only be mounted in front. Make sure it’s short enough that you

can still close the rack door.

• The 1 RU rack server has very small fan, which is not as good as larger fan.

• Less suited when each DC is big enough to have 2 chassis

• Cabling & rewiring

Page 26: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

26 Confidential

Server Selection All Tier 1 vendors (HP, Dell, IBM, Cisco, etc) make great ESXi hosts.• Hence the following guidelines are relatively minor to the base spec.

Additional guidelines for selecting an ESXi Servers:• Does it have Embedded ESXi?• How much local SSD (capacity and IOPS) can it handle? This is useful for stateless desktop

architecture. Useful when using local SSD as cache or virtual storage.• Does it have built-in 2x 10 GE ports?• Does the built-in NIC card have hardware iSCSI capability?• Memory cost. Most ESXi Server has around 64 – 128 GB of RAM, with mostly around 72 GB. With

4 GB DIMM, it needs a lot of DIMM slots.• What are the server unique features for ESXi?• Management integration. Majority of the server vendors have integrated management with

vCenter. Most are free. Dell is not free, although it has more features?• DPM support?

Page 27: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

27 Confidential

SAN Boot 4 methods of ESXi boot• Local Compact Flash• Local Disk• LAN Boot (PXE) with Auto-Deploy• SAN Boot

For the 3 sample size, we use ESXi Embedded.• Environment with >20 ESXi should consider Auto Deploy. • Auto-Deploy is also good for environment where you need to prove to security team that your

ESXi has not been tempered (you can simply boot it and it is back to “normal” ) Advantages of Local Disk to SAN boot• No SAN complexity• Need to label the LUN properly.

Disadvantages of Local Disk to SAN boot• Need 2 local disk, mirrored.

• Certain organisation does not like local disk.• Disk is a moving part. Lower MTBF.• Save power/cooling

• SAN Boot is a step toward stateless ESXi• An ideal ESX is just pure CPU and RAM. No disk, no PCI card, no identity.

Page 28: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

28 Confidential

Storage Design

Page 29: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

29 Confidential

Methodology

• Once mapping is done, turn on QoS if needed• Turn on Storage IO Control if a particular VM needs certain guarantee.• Turn on Storage IO Control is we want fairness among all VM within the DS• Storage IO Control is per datastore. If underlying LUN shares spindles with all other LUN, then it may not achieve the result.

Consult with storage vendor on this as they have entire array visibility/control.

SLA

Datastore

VM

Mapping

QoSDefine the standard (Storage Driven Profile)

Map each VM to each datastoreCreate another DS if insufficient (either capacity or performance)

See next slide for detail

For each VM, gather:• Capacity (GB)• Performance (IOPS) requirements• Importance to business: Tier 1, 2, 3

Define the Datastore profile.

Map Cluster to Datastore

Page 30: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

30 Confidential

SLA: Type of Datastores Not all datastores are equal.

• Always know the underlying IOPS & SLA that the Array can provide for a given datastore

You should always know where to place a VM.• Use datastore group • Always have a mental picture where your Tier 1 VM resides. It can’t be “somewhere in the cloud”

Types of datastore• Business VM

• Tier 1 VM, Tier 2 VM, Tier 3 VM, Single VM• Each Tier may have multiple datastores.

• DMZ VM• Mounted only by ESX that has DMZ network?

• IT VM• Isolated VM• Template• Desktop VM• SRM Placeholder• Datastore Heartbeat?

• Do we dedicate datastores for it?

1 datastore = 1 LUN• Relative to “1 LUN = Many VMFS”, it gives better performance due to less SCSI reservation

Page 31: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

31 Confidential

Special Purpose Datastore 1 low cost Datastores for ISO and Templates

• Need 1 per vCenter• Need 1 per physical Data Center. Else you will transfer GBs of data across WAN.• Around 500 GB• ISO directory structure:

1 staging/troubleshooting datastore• To isolate a VM. Proof to Apps team that datastore is not affected by other VM.• For storage performance study or issue. Makes it easier to corelate with data from Array.• The underlying spindles should have enough IOPS & Size for the single VM• Our sizing: 500 GB

1 SRM Placeholder datastore• So you always know where it is. • Sharing with other datastore may confuse others.• Used in SRM 5 to place the VMs metadata so it can be seen in vCenter.• 10 GB enough. Low performance.

\ISO\ \OS\Windows \OS\Linux \Non OS\ store things like anti virus, utilities, etc

Page 32: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

32 Confidential

SLA: 3 Tier pools of storage Create 3 Tiers of Storage. • This become the type of Storage Pool provided to VM• Paves for standardisation

• 1 size for each Tier. Keep it consistent. Choose an easy number.• 20% free capacity for VM swap files, snapshots, logs, thin volume growth, and storage vMotion (inter tier).

• Use Thin Provisioning at array level, not ESX level.• Separate Production and Non Production

Example • Replication is to DR Site via array replication, not same building.• Snapshot = protected with array-level snapshot for fast restore• RAID level does not matter so much if Array has sufficient cache (with battery backed, naturally)• RDM will be used for data drive with 1 TB. Virtual-compatibility mode used unless Apps said so.• VMDK larger than 1 TB will be provisioned as RDM. Virtual-compatibility mode used.

Tier Interface IOPS Latency RAID RPO RTO Size Limit Replicated Snapshot # VM

1 FC 3000 10 ms 10 1 hour 1 hour, with SRM 1.0 TB 70% Yes, hourly Yes ~10 VM. EagerZeroedThick

2 FC 2000 15 ms 5 4 hour 4 hour, with SRM 2.0 TB 80% Yes, 4 hourly No ~20 VM. Normal Thick

3 FC 1000 20 ms 5 8 hour 8 hour 3.0 TB 80% No No ~30 VM. Thin Provision

Consult storagevendor for arrayspecific design

Page 33: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

33 Confidential

3-Tier Storage? Below is a sample diagram, showing disk grouping inside an array.

• The array has 48 disks. Hot Spare not shown for simplicity• This example only has 1 RAID Group (2+2) for simplicity

Design consideration• Datastore 1 and Datastore 2 performance can impact one another, as they share physical spindles.

• The only way they don’t impact if there are “Share” and “Reservation” concept at “meta slice” level.• Datastore 3, 4, 5, 6 performance can impact one another.• DS 1 and DS 3 can impact each other since they share the same Controller (or SP). This contention happens if the shared

component becomes bottlenect (e.g. cache, RAM, CPU).• The only way to prevent is to implement “Share” or “Reservation” at SP level.

Page 34: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

34 Confidential

Mapping: Cluster - Datastore Always know which cluster mounts what datastores• Keep the diagram simple. Not too many info. The idea is to have a mental picture that you can

remember.• If your diagram has too many lines, too many datastores, too many clusters, then it maybe too

complex. Create a Pod when such thing happens. Modularisation can be good.

Page 35: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

35 Confidential

Mapping: Datastore Replication

Page 36: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

36 Confidential

Mapping: Datastore – VM

Criteria to use when placing a VM into a Tier:• How critical is the VM? Importance to business.• What are its performance and availability requirements?• What are its Point-in-Time restoration requirements?• What are its backup requirements?• What are its replication requirements?

Have a document that lists which VM resides on which datastore group• Content can be generated using PowerCLI or Orchestrator, which shows datastores and their

VMs.• While rarely happen, you can’t rule out if datastore metadata get corrupted.

A VM normally change tiers throughout its life cycle • Criticality is relative and might change for a variety of reasons, including changes in the

organization, operational processes, regulatory requirements, disaster planning, and so on.• Be prepared to do Storage vMotion.

Datastore Group VM Name Size (GB) IOPS

Total 12 VM 1 TB 1400 IOPS

Page 37: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

37 Confidential

Storage Calculation

We will split System Drive and Data Drive• Enable changing the OS by swapping the C:\ vmdk file• We use 10 GB for C:\ to cater for Win08 and give space for defragmentation.• We use Thin Provisioning, at array level preferably.

The sample calculation below is for our small cloud• 30 Production VM: 26 non-DB + 3 DB + 1 File Server

• Non-DB VM: 100 GB on average• DB VM: 500 GB on average• File server VM: 2 TB

• 15 Non Production Capacity IOPS RemarksProduction (non DB) Average D:\ drive is 100 GB

Space needed: 2.6 TB.Datastore: 3 x 1 TB each

100 IOPS x 26 VM = 2600 IOPSConsult with storage team if this is too high. What if they have similar peak period?

This is on the high side, so we don’t have to add buffer for swap file, snapshot, VMFS/NFS buffer

Production (DB) Average D:\ drive is 500 GBSpace needed: 1.5 TBTiering in Production DS

500 IOPS x 3 VM = 1500 IOPS This is on the high side.

IT Cluster Average D:\ drive is 50 GBFile Server is 2 TB

100 IOPS per VM.300 IOPS for File Server

This is on the high side.

Non-Production Average D:\ drive is 100 GBSpace needed: 1.5 TBDatastore: 2 x 1 TB

100 IOPS x 15 VM This is on the high side.

Isolated VM 200 GB 200 IOPS

Total ~ 6.1 TB ~ 6000 IOPS

Page 38: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

38 Confidential

Reasons for FC (partial list) Network issue does not create storage issue Troubleshooting storage does not mean troubleshooting network too 8 Gb vs 1 Gb. 16 vs 2 Gb in redundant mode• 10 GE is still expensive and need uplink to change too• HP or Cisco blade may provide good alternative here.

Consider the total TCO and not just cost per box. FC vs IP• FC protocol is more efficient & scalable than

IP protocol for storage• Path failover is <30 seconds, compared with

<60 seconds for iSCSI Lower CPU cost• See the chart. FC has lowest CPU hit

to process the IO, followed by hardware iSCSI Storage vMotion• You can estimate the time taken to move 100 GB over

1 Gb path…

NFS S/W iSCSI

H/W iSCSI

FC0.00

0.20

0.40

0.60

0.80

1.00

1.20

ESX 3.5ESX 4.0

Rel

ativ

e C

PU

cos

t per

I/O

Page 39: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

39 Confidential

Network Design

Page 40: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

40 Confidential

Methodology Define how many VLAN you need Decide if you will use 10 GE or 1 GE• If you use 10 GE, define how you will use Network IO Control

Decide if you use IP storage or FC storage Decide the vSwitch to use: local, distributed, Nexus Decide when to use Load Based Teaming Select blade or rack mount• This has impact on NIC ports and Switches

Define the detailed design with vendor

Page 41: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

41 Confidential

Network Architecture

Page 42: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

42 Confidential

ESXi Network configuration

Page 43: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

43 Confidential

Security Design

Page 44: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

44 Confidential

Enterprise IT space

Separation of Duties with vSphere

VMware Admin >< AD Admin• AD Admin has access to NTFS. This can be too powerful if it has confidential data

Segregate the virtual world• Split vSphere access into 3.

• Storage• Server• Network

• Give Network to Network team.• Give Storage to Storage team.• Role with all access to vSphere

should be rarely used.• VM owner can be given some access

that they don’t have in physical world. They will like the empowerment (self service)

vSphere space

VMware Admin

Ser ver Adm in

Operator

VM Owner

Operator

VM Owner

St or ag e Adm in

MS AD Admin

Storage Admin

Network Admin DBA Apps

Admin

Page 45: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

45 Confidential

Folder Properly use it• Do not use Resource Pool to organise VM.• Caveat: the Host/Cluster view + VM is

the only view where you can see both ESX and VM. Study the hierarchy on the right• It is Folder everywhere. • Folder is the way to limit access.

• Certain object don’t have its own access control. They rely on folder.• E.g. You cannot set permissions directly on a

vNetwork Distributed Switches. To set permissions, create a folder on top of it.

Page 46: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

46 Confidential

Compliance How do we track changes made at vCenter by authorised staff? vCenter does not track configuration drift.• Tools like vCenter Ops Enterprise provides some level of configuration management, but not all.

Page 47: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

47 Confidential

VM Design

Page 48: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

48 Confidential

Standard VM sizing: Follow McDonald 1 VM = 1 App = 1 purpose. No bundling of services.• Having multiple application or services in 1 OS tend to create more problem. Apps team knows

this better. Start with Small size, especially for CPU & RAM.• Use as few virtual CPUs (vCPUs) as possible.

• CPU impact on scheduler, hence performance• Hard to take back once you give them. Also, the app might be configured to match the processor (you will

not know unless you ask the application team).• Maintaining a consistent memory view among multiple vCPUs consumes resources.• There is licencing impact if you assign more CPU. vSphere 4.1 multi-core can help (always verify with ISV)• Virtual CPUs not used still consumes timer interrupts and execute the idle loops of the guest OS• In physical world, CPU tend to be oversized. Right size it in virtual world.

• RAM• RAM starts with 1 GB, not 512 MB. Patch can be large (330 MB for XP SP3) and needs RAM• Size impact vMotion, ballooning, etc, so you want to trim the fat• Tier 1 Cluster should use Large Page.

• Anything above XL needs to be discussed case by case. Utilise Hot Add to start small (need DC edition)• See speaker notes for more info

Item Small VM Medium VM Large CustomCPU 1 2 3 4 – 8

RAM 1 GB 2 GB 4 GB 8, 12, 16 GB, etc

Disk 50 GB 100 GB 200 GB 300, 400, etc GB

Page 49: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

49 Confidential

Operational Excellence

Page 50: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

50 Confidential

Ownership

Where do you draw the line between Storage and vSphere. Who owns the following:• VMFS and RDM• VMDK• Storage DRS• Who can initiate Storage vMotion?• Virtual Disk SCSI controller• Who decide the storage-related design in vSphere?

Where do you draw the line between Network and vSphere?• Who decide which one to use: vSwitch, vDS, Nexus?• Who decide on the network-related design in vSphere?• Who troubleshoot network problem in vSphere?

Where do you draw the line between Security and vSphere?

Page 51: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

51 Confidential

Management in the Virtual World

VM is very different to a physical machine

Page 52: Private Cloud Sample Architectures for >1000 VM Singapore, Oct 2011

© 2009 VMware Inc. All rights reserved

Confidential

Thank You