VMworld 2013: Extreme Performance Series: vCenter of the Universe

Preview:

DESCRIPTION

VMworld 2013 Justin King, VMware Ravi Soundararajan, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Citation preview

Extreme Performance Series:

vCenter of the Universe

Justin King, VMware

Ravi Soundararajan, VMware

VSVC5234

#VSVC5234

2

Goals

Help you understand vCenter Architecture

Help you use this knowledge to guide vCenter deployment

3

vCenter Deployment Options

One vCenter

Many vCenters

1 vCenter per site

Multiple vCenters using linked mode within a single site

Multiple vCenters using linked mode across sites

4

Agenda

Introduction

vCenter Architectural Deep Dive

Common Questions

Multiple vCenter deployment strategies

Conclusion

5

For Most of You, This Is vCenter

C# clients

API clients

C# clients

API clients

vpxd DB

vCenter server

6

However, This is Approximately vCenter (We Will Dissect This…)

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Java

Inv

Serv

vCenter server

SSO

PBSM

Log

vctomcat

7

Understanding vCenter Control Flow: Web Client Login

App

Server

vSphere

Web

Clients vCenter server

1. Login

AD

SSO 2. SSO

Authenticates

3. After user is authenticated, user

has access to all providers registered

with SSO (e.g., vCenter)

8

Understanding vCenter Control Flow: C# Client Login

VPXD

vCenter server

AD

SSO

VI Clients 1. Login request to vCenter

(vpxd service)

2. vpxd contacts

SSO for authentication

3. User is able to

view inventory

Note: vpxd no longer

directly talks to AD

9

Understanding vCenter Control Flow: A PowerOn Operation

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

vSphere

Web

Clients

Inv

Serv

vCenter server

1. PowerOn

2. To Vpxd

3. DRS +

Admission

Control

4. Issue Command

To ESX. Report Status. 5. Persist

To DB

6a,b. Notify clients;

Persist to Inv Svc Note: client is

authenticated, so

SSO not invoked

during operation

10

Agenda for vCenter Architectural Deep Dive

vCenter to ESX interactions

vCenter server internals

Database

Clients

11

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

vCenter service

Architecture Deep Dive: vCenter to ESX Interactions

3 main interactions:

1. Command traffic (depends on load)

2. Update host status (host sync)

3. Statistics (bursty)

12

vCenter-to-ESX Considerations: Latency and Throughput

• Data transferred is typically small (KBs, not MBs)

• Latency from VC-to-ESX has larger impact than throughput

• Latency example: 4x diff (100ms vs. 500ms) 2x powerOn

latency difference

• Throughput example: 3x diff (512Kbps vs. 1.5Mbps) 0 powerOn

latency difference

• Other implications of high latency or low throughput

• Impact on statistics

• Slower stats collection

• Slower real-time queries

• Impact on browsing

• Console slower

• Host config slower

• Other stuff should be same

13

Architecture Deep Dive: vCenter Server Internals

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Java

Inv

Serv

vCenter server

SSO

PBSM

vpxd

• (Core business logic)

• Sends tasks to

appropriate hosts

• Retrieves config changes

from hosts

• Pushes config updates to DB

• Inserts stats into DB

• Satisfies queries from clients

CPU/Memory important

Log

vctomcat

14

Architecture Deep Dive: vCenter Server

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Java

Inv

Serv

vCenter server

SSO

PBSM

Inv Serv (Inventory Service)

• Cache of DB data

• Stores extension data

(SRM, PBSM)

• Satisfies Web client queries

• Helps with Linked Mode search

• Contains embedded DB

IO crucial: install on different

spindles from vpxd

Multi-threaded: CPU/mem

important

App Server (Web Client Server)

• Satisfies web client requests

• Forwards to Inv Serv, SSO, etc.

• Spawns remote console service

1-1.5 CPUs should be enough

Log

vctomcat

15

Architecture Deep Dive: vCenter Server

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

vctomcat

Java

Inv

Serv

vCenter server

SSO

PBSM

SSO (Single-sign on)

• C/C++ plus Java-based STS

(secure-token service)

• Handles authentication

• Communicates with AD, etc.

Vctomcat

• Contains Health service

• Contains SRS

• Stats reporting service for

overview perf charts

• Retrieves data from DB

• Contains EAM

• ESX Agent Manager for

manager VMs

Log

16

Architecture Deep Dive: vCenter Server

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Tomcat

Java

Inv

Serv

vCenter server

SSO

PBSM

Log (Log Browser service)

• Allows log viewing in web client

PBSM (Policy-based storage mgr)

• Contains SMS + policy engines

• Satisfies “Storage View” queries

from clients

• Every 2 hrs, queries DB and Inv

Serv for most up-to-date data

Can be CPU/Mem-intensive

during queries

Log

17

vCenter Server Resource Usage

vctomcat: SRS, EAM, Health, etc. Inventory Service

Web Client App Server

and remote console

PBSM STS

Log Browser

18

vCenter Server Performance Considerations (1 of 2)

Resource requirements

• Many new services

• Need sufficient CPU and Memory

• May need to tune JVM heap sizes according to inventory size

• Rules of thumb (Unofficial…please check documentation):

• Small setups (< 1000 VMs): 2-4 vCPUs, 8-12GB

• Medium setups (< 4000 VMs): 4-8 vCPUs, 12-24GB

• Large setups (> 4000 VMs): 8-16 vCPUs, 24-32GB

• Embedded database for Inventory Service

• IO requirements higher (2-3K IOPs depending on load)

• Place on its own spindles (separate from other services)

• Consider SSDs

19

vCenter Server Performance Considerations (2 of 2)

Inventory Structure

• Single datastore/datacenter/network can sometimes be vCenter bottleneck

• Several smaller clusters may be better than 1 big cluster

• Spreading hosts/networks/datastores across different datacenters relieves

some bottlenecks

20

Architecture Deep Dive: vCenter-to-Database Interactions

VPXD

DB

VC talks to

DB when…

1. Persisting statistics

(5-minute intervals)

2. Persisting config

changes (e.g., host

syncs)

higher when

more tasks

3. Answering certain

UI queries (e.g.,

cluster/datacenter

charts, historical stats

queries like past-day,

past-week, etc.)

4.Persisting version

information (for inv svc)

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

DB also performs these tasks:

• Stats Rollups: VPX_HIST_STATX

• 30 minutes, 2 hours, 1 day

• Purging stats

• when entities deleted

• Purging events (if auto-purge configured)

• Purging tasks (if auto-purge configured)

• TopN computation

• 10 min, 30 min, 2 hours, 1 day

• Satisfying SMS data refresh for Storage

views (every 2 hours)

21

DB Performance Considerations (1 of 2)

Latency to DB important (often more so than ESX-to-VC latency)

• Almost everything involves the DB…

• Stats persistence

• Certain UI queries

• Updating configuration information

• Historical queries (events, alarms, task history)

• …

Recommendation:

Place DB and vCenter close together

Note: DB and vCenter on different hosts/VMs allows for independent

sizing and tuning

22

DB Performance Considerations (2 of 2)

DB traffic is write-mostly

• Stats inserts and rollups, version updates, config changes, purges

• Sufficient disk subsystem needed. If SSDs are an option, use them (2K IOPs)

Manage database disk growth

• Majority of DB data is “SEAT” data (Stats, events, alarms, tasks): 80-85% (10s

of GBs or more in big setups)

• Inventory data: 10-15% of data (usually < 10GB for large inventories)

• Choose stats levels wisely to avoid excessive growth

• Utilize automatic purging of event/task tables if possible

Recompute DB stats on highly-volatile tables (at least once a day)

• VPX_PROPERTY_BULLETIN

• VPX_TOPN*

23

Architecture Deep Dive: Client Interactions

• C# VI client refreshes frequently

• Induces load on vpxd

More clients, more load

• Web client

• Does not auto-refresh

• Read requests satisfied by

app server, not vpxd

Less load on vpxd

• API clients

• If listening to subset of

inventory/properties, small

load on vCenter

• Limit of 2000 sessions to

vCenter: includes all clients +

remote console App server: Can put in same geo or

on same server as Inv Svc

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Tomcat

Java

Inv

Serv

SSO

PBSM

Log

24

Client Considerations

Clients add load

• If you aren’t using a session, log out

Web Client App Server can go in same server as Inventory Service

• Small resource footprint

• Low latency to inventory service

For API clients, try to be a good citizen

• Avoid frequent/expensive DB calls

• Example: frequency createEventHistoryCollector with complex EventFilterSpec

• Monitor specific inventory items or properties, not all entities and all properties

• Log out when you are done (don’t waste sessions!)

25

Client Notes: Simple Example of “Bad” Client in PowerCLI

Example of a good vs. bad client in PowerCLI

PowerCLI:

• Simple to use, but involves client-side filtering

• Example: Get-VM gets all VMs from server, filters list @ client

$vmList = Get-VM –name “vm1”,”vm2”,”vm3”,”vm4”

Good: 1 server call, client throws away all but vm1,vm2,vm3,vm4

$nameList = “vm1”,”vm2”,”vm3”,”vm4”

foreach ($name in $nameList) {

Get-VM $name

}

Bad: 4 server calls, gets all VMs 4 times…excess client/server work

Also: Please log out when you are done!

26

vCenter Architecture: Summary (Whew!)

ESXi + HostD + VPXA

ST

OR

AG

E

NE

TW

OR

K

VPXD

DB

App

Server

Health SRS

vSphere

Web

Clients

VI Clients

Update

Manager

Converter

AD

API

Clients

Java

Inv

Serv

vCenter server

SSO

PBSM

Log

vctomcat

27

Agenda

Introduction

vCenter Architectural Deep Dive

Common Questions

Multiple vCenter deployment strategies

Conclusion

28

You Say n VMs/Hosts, but I Can Only Reach N. Why?

How we set limits

Create a ‘large environment’

Attach clients, solutions, etc.

Run management operations (clones, powerOps, etc.)

Measure latency and throughput

Why your setup may not reach our scale

Different stats level

Different device configuration of hosts/VMs (e.g., # of datastores)

Different DB configuration (less memory, different recovery mode)

Different latencies from VC-to-ESX or VC-to-DB

Viewing Different Client Pages

Accumulating events and tasks vs. purging them

Each might stress your vCenter/DB/network etc. more than ours

29

How Many Concurrent Operations Can I Perform? (1 of 2)

vCenter hard limits

• 640 concurrent operations before incoming requests are queued

• 2000 concurrent sessions (incoming requests plus remote console sessions)

Per-host or per-datastore limits

• A host can perform up to 8 provisioning operations at once

(provisioning = clone, VMotion, relocate)

• If host is source and destination, host can only do 4 operations at once

• A datastore can perform up to 128 VMotions at once

• A datastore can perform up to 8 Storage VMotions at once

• Limits can be changed, but changes are not officially supported

Other limits

• Datacenter/host/datastore synchronization at VC can limit concurrency

30

vCenter Concurrency (2 of 2)

Clone VM from host A to host B

Each host can participate in 7

other provisioning operations

Clone VM from host A to host A

Host A can only participate in 6

more operations

vCenter

Host A

VM 1

Host B

VM 2

Cost to A: 1 Cost to B: 1

vCenter

Host A

VM 1 VM 2

Cost to A: 2

Do not use a single host as the source of all clones (i.e., spread out templates)

Better disk performance and better concurrency

31

Why Should I Upgrade from VC5.0?

One big reason: In 5.1 and 5.5, stats tables are partitioned

• Stats inserts more efficient (into a small partition at a time)

• Rollups more efficient (plus, amount of data rolled up at once is throttled)

• Stats data purging more efficient (simply truncating a partition)

• vCenter can support higher stats levels for longer periods of time

• Still recommend running higher stats levels (2-4) only for temporary troubleshooting

Inserts

Rollups

Purge

32

What Is the Real Dirt on Stats Levels?

Changing stats levels increases load on the database

Rough rules of thumb (not official VMware recommendations)

• Level 1 stats: per-VM and per-host aggregate stats

• Level 2 stats: additional per-VM/per-host stats

4x or more stats than Level 1 depending on configuration

• Level 3 stats: per-instance stats

6x or more stats than Level 2 depending on configuration

• Level 4 stats: additional rollup types

1.4x more stats than Level 3 depending on configuration

• Use the stats calculator in vCenter

• Try to use higher stats levels only for temporary debugging

• If the stat you want is at the wrong level, let us know

• Consider VCOps for more advanced stats functionality?

33

Should I Distribute VC Services across VMs? (1 of 2)

You can distribute services (Inv Svc, SSO, vpxd, DB) to multiple

VMs, but…

• Better performance when vpxd and Inv Svc are co-located

• Better performance when Web Client service and Inventory Service are

close together

• Better performance when vpxd and DB are close together

34

Should I Distribute VC Services across VMs? (2 of 2)

Typical deployment pre-5.1

• VC and assorted services in 1 VM

• VC DB in another VM

Will still work fine with VC 5.5

Another suggestion

• Put all in 1 VM

• Make sure VM has sufficient CPU/Memory/Disk/Network

(follow best practices)

• Put Inventory Service partition on separate spindles from vpxd and DB

• Put DB partition on separate spindles

• Advantage: looks ahead to future ‘single-VM’ appliance

35

Why Are Cluster/Datacenter Charts Sometimes Slow?

These charts are computed on the fly

They require collection of data from hosts and VMs

A single slow host can hurt performance

36

Agenda

Introduction

vCenter Architectural Deep Dive

Common Questions

Multiple vCenter deployment strategies

Conclusion

37

When Should I Use Multiple vCenters?

Considerations

• Have you exceeded the single host limit?

• Do you want one vCenter per geography?

• Do you want one vCenter per organizational boundary?

(finance, engineering, etc.)

• Do you want a primary and secondary site (e.g., SRM)?

• Do you prefer to manage smaller VCs?

38

Single Site with Multiple vCenters

ESX ESX

ESX

vCenter Server

ESX ESX

ESX

vCenter Server

AD

VI

Client API

Client Important Considerations How do I decide how many

vCenters I need?

(Consider vCenter limits,

Organizational boundaries)

Do I want a single view of

inventory managed by all

vCenters?

How do I synchronize

roles/permissions across

vCenters?

VI

Client VI

Client

API

Client API

Client

Site A Yes? Consider

“linked mode” …

39

Linked Mode

Single pane of

glass from UI for

inventory data

Search across

VC instances

Unified roles and

permissions via AD

40

Linked Mode Architecture

GUI

Linked

Mode Linked

Mode vCenter

AD

VC

DB

ADAM

IS

vCenter

VC

DB

ADAM

IS

vCenter

VC

DB

ADAM

IS

Role A Role A Role A

41

Multiple vCenters in a Single Site in Linked Mode

VI

Client API

Client VI

Client VI

Client

API

Client API

Client

ESX ESX

ESX

vCenter Server

ESX ESX

ESX

vCenter Server

ESX ESX

ESX

vCenter Server

AD

Site A

Important

Considerations: • At most 10

vCenters can be

linked together

• Does not work

on vCenter Server

Appliance (ADAM

Replication)

• Cross-vCenter

operations not

available

• API not linked

mode aware

42

Linked Mode and Single Sign-On Considerations

Linked Mode

• Should I use linked mode across multiple sites?

• Business units that have computing needs across data centers

• What impact does bandwidth have on cross site linked mode?

• Except for query federation, linked mode sites only communicate via ADAM

Linked mode adds minimal cross-site network overhead over multi-site without linked

mode

Bandwidth tradeoffs same as for multi-site vCenters without linked mode

Single Sign-On

• Extend the vSphere authentication domain across sites

• Use Domain accounts for permissions instead of Local OS

• Define replication partners for WAN replication

43

Agenda

Introduction

vCenter Architectural Deep Dive

Common Questions

Multiple vCenter deployment strategies

Conclusion

44

Looking Ahead (No Timelines…)

Many things, but a few main ones:

• Single VM vCenter appliance that can support increasing scale and federation

• Improved performance and scalability

• Operations across VC (like cross-VC VMotion)

45

Conclusion

Single vCenter…some key takeaways

• Services can be placed in the same VM

• IO performance is critical for vCenter and inventory service

• DB provisioning is critical

• VC-to-DB latency is important

Multiple vCenters…Why?

• Exceeding single vCenter limits

• Organizational boundaries

• Security and compliance

• Local/remote administration

Should I use linked mode?

• Single pane of glass from UI? Yes (but also possible with just Web Client…)

• Synchronized roles? Yes

46

Performance Community Resources

Performance Technology Pages

• http://www.vmware.com/technical-resources/performance/resources.html

Technical Marketing Blog

• http://blogs.vmware.com/vsphere/performance/

Performance Engineering Blog VROOM!

• http://blogs.vmware.com/performance

Performance Community Forum

• http://communities.vmware.com/community/vmtn/general/performance

Virtualizing Business Critical Applications

• http://www.vmware.com/solutions/business-critical-apps/

47

Performance Technical Resources

Performance Technical Papers

• http://www.vmware.com/resources/techresources/cat/91,96

Performance Best Practices

• http://www.youtube.com/watch?v=tHL6Vu3HoSA

• http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf

• http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.1.pdf

• http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf

• http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf

Troubleshooting Performance Related Problems in vSphere Environments

• http://communities.vmware.com/docs/DOC-14905 (vSphere 4.1)

• http://communities.vmware.com/docs/DOC-19166 (vSphere 5)

• http://communities.vmware.com/docs/DOC-23094 (vSphere 5.x with vCOps)

48

Don’t miss:

vCenter of the Universe – Session # VSVC5234

Monster Virtual Machines – Session # VSVC4811

Network Speed Ahead – Session # VSVC5596

Storage in a Flash – Session # VSVC5603

Big Data:

Virtualized SAP HANA Performance, Scalability and Practices –

Session # VAPP5591

49

Other VMware Activities Related to This Session

HOL:

HOL-SDC-1304

vSphere Performance Optimization

Group Discussions:

VSVC1001-GD

Performance with Mark Achtemichuk

VSVC5234

THANK YOU

Extreme Performance Series:

vCenter of the Universe

Justin King, VMware

Ravi Soundararajan, VMware

VSVC5234

#VSVC5234

Recommended