35
Best practices for deploying Citrix XenApp on XenServer on HP ProLiant servers Technical white paper Table of contents Executive summary............................................................................................................................... 2 Business case ...................................................................................................................................... 2 Virtualization overview ......................................................................................................................... 3 Benefits ........................................................................................................................................... 4 Why virtualize XenApp? ................................................................................................................... 4 Performance testing.............................................................................................................................. 5 Overview ........................................................................................................................................ 5 Test results ....................................................................................................................................... 8 Best practices ...................................................................................................................................... 9 Unable to migrate to a 64-bit environment?......................................................................................... 9 Only virtualize suitable platforms ..................................................................................................... 10 Consider the cost ........................................................................................................................... 11 Do not oversubscribe vCPUs ............................................................................................................ 11 Fully utilize CPU resources............................................................................................................... 11 Avoid spikes in processor utilization ................................................................................................. 14 Allocate sufficient memory to each VM ............................................................................................. 15 Using write cache .......................................................................................................................... 19 Monitor network performance.......................................................................................................... 20 Enhance availability ....................................................................................................................... 20 Balance the distribution of VMs........................................................................................................ 21 Optimize resource use .................................................................................................................... 21 Optimize the XenServer kernel ......................................................................................................... 21 Enhance manageability .................................................................................................................. 21 Enhancing the scalability of a modern 4P server .................................................................................... 22 Bare-metal 64-bit platform ............................................................................................................... 22 32-bit platform ............................................................................................................................... 25 Consolidation example ....................................................................................................................... 28 Appendix A Testing ......................................................................................................................... 31 Test tools ....................................................................................................................................... 31 User profile ................................................................................................................................... 32 Test scenarios ................................................................................................................................ 32 Test topology ................................................................................................................................. 34 For more information.......................................................................................................................... 35

Best Practices for Deploying Citrix XenServer on HP Proliant Blade

Embed Size (px)

Citation preview

Page 1: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

Best practices for deploying Citrix XenApp

on XenServer on HP ProLiant servers

Technical white paper

Table of contents

Executive summary ............................................................................................................................... 2

Business case ...................................................................................................................................... 2

Virtualization overview ......................................................................................................................... 3 Benefits ........................................................................................................................................... 4 Why virtualize XenApp? ................................................................................................................... 4

Performance testing .............................................................................................................................. 5 Overview ........................................................................................................................................ 5 Test results ....................................................................................................................................... 8

Best practices ...................................................................................................................................... 9 Unable to migrate to a 64-bit environment? ......................................................................................... 9 Only virtualize suitable platforms ..................................................................................................... 10 Consider the cost ........................................................................................................................... 11 Do not oversubscribe vCPUs ............................................................................................................ 11 Fully utilize CPU resources............................................................................................................... 11 Avoid spikes in processor utilization ................................................................................................. 14 Allocate sufficient memory to each VM ............................................................................................. 15 Using write cache .......................................................................................................................... 19 Monitor network performance .......................................................................................................... 20 Enhance availability ....................................................................................................................... 20 Balance the distribution of VMs ........................................................................................................ 21 Optimize resource use .................................................................................................................... 21 Optimize the XenServer kernel ......................................................................................................... 21 Enhance manageability .................................................................................................................. 21

Enhancing the scalability of a modern 4P server .................................................................................... 22 Bare-metal 64-bit platform ............................................................................................................... 22 32-bit platform ............................................................................................................................... 25

Consolidation example ....................................................................................................................... 28

Appendix A – Testing ......................................................................................................................... 31 Test tools ....................................................................................................................................... 31 User profile ................................................................................................................................... 32 Test scenarios ................................................................................................................................ 32 Test topology ................................................................................................................................. 34

For more information .......................................................................................................................... 35

Page 2: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

2

Executive summary

Since the first hypervisor came on the market, businesses have been challenged to consolidate

underutilized servers on to a single physical machine. While early implementations were

disappointing, resulting in large performance penalties due to virtualization overhead, there have

been significant technological advances. Today’s AMD Opteron™ and Intel® Xeon® processors

provide on-chip instructions to handle direct hardware calls from the hypervisor, minimizing the

associated overhead. As a result, the scalability of virtual machines (VMs) is much improved.

Meanwhile, businesses continue to deploy non-virtualized x86 platforms that are inherently restricted

in the number of users they can support due to limited memory addressability. The decision not to

move to a 64-bit platform is often predicated by driver and application incompatibilities that would

make migration prohibitively expensive. Virtualization offers a viable solution to this dilemma,

allowing 32-bit platforms to scale to unprecedented levels.

This paper explores a number of options for using Citrix XenServer to consolidate 32-bit workloads on

both 32- and 64-bit Microsoft® Windows® platforms, with emphasis on best practices, tuning, and

tips for virtualizing a Citrix XenApp on XenServer environment.

Target audience: This white paper provides information for IT professionals interested in virtualization.

This white paper describes testing performed in July 2008 – August 2010.

Business case

Whether you are running a small business, a remote office or a data center, the problem is the same:

you need to support multiple applications that may not co-exist well on the same server. However, no

single application is likely to overload the multi-core processors in one of today’s HP ProLiant servers;

thus, dedicating a server to a particular application would waste valuable resources. In addition, you

may need infrastructure servers to act as firewalls, Domain Name System (DNS)/Dynamic Host

Configuration Protocol (DHCP) servers, virtual desktops, desktop application servers, web servers, or

mail servers, depending on your particular environment. Are you going to dedicate a physical server

to each of these functions?

Another part of the equation may be the need to move old software and operating systems from

outdated and, possibly, failing or hard-to-repair hardware to more modern servers. Unfortunately,

your new hardware may not be able to support the older operating systems and applications, while

the alternative – updating both the hardware and applications – is expensive and increases the

potential for error. Virtualization is one way to address these issues, by better utilizing resources and,

through consolidation, reducing the number of physical servers you need.

XenApp customers are generally seeking ways to reduce their overall costs. Within a XenApp

environment, the cost of powering and cooling servers is high, constituting a significant – if not the

most significant – portion of total IT infrastructure costs. Indeed, more and more studies are indicating

that server hardware is no longer the leading data center expense; for example, the purchase price of

a new, 1U server has already been exceeded by the capital cost of the power and cooling

infrastructure needed to support it and may soon be exceeded by its lifetime energy costs. As a result,

power consumption is a key concern for enterprise customers considering the purchase of HP servers.

Many would like to reduce their overall power footprint – but without sacrificing performance. While

this goal has traditionally been impossible, recent performance characterizations performed by HP

demonstrate that a balance between performance and power consumption can be achieved.

Page 3: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

3

With virtualization, you can now consolidate multiple applications on a single HP ProLiant server,

depending on the memory, CPUs, and disk space available in the host machine, and the particular

applications you wish to support. Furthermore, your VMs can be configured to run troublesome, older

operating systems.

In addition to consolidation, the benefits of hardware virtualization typically include:

VMs are isolated and can be configured to use specific hardware resources

VMs are easily copied and deployed, and can be moved between physical machines without

service interruption

VMs can be administered centrally

Virtualization overview

XenServer, the Citrix virtualization solution, is based on Xen, an open-source virtualization project that

supports both Intel Virtualization Technology (VT) and AMD Virtualization™ (AMD-V™) capabilities.

Xen allows a single machine to host multiple isolated environments, each running an operating system

and an application software instance.

In 2004, the Xen project released the first version of its virtualization platform (hypervisor); the

following year, the project founders formed XenSource, which subsequently introduced products such

as XenServer and XenEnterprise. In 2007, Citrix purchased XenSource, renaming these products

XenServer Standard Edition and XenServer Enterprise Edition.

The hypervisor is the layer responsible for managing partitions (domains), instantiating VMs into

domains, and scheduling and allocating resources for the domains. In some virtualized environments,

the guest is unaware that it is virtualized since the underlying hypervisor is able to emulate all system

components. I/O calls and other privileged requests are performed as normal by the guest but must

be trapped and emulated by the underlying hypervisor, thereby degrading performance. This

scenario is often referred to as full virtualization.

While first-generation hypervisors relied on emulation technology to virtualize operating systems, Xen

takes advantage of advances in operating system and CPU technologies to provide paravirtualization

and hardware-assisted virtualization, along with full 64-bit support.

With paravirtualization, the operating system is modified so that it can directly call virtualized I/O

services and other privileged operations supported by the hypervisor, eliminating the need for

resource-intensive binary translation and emulation. Drivers for storage and network interface cards

(NICs) are replaced with virtualization-aware drivers that provide a fast I/O channel through the host

domain, delivering excellent guest I/O performance.

While operating systems that are not virtualization-aware can be used with Xen, these operating

systems rely on processor extensions to assist virtualization. For a hardware-assisted domain to run on

Xen, the underlying hardware must be either Intel VT or AMD-V capable and have that feature

enabled. All current HP ProLiant servers support hardware-assisted virtualization in 32- or 64-bit

environments.

Page 4: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

4

Note:

Hardware assistance for virtualization is disabled by default in HP ProLiant

servers. To enable this feature during boot, press F9 to enter the setup

mode; then select Advanced OptionsProcessor Options; and, lastly, select

and enable the Virtualization Technology option. You should also enable

No-Execute Memory Protection. Select F10 to save these settings and exit

the utility.

For more information, refer to the “HP ROM-Based Setup Utility User

Guide.”

If you re-flash your firmware, these settings will return to their default values.

Benefits

The benefits of virtualization include:

Enhanced server utilization

Average server utilization in the data center may be as low as 5% – 10%1, making infrastructure

servers (such as a Domain Name System (DNS) or Microsoft Active Directory controller) and other

lightly-used machines excellent candidates for consolidation.

Consolidating underutilized servers and application silos allows you to maximize the utilization of IT

resources and comply with conservation (green) initiatives.

Business continuity solution

Costly clusters of physical machines are typically used to minimize the risk of a loss of a single

server. However, virtualization allows you to provide failover and redundancy for multiple

applications on a single cluster – and the machines that make up this cluster need not be

configured identically. The provisioning of these servers is simple and flexible.

Disaster recovery solution

To help eliminate the risk of the loss of a whole location or data center, you can replicate VMs to

another site in near real-time.

Dynamic workload management

You can use VMs to support dynamic workload management, moving VMs to accommodate spikes

in demand.

Enhanced management flexibility

VMs can help increase levels of automation in the data center. Scripting and programmatic

exposure via management application programming interfaces (APIs) can enhance management

flexibility.

Why virtualize XenApp?

While the benefits of delivering applications to users through XenApp are proven, the growth in scale

and complexity of XenApp deployments has created an opportunity for you to achieve an even

greater return on your investment.

1 The DataCenter Journal, 12 March 2009

Page 5: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

5

The benefits of XenApp on XenServer include:

Reduced server/data center footprint

Server consolidation can reduce the number of physical servers required in your data center (see

Consolidation example). XenServer allows you to deploy multiple applications on the same servers

– even if these applications would be incompatible in a non-virtualized environment.

Improved failover and redundancy

In a non-virtualized environment, silos are often created to simplify application-specific redundancy

and failover, typically resulting in significant unused capacity. To retain the availability benefits of

redundancy while reducing physical server footprint, you can virtualize underutilized XenApp

servers.

Zero-downtime hardware maintenance

In a non-virtualized environment, hardware maintenance is usually associated with reduced

application availability. You must typically schedule maintenance after-hours so that you power

down servers in order to replace faulty or outdated hardware.

However, XenServer’s XenMotion feature allows running VMs to be migrated from one physical

server to another with no service interruption, supporting zero-downtime maintenance.

Rapid server, application, and capacity provisioning

In a non-virtualized environment, it may take hours or even days to manually increase XenApp

capacity. With XenServer, however, VMs preinstalled with XenApp can be converted into templates

and, in conjunction with a resource pool, used for rapid provisioning.

Fast, easy, portable test and demonstration environments

If you cannot justify the hardware required to create test, demonstration, and training environments,

you can use XenServer to deliver copies of production environments. As a result, you can test the

quality and impact of applications, hot fixes, and configuration changes prior to rolling them out

into production. In addition, you can create complete, portable training and demonstration

environments to introduce new services and applications throughout the organization.

Performance testing

HP has performed a number of performance characterizations designed to compare the scalability of

virtualized HP ProLiant servers deployed in 32- and 64-bit XenApp environments. To provide

baselines, bare-metal configurations were also tested.

For more information on tested configurations and the test environment, refer to Appendix A –

Testing.

Overview

HP bases the workload for tested servers on a Microsoft Office 2003-based Heavy User profile.

Heavy Users (also known as Structured Task Workers) tend to open multiple applications

simultaneously and remain active for long periods; they often leave applications open when not in

use.

To characterize scalability, HP focuses on the following criteria:

System performance reported by Windows Performance Monitor (Perfmon)

User response times measured using a canary script

Page 6: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

6

Ideally, HP would prefer to use exactly the same tools to characterize performance in both virtualized

and non-virtualized environments. However, because of the wide range of discrete data values2

generated in a virtualized environment, HP prefers to employ moving averages rather than discrete

values. HP has also developed custom tools to aggregate the large amount of performance data

generated by VMs.

Despite these departures from the methodology established for bare-metal servers, HP expects the

margin of error in metrics for a virtualized server to be less than 10%.

Note:

One option for monitoring VM performance is the use of round-robin

databases (RRDs). XenServer records persistent performance metrics in

RRDs to provide long-term access to this data and support the analysis of

historical trends. RRDs are maintained for the host server and for VMs. For

more information, refer to the HP white paper, “Analyzing Citrix XenServer

persistent performance metrics from Round Robin Database logs.”

In general, 80% processor utilization has been considered the critical performance threshold, typically

used to help specify the optimal number of users supported by a particular server configuration.

However, processor utilization does not always reach 80% during a test run. In such cases, HP

analyzes the Perfmon results to determine what has limited scalability; for example, when bare-metal

32-bit platforms are tested, scalability tends to be limited by lack of system page table entries (PTEs).

HP uses the results of the associated canary run to validate that response times were acceptable when

the optimal number of users indicated by Perfmon was active. If, however, user response times have

already become unacceptable before the 80% threshold is reached, HP accepts as optimal the

number of users supported just before response times began to degrade.

Sample test results are shown in Figure 1.

2 Due to ringing (that is, significant oscillation of processor utilization values)

Page 7: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

7

Figure 1. Sample test results showing that, for this 64-bit HP ProLiant BL685c G6 platform, response times were acceptable

when 500 users – the optimal number – were active

For more information on the test methodology, refer to Appendix A – Testing.

Page 8: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

8

Test results

This section summarizes test results for virtualized 32- and 64-bit platforms.

Table 1. Optimal numbers of users supported by a range of virtualized HP ProLiant servers

Server model Number of

cores

64-bit platforms 32-bit platforms

Config-

uration

Users Overhead Config-

uration

Users Consolid-

ation

factor

DL380 G7 24 * 6/4 680 5.0

DL585 G5 16 4/43 242 16%

DL785 G5 32 8/4 430 3%

BL460c G6 16* 4/4 340 16% 4/4 401 2.7

Bl465c G5 8 4/2 139 10%

G6 12 6/2 360 0% 6/2 378 2.1

G6 (i) 12 6/2 303 0%

G7 24 6/4 645 4.6

BL680c G5 24 6/4 291 25% 6/4 483 3.5

BL685c G6 (ii) 16 4/4 404 2%

G6 (iii) 24 6/4 500 0%

G7 (i) 32 8/4 731 (iv)

(i) Low-power processors

(ii) Four-core processors

(iii) Six-core processors

(iv) HP was unable to perform the bare-metal testing required to obtain a consolidation factor because the Enterprise Edition

of Windows Server 2003 (deployed on the tested server) does not support 32 processor threads.

* Intel HT Technology enabled

Important:

When Intel Hyper-Threading Technology (Intel HT Technology) is enabled,

the number of processor cores seen by the operating system doubles.

3 Virtualized configuration – formatted as x/y, where x denotes the number of VMs; y denotes the number of virtual CPUs (vCPUs) allocated to

each VM. Note that, in some cases, the configuration is expressed as x/y/z, where z denotes the amount of memory (in GB) allocated to each

VM.

Page 9: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

9

Best practices

To take best advantage of the benefits delivered by virtualization, you need to understand your

XenApp environment. To size cost-effective host servers when virtualizing such an environment, you

must consider the particular applications, along with the numbers of users and the specific user

profiles you wish to support.

Be aware that VM performance can vary depending on the application, the guest operating system,

and other factors. Thus, one of the biggest challenges when planning a virtualized environment is

how to address the variables that can impact host server sizing and performance, such as:

How many VMs should be deployed on a single host?

How many virtual CPUs should be allocated to each VM?

How much memory is required on each VM to help eliminate memory and I/O bottlenecks?

Would a storage array network (SAN) be a better choice than internal storage?

Are there enough network interface cards (NICs)?

To compound this level of complexity, the XenApp environment presents unique challenges due to the

vast number of processes running simultaneously and the large memory dependencies of many of the

applications deployed. As a result, you must refine your sizing process when virtualizing such an

environment so that the host server you select can deliver the appropriate resources (processor,

memory, I/O, and network) and scalability.

Before you start planning for virtualization, however, you should first determine if your application is

a good candidate. For example, underutilized XenApp servers, XenApp data store servers, and

servers running infrastructure services may be suitable for virtualization. Conversely, XenApp servers

running resource-intensive applications or highly-utilized infrastructure servers may not be such good

candidates.

This section provides guidelines for optimizing the scalability of a virtualized HP ProLiant server.

Note:

In general terms, your virtualized server configuration (6/4/8, for example)

is considered optimal if scalability is degraded when you increase the

number of VMs, change the number of vCPUs per VM, or reduce the

amount of memory per VM.

However, it is important to point out that your environment is unique; testing is a critical part of

maximizing server scalability and consolidation ratio.

Unable to migrate to a 64-bit environment?

The ideal solution to addressing a memory-constrained 16- or 32-bit application is migration to a 64-

bit environment, where the amount of addressable memory is no longer an issue – indeed, it is not

uncommon for the latest server products to support 512 GB of memory or more. While a 64-bit

operating system can fully utilize this 512 GB, the best-case for a 32-bit operating system is support

for 128 GB4.

4 Accessing more than 4 GB of memory requires Physical Address Extension (PAE) support. For more information, refer to

http://www.microsoft.com/whdc/system/platform/server/PAE/PAEdrv.mspx.

Page 10: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

10

Note that virtualizing a 64-bit platform is associated with resource overhead that decreases the

overall user density compared with a bare-metal implementation. However, this overhead is typically

acceptable given the range of benefits delivered by virtualization (such as server consolidation,

energy-efficiency, enhanced disaster recovery capabilities, and easier system maintenance).

There are a number of reasons that may make it impractical or uneconomical to migrate to a 64-bit

environment, including the following:

If you are supporting a 16-bit application that cannot be ported to a 32- or 64-bit deployment, you

have no choice but to run the existing application in a 32-bit environment (whether virtualized or

not).

If a device driver or application is incompatible with the 64-bit environment, you again have no

choice but to deploy your environment on a 32-bit edition of Windows.

It is possible to obtain excellent user densities by virtualizing your 32-bit platforms – and without

making any changes. The scalability of older platforms is often restricted to 150 users or less;

however, modern HP ProLiant servers can deliver consolidation factors as high as 5.0. Thus,

virtualization may allow you to replace five legacy servers with a single, virtualized server such as an

HP ProLiant DL380 G7.

Note:

Virtualization adds complexity5 to a deployment. Despite this, if you are

unable to migrate to a 64-bit environment, virtualization may be an

appropriate choice

Only virtualize suitable platforms

Many options are available to you when identifying good candidates for virtualization. In general,

opportunities exist for the dramatic consolidation of any under-utilized legacy servers, whatever the

makes and models, whether 32- or 64-bit platforms.

Note:

HP offers tools to help you migrate from third-party servers. For example,

HP Insight Server Migration software for ProLiant supports physical-to-

ProLiant application migrations.

In many cases, 32-bit platforms make the best candidates for consolidation. Businesses have long

been striving to extract every last ounce of productivity from their legacy servers and applications.

Indeed, many are now unable to move to 64-bit platforms due to driver incompatibilities and/or

porting issues with custom applications.

With XenServer, legacy servers can be efficiently migrated and hosted on HP’s latest server families

without sacrificing performance – or, thanks to generational improvements in processor, memory, and

I/O capabilities, performance may even be enhanced. Given the well-documented memory limitations

of the non-virtualized 32-bit platform and its inability to scale, virtualization can deliver dramatic

improvements in scalability – as much as 400%, as described in the HP white paper, “Consolidation

of x86 HP Server Based Computing environment with Citrix XenServer on HP ProLiant BL680c G5.”

Although the 64-bit platform eliminates the drastic memory limitations that plague 32-bit environments,

the requirement for emulation means that executing 32-bit workloads in a 64-bit environment places a

limit on scalability.

5 A significant learning curve may be required.

Page 11: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

11

Note:

Since most of the emulation is handled at the chip level, the associated

overhead is significantly less than that associated with 16-bit Windows on

Windows (WoW) emulation.

Indeed, it is possible to get better performance when virtualizing a 32-bit workload on a 32-bit

platform than when virtualizing the same workload on an equivalently configured 64-bit platform.

Contrast the results HP obtained when running a 32-bit workload on an HP ProLiant BL680c G5

server blade: in a virtualized 32-bit environment, 483 users were supported; in a virtualized 64-bit

environment, 291.

Even though there may be significant overhead when virtualizing a 32-bit workload on a 64-bit

platform6, significant benefits can be achieved by consolidating older servers on to newer machines.

In addition, reducing the number of physical machines can reduce costs associated with power,

cooling, data center real estate, and licensing. In short, the savings start to add up.

Consider the cost

When planning your virtualized environment, consider the costs involved.

As shown in the Test results, HP tested a broad range of configurations, demonstrating that

virtualization overhead can vary significantly (0% – 25% for optimal configurations) in the 64-bit

environment. While the level of overhead may be significant, careful tuning can optimize scalability,

thus maximizing the return on your investment.

Note:

In a 32-bit environment, modern HP ProLiant servers support more users

when virtualized compared to a bare-metal configuration. Thus, there is

effectively no virtualization overhead.

Remember, however, that your operating system licensing costs are directly related to the number of

VMs deployed. In practice, it may be more beneficial to minimize licensing costs than deploy the

optimal number of VMs. For example, testing performed on an HP ProLiant DL785 G5 server

demonstrated that an 8/4 configuration was able to support 430 users, while a 4/8 configuration

could support 420. It is hard to imagine how support for 10 more users could justify the cost of the

four additional OS licenses that would be required.

Do not oversubscribe vCPUs

It might seem likely that the more virtual CPUs (vCPUs) you subscribe to a particular VM, the more

users the VM will be able to support. However, HP has found that oversubscribing vCPUs (that is,

allocating more vCPUs than there are processor cores) tends to degrade server scalability because

processor resources must now be shared between VMs.

Fully utilize CPU resources

Typically, a CPU core can only run a single thread at any one time. However, Intel HT Technology

allows a core to support two threads; indeed, a quad-core processor with Intel HT Technology

enabled is recognized by the operating system to be an eight-core processor (see Enabling Intel HT

Technology).

6 The Test results section provides examples of this overhead.

Page 12: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

12

Each core is capable of supporting a virtual CPU (vCPU).

While it is a best practice not to oversubscribe vCPUs, you should utilize all available processor cores

to help maximize performance. Thus, if you are virtualizing an HP ProLiant server featuring two quad-

core AMD Opteron processors, you should deploy eight vCPUs; however, if you are virtualizing a

server that features two quad-core Intel Xeon processors, you can deploy 16 vCPUs when Intel HT

Technology is enabled.

Creating balance

One of the keys to optimizing server scalability is achieving a balance between the number of VMs

you configure in a particular server and the number of vCPUs allocated to each VM. For example,

consider the following virtualized HP ProLiant DL585 G5 server configurations tested by HP:

8/2: 184 users

4/4: 242 users (that is, 30% more)

Thus, reducing the number of VMs and doubling the number of vCPUs per VM increased the

scalability of this particular server by 30%.

HP offers the following guidelines to provide a starting point when you are configuring VMs:

8 cores: 4/2

16 cores: 4/4

24 cores: 6/4

32 cores: 8/4

HP strongly recommends carrying out performance tests to determine the ideal configuration for your

particular environment.

Enabling Intel HT Technology

If you are using a later-generation7 Intel Xeon-powered HP ProLiant server, you can enable Intel HT

Technology to double the number of processor cores available to VMs.

Note:

Due to the associated overhead, you should not enable earlier

implementations of Intel HT Technology.

HP demonstrated the benefits of enabling Intel HT Technology on a 2P/12C8 HP ProLiant DL380 G7

server blade9. A 6/2/6 configuration was able to support 500 users in a XenApp environment, as

shown in Figure 2.

7 G6 or later 8 Signifying support for two processors (P) and a total of 12 cores (C) 9 For more information, refer to the HP white papers, “Performance of HP ProLiant DL380 G7 with Intel Xeon Processor X5680 (3.33 GHz) in 32-

and 64-bit HP SBC environments” and “Performance of HP ProLiant DL380 G7 with Intel Xeon Processor X5680 (3.3 GHz) in a 32-bit

virtualized HP SBC environment.”

Page 13: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

13

Figure 2. Configured to use 12 cores, this virtualized server was able to support 500 users

Using Intel HT Technology effectively increased the number of available cores from 12 to 24. Taking

full advantage of these additional resources, HP doubled the number of VMs deployed on this server.

Note:

This is not considered to be over-subscription.

As shown in Figure 3, the resulting 6/4/6 configuration was able to support 680 users, an increase

of 36%.

Page 14: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

14

Figure 3. Doubling the number of vCPUs (from 12 to 24) fully utilized the CPU resources of this virtualized server and increased

the number of users supported by 36%

Avoid spikes in processor utilization

To avoid spikes in processor utilization, ensure your VMs are online before applying the workload.

Do not simultaneously add large numbers of users; if possible, balance the workload across your

VMs.

Page 15: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

15

Allocate sufficient memory to each VM

Allocating sufficient memory to each VM is also important. For example, although test systems are

often configured with less, HP recommends configuring at least 8 GB for each production VM

(whether running 32- or 64-bit applications). This allocation should accommodate a typical workload

and provide some space for growth. Allocating 8 GB to 32-bit VMs also means that you will not have

to physically upgrade your host servers when you migrate to a 64-platform, which may impose an

additional memory overhead.

However, if you are running an operating system that cannot support 8 GB (such as the 32-bit version

of Windows Server 2003 Standard Edition, which can only support 4 GB), you should allocate less

memory to each VM.

Important:

To maximize scalability, HP does not recommend deploying page files

within VMs. However, in a test environment, HP deploys page files within

VMs to provide consistency with bare-metal server configurations.

Note that there are some benefits to deploying page files within VMs. If a

VM were to trap or BSOD10, for example, you would be unable to obtain a

dump file for analysis purposes unless there is a local page file.

Although XenServer does not currently allow you to over-subscribe memory, HP does not believe this

capability would add value in a XenApp environment. You merely need to ensure your total VM

memory allocation does not exceed the size of physical memory; individual VM allocations must not

exceed the limit supported by the guest operating system. Remember to reserve approximately 1 GB

for the hypervisor.

However, allocating insufficient memory resources can lead to a significant performance penalty.

Example of insufficient memory resources

To determine the impact of allocating insufficient memory to VMs, HP tested an HP ProLiant BL465c

G7 server blade11 in a 32-bit XenApp environment when configured as follows:

6/4/4

6/4/6

Figure 4 shows the scalability of the 6/4/4 configuration.

10 A reference to the so-called blue screen of death 11 For more information, refer to the HP white paper, “Performance of HP ProLiant BL465c G7 with AMD Opteron processor Model 6174 (2.2

GHz) in a 32-bit virtualized HP SBC environment.”

Page 16: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

16

Figure 4. Scalability of a 6/4/4 configuration

Lack of available memory resources in the 6/4/4 configuration caused the number of stopped

sessions to increase exponentially when 445 Heavy Users were active.

Processor utilization never reached 80%, the criterion typically used to characterize server scalability.

For this particular server configuration, HP determined that the optimal number of users was 435 (445

active sessions less 10 stopped sessions), limited by lack of memory rather than processor resources.

Figure 5 shows what happened to an individual VM when it ran low on memory resources.

Page 17: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

17

Figure 5. When this VM ran low on memory resources, disk idle time decreased exponentially, limiting the number of users that

could be supported (Note that stopped sessions are not included)

When memory size was increased from 4 to 6 GB, scalability increased significantly, as shown in

Figure 6.

Page 18: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

18

Figure 6. Scalability of a 6/4/6 configuration

Average processor utilization reaching 80% when 650 Heavy Users were active; the number of

stopped sessions began to increase exponentially when 630 – 680 Heavy Users were active.

Thus, HP determined that the optimal number of users was 645 (650 users less five stopped sessions).

Increasing the memory allocated to each VM from 4 GB to 6 GB allowed processor resources to be

fully utilized, resulting in a 48% increase in scalability, as shown in Figure 7.

Page 19: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

19

Figure 7. Memory allocation comparison

You should always monitor I/O performance in a production environment to determine if there are

potential bottlenecks. Take particular care in the following usage cases:

Applications that tax I/O subsystems do not scale as well in a virtualized environment as in a bare-

metal configuration.

If you have virtualized the existing workload on a modern 32-bit platform, be aware that the server

may be supporting significantly more users than before; potential I/O bottlenecks may now be

exposed.

Avoiding disk I/O bottlenecks

To help you avoid disk I/O bottlenecks, Microsoft recommends using the Windows performance

monitoring tool, Perfmon, to check the following metrics12:

%Idle time – Idle times for logical and physical drives should average at least 50%

Average Disk Seconds/Read and Average Disk Seconds/Write – The average time taken to

complete a read or write should average less than 25 milliseconds, with peak times of less than 50

milliseconds

If the above conditions specified by Microsoft cannot be met, a disk I/O bottleneck is likely.

Note:

In the event of an I/O bottleneck, you should tune the disk subsystem,

decrease the number of users or applications, or add memory to the server.

Using write cache

HP Smart Array controllers include a data cache, memory that can be utilized to temporarily cache

data being written to or read from disk. Since access to this memory is significantly faster than disk

access, the cache can enhance overall server performance, particularly during login operations.

Write cache is of particular interest in a XenApp environment. After buffering all the data associated

with a particular write command, the Smart Array controller indicates to the XenApp server that the

data transfer to the disk is complete – even through the data is still being written to disk. This frees up

the server’s processor to perform other tasks and accelerates data throughput.

12 For more information, visit the Microsoft website.

Page 20: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

20

While HP has not yet characterized flash backed write cache (FBWC) performance in the XenApp

environment, testing performed on battery backed write cache (BBWC) demonstrates that

performance enhancements due to write cache may be most significant when the XenApp server is

carrying out log-intensive operations and/or when significant page file write operations are

necessary, such as during user logins. Performance gains have ranged from 50% to 250%13; actual

results would vary depending on the application(s) involved and your particular XenApp environment.

Monitor network performance

Historically, XenApp workloads have more issues with network latency than with raw bandwidth due

to efficiencies associated with Citrix Independent Computing Architecture (ICA).

Now that a single physical server is being used to host workloads that, prior to virtualization, were

run on multiple servers, be sure to monitor the host for any network bottlenecks that may have been

introduced.

There are a number of ways in which you can enhance network performance, such as deploying

additional network ports, implementing network interface card (NIC) teaming, or using HP Virtual

Connect technology.

Enhance availability

VMs are flexible, allowing you to readily implement the level of availability you need. Moreover, you

can enhance availability by utilizing a SAN created from HP StorageWorks product offerings, with

capabilities that may include:

Multiple paths for redundancy

Automatic path failover

High-availability cluster support

Example

Consider an environment in which a fully-configured HP ProLiant BL460c G6 server blade is hosting

eight VMs. In the event of a server failure, all eight VMs would fail.

An alternative would be to deploy two BL460c G6 server blades, each hosting four VMs. Now the

loss of a server would only impact four VMs; if desired, you could manually import the downed VMs

to the surviving server and restart them.

If, however, the two BL460c G6 server blades are in a pooled configuration with shared storage and

are utilizing the XenServer High Availability (HA) feature, the downed VMs could be automatically

restarted on the surviving server. Moreover, having shared storage allows running VMs to be moved

between hosts using the XenServer live migration (XenMotion) feature.

XenMotion helps to eliminate VM downtime, freeing up the server administrator to perform repairs or

upgrades to the original host.

For a business-critical environment, you might consider adding a third BL460c G6 server blade to the

pool, allowing each host to support two – three VMs. Using the live migration and workload

balancing capabilities of XenServer, VMs can automatically be moved between hosts to achieve the

best balance and optimize resource utilization on each host. In this configuration, one server can be

taken offline with little effect on overall pool performance.

13 For more information, visit the HP website.

Page 21: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

21

Balance the distribution of VMs

Internal bottlenecks created by poor VM tuning can burden the host system, particularly if multiple

identical VMs are deployed on the same host server. For example, if all the VMs on a host are

memory-constrained, a tremendous burden is placed on the server’s disk I/O system. To avoid this

scenario, ensure VMs are properly configured and balanced within your environment. Avoid memory

swapping at all costs.

Optimize resource use

In addition to suitable sizing, optimal VM performance requires XenServer and guest operating

systems to be appropriately configured. Do not overlook the execution of screensavers or other

resource-intensive applications; carefully scrutinize your VMs to save precious resources for users and

applications. This requirement, which should be well-known in conventional XenApp environments,

becomes even more important after virtualization.

Optimize the XenServer kernel

HP typically makes no changes to the XenServer kernel to optimize performance in the XenApp test

environment. However, if your workload is memory-intensive, you may need to increase the amount of

memory allocated to domain zero (Dom0) if you experience scalability or reliability issues.

Thus, if necessary, you can increase the amount of RAM allocated to Dom0 in the XenServer

/boot/extlinux.cnf file to accommodate additional users.

For more information on Dom0, refer to http://wiki.xensource.com/xenwiki/Dom0.

Enhance manageability

Consider the following caveats concerning VM management:

While many management tools perceive VMs to be, in effect, the same as physical machines,

remember that you will also need to manage the virtualization layer. In order to minimize the

number of tools required to manage your environment, a single, integrated platform for physical

and virtual machines is recommended (such as HP ProLiant servers running XenServer and

XenApp).

Consider enabling any onboard hardware management and notification capabilities so that you

can receive pre-failure alerts, allowing you to migrate VMs to another physical host prior to failure.

Since it may be difficult to monitor application rather than server performance in a virtualized

environment, resist the temptation to blindly propagate VMs in response to performance issues. You

may be trading your silos of physical machines for silos of VMs.

Since virtualization makes it so easy to replicate services, you may find that, without even

increasing the number of physical machines, you are now managing a large number of new

servers. These additional instances translate to more patches, more managing and monitoring.

Since VMs often host seldom-used applications, they may be off for long periods of time.

Management tools may not be able to turn on these VMs to install patches, creating a potential

security risk.

The remainder of this paper describes how you can use virtualization to enhance the scalability of

today’s 4P servers, which may be limited by the capabilities of Windows Server 2003. In addition,

an example of the benefits of consolidating a legacy environment is provided.

Page 22: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

22

Enhancing the scalability of a modern 4P server

HP has discovered that, in a 64-bit environment, the Enterprise Edition of Windows Server 2003 may

not be able to accommodate the number of processor cores featured in today’s 4P HP ProLiant

servers.

Note:

The limited scalability of 32-bit Windows Server 2003 platforms is well

known.

This section compares the limited scalability of an HP ProLiant BL685c G7 server blade when

deployed as a bare-metal 64-bit platform14 against the significant improvement when deployed as a

virtualized 32-bit platform15.

Bare-metal 64-bit platform

Figure 8 shows the number of Heavy Users supported by a 4P HP ProLiant BL685c G7 server blade in

a 64-bit test environment.

14 Due to its limited scalability, HP did not publish performance test results for the 64-bit platform. 15 For more information, refer to the HP white paper, “Performance of HP ProLiant BL685c G7 with AMD Opteron processors Model 6128 HE

(2.0 GHz) in a 32-bit virtualized HP SBC environment.”

Page 23: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

23

Figure 8. The bare-metal 64-bit platform can only provide optimal support for 208 Heavy Users

Processor utilization, the metric typically used by HP when determining the optimal number of users

supported by a particular server, barely reached 80% during this test run. However, the number of

stopped session began to increase exponentially when 208 users were active.

Although over 600 sessions were started during the test run, a large number of these had already

stopped when the test concluded.

Thus, HP concluded that the maximum number of users supported by a bare-metal HP ProLiant BL685c

G7 server blade in this 64-bit test environment was 208. By comparison, an HP ProLiant BL685c G6

server blade was able to support 500 users in the same environment.

Page 24: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

24

That a 32-core server (G7) would support significantly fewer users than a 24-core server (G6) is

counter-intuitive. However, the disparity can be explained through HP’s use of the Enterprise Edition of

Windows Server 2003 to run the tested server.

This edition cannot support the execution of 32 threads concurrently, even under moderate loads. As

a result, you may wish to consider the following upgrade options:

Datacenter Edition of Windows Server 2003 – maximum of 32 cores

Windows Server 2008

– Web or Standard Edition – maximum of four processors

– Enterprise Edition – maximum of eight processors

– Datacenter Edition – maximum of 64 processors

Note:

As a result of the recent discovery that scalability may be limited by high

core density in a 64-bit environment, HP is actively upgrading the test

harness from Windows Server 2003/Office 2003 to Windows Server

2008/Office 2007.

Due to this software performance issue in the current test harness, HP has

not published a report of the bare-metal testing of the HP ProLiant BL685c

G7 server blade in a 64-bit test environment. However, the same

methodology was used as for the G6 model of this server.

Kernel instability

As shown in Figure 9, % Privilege Time values spiked after 390 sessions had been started, indicating

that the kernel had become unstable, unable to concurrently execute the requisite number of threads.

In turn, the length of the processor queue also spiked.

Page 25: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

25

Figure 9. The kernel became unstable after 390 sessions had been started

Thus, the scalability of the HP ProLiant BL685c G7 server blade was limited when deployed as a 64-

bit platform. The following section highlights the improvement that can be achieved when this server is

deployed as a virtualized 32-bit platform.

32-bit platform

HP tested the HP ProLiant BL685c G7 server blade as a virtualized 32-bit platform and, for

comparison purposes, as a bare-metal 32-bit platform.

Bare-metal 32-bit platform

Figure 10 shows that, as expected, the scalability of the bare-metal platform was limited by lack of

system PTEs.

Page 26: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

26

Figure 10. The bare-metal HP ProLiant BL685c G7 server blade was able to support 127 users as a 32-bit platform

Page 27: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

27

Virtualized 32-bit platform

Figure 11 shows that the virtualized 32-bit platform was able to fully utilize the processor resources of

the HP ProLiant BL685c G7 server blade.

Figure 11. The HP ProLiant BL685c G7 server blade was able to support 731 users as a virtualized 32-bit platform

Thus, in a 32-bit environment, the HP ProLiant BL685c G7 server blade was able to support

significantly more users16 when virtualized.

Moreover, since host operating systems only had to support four CPU cores, VMs were able to

perform without kernel limitations, unlike the bare-metal configurations, which featured 32 cores –

more than the operating system was able to support.

16 576%

Page 28: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

28

Results indicated that, due to these kernel limitations, the virtualized 32-bit platform was able to

support 351% more users than the bare-metal 64-bit platform. Thus, the solution for a dense 4P server

that appears to under-perform as a bare-metal 64-bit platform may be to virtualize this server and

deploy it in a 32-bit environment, as shown in Figure 12.

Figure 12. Scalability comparison

Consolidation example

To highlight the benefits of consolidation, HP determined how many modern server blades it would

take to accommodate the workload supported by a number of legacy blades.

The challenge was as follows:

Support at least 1,500 Microsoft Office 2003 users from a single HP BladeSystem c7000 enclosure

Replace 16 HP ProLiant BL460c G1 server blades, each able to support 96 users17.

To replace the legacy blades, HP selected the 2P HP ProLiant BL465c G7 server blade, which was

configured as follows:

AMD Opteron processors Model 6174 (2.2 GHz)

– 12 cores

– 12 MB shared L3 cache

64 GB RAM

HP Smart Array P410i controller with RAID 0

– 2 x 146 GB 10,000 rpm SAS drive

– 1 GB FBWC

HP NC551i Dual port FlexFabric 10 Gb Converged Network Adapter

HP determined that a virtualized HP ProLiant BL465c G7 server blade can support 645 users in a

XenApp environment; thus, three18 of these blades can be used to accommodate the workload

previously supported by 16 legacy blades, as shown in Figure 13.

In fact, the three modern blades were able to support 25% more users that the legacy systems, while

leaving 13 slots available for future expansion.

17 16 x 96 = 1,536 supported users 18 3 x 645 = 1,935 supported users

Page 29: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

29

Figure 13. 16 legacy systems replaced by three virtualized HP ProLiant BL465c G7 server blades, with capacity to spare

Note:

The numbers of users projected for the above HP ProLiant BL465c G7

server blades do not take into consideration factors such as third-party

agents that typically consume a modest amount of system resources (such

virus scanning, software provisioning, remote administration, and firewalls).

Be aware that varying the user profile and workload can have a significant

impact on scalability.

As with the legacy implementation, it is assumed that management servers

are installed elsewhere.

Minimizing utility costs

The cost of powering and cooling servers is high, constituting a significant – if not the most significant

– portion of total IT infrastructure costs. Indeed, more and more studies are indicating that server

hardware is no longer the leading data center expense. For example, the purchase price of a new,

1U server has already been exceeded by the capital cost of the power and cooling infrastructure

needed to support it and will soon be exceeded by its lifetime energy costs (for more information,

refer to HP ActiveAnswers).

Consolidating your workload on a small number of high-performance HP ProLiant server blades can

significantly reduce your annual utility costs. If, for example, the cost of electricity were $0.10 per

KWh, annual utility costs for the baseline HP ProLiant BL460c G1 server blade configuration would be

$5,168 for 24 x 7 operation. Utility costs for the three HP ProLiant BL465c G7 server blades are

significantly lower19, as shown in Figure 14.

19 Based on high-performance technical computing (HPTC) requirements of 1,935 W

Page 30: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

30

Note:

Power requirements for the legacy and consolidated configurations were

estimated using the HP BladeSystem Power Sizer (BPS)

Based on actual component-level power measurements for systems stressed

to their maximum capabilities, the BPS helps you plan a particular HP

BladeSystem deployment, providing power and cooling requirements, cost,

and a detailed bill of material.

Figure 14. Annual utility costs for legacy configuration are significantly higher

Page 31: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

31

Appendix A – Testing

Appendix A describes the methodology used by HP to characterize the optimal number of users

supported by VMs on HP ProLiant servers.

Important:

As with any laboratory testing, the performance metrics quoted in this

paper are idealized. In a production environment, these metrics may be

impacted by a variety of factors.

HP recommends proof-of-concept testing in a non-production environment

using the actual target application as a matter of best practice for all

application deployments. Testing the actual target application in a

test/staging environment identical to, but isolated from, the production

environment is the most effective way to characterize system behavior.

This section provides more information on test tools, user profile, and test scenarios.

Test tools

To facilitate the placement and management of simulated loads on a XenApp server, HP used

Terminal Services Scalability Planning Tools (TSScaling), a suite of tools developed by Microsoft to

help organizations with Windows Server 2003 Terminal Server capacity planning.

Table A-1 describes these tools.

Table A-1. Components of TSScaling

Component Description

Automation tools Robosrv.exe Drives the server-side of the load simulation

Robocli.exe Helps drive the client-side of the load simulation

Test tools Qidle.exe Determines if any scripts have failed and require

operator intervention

Tbscript.exe A script interpreter that helps drive the client-side load

simulation

Help files TBScript.doc Terminal Server bench scripting documentation

TSScalingSetup.doc A scalability test environment set-up guide

TSScalingTesting.doc A testing guide

Page 32: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

32

More information

Roboserver (Robosrv.exe) and Roboclient (Robocli.exe): Terminal Server capacity planning

TSScaling: Windows Server 2003 Terminal Server Capacity and Scaling

User profile

To simulate a typical workload in a XenApp environment, HP selected the Heavy User profile. Heavy

Users (also known as Structured Task Workers) tend to open multiple applications simultaneously and

remain active for long periods. Heavy Users often leave applications open when not in use.

Table A-2 outlines the activities performed by these users.

Table A-2. Activities incorporated into the test script

Activity Description

Access Open a database, apply a filter, search through records, add records, and delete records.

Excel Open, print and save a large spreadsheet.

InfoPath Enter data20 into a form; save the form over an existing form.

Outlook First pass: Email a short message.

Second pass: Email a reply with an attachment.

Outlook_2 Create a long reply.

PowerPoint Create a new presentation, insert clipart, and apply animation. View the presentation after each slide is

created.

PowerPoint2 Open and view a large presentation with heavy animation and many colors and gradients.

Word Create, save, print, and email a document.

Test scenarios

HP tested a bare-metal server to provide a baseline, then tested virtualized configurations to compare

their scalability.

HP used the same basic methodology, tools, and workload to characterize the performance and

scalability of the bare-metal server and of VMs running on that server.

Obtaining the baseline

To characterize the bare-metal performance of the non-virtualized server, HP used a workload based

on the activities described in User profile.

Testing was initiated by running the particular workload with a group of simulated users; start times

were staggered to eliminate authentication overhead. After these sessions finished, HP added another

group of users, then repeated the testing. Further users were added until the optimal number (see

Performance and scalability metrics) was reached.

Characterizing VM performance

HP used a similar methodology to characterize the aggregate performance of VM configurations on

the tested server. Note, however, that when characterizing VM performance, you must also consider

the demands of the hypervisor: if the number of user sessions is increased too quickly or too many

sessions are initiated concurrently, CPU utilization on the physical server can increase dramatically.

20 Data entry for Office InfoPath 2003 requires significant processor resources

Page 33: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

33

So as not to over-saturate the tested server, HP ensured VMs were online before testing began and

controlled the number of users being added to VMs, thus minimizing spikes in processor utilization. To

ensure VMs approached saturation at the same rate, HP adopted a round-robin approach when

adding users.

Performance and scalability metrics

While the Office 2003-based workload was running, HP monitored a range of Windows

Performance Monitor (Perfmon) counters to characterize the performance and scalability of the bare-

metal server and VMs. HP also used canary scripts featuring Office 2003-based activities to establish

the number of users that could be supported before user response times became unacceptable.

HP typically uses the Perfmon % Processor Time counter to establish the optimal number of users

supported by a XenApp server – by definition, the number of users active when processor utilization

reaches 80%. At this time, a limited number of additional users or services can be supported;

however, user response times may become unacceptable.

In a 32-bit XenApp environment, System Page Table Entries (PTEs) on a bare-metal server may

become exhausted before processor utilization reaches 80% due to the well-known scalability

limitations of the 32-bit Windows platform.

To validate metrics obtained from Perfmon, HP uses canary scripts to characterize response times for

a range of discrete activities, such as the time taken to invoke an application or for a modal box to

appear. By monitoring response times – a very practical metric – as more and more users log on, HP

has been able to demonstrate that these times are acceptable when the optimal number of users (as

determined using Perfmon counter values) is active.

With some tested servers, response times begin to increase before processor utilization reaches 80%.

In such cases, HP prefers to be conservative, specifying as optimal the number of users supported

when response times first become unacceptable (that is, these times begin to increase markedly over a

baseline level).

HP used the same basic methodology to characterize performance and scalability of the bare-

metal server and of VMs running on that server.

Characterizing the optimal number of users in the virtualized environment

HP ran Perfmon on each VM to log the CPU resources consumed during a particular scenario.

Individual results were also aggregated to provide a single view of the capabilities of the tested server

when virtualized. However, while plots of Perfmon counter values tend to be relatively smooth in a

non-virtualized environment, when VMs are deployed, hypervisor activity introduces sporadic

transients that make raw data difficult to interpret. By utilizing moving averages, HP was able to

smooth out these transients, creating a view of processor consumption that helped characterize the

optimal number of users supported by VMs in this environment.

After a particular scenario was run, Perfmon logs for all VMs were saved to a single file. Office Excel

was then used to plot a moving average of 10 sequential log values.

While this methodology is less precise than that used by HP in a bare-metal environment, it provides

significant insight into overall system performance and the performance of individual VMs. By

analyzing Perfmon results in conjunction with canary response times, HP was able to specify the

optimal aggregate number of users supported by VMs in each scenario.

Page 34: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

34

Test topology

Figure A-1 illustrates the HP Server Based Computing (HP SBC) test environment.

Figure A-1. The tested environment – the HP ProLiant DL785 G5 server is shown

Page 35: Best Practices for Deploying Citrix XenServer on HP Proliant Blade

For more information

HP ProLiant servers http://www.hp.com/go/proliant

HP ActiveAnswers solution area for HP SBC,

including Citrix XenApp and Microsoft Terminal

Services

http://www.hp.com/solutions/activeanswers/hp

sbc

Citrix XenApp http://www.citrix.com/site/PS/products/feature

.asp?familyID=19&productID=186&featureID=41

10

Citrix XenServer http://h71019.www7.hp.com/ActiveAnswers/c

ache/457122-0-0-225-121.html

HP Sizer for Citrix XenApp and Microsoft

Terminal Services

http://h71019.www7.hp.com/ActiveAnswers/c

ache/70245-0-0-0-121.html

HP Solution Centers http://www.hp.com/go/solutioncenters

HP Services http://www.hp.com/hps/

AMD Opteron processors http://www.amd.com/us/products/server/Page

s/server.aspx

Intel Xeon processors http://www.intel.com/products/server/processo

rs/index.htm

To help us improve our documents, please provide feedback at

http://h20219.www2.hp.com/ActiveAnswers/us/en/solutions/technical_tools_feedback.html.

© Copyright 2009 - 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. AMD Opteron, AMD Virtualization, and AMD-V are trademarks of Advanced Micro Devices, Inc. Intel and Xeon are trademarks of Intel Corporation in the U.S. and other countries.

4AA2-5115ENW, Created March 2009; Updated September 2010, Rev. 2