Microsoft Private Cloud Fast Track - IT Weekly · 2014-12-09 · Microsoft Private Cloud Fast Track Reference Architecture Guide Published: July 2012 For the latest information, please

Microsoft Private Cloud Fast Track Reference Architecture Guide

Published: July 2012

For the latest information, please see the Microsoft Server and Cloud Platform site.

http://www.microsoft.com/en-us/server-cloud/private-cloud/default.aspxhttp:/www.microsoft.com/en-us/server-cloud/private-cloud/default.aspx

Contributors

Microsoft Consulting Services

Adam Fazio ([email protected])

Solution Architect

David Ziembicki ([email protected])

Solution Architect

Joel Yoker ([email protected])

Solution Architect

Business stakeholders

Mike Truitt ([email protected])

Senior Product Planner

Bryon Surace ([email protected])

Senior Program Manager, Windows Server

Jim Dial ([email protected])

Principal Knowledge Engineer, Server & Cloud Division, Information Experience Solutions

Copyright information

This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet

Web site references, may change without notice.

Some examples depicted herein are provided for illustration only and are fictitious. No real association or connection is

intended or should be inferred.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You

may copy and use this document for your internal, reference purposes.

© 2012 Microsoft. All rights reserved.

Active Directory, Forefront, Hyper-V, Microsoft, SharePoint, SQL Server, Windows, Windows Server, Windows

PowerShell, and Windows Vista are trademarks of the Microsoft group of companies.

All other trademarks are property of their respective owners.

Contents

Introduction....................................................................................................................................................................................................................... 5

Private Cloud Fast Track Program Description ................................................................................................................................................... 6

Business Value ................................................................................................................................................................................... 6

Technical Benefits ............................................................................................................................................................................ 6

Technical Overview .......................................................................................................................................................................... 7

Using this Document ...................................................................................................................................................................... 9

Microsoft Private Cloud Overview .......................................................................................................................................... 10

Private Cloud Architecture Principles & Concepts ........................................................................................................... 11

Resource Pooling .......................................................................................................................................................................11

Elasticity and Perception of Infinite Capacity .................................................................................................................11

Perception of Continuous Service Availability................................................................................................................11

Drive Predictability ....................................................................................................................................................................11

Take a Service Provider’s Approach ...................................................................................................................................11

Multitenancy ...............................................................................................................................................................................12

Security and Identity ................................................................................................................................................................12

Private Cloud Reference Model ............................................................................................................................................... 12

Conceptual Architecture............................................................................................................................................................. 13

Fabric ..............................................................................................................................................................................................14

Management ...............................................................................................................................................................................16

Service Delivery ..........................................................................................................................................................................17

Operations....................................................................................................................................................................................18

Reference Architecture................................................................................................................................................................................................18

Use Cases .......................................................................................................................................................................................... 18

Service Models ............................................................................................................................................................................18

IaaS ..................................................................................................................................................................................................19

Data Center Consolidation and Virtualization................................................................................................................19

Virtual Desktop Infrastructure ..............................................................................................................................................20

Fabric Logical Architecture ........................................................................................................................................................ 20

Fabric ..............................................................................................................................................................................................20

Server Architecture ....................................................................................................................................................................... 21

Rack or Blade Chassis Design ................................................................................................................................................22

Server and Blade Design Recommendations ..................................................................................................................22

Server and Blade Storage Connectivity Recommendations ......................................................................................22

Server and Blade Network Connectivity ...........................................................................................................................22

Server and Blade High Availability and Redundancy Recommendations ............................................................22

Storage Architecture .................................................................................................................................................................... 23

Storage Options .........................................................................................................................................................................23

SAN Storage Protocols ............................................................................................................................................................23

Cluster Shared Volumes ..........................................................................................................................................................25

SAN Design ..................................................................................................................................................................................30

Storage Automation .................................................................................................................................................................32

Network Architecture .................................................................................................................................................................. 33

Three-Tier Network Design ...................................................................................................................................................33

Collapsed Core Network Design..........................................................................................................................................34

High Availability and Resiliency ...........................................................................................................................................35

Network Security and Isolation ............................................................................................................................................35

Network Automation ...............................................................................................................................................................36

Virtualization Architecture ......................................................................................................................................................... 36

Virtualization ...............................................................................................................................................................................36

Windows Server 2008 R2 SP1 and Hyper-V Host Design ..........................................................................................39

Hyper-V Failover Cluster Design..........................................................................................................................................46

Hyper-V Guest Virtual Machine Design ............................................................................................................................49

Management Architecture ........................................................................................................................................................ 54

Management Hosts ..................................................................................................................................................................54

Management Logical Architecture .....................................................................................................................................56

Management Systems Architecture ...................................................................................................................................58

Management Scenarios Architecture.................................................................................................................................65

Service Management ...............................................................................................................................................................71

Backup and Disaster Recovery..............................................................................................................................................72

Security ..........................................................................................................................................................................................73

Service Delivery Layer .................................................................................................................................................................. 76

Operations ....................................................................................................................................................................................... 78

Appendix: Detailed Private Cloud Fast Track SQL Server Design Diagram ..........................................................................................81

Introduction

The Microsoft® Private Cloud Fast Track Program is a joint effort between Microsoft and its hardware

partners. The goal of the program is to help organizations decrease the time, complexity, and risk of

implementing private clouds. The program provides:

Reference implementation guidance: Lab-tested and validated guidance for implementing

multiple Microsoft products and technologies with hardware that meets specific, minimum,

hardware vendor-agnostic requirements. Customers can use this guidance to implement a

private cloud solution with hardware they already own, or that they purchase.

Reference implementations: Microsoft hardware partners define physical architectures with

computing, network, storage, and value-added software components that meet (or exceed)

the minimum hardware requirements defined in the reference implementation guidance. Each

implementation is then validated with Microsoft and made available for purchase to

customers. Further details can be found by reading the information at Private Cloud How To

Buy.

The customer has the choice of building the solution by using the reference implementation guidance or

purchasing a solution from a Microsoft hardware partner that couples the guidance with optimized

hardware configurations. Although both options decrease the time, cost, and risk in implementing private

clouds, purchasing a reference implementation from a Microsoft hardware partner will result in the fastest,

lowest-risk solution. This is due to the fact that in this option, all of the hardware and software best

practice implementation choices have been determined by Microsoft and its hardware partners’

engineering teams. As a result, this will often also prove to be the most inexpensive option.

The private cloud model provides much of the efficiency and agility of cloud computing in addition to the

increased control and customization that is achieved through dedicated private resources. With the

Microsoft Private Cloud Fast Track Program, Microsoft and its hardware partners can help provide

organizations with the control and flexibility required to reap the potential benefits of the private cloud.

The Private Cloud Fast Track Program includes three documents to help you create your private cloud

solution. Refer to the following companion guides:

Microsoft Private Cloud Fast Track Reference Deployment Guide

Microsoft Private Cloud Fast Track Reference Operations Guide

http://www.microsoft.com/en-us/server-cloud/private-cloud/buy.aspx#tabs-2

http://www.microsoft.com/en-us/server-cloud/private-cloud/buy.aspx#tabs-2

http://go.microsoft.com/fwlink/?LinkId=258256


Private Cloud Fast Track Program Description The Microsoft Private Cloud Fast Track Program is a joint reference architecture for building private clouds,

which combines Microsoft software, consolidated guidance, and validated configurations with hardware

partner computing power, network, and storage architectures; and value-added software components.

Specifically, the Microsoft Private Cloud Fast Track Program utilizes the core capabilities of the Windows

Server operating system, Hyper-V technology, and Microsoft System Center 2012 to deliver the building

blocks of a private cloud infrastructure as a service offering. The key software components of every

reference implementation are the Windows Server 2008 R2 SP1 operating system, Hyper-V, and Microsoft

System Center 2012.

Business Value

The Microsoft Private Cloud Fast Track Program includes a set of three documents that provide reference

implementation guidance and reference implementations (as described previously). The program can be

used to build private clouds that are flexible and extensible. A Microsoft Private Cloud Fast Track solution

helps organizations implement virtualization and private clouds with increased ease and confidence. The

potential benefits of the Microsoft Private Cloud Fast Track Program include faster deployment, reduced

risk, and a lower cost-of-ownership.

Faster Deployment

End-to-end architectural and deployment guidance

Streamlined infrastructure planning due to predefined capacity

Enhanced functionality and automation through deep knowledge of infrastructure

Integrated management for virtual machine and infrastructure deployment

Reduced Risk

Tested end-to-end interoperability for compute, storage, and network (if the solution is purchased

from a Microsoft hardware partner)

Predefined, out-of-box solutions based on a common cloud architecture

High degree of service availability through automated load balancing

Lower Cost-of-Ownership

Near-zero downtime with exceptional fault tolerance, providing high availability

Dynamic pooling that can enhance the use of virtualization resources with Hyper-V and with

supported storage and network devices

Utilization of low-cost switches that consume less power and deliver high throughput for large

bandwidth requirements

Technical Benefits

The Microsoft Private Cloud Fast Track Program integrates multiple Microsoft products and technologies,

in addition to hardware requirements, to create reference implementation guidance. If the solution is

purchased from a Microsoft hardware partner, the reference implementation guidance is implemented

with partner hardware and sold as a reference implementation. Whether the customer decides to

implement the Microsoft-validated reference implementation guidance with their own hardware or with

hardware from a Microsoft partner, it goes through a validation process. In either case, Microsoft and its

hardware partners have created a solution that is ready to meet customer needs.

Technical Overview

To establish a baseline of understanding for the term “cloud computing,” this document utilizes

terminology from the United States National Institute of Standards and Technology’s (NIST) Definition of

Cloud Computing. This is one of the more popular definitions in use today. The current release is version

16 of the definition, which was created with input from many public and private reviewers and

contributors. For more information, see Final Version of NIST Cloud Computing Definition Published.

Note: The following text in this section is an excerpt from NIST Definition of Cloud Computing (Mell and

Grance 2011).

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared

pool of configurable computing resources (such as networks, servers, storage, applications, and services)

that can be rapidly provisioned and released with minimal management effort or service provider

interaction. This cloud model is composed of five essential characteristics, three service models, and four

deployment models.

Essential Characteristics:

On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time

and network storage, as needed automatically without requiring human interaction with each service

provider.

Broad network access. Capabilities are available over the network and accessed through standard

mechanisms that promote use by heterogeneous thin or thick client platforms (such as mobile phones,

tablets, laptops, and workstations).

Resource pooling. The provider’s computing resources are pooled to serve multiple consumers using a

multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned

according to consumer demand. There is a sense of location independence in that the customer generally

has no control or knowledge over the exact location of the provided resources but may be able to specify

location at a higher level of abstraction (such as country, state, or data center). Examples of resources

include storage, processing, memory, and network bandwidth.

Rapid elasticity. Capabilities can be elastically provisioned and released, in some cases automatically, to

scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available

for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.

Measured service. Cloud systems automatically control and optimize resource use by leveraging a metering

capability1 at some level of abstraction appropriate to the type of service (such as storage, processing,

bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported,

providing transparency for both the provider and consumer of the utilized service.

Service Models:

1 Typically this is done on a pay-per-use or charge-per-use basis.

http://www.nist.gov/itl/csd/cloud-102511.cfm

http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

Software as a Service (SaaS). The capability provided to the consumer is to use the provider’s applications

running on a cloud infrastructure2. The applications are accessible from various client devices through

either a thin client interface, such as a web browser (web-based email) or a program interface. The

consumer does not manage or control the underlying cloud infrastructure including network, servers,

operating systems, storage, or even individual application capabilities, with the possible exception of

limited user-specific application configuration settings.

Platform as a Service (PaaS). The capability provided to the consumer is to deploy onto the cloud

infrastructure consumer-created or acquired applications created using programming languages, libraries,

services, and tools supported by the provider.3 The consumer does not manage or control the underlying

cloud infrastructure including network, servers, operating systems, or storage, but has control over the

deployed applications and possibly configuration settings for the application-hosting environment.

Infrastructure as a Service (IaaS). The capability provided to the consumer is to provision processing,

storage, networks, and other fundamental computing resources where the consumer is able to deploy and

run arbitrary software, which can include operating systems and applications. The consumer does not

manage or control the underlying cloud infrastructure but has control over operating systems, storage, and

deployed applications; and possibly limited control of select networking components (host firewalls).

Deployment Models:

Private cloud. The cloud infrastructure is provisioned for exclusive use by a single organization comprising

multiple consumers (business units). It may be owned, managed, and operated by the organization, a third

party, or some combination of them, and it may exist on or off premises.

Community cloud. The cloud infrastructure is provisioned for exclusive use by a specific community of

consumers from organizations that have shared concerns (such as mission, security requirements, policy,

and compliance considerations). It may be owned, managed, and operated by one or more of the

organizations in the community, a third party, or some combination of them, and it may exist on or off

premises.

Public cloud. The cloud infrastructure is provisioned for open use by the general public. It may be owned,

managed, and operated by a business, academic, or government organization, or some combination of

them. It exists on the premises of the cloud provider.

Hybrid cloud. The cloud infrastructure is a composition of two or more distinct cloud infrastructures

(private, community, or public) that remain unique entities, but are bound together by standardized or

proprietary technology that enables data and application portability (such as cloud bursting for load

balancing between clouds).

2 A cloud infrastructure is the collection of hardware and software that enables the five essential characteristics of cloud computing. The cloud

infrastructure can be viewed as containing both a physical layer and an abstraction layer. The physical layer consists of the hardware resources that are necessary to support the cloud services being provided, and typically includes server, storage and network components. The abstraction layer consists of the software deployed across the physical layer, which manifests the essential cloud characteristics. Conceptually the abstraction layer sits above the physical layer. 3 This capability does not necessarily preclude the use of compatible programming languages, libraries, services, and tools from other sources.

Using this Document

This document will be of most benefit to system architects, designers, and engineers who plan, design, or

implement private cloud solutions in their organization.

The physical architecture detailed in this document was designed to achieve goals of high availability,

scalability, and performance, while providing security for the infrastructure, and the virtual machines within

it. In addition to the physical architecture, this document includes a private cloud reference model and

principles that serve as the foundation for the physical architecture. Further, the document includes many

best practice recommendations from Microsoft product and enterprise service teams.

Microsoft Private Cloud Overview

The private cloud is a computing model that uses resources that are dedicated to your organization. A

private cloud shares many of the characteristics of public cloud computing including resource pooling,

self-service, elasticity, and usage-based metering. However, a private cloud is delivered in a standardized

manner with the additional control and customization available from dedicated resources.

Figure 1: Private cloud attributes

Although virtualization is an important technological component of a private cloud, the key differentiator

is the continued abstraction of computing resources from the infrastructure and the machines (virtual or

otherwise) that are used to deliver those resources. Only by delivering this abstraction, can customers

achieve the potential benefits of a private cloud, which include improved agility and responsiveness,

increased business alignment and focus, and reduced total cost-of-ownership. In addition, a private cloud

can exceed the cost effectiveness of a virtualized infrastructure through higher workload density and

greater resource utilization.

Microsoft private cloud solutions are built on four key pillars:

All about the app: An application-centric cloud platform that helps you focus on business value.

Cross-platform from the metal up: Cross-platform support for multi-hypervisor environments,

operating systems, and application frameworks.

Foundation for the future: A Microsoft private cloud allows you to go beyond virtualization to a

true cloud platform.

Cloud on your terms: The ability to consume cloud on your terms, providing you the choice and

flexibility of a hybrid cloud model through common management, virtualization, identity, and

developer tools.

For more information about private cloud solutions, please see Microsoft Private Cloud Overview.

http://www.microsoft.com/en-us/server-cloud/private-cloud/overview.aspx

Private Cloud Architecture Principles & Concepts

Resource Pooling Resource optimization is a principle that drives efficiency and cost reduction. It is achieved primarily

through resource pooling. Abstracting the platform from the physical infrastructure enables optimization

of resources through shared use. Multiple consumers sharing resources results in higher resource utilization

and leads to a more efficient, effective use of the infrastructure. Optimization through abstraction enables

many of the Microsoft private cloud principles, and this technique can ultimately help drive down costs

and improve agility.

Elasticity and Perception of Infinite Capacity From a consumer’s perspective, cloud services appear to have infinite capacity. Using an electric utility

provider as a metaphor, the consumer can use as much or as little of the service as he needs. This utility

approach requires that capacity planning be paramount and proactive so that requests can be satisfied on

demand. Applying this principle reactively and in isolation often leads to an inefficient use of resources and

unnecessary costs. Combined with other principles, like encouraging a desired consumer behavior, this

principle allows for a balance between the cost of unused capacity and the desire for agility.

Perception of Continuous Service Availability From the consumer’s perspective, cloud services should always appear available when needed. The

consumer should never experience an interruption of service, even if failures occur within the cloud

environment. To achieve this perception, a provider must have a mature service management approach, an

inherent application resiliency, and infrastructure redundancies in a highly automated environment. Much

like the perception of infinite capacity, the perception of continuous availability can only be achieved in

conjunction with the other Microsoft private cloud principles.

Drive Predictability Predictability is a fundamental cloud principle whether you are a consumer or a provider. From the

vantage point of the consumer, cloud services should be consistent, and they should have the same quality

and functionality any time they are used.

A provider must deliver an underlying infrastructure that drives a consistent experience to the hosted

workloads to achieve predictability. This consistency is realized through the homogenization of underlying

physical servers, network devices, and storage systems.

From the service management perspective of a provider, predictability is driven through the

standardization of service offerings and processes. The principle of predictability is necessary for driving

service quality.

Take a Service Provider ’s Approach When you take a service provider’s approach for delivering information technology, a key capability is to

be able to meter resource utilization and charge users for that usage. Historically, when IT departments

have been asked to deliver a service to the business, they purchase the necessary components and then

build an infrastructure that is specific to the service requirements. This process can result in an increase in

time-to-market, higher costs because of duplicate infrastructures, and unmet business expectations of

agility and cost reduction. Further compounding the issue, this model is often used when an existing

service needs to be expanded or upgraded.

IT departments can transform their organization by taking a service provider’s approach. When

infrastructure is provided as a service, IT departments can use a shared resource model that enables

economies of scale, and they can also combine other private cloud architecture principles & concepts to

achieve greater agility for providing services.

Multitenancy Multitenancy refers to a principle in which an infrastructure can be logically subdivided and provisioned to

organizations or organizational units. The traditional example is a hosting company that provides servers

to multiple customer organizations. Increasingly, this model is being used by organizations with a

centralized IT department that provides services to multiple business or organizational units and treats

each as a customer or tenant.

Security and Identity Security for a Microsoft private cloud is founded on three pillars: protected infrastructure, application

access, and network access.

Protected infrastructure: Uses security and identity technologies to help make sure that hosts,

information, and applications are secured across all scenarios in the data center, including physical

(on premises) and virtual (on premises and in the cloud) scenarios.

Application access: Allows IT professionals to extend vital application access to internal users,

business partners, and cloud users.

Network access: Uses an identity-centric approach to provide users (internal employees or users

in remote locations) with more secure access on numerous devices to help foster greater

productivity.

A more secure data center uses common, integrated technology to provide users simple access with a

common identity. A more secure data center also integrates management across physical, virtual, and

cloud environments so that a business can take advantage of all IT capabilities without requiring significant

financial investments.

Private Cloud Reference Model

Infrastructure as a service (IaaS) is the application of the private cloud architecture principles & concepts to

deliver infrastructure. As the cloud ecosystem matures, product features and capabilities broaden and

deepen. The following reference model can be used as a guide for delivering a holistic solution that spans

all the layers that are required for a mature IaaS. This model is a reference only, and it can assist architects

in developing a private cloud architecture. Some elements are emphasized more than others in the

technical reference architecture, and that preference is based on the experience of operating private clouds

in real-world environments.

Figure 2: Private Cloud Reference Model - IaaS view

The reference model is split into the following layers:

The software, platform, and infrastructure layers represent the technology stack. Each layer

provides services to the layer above.

The service operations and management layers represent the process perspective, and they include

the management tooling required to implement the process.

The service delivery layer represents the alignment between business and IT.

This reference model is a deliberate attempt to blend technology and process perspectives because cloud

computing is as much about service management as it is about the technologies involved in it. For more

information, see the following resources:

Information Technology Infrastructure Library (ITIL)

Microsoft Operations Framework (MOF)

Private Cloud Reference Model

Conceptual Architecture

A key driver of the layered approach to infrastructure architecture is to enable complex workflows and

automation to be delivered over time. This approach can be achieved by creating a collection of simple

automation tasks, assembling and managing procedures in the management layer, and then creating the

workflows and process automation that are controlled by the orchestration layer.

http://www.itil-officialsite.com/

http://www.itil-officialsite.com/

http://technet.microsoft.com/en-us/solutionaccelerators/dd320379.aspx

http://social.technet.microsoft.com/wiki/contents/articles/4399.private-cloud-reference-model.aspx

http://social.technet.microsoft.com/wiki/contents/articles/4346.private-cloud-principles-patterns-and-concepts.aspx

Fabric

Scale Units

In a modular architecture, the concept of a scale unit refers to the point to which a module in the

architecture can scale before another module is required. For example, an individual server is a scale

unit because it can be expanded to a certain point in terms of CPU and RAM, but when it reaches its

maximum scalability, an additional server is required to continue scaling. Each scale unit also has an

associated amount of physical installation and configuration labor. With large scale units, like a

preconfigured full rack of servers, the labor overhead can be minimized.

It is critical to know the scale limits of all components, both hardware and software, when you are

determining the optimum scale units for the overall architecture. Scale units enable the aggregation of

all the requirements that are needed for implementation (for example, space, power, HVAC, and

connectivity).

Servers

Data center architects have constantly evolved their choice of hardware architecture. Choices range

from rack-mounted servers to tightly integrated, highly redundant blade systems to container models.

A similar spectrum exists for storage and networking equipment.

Server scale limits are well published, and examples include the number and speed of CPU cores, the

maximum amount and speed of RAM, and the number and type of expansion slots. The number and

type of onboard input/output (I/O) ports and the number and type of supported I/O cards are

particularly important. Ethernet and Fibre Channel expansion cards often provide multiport options

where a single card can have four ports. Additionally, in blade server architectures, there are often

limitations in the amount of I/O card and supported combinations. It is important to be aware of these

limitations in addition to the oversubscription ratio between blade I/O ports and blade chassis switch

modules. A single server is not typically a good scale unit for a private cloud solution because of the

overhead that is required to install and configure an individual server.

Storage

Storage architecture is a critical design consideration for private cloud solutions. The topic is

challenging because it is rapidly evolving in terms of new standards, protocols, and implementations.

Storage and supporting storage networking is critical to the performance of the environment. The

overall cost is also significantly impacted because storage tends to be costly compared to other

components of the infrastructure.

Current storage architectures have several layers that can include the storage arrays, the storage

network, the storage protocol, and for virtualization, the file system that is utilizing the physical

storage.

One of the primary objectives of a private cloud solution is to enable rapid provisioning and de-

provisioning of virtual machines, but doing so at a large scale requires tight integration with the

storage architecture and robust automation. Provisioning a new virtual machine on an existing logical

unit number (LUN) is a simple operation; however, provisioning a new LUN and adding it to a host

cluster are relatively complicated tasks that can benefit from automation.

Networking

Many network architectures include a tiered design with three or more tiers, such as core, distribution,

and access. Designs are driven by the port bandwidth and quantity required at the edge, in addition to

the ability of the distribution and core tiers to provide higher speed uplinks to aggregate traffic.

Examples of additional considerations include Ethernet broadcast boundaries and limitations, the

spanning tree algorithm, and loop avoidance technologies.

A dedicated management network is a frequent feature of advanced data center virtualization

solutions because it allows hosts to be managed through a dedicated network to help eliminate

competition with guest traffic needs and provide a degree of separation for security purposes. A

dedicated management network typically implies dedicating a network adapter per host and a port

per networked device to the management network.

With advanced data center virtualization, a frequent use case is to provide isolated networks in which

different owners, such as particular departments or applications, are provided with a dedicated

network. Multitenant networking refers to using technologies such as virtual local area networks

(VLANs) or Internet protocol security (IPsec) isolation techniques to provide dedicated networks that

utilize a single network infrastructure or wire.

Managing the network environment in an advanced data center virtualization solution can present

challenges that must be addressed. Ideally, network settings and policies are defined centrally and

applied universally by the management solution. In the case of IPsec-based isolation, this can be

accomplished by using Active Directory® Domain Services (AD DS) and Group Policy to control firewall

settings across the hosts and guest, in addition to the IPsec policies controlling network

communication.

For VLAN-based network segmentation, several components, including the host servers, host clusters,

Microsoft System Center 2012 Virtual Machine Manager, and the network switches must be configured

correctly to enable rapid provisioning and network segmentation. With Hyper-V and host clusters,

identical virtual networks must be defined on all nodes for a virtual machine to fail over to any node

and maintain its connection to the network. On a large scale, this configuration task can be

accomplished by scripting with Windows PowerShell®.

Virtualization

Decoupling hardware, operating systems, data, applications, and user state opens a wide range of

options for better management and distribution of workloads across the physical infrastructure. The

ability of the virtualization layer to migrate running virtual machines from one server to another

without downtime and many other features that are provided by hypervisor-based virtualization

technologies enable a rich set of solution capabilities. These capabilities can be utilized by the

automation, management, and orchestration layers to maintain desired states, proactively address

decaying hardware, or handle other issues that would otherwise cause faults or service disruptions.

Like the hardware layer, the automation, management, and orchestration layers must be able to

manage the virtualization layer. Virtualization provides an abstraction of software from hardware that

enables the majority of management and automation to move from manual human tasks to

automated tasks that are executed by management software.

Management

Fabric Management

Fabric management is the concept of treating discrete capacity pools of servers, storage, and networks

as a single fabric. The fabric is then subdivided into capacity clouds, or resource pools, which carry

characteristics such as delegation of access and administration, service level agreements (SLAs), and

cost metering. Fabric management centralizes and automates complex management functions that

can be carried out in a highly standardized, repeatable fashion to increase availability and lower

operational costs.

Process Automation and Orchestration

The orchestration layer that manages the automation and management components must be

implemented as the interface between the IT organization and the infrastructure. Orchestration

provides the bridge between IT business logic, such as "Deploy a new web-server virtual machine when

capacity reaches 85 percent," and the dozens of steps in an automated workflow that are required to

implement such a change.

Ideally, the orchestration layer provides a graphical interface that combines complex workflows with

events and activities across multiple management system components and forms an end-to-end IT

business process. The orchestration layer must provide the ability to design, test, implement, and

monitor these IT workflows.

Service Management System

A service management system is a set of tools that is designed to facilitate service management

processes. Ideally, these tools should integrate data and information from the entire set of tools found

in the management layer. The service management system should process and present the data as

needed.

At a minimum, the service management system should link to the configuration management system

(CMS), commonly known as the configuration management database (CMDB), and it should log and

track incidents, issues, and changes. The service management system should be integrated with the

service health modeling system so that incident tickets can be auto-generated.

User Self-Service

Self-service capability is a characteristic of private cloud computing, and it must be present in any

implementation. The intent is to permit users to approach a self-service capability and be presented

with options that are available for provisioning. The capability may be basic (provision a virtual

machine with a predefined configuration), more advanced (allow configuration options to the base

configuration), or complex (implement a platform capability or service).

Self-service capability is a critical business driver that enables members of an organization to become

more agile in responding to business needs with IT capabilities that align and conform to internal

business and IT requirements.

The interface between IT and the business should be abstracted to a well-defined, simple, and

approved set of service options. The options should be presented as a menu in a portal or be available

from the command line. The business can select these services from the catalog, start the provisioning

process, and be notified upon completion, at which point they are charged only for the services that

are actually used.

Service Delivery

Service Catalog

Service catalog management involves defining and maintaining a catalog of services that are offered

to consumers. This catalog lists the following:

Classes of services that are available

Requirements to be eligible for each service class

Service-level attributes and targets that are included with each service class

Cost models for each service class

The service catalog might also include specific virtual machine templates that are designed for

workload patterns. Each template defines the virtual machine configuration specifics such as amount of

allocated CPU, memory, and storage.

Capacity Management

Capacity management defines the processes necessary to achieve the perception of infinite capacity.

Capacity must be managed to meet existing and future peak demand while controlling

underutilization. Business relationships and demand management are key inputs into effective capacity

management, and they require a service provider’s approach, as first mentioned in the Private Cloud

Architecture Principles & Concepts section of this document. Predictability and optimization of

resource usage are primary principles to achieve capacity management objectives.

Availability Management

Availability management defines the processes that are necessary to achieve the perception of

continuous availability. Continuity management defines how risk will be managed in a disaster scenario

to help make sure minimum service levels are maintained. The principles of resiliency and automation

are fundamental.

Service-Level Management

Service-level management is the process of negotiating SLAs and making sure that the agreements are

met. SLAs define target levels for cost, quality, and agility by service class, in addition to metrics to

measure actual performance. Managing SLAs is necessary for achieving the perception of infinite

capacity and continuous availability. Service-level management also requires a service provider’s

approach by IT.

Service Lifecycle Management

Service lifecycle management takes an end-to-end management view of a service. A typical journey

starts with identifying a business need, and then moves to managing a business relationship, and it

concludes when that service becomes available. Service strategy drives the service design. After launch,

the service is transitioned to operations and refined through continual service improvement. A service

provider’s approach is critical to successful service lifecycle management.

Operations

Change Management

Change management controls the lifecycle of all changes. The primary objective of change

management is to eliminate, or at least minimize, disruption while desired changes are made to

services. Change management focuses on understanding and balancing the cost and risk of making

the change versus the potential benefit of the change to the business or the service. Driving

predictability and minimizing human involvement are the core principles to achieve a mature service

management process and ensure that changes can be made without impacting the perception of

continuous availability.

Incident and Problem Management

Incident management resolves events that impact, or threaten to impact, services quickly and with

minimal disruption. Problem management identifies and resolves the root causes of incidents. Problem

management also tries to prevent or minimize the impact of possible incidents.

Configuration Management

Configuration management helps make sure that the assets required to deliver services are properly

controlled. The goal is to have accurate and reliable information about those assets available when and

where it is needed. This information includes details about asset configuration and the relationships

between assets.

Configuration management typically requires a CMDB, which is used to store configuration records

throughout their lifecycle. The configuration management system maintains one or more CMDBs, and

each CMDB stores the attributes of the configuration items and their relationships to other

configuration items.

Reference Architecture

Use Cases

Service Models The following image depicts the taxonomy of cloud services and defines the separation of responsibilities

when you adopt each service model. Please see the next section for more details about the service models.

Figure 3: Taxonomy of cloud services

IaaS IaaS abstracts hardware into a pool of computing, storage, and connectivity capabilities that are delivered

as services for a usage-based cost. IaaS provides a flexible, standard, and virtualized operating environment

that can become a foundation for platform as a service (PaaS) and software as a service (SaaS).

IaaS usually provides a standardized virtual server. The consumer takes responsibility for configuration and

operations of the guest operating system, software, and database. Compute capabilities (like performance,

bandwidth, and storage access) are also standardized. Service level agreements cover the performance and

availability of the virtualized infrastructure. The consumer takes on the operational risk that exists above

the infrastructure.

The Microsoft Private Cloud Fast Track Program aims primarily to deliver IaaS and to enable PaaS and SaaS.

Data Center Consolidation and Virtualization Consolidation and virtualization enable enterprise customers to migrate physical computers and virtual

machines to Hyper-V virtualization technology and Hyper-V-based cloud environments. Migrating to these

technologies reduces capital and operational expenses while improving manageability of virtual and

physical environments by utilizing the products in Microsoft System Center 2012.

Goals

Deploy a highly standardized Hyper-V network and storage infrastructure to reduce the costs of

facilities, hardware, and licensing incurred by alternative solutions

Implement a holistic and robust management solution to reduce server sprawl

Transition from organically-grown virtualized environments to a private cloud solution while

implementing new capabilities and business

Virtual Desktop Infrastructure Virtual desktop infrastructure (VDI) enables IT staff to deploy desktops in virtual machines on centralized,

data center hardware. A centralized, optimized virtual desktop enables users to access and run their

desktop and applications wherever they may be. By using virtual desktops, the IT department is able to

build a more agile and efficient IT infrastructure. Flexible desktop scenarios that are running Windows

operating systems give organizations the ability to choose the client computing scenarios that meet the

unique needs of their businesses.

Fabric Logical Architecture

The logical architecture is comprised of two parts. The first part is the fabric, which is the physical

infrastructure comprised of servers, storage, and the network that will host and run all customer or

consumer virtual machines. The second part is fabric management, which is a set of virtual machines

comprised of the Microsoft SQL Server data management software and the System Center 2012

management infrastructure.

The recommended practice is to have two or more Hyper-V host servers in a dedicated cluster for the

fabric management virtual machines, and then have separate clusters for the fabric. For smaller scale

deployments, the fabric management virtual machines could be hosted on the fabric itself.

Fabric The following graphic depicts the high-level minimum requirements for the fabric. The requirements are

categorized in compute, storage, and network layers. The minimums and recommendations are designed

to balance cost versus density and performance.

Figure 4: Private cloud fabric infrastructure

Server Architecture

The host server architecture is a critical component of the virtualized infrastructure and a key variable in

the consolidation ratio and cost analysis. The ability of the host server to handle the workload of a large

number of consolidation candidates increases the consolidation ratio and helps provide the desired cost

benefit.

The system architecture of the host server refers to the general category of the server hardware itself.

Examples include rack mounted servers, blade servers, and large symmetric multiprocessor servers (SMP).

The primary tenet to consider when selecting system architectures is that each Hyper-V host will contain

multiple guest operating systems with multiple workloads. Processor, RAM, storage, and network capacity

are critical, as are high I/O capacity and low latency. The host server must be able to provide the required

capacity in each of these categories.

Note: The Windows Server Catalog is useful to assist customers in selecting appropriate hardware. It

contains all servers, storage, and other hardware devices that are certified for Windows Server 2008 R2 and

Hyper-V. The logo program and support policy for failover cluster solutions changed with Windows Server

2008 R2 and Windows Server 2008, and cluster solutions are not listed in the Windows Server Catalog. All

individual components that comprise a cluster configuration need to earn the appropriate "Certified for"

Windows Server 2008 R2 or Windows Server 2008 designations, and they will be listed in their device-

specific category in the Windows Server Catalog. To find out if your components are certified:

Open the Windows Server Catalog. Under Hardware Testing Status, click Certified for Windows Server 2008

R2.

http://www.windowsservercatalog.com/

Rack or Blade Chassis Design The rack or blade chassis design should provide redundant power connectivity (that is, multiple power

distribution unit or PDU) capability for racks, or multiple hot-swappable power supplies for the blade

chassis.

Server and Blade Design Recommendations 2 to 8 socket server with a maximum of 64 logical processors enabled

64-bit CPU with virtualization technology support, data execution prevention (DEP), and second

level address translation (SLAT)

64 GB RAM minimum

Min 40 GB local RAID 1 or 10 hard-disk space for the operating system partition or an equivalent

boot from a storage area network (SAN) design

For more information, see Installing Windows Server 2008 R2.

Server and Blade Storage Connectivity Recommendations Internal serial advanced technology attachment (SATA) or serial attached storage (SAS) controller

for direct attached storage unless design is 100 percent SAN-based including boot from storage

area network (SAN) for the host operating system

If you are using a Fibre Channel SAN, two or more 4 to 8 gigabit fibre channel (GFC) host bus

adapters (HBAs)

If you are using iSCSI, two or more 1 Gb or 10 Gb network adapters or HBAs

If you are using Fibre Channel over Ethernet (FCoE) two or more 10 Gb converged network

adapters (CNAs)

Note: For iSCSI, 10 Gb network adapters are recommended because of the dynamic nature of

virtualized data centers. If 1 Gb network adapters are used, throughput should be carefully monitored.

Server and Blade Network Connectivity Use multiple network adapters and/or multiport network adapters on each host server. For converged

designs, network technologies that provide teaming or virtual network adapters can be utilized. This

arrangement assumes that two or more physical adapters can be teamed for redundancy and that multiple

virtual network adapters and/or VLANs can be presented to the hosts for traffic segmentation and

bandwidth control. For the recommended configuration by quantity and type of network adapter, see

Hyper-V: Live Migration Network Configuration Guide.

Server and Blade High Availability and Redundancy

Recommendations If you are using rack mounted servers, each server should have redundant power supplies.

If you are using rack mounted servers, each server should have redundant fans.

If you are using blade servers, each chassis should have redundant power supplies.

If you are using blade servers, each chassis should have redundant fans.

http://technet.microsoft.com/en-us/library/dd379511(WS.10).aspx

http://technet.microsoft.com/en-us/library/ff428137(WS.10).aspx

If the Hyper-V host system partition uses direct attached storage, each server should provide SAS

or SATA RAID capability for the system partition.

Storage Architecture

The storage design for any virtualization-based solution is a critical element that is typically responsible for

a large percentage of the solution’s overall cost, performance, and agility.

Storage Options Not all workloads have the same availability requirements nor achieve their requirements in the same way.

In the case of data center architecture, workloads are classified as stateful or stateless. A stateful workload

has data specific to that virtual machine that, if lost, would become unavailable. A stateless workload uses

data stored elsewhere in the data center, and it can achieve high availability through resiliency in the

application. An example of a stateless workload is a front-end web server farm.

Many data centers run more stateful workloads; therefore, this architecture assumes SAN storage will be

used throughout. However, the solution implementer may want to use non-clustered Hyper-V hosts and

direct-attached storage (DAS) for stateless workloads or for special cases such as for a VDI.

After the workload type is determined, the performance and availability characteristics of the specific

workload should be analyzed as follows to determine the storage characteristics required:

Shared storage is required for Hyper-V host clustering.

The use of non-shared storage (for example, DAS) is an exception, which may be preferable,

depending on the implementation requirements.

iSCSI shared storage is required for Hyper-V guest clustering

SAN Storage Protocols

Block-based versus File-based Storage

In Windows Server 2008 R2 with SP1, file-based storage is not supported for Hyper-V host clusters.

Hyper-V host clusters require block-based shared storage that is accessible to each host in the cluster.

Block-based shared storage is required for Hyper-V host clustering.

iSCSI versus Fibre Channel versus FCoE

Fibre Channel has historically been the storage protocol of choice for enterprise data centers for a

variety of reasons, including performance and low latency. These considerations have offset the

typically higher costs of Fibre Channel. In the last several years, the continually advancing performance

of Ethernet from 1 Gb to 10 Gb and beyond has led to great interest in storage protocols that use

Ethernet transports such as iSCSI, and recently, Fibre Channel over Ethernet (FCoE).

A key advantage of the protocols that use Ethernet transport is the ability to use a converged network

architecture. Converged networks have an Ethernet infrastructure that serves as the transport for LAN

and storage traffic. This can reduce costs by eliminating dedicated Fibre Channel switches and

reducing cabling. FCoE allows for the potential benefits of using an Ethernet transport while retaining

the advantages of the Fibre Channel protocol and the ability to use Fibre Channel storage arrays.

Several enhancements to the standard Ethernet are required for FCoE. The enriched Ethernet is

commonly referred to as enhanced Ethernet or Data Center Ethernet. These enhancements require

Ethernet switches that are capable of supporting enhanced Ethernet.

For Hyper-V, iSCSI-capable storage provides an advantage in that it is the protocol that can also be

utilized by Hyper-V guest virtual machines for guest clustering. A common practice in large-scale

virtualization deployments is to use Fibre Channel and iSCSI. Fibre Channel provides the host storage

connectivity, and iSCSI is used only by guest operating systems that require built-in operating system

iSCSI connectivity, such as a guest cluster. In this case, although Ethernet and some storage I/O will be

sharing the same pipe, segregation is achieved by VLANs and quality-of-service (QoS) that can be

applied with the OEM’s networking software.

Storage Network

FCoE and iSCSI use an Ethernet transport for storage networking, which provides another architecture

choice. The choices are to use a dedicated Ethernet network with separate switches, cables, and paths,

or to use a converged network in which multiple traffic types are run over the same cabling and

infrastructure.

The following diagram illustrates the differences between traditional and converged architectures. On

the left, is a traditional architecture with separate Ethernet and Fibre Channel switches, each with

redundant paths. On the right, is a converged architecture in which both Ethernet and Fibre Channel

(through FCoE) utilize the same set of cables while still providing redundant paths. The converged

architecture requires fewer switches and cables; however, the switches must be capable of supporting

enhanced Ethernet.

Figure 5: Storage network architectures

When you plan your storage network, consider the following:

Provide logical or physical isolation between storage and Ethernet I/O.

Ensure that host bus adapters (HBAs) or converged adapters are logo certified for Windows

Server 2008 R2 with SP1.

If you use a converged network, provide QoS for storage performance.

Provide iSCSI connectivity for guest clustering.

Provide fully redundant, independent paths for storage I/O.

For FCoE, use standards-based converged network adapters, switches, and Fibre Channel storage

arrays.

Make sure that the selected storage arrays provide iSCSI connectivity over standard Ethernet so

that Hyper-V guest clusters can be utilized.

If you are using iSCSI or Fibre Channel, make sure that there are dedicated network adapters or

HBAs, switches, and paths for the storage traffic.

Cluster Shared Volumes Windows Server 2008 R2 includes the first version of Failover Clustering to offer a distributed file access

solution. Clustered shared volumes (CSV) is a feature in Windows Server 2008 R2 that is designed

exclusively for use with the Hyper-V role. It enable all nodes in the cluster to access the same cluster

storage volumes at the same time. CSV use standard NTFS, and it has no special hardware requirements

beyond supported block-based shared storage.

CSV provides shared access to the disk and a storage path for I/O fault tolerance (dynamic I/O redirection).

If the storage path on one node becomes unavailable, the I/O for that node will be rerouted through a

server message block (SMB) to another node. A performance impact can be expected while running this

state. It is designed for use as a temporary failover path while the primary dedicated storage path is

brought back online. This feature can use any cluster communications network and further increases the

need for high-speed networks.

CSV maintains metadata information about the volume access and requires that some I/O operations take

place over the cluster communications network. One node in the cluster is designated as the coordinator

node, and it is responsible for these disk operations. However, virtual machines have direct I/O access to

the volumes, and they only use the dedicated storage paths for disk I/O, unless a failure scenario occurs as

described previously.

CSV Limits

The following limitations are imposed by the NTFS file system and are inherited by CSV.

CSV Parameter Limitation

Maximum volume size 256 TB

Maximum number of

partitions

128

Directory structure Unrestricted

Maximum files per CSV 4+ Billion

Maximum VMs per CSV Unlimited

Table 1: CSV Limits

CSV Requirements

All cluster nodes must use Windows Server 2008 R2 SP1.

All cluster nodes must use the same drive letter for the system disk.

All cluster nodes must be on the same logical network subnet. VLANs are recommended for

multisite clusters running CSV.

NT LAN Manager (NTLM) authentication in the local security policy must be enabled on cluster

nodes.

SMB must be enabled for each network on each node that will carry CSV cluster communications.

Client for Microsoft Networks and File and Printer Sharing for Microsoft Networks must be enabled

in the network adapter’s properties to enable all nodes in the cluster to communicate with CSV.

The Hyper-V role must be installed on any cluster node that might host a virtual machine.

CSV Volume Sizing

Because all cluster nodes can access all CSV volumes simultaneously, in a cluster with CSV, you can now

use standard LUN allocation methodologies based on performance and capacity requirements of the

workloads running within the virtual machines. Generally speaking, isolating the virtual machine

operating system I/O from the application data I/O on separate LUNs is a good start, in addition to

application-specific I/O considerations such as segregating databases and transaction logs and

creating SAN volumes and/or storage pools that factor into the I/O profile (for example, random Read

and Write operations versus sequential Write operations).

The CSV architecture differs from other traditional clustered file systems, which makes it free from

common scalability limitations. As a result, there is little guidance for scaling the number of Hyper-V

nodes or virtual machines on a CSV volume. Make sure that the overall I/O requirements of the

expected virtual machines running on the CSV are met by the underlying storage system and storage

network. Although rare, disks and volumes can enter a state that require running chkdsk, and large

disks might take a long time to complete, causing downtime of the volume during this process roughly

proportional to the volume’s size.

Each enterprise application that you plan to run within a virtual machine might have unique storage

recommendations or virtualization-specific storage guidance. That guidance also applies to use with

CSV volumes. Be aware that all virtual disks running on a particular CSV will contend for storage I/O.

It is worth noting that individual SAN LUNs do not necessarily equate to dedicated disk spindles. A

SAN storage pool or RAID array may contain many LUNs. A LUN is simply a logical representation of a

disk that is provisioned from a pool of disks. Therefore, if an enterprise application requires specific

storage I/O operations per second (IOPS) or disk response times you must consider all the LUNs that

are in use in that storage pool. An application that would require dedicated physical disks (if it is not

virtualized) might require dedicated storage pools and CSV volumes running within a virtual machine.

Consider the following when using CSV:

The CSV feature in Windows Server 2008 R2 or an equivalent clustered file system that

supports Hyper-V is recommended to enable Hyper-V Live Migration.

For maximum flexibility, configure LUNs for CSV with a single volume so that 1 LUN equals 1

CSV.

For I/O optimization or performance critical workloads, at least 4 CSVs per host cluster are

recommended for segregating operating system I/O, random read/write I/O, sequential I/O,

and other virtual machine-specific data.

Follow the vendor’s recommendations for storage with CSVs.

Create a standard size and IOPS profile for each type of CSV LUN to utilize for capacity

planning. When additional capacity is needed, provision additional standard CSV LUNs.

Consider prioritizing the network that is used for CSV traffic. For more information, see

Designating a Preferred Network for Cluster Shared Volumes Communication.

CSV Design Patterns

Single CSV per Cluster

In the single CSV per cluster design pattern, the SAN is configured to present a single large LUN to all

the nodes in the host cluster. The LUN is configured as a CSV in failover clustering. All virtual machine-

related files that belong to the virtual machines that are hosted on the cluster are stored on the CSV.

Optionally, data deduplication functionality that is provided by the SAN can be utilized (if it is

supported by the SAN vendor).

Figure 6: Virtual machines on a single large CSV

Multiple CSVs per Cluster

In the multiple CSV per cluster design pattern, the SAN is configured to present two or more large

LUNs to all the nodes in the host cluster. The LUNs are configured as a CSV in failover clustering.

All virtual machine-related files that belong to the virtual machines that are hosted on the cluster

are stored on the CSVs. Also, data deduplication functionality that is provided by the SAN can be

utilized (if it is supported by the SAN vendor).


Figure 7: VMs on multiple CSVs with minimal segregation

For the single and multiple CSV patterns, each CSV has the same I/O characteristics, so each

individual virtual machine has all its associated virtual hard disks (VHDs) stored on one of the CSVs.

Figure 8: Each virtual machine’s virtual disks reside together on the same CSV

Multiple I/O Optimized CSVs per Cluster

In the multiple I/O optimized CSVs per cluster design pattern, the SAN is configured to present

multiple LUNs to all the nodes in the host cluster. However, the LUNs are optimized for particular

I/O patterns like fast sequential Read performance or fast random Write performance. The LUNs

are configured as CSV in failover clustering. All VHDs that belong to the virtual machines that are

hosted on the cluster are stored on the CSVs, but they are targeted to the appropriate CSV for the

given I/O needs.

Figure 9: Virtual machines with a high degree of virtual disk segregation

In the multiple I/O optimized CSVs per cluster design pattern, each individual virtual machine has

all its associated VHDs stored on the appropriate CSV per required I/O requirements.

Figure 10: Virtual machines with a high degree of virtual disk segregation

Note: A single virtual machine can have multiple VHDs, and each VHD can be stored on a different

CSV (provided all CSVs are available to the host cluster on which the virtual machine is created).

SAN Design

High Availability

The high availability SAN design should have no single points of failure, for example redundant power

from independent PDUs, redundant storage controllers, redundant target ports of network adapters

per controller, redundant Fibre Channel or IP network switches.

Performance

Storage performance is a complicated mix of drive, interface, controller, cache, protocol, SAN, host bus

adapter (HBA), driver, and operating system considerations. The overall performance of the storage

architecture is typically measured in terms of maximum throughput, maximum I/O operations per

second (IOPS), and latency or response time. Although each of the factors is important, IOPS and

latency are highly relevant to server virtualization.

Many modern SANs use a combination of high-speed disks, slower-speed disks, and large memory

caches. Storage controller cache can improve performance during burst transfers or when the same

data is accessed frequently by storing it in the cache memory, which is typically several orders of

magnitude faster than the physical disk I/O. However, cache is not a substitute for adequate disk

spindles because caches are ineffective in aiding heavy Write operations.

Drive Types

The type of hard drive that is utilized in the host server or in the storage array will have significant

impact on the overall storage architecture performance. The critical performance factors for hard disks

are the interface architecture (for example, U320 SCSI, SAS, SATA), the rotational speed of the drive

(7200, 10K, 15K RPM), and the average latency in milliseconds. Additional factors, such as the cache on

the drive and support for advanced features, can improve performance. As with the storage

connectivity, high IOPS and low latency are more critical than maximum sustained throughput for host

server sizing and guest performance.

When you select drives, this translates into selecting those with the highest rotational speed and lowest

latency possible. Utilizing 15K RPM drives over 10K RPM drives can result in up to 35 percent more

IOPS per drive. The workloads that are targeted to run within the virtual machines play a critical role in

determining acceptable disk subsystem latency. Make sure that the latency reflects a minimum

assumption that production-class workloads will be running within the virtual machines that are

running on Windows Server 2008 R2 with SP1.

RAID Array Design

The RAID type should provide high availability and high performance even in the event of disk failures

and RAID parity rebuilds. In general, RAID 10 (0+1), or a proprietary hybrid RAID type are

recommended for virtual machine volumes. RAID 1 is also acceptable for host boot volumes, although

many proprietary RAID types and additional SAN capabilities can be employed. In general, the RAID

type must be able to tolerate a single drive failure and not sacrifice performance for capacity.

Multipathing

In all cases, multipathing should be used. Generally, storage vendors will build a device specific module

(DSM) on top of Microsoft Multipath I/O (MPIO) in Windows Server 2008 R2. Each DSM and HBA will

have a unique multipathing option and a recommended number of connections.

Fibre Channel (if Fibre Channel is used)

Fibre channel is an option, and it is a supported storage connection protocol.

iSCSI

The iSCSI SAN should be on an isolated network, for security and performance reasons. You can use

the following options to achieve this:

A physically separate, dedicated storage network.

A physically shared network with the iSCSI SAN running on a private VLAN. The switch

hardware should provide Class or Service (CoS) or Qualify of Service (QoS) assurances for the

private VLAN.

Encryption and Authentication

If multiple clusters and/or systems are used on the same SAN, proper segregation or device isolation

should be provided. The storage used by cluster A should be visible only to cluster A, and not to any

other cluster or to a node from a different cluster. The use of session authentication (for example,

Challenge Handshake Authentication Protocol or CHAP) is highly recommended. This provides a

degree of security in addition to segregation. Mutual CHAP or IPsec can also be used, but if so,

performance implications should be considered.

Jumbo Frames

If supported at all points in the iSCSI network, jumbo frames can increase throughput by up to 20

percent. Jumbo frames are supported in Hyper-V at the host and guest levels.

Data Deduplication

Data deduplication can yield significant storage cost savings in virtualization environments. Some

common considerations are the performance hits during the deduplication cycle and the maximum

efficiency that is achieved by locating similar data types on the same volume or LUN.

Thin Provisioning

In virtualization environments, thin provisioning is a common practice because it allows for efficient

use of the available storage capacity. The LUN and corresponding CSV can grow as needed, typically in

an automated fashion to secure the availability of the LUN. However, as storage becomes over-

provisioned, careful management and capacity planning are critical.

Volume Cloning

Volume cloning is another common practice in virtualization environments. Volume cloning can be

used for host and virtual machine volumes to dramatically improve host installation and virtual

machine provisioning times.

Volume Snapshots

SAN volume snapshots are a common method of providing a point-in-time, instantaneous backup of a

SAN volume or LUN. These snapshots are typically block-level, and they only utilize storage capacity as

blocks that change on the originating volume. Some SANs provide tight integration with Hyper-V and

integrate the Hyper-V VSS writer on hosts and volume snapshots on the SAN. This integration provides

a comprehensive and high-performing backup and recovery solution.

Storage Tiering

Storage tiering is the practice of physically partitioning data into multiple distinct classes, such as price

or performance. Data can be dynamically moved among classes in a tiered storage implementation

based on access, activity, or other considerations.

Storage tiering is normally achieved through a combination of varying types of disks that are used for

different data types (for example, production, non-production, or back-up data types). The following

figure shows an example of storage tiering for a high I/O application such as Microsoft Exchange

Server.

Figure 11: Example of storage tiering

Storage Automation One of the objectives of a private cloud solution is to enable rapid provisioning and deprovisioning of

virtual machines. Doing so on a large scale requires tight integration with the storage architecture and

robust automation. Provisioning a new virtual machine on an already existing LUN is a simple operation;

however, provisioning a new CSV LUN and adding it to a host cluster are relatively complicated tasks that

should be automated. Virtual Machine Manager enables end-to-end automation of this process through

SAN integration by using the Storage Management Initiative Specification (SMI-S) protocol.

Historically, many storage vendors have designed and implemented their own storage management

systems, application programming interfaces (APIs), and command-line utilities. This customization has

made it a challenge to use a common set of tools and scripts across heterogeneous storage solutions.

For the robust automation that is required in an advanced data center virtualization scenario, a SAN

solution that supports SMI-S is required. Preference is also given to SANs that support standard and

common automation interfaces like Windows PowerShell. Consider the following when you design a

storage solution:

The SAN should support SMI-S, and it should pass the Virtual Machine Manager 2012 SMI-S

validation test harness and certification. (Actual usage of SMI-S is not required if an OEM-specific

solution provides greater capability.)

The storage solution should provide mechanisms to achieve automated provisioning at a

minimum—ideally, automation of all common administrative tasks.

Network Architecture

There are a variety of design considerations for the network that supports the private cloud solution.

Three-Tier Network Design Many network architectures include a tiered design with three or more tiers, such as core, aggregation (or

distribution), and access. Designs are driven by the port bandwidth, quantity required at the edge, and the

ability of the core and aggregation tiers to provide higher speed uplinks to aggregate traffic. Additional

considerations include Ethernet broadcast boundaries and limitations, in addition to the spanning tree

algorithm and other loop avoidance technologies.

Core

The core tier is the high-speed backbone for the network architecture. The core is typically comprised of

two modular switch chassis that provide a variety of service and interface module options. The data center

core tier might interface with other network modules.

Aggregation

The aggregation (or distribution) tier consolidates connectivity from multiple access tier switch uplinks. This

tier is commonly implemented in end-of-row switches, a centralized wiring closet, or a main distribution

frame (MDF) room. The aggregation tier provides high-speed switching and more advanced features, such

as Layer 3 routing and other policy-based networking capabilities. The aggregation tier should have

redundant, high-speed uplinks to the core tier for high availability.

Access

The access tier provides device connectivity to the data center network. This tier is commonly implemented

by using Layer 2 Ethernet switches, typically through blade chassis switch modules or top-of-rack (ToR)

switches. The access tier should provide redundant connectivity for devices, required port features, and

adequate capacity for access (device) ports and uplink ports. The network switches should support:

802.1q VLAN trunks.

An Ethernet link aggregation standard that is compatible with the rack or blade server network

adapters so that network adapter teaming can span two or more switches.

Ethernet link aggregation so that multiple uplink ports can be bonded together for high

bandwidth.

The access tier can also provide features that are related to network adapter teaming like link aggregation

control protocol (LACP). Certain teaming solutions might require LACP switch features. The following

diagrams illustrate a three-tier network model: one provides a 10 Gb Ethernet connection to devices and

the other provides a 1 Gb Ethernet connection to devices.

Figure 12: Three-tier network design

Collapsed Core Network Design In smaller environments, a simpler network architecture than the three-tier model might be adequate. One

option is to combine the core and aggregation tiers (sometimes called a collapsed core). In this design, the

core switches provide core and aggregation functionality. The smaller number of tiers and switches provide

lower cost at the expense of future flexibility. The following diagram illustrates a design with the core and

aggregation tiers combined, and used in conjunction with the access tier.

Figure 13: Collapsed-core network design

High Availability and Resiliency Providing redundant paths from the server through all the network tiers to the core tier is highly

recommended for high availability and resiliency. Technologies like network adapter teaming or the

spanning tree algorithm can be utilized to provide redundant path availability without looping.

Each network tier should include redundant switches. With redundant pairs of access tier switches,

individual switch resiliency is slightly less important, so the expense of redundant power supplies and other

component redundancy might not be required. At the core and aggregation tiers, full hardware

redundancy and device redundancy are recommended because of the critical nature of those tiers.

Sometimes devices fail, become damaged, or get misconfigured. For these situations, remote management

and the ability to remotely power cycle all devices becomes important to restore service rapidly. It’s

recommended that the network design allow for the loss of any switch or switch module without dropping

host server connectivity.

Network Security and Isolation The network architecture should help enable security and isolation of network traffic. A variety of

technologies can be used individually or together to assist in security and isolation, for example:

VLANs enable traffic on one physical LAN to be subdivided into multiple virtual LANs or broadcast

domains. This is accomplished by configuring devices or switch ports to tag traffic with specific

VLAN IDs. A VLAN trunk is a network connection that can carry multiple VLANs, with each VLAN

tagged with specific VLAN IDs.

Access control lists (ACLs) enable traffic to be filtered or forwarded based on characteristics such as

protocol and the source or destination port. ACLs can be used to enable or prevent traffic from

reaching specific endpoints or to prohibit certain traffic types from reaching the network.

IPsec supports authenticating and encrypting network traffic to help protect against man-in-the-

middle attacks, network sniffing, and other data collection activities.

QoS allows rules to be set based on traffic type or attributes so that one form of traffic does not

block all others (by throttling it) or to help make sure that critical traffic has enough bandwidth

allocated.

Network Automation Remote interfaces and management of the network infrastructure through Secure Shell (SSH) or similar

protocol is important to the automation and resiliency of the data center network. Remote access and

administration protocols can be used by management systems to automate complex or error prone

configuration activities. For example, adding a VLAN to a distributed set of access tier switches can be

automated to avoid the potential for human error.

Virtualization Architecture

Virtualization Virtualization is provided at multiple layers, including storage, network, and server. Virtualization supports

resource pooling at each of these layers and abstraction between the layers for greater efficiency.

Storage Virtualization

Storage virtualization refers to the abstraction (separation) of logical storage from physical storage so

that it can be accessed without regard to physical storage or heterogeneous structure. This separation

allows increased flexibility for how system administrators manage storage for end users.

Network Virtualization

Network virtualization is the process of combining hardware and software network resources and

network functionality into a single, software-based administrative entity known as a virtual network.

Network virtualization involves platform virtualization, and it is often combined with resource

virtualization.

Network virtualization is categorized as external when it combines many networks, or parts of

networks, into a virtual unit. Network virtualization is categorized as internal when it provides network-

like functionality to the software containers on a single system. Whether virtualization is internal or

external depends on the implementation that is provided by the vendors that support the technology.

Various equipment and software vendors offer network virtualization by combining any of the

following:

Network hardware, such as switches and network adapters

Networks, such as virtual LANs (VLANs) and containers such as virtual machines

Network storage devices

Network media, such as Ethernet and Fibre Channel

For more information about storage virtualization, see Storage Virtualization: the SNIA Technical

Tutorial.

Server Virtualization

Hardware virtualization uses software to create a virtual machine that emulates a physical computer.

This virtualization creates a separate operating system environment that is logically isolated from the

host server. By providing multiple virtual machines at once, this approach allows several operating

systems to run simultaneously on a single physical computer.

Hyper-V technology is based on a 64-bit hypervisor-based microkernel architecture that enables

standard services and resources to create, manage, and disable virtual machines. The Windows

hypervisor runs directly above the hardware and provides strong isolation between the partitions by

enforcing access policies for critical system resources such as memory and processors. The Windows

hypervisor does not contain non-Microsoft device drivers or code, which minimizes its attack surface

and provides a more secure architecture. For a functional overview, see the following graphic.

http://www.snia.org/education/storage_networking_primer/stor_virt/

http://www.snia.org/education/storage_networking_primer/stor_virt/

Figure 14: Windows Server 2008 R2 detailed Hyper-V architecture

In addition to the Windows hypervisor, there are two other major elements to consider in Hyper-V: a

parent partition and a child partition. The parent partition is a special virtual machine that runs

Windows Server 2008 R2, controls the creation and management of child partitions, and maintains

direct access to hardware resources. In this model, device drivers for physical devices are installed in

the parent partition. In contrast, the role of a child partition is to provide a virtual machine

environment to install and implement guest operating systems and applications.

For more information, please download the detailed Hyper-V architecture poster.

Windows Server 2008 R2 SP1 and Hyper-V Host Design The recommendations in this section adhere to the support statements in the topic, Requirements and

Limits for Virtual Machines and Hyper-V in Windows Server 2008 R2.

Licensing

Windows Server 2008 R2 Enterprise includes use rights for up to four virtual machines. The use rights do

not technically limit the number of guest operating systems that the host can run; it includes the right to

run up to four Windows Server Enterprise guest operating systems on the licensed server. To run more

than four virtual machines you simply need to make sure that you have valid Windows Server licenses for

the additional virtual machines.

In contrast, Windows Server 2008 R2 Datacenter includes unlimited virtualization use rights, which from a

licensing standpoint allows you to run as many Windows Server guest operating systems as you want on

the licensed physical server.

Microsoft Hyper-V Server 2008 R2 might be more appropriate for desktop workloads. For more

information, see Licensing for Virtual Environments.

Windows Server 2008 Enterprise and Windows Server 2008 Datacenter include virtualization use rights,

which is a license to run a specified number of Windows Server-based virtual machines on a licensed

server.

Operating System Configuration

The following are required for the Hyper-V role:

An x64-based processor. Hyper-V is available for the 64-bit versions of Windows Server 2008

Enterprise and Datacenter. Hyper-V is not available for 32-bit (x86-based) versions or Windows

Server 2008 for Itanium-based systems. However, the Hyper-V management tools are available for

the 32-bit versions.

Hardware-assisted virtualization. This is available in processors that include a virtualization

option.

Hardware-enforced data execution prevention (DEP) must be available and enabled.

Specifically, you must enable or turn on this feature in the BIOS.

o Use Windows Server 2008 R2 with the full or server core installation option.

Note: there is no upgrade path from a server core installation to a full installation or vice

versa, so please make this selection carefully.

o Use the latest hardware device drivers.

o Domain-join the Hyper-V parent partition.

o Install the Hyper-V server roles and failover clustering features.

http://www.microsoft.com/download/en/details.aspx?id=2688

http://technet.microsoft.com/en-us/library/ee405267(WS.10).aspx

http://technet.microsoft.com/en-us/library/ee405267(WS.10).aspx

http://www.microsoft.com/licensing/highlights/virtualization.mspx

Updates. Apply relevant Windows updates, including out-of-band (OOB) updates that are not

offered on Microsoft Update. For more information, see Hyper-V Update List for Windows Server

2008 R2.

Cluster Validation Wizard. All nodes, networks, and storage solutions must pass the Cluster

Validation Wizard.

Memory and Hyper-V Dynamic Memory

Dynamic Memory is a Hyper-V feature that helps you use physical memory more efficiently. With

Dynamic Memory, Hyper-V treats memory as a shared resource that can be reallocated automatically

among running virtual machines. Dynamic memory adjusts the amount of memory that is available to

a virtual machine, based on changes in memory demand and values that you specify. Dynamic

Memory is available for Hyper-V in Windows Server 2008 R2 Service Pack 1. You can make the

Dynamic Memory feature available by applying the service pack to the Hyper-V role in Windows

Server 2008 R2 or to Microsoft Hyper-V Server 2008 R2.

For a complete description of Dynamic Memory feature, including settings and design considerations,

refer to the Hyper-V Dynamic Memory Configuration Guide. This guide provides the specific operating

system, service pack, and integration component levels for supported operating systems. The guide

also contains the minimum recommended startup RAM setting for all supported operating systems.

In addition to the previous general guidance, specific applications or workloads (particularly those with

built-in memory management capability, such as SQL Server or Exchange Server) may provide

workload specific guidance. The Private Cloud Fast Track Reference Architecture utilizes SQL Server

2008 R2, and the SQL Server product group has published best practices guidance for Dynamic

Memory in Running SQL Server with Hyper-V Dynamic Memory.

Storage Adapters

It’s recommended that storage adapters be configured per the vendor’s recommendations.

MPIO Configuration

Microsoft Multipath I/O (MPIO) architecture supports iSCSI, Fibre Channel, and serial attached storage

(SAS) SAN connectivity by establishing multiple sessions or connections to the storage array.

Multipath solutions use redundant physical path components (adapters, cables, and switches) to create

logical paths between the server and the storage device. If one or more of these components fails,

causing the path to fail, multipath logic uses an alternate path for I/O so that applications can still

access their data. Each network adapter (for iSCSI) or HBA should be connected by using redundant

switch infrastructures to provide continued access to storage in the event of a failure in a storage fabric

component.

Failover times vary by storage vendor, and they can be configured by using timers in the Microsoft

iSCSI Initiator, or by modifying the Fibre Channel host bus adapter driver parameter settings. Consider

the following:

Use MPIO with all iSCSI and Fibre Channel storage adapters.




http://msdn.microsoft.com/en-us/library/hh372970.aspx

Follow MPIO best practices as documented in Appendix B – MPIO & DSM Configuration and

Best Practices in the Windows Server High Availability with Microsoft MPIO whitepaper.

Follow the vendor’s recommended MPIO settings.

Network Adapters

For each node of the failover cluster, use more than one network adapter and configure at least one

network adapter for the private network. We recommend that you configure separate dedicated

networks with gigabit or faster speed for live migration traffic and cluster communication, and these

networks should be separate from the network that is used by the management operating system, and

from the network that is used by the virtual machines.

For information about the network traffic that can occur on a network used for Cluster Shared

Volumes, see “Understanding redirected I/O mode in CSV communication” in Requirements for Using

Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2.

Use the following host server network adapter settings:

CSV. Each node should have a network adapter that carries CSV communication. Client for

Microsoft Networks and File and Printer Sharing for Microsoft Networks must be

enabled in the network adapter properties to support SMB. For more information about the

network that is used for CSV communication, see Managing the network used for Cluster

Shared Volumes.

Hardware and system settings. It is required that the storage configuration and hardware on

the failover cluster be identical and that the cluster nodes that are used for live migration have

processors that are made by the same manufacturer. If this is not possible, it is recommended

that the hardware and system settings are as similar as possible to minimize potential

problems.

Security policies. If possible, do not apply IPsec policies on a private network for live

migration because this can significantly impact the performance of the live migration.

However, IPsec should be implemented when the live migration traffic needs to be encrypted.

IP subnet configuration. Ensure that the virtual network on the source and destination nodes

(for the live migration) in the failover cluster is connected through the same IP subnet. This is

so the virtual machine can retain the same IP address after live migration. For each network in

a failover cluster where CSV is enabled, all nodes must be on the same logical subnet, which

means that multisite clusters that use CSV must use a VLAN.

Performance Settings

The following network performance improvements in Hyper-V Server 2008 R2 should be tested and

considered for production use:

TCP Checksum Offload is recommended. It can benefit CPU and overall network throughput

performance, and it is fully supported by live migration.

http://www.microsoft.com/download/en/confirmation.aspx?id=9787

http://www.microsoft.com/download/en/confirmation.aspx?id=9787



http://technet.microsoft.com/en-us/library/47666237-c938-4fa7-9cb5-b098f7884c3b(v=ws.10)#BKMK_manageCSVnetworks

http://technet.microsoft.com/en-us/library/47666237-c938-4fa7-9cb5-b098f7884c3b(v=ws.10)#BKMK_manageCSVnetworks

Support for jumbo frames was introduced in Windows Server 2008. Hyper-V in Windows

Server 2008 R2 simply extends this capability to virtual machines. Similar to physical network

scenarios, jumbo frames add the same basic performance enhancements to virtual networking

including up to six times larger payloads per packet, which improves overall throughput and

reduces CPU usage for large file transfers.

Virtual machine queue (VMQ) architecture allows the host’s network adapter to direct memory

access (DMA) packets directly into individual virtual machine memory stacks. Each virtual

machine device buffer is assigned a VMQ, which avoids needless packet copies and route

lookups in the virtual switch. The result is less data in the host’s buffers and an overall

performance improvement to I/O operations.

Note: We recommend the use of TCP Checksum Offload, jumbo frames, and VMQ.

The cluster heartbeat network should be on a distinctly separate subnet from the host

management network.

The virtual machine network adapter should not be shared with the host operating system,

and therefore, it should have an IP address.

The iSCSI network should be on a distinctly separate and isolated network with a dedicated IP

range that is used only for storage. In a converged network, QoS mechanisms have to be in

place to provide storage traffic isolation and impact.

Network Adapter Teaming Configurations

Network adapter teaming can be used to enable multiple, redundant network adapters and

connections between servers and access tier network switches. Teaming can be enabled through

hardware or software-based approaches. Teaming can enable multiple scenarios including path

redundancy, failover, and load balancing. It’s recommended that network adapter teaming, or a

functional equivalent technology, be used to provide high availability to the virtual machine networks.

iSCSI Converged Network Adapter Teaming for Hardware

In this design pattern, a converged network and storage approach is used by teaming two 10 Gb

adapters and combining LAN and iSCSI SAN traffic on the same physical infrastructure with dedicated

VLANs. The network adapter teaming is provided at the blade or interconnect layer, and it is

transparent to the host operating system. Each VLAN is presented to the management operating

system as an individual physical network adapter, and in Hyper-V, a virtual switch is created for each.

By using iSCSI, host clustering and guest clustering are enabled.

Figure 15: iSCSI converged network adapter teaming for hardware

iSCSI Converged Network Adapter Teaming for Software


adapters and combining LAN and iSCSI SAN traffic on the same physical infrastructure with dedicated

VLANs. The network adapter teaming is provided at the software layer inside the host operating

system. Each VLAN is presented to the management operating system as a virtual network adapter,

and in Hyper-V, a virtual switch is created for each. By using iSCSI, host clustering and guest clustering

are enabled.

Figure 16: iSCSI converged network adapter teaming for software

FCOE Converged Network Adapter Teaming for Hardware


converged network adapters (CNA) and combining LAN and Fibre Channel SAN traffic on the same

physical infrastructure with dedicated VLANs. The network adapter teaming is provided at the blade or

interconnect layer, and it is transparent to the host operating system. Each VLAN is presented to the

operating system as an individual network adapter. In Hyper-V, a virtual switch is created for each. The

CNAs also present virtual HBAs to the host operating system. Host clustering is enabled by using Fibre

Channel, but guest clustering is not. To enable guest clustering, the SAN should also be capable of

presenting iSCSI LUNs over the Ethernet.

Figure 17: FCoE converged network adapter teaming for hardware

Fibre Channel and Ethernet

This design pattern uses a traditional, physically separated approach of Ethernet and Fibre Channel. For

the LAN, two 10 Gb adapters are teamed, combining LAN and iSCSI SAN traffic on the same physical

infrastructure with dedicated VLANs. For storage, two Fibre Channel HBAs are utilized with MPIO for

failover and load balancing. The network adapter teaming is provided at the blade or interconnect

layer, and it is transparent to the host operating system.

Each VLAN is presented to the operating system as an individual network adapter. In Hyper-V, a virtual

switch is created for each. For Fibre Channel, MPIO is provided by the host IOS combined with the

Microsoft DSM or an OEM DSM. By using Fibre Channel, host clustering is enabled, but guest

clustering is not. To enable guest clustering, the SAN should also be capable of presenting iSCSI LUNs

over the Ethernet.

Figure 18: Fibre Channel and Ethernet

Hyper-V Failover Cluster Design A failover cluster in Hyper-V is a group of independent servers that work together to increase the

availability of applications and services. The clustered servers (called nodes) are connected by physical

cables and software. If one of the cluster nodes fails, another node begins to provide service (a process

known as failover). In the case of a planned live migration, users experience no perceptible service

interruption.

The host servers are one critical component of a dynamic, virtual infrastructure. Consolidation of multiple

workloads onto the host servers requires high availability servers. Windows Server 2008 R2 provides

advances in failover clustering that enable high availability and live migration of virtual machines between

physical nodes.

Failover Cluster Topology

The Microsoft Private Cloud Fast Track Program has two standard design patterns. It’s recommended that

the server topology consist of at least two clusters running Hyper-V. The first needs at least two nodes, and

it is referred to as the management cluster. The remaining clusters are referred to as fabric host clusters.

In smaller scale scenarios or specialized solutions, the management and fabric clusters can be consolidated

onto the fabric host cluster. Take special care to provide resource availability for the virtual machines that

host the various parts of the management infrastructure. For details about the management cluster design

patterns, see the Management Architecture section later in this document.

Each failover cluster can contain up to 16 nodes. Failover clusters require some form of shared storage

such as a Fibre Channel or iSCSI SAN.

Failover Cluster Networks

A variety of networks are required for a Hyper-V failover cluster. The network requirements enable high

availability and high performance. The specific requirements and recommendations for network

configuration are published on TechNet in the Hyper-V: Live Migration Network Configuration Guide.

Network

access type

Purpose of the network access type Network traffic

requirements

Recommended

network access

Cluster and

Cluster Shared

Volumes

Preferred network used by the cluster for communications to

maintain cluster health. Also, used by CSV to send data

between owner and non-owner nodes. If storage access is

interrupted, this network is used to access the C SV or to

maintain and back up the CSV.

The cluster should have access to more than one network for

communication to ensure a high availability cluster.

Usually low bandwidth

and low latency.

Occasionally high

bandwidth.

Private access

Live migration Transfer virtual machine memory and state. High bandwidth and low

latency during

migrations.

Private access

Management Managing the Hyper-V management operating system. This

network is used by Hyper-V Manager or Virtual Machine

Manager.

Low bandwidth Public access,

which could be

teamed to fail

over the cluster.

Storage Access storage through iSCSI or Fibre Channel (Fibre Channel

does not need a network adapter).

High bandwidth and low

latency.

Usually dedicated

and private

access. Refer to

your storage

vendor for

guidelines.

Virtual machine

access

Workloads running on virtual machines usually require

external network connectivity to service client requests.

Varies Public access,

which could be

teamed for link

aggregation or to

fail over the

cluster.

Table 2: Failover cluster networks

Management Network

A dedicated management network is recommended so that hosts can be managed through a dedicated

network to prevent competition with guest traffic needs. A dedicated network provides a degree of


separation for the purposes of security and ease-of-management. A dedicated management network

typically implies dedicating a network adapter per host and port per network device to the management

network.

Additionally, many server manufacturers provide a separate out-of-band management capability that

enables remote management of server hardware outside the host operating system. Consider the

following:

Implement a dedicated network to manage the infrastructure.

Make sure that all Hyper-V hosts have a dedicated network adapter connected to the

management network for exclusive use by the parent partition.

Establish a dedicated LAN for out-of-band management adapters if your server hardware supports

them.

iSCSI Network

If you are using iSCSI, a dedicated iSCSI network is recommended so that storage traffic is not in

contention with other traffic. This typically implies dedicating two network adapters per host and two ports

per network device to the management network. If you use iSCSI, implement a dedicated iSCSI network or

VLAN. If you are using 1 Gb or 10 Gb network adapters, make sure that at least two network adapters are

dedicated to iSCSI traffic and that MPIO is enabled to ensure redundancy.

CSV to Cluster Communication Network

Usually, when the cluster node that owns a VHD file in a CSV performs disk I/O, the node communicates

directly with the storage. However, storage connectivity failures sometimes prevent a given node from

communicating directly with the storage. To maintain functionality until the failure is corrected, the node

redirects the disk I/O through a cluster network (the preferred network for CSV) to the node where the disk

is currently mounted. This is called CSV redirected I/O mode.

Implementing a dedicated CSV to cluster communication network is recommended. If you are using non-

teamed Ethernet network adapters, confirm that all Hyper-V hosts have a dedicated network adapter

connected to the CSV network for exclusive use by the parent partition. If you are using network adapter

teaming, ensure that a virtual network adapter is presented to the parent partition for CSV traffic.

Live Migration Network

During live migration, the content of the memory of the virtual machine running on the source node needs

to be transferred to the destination node over a LAN connection. To enable high-speed transfer, a

dedicated live migration network is required.

If you are using non-teamed Ethernet network adapters for the live migration network, ensure that all

Hyper-V hosts have a dedicated network adapter connected to the live migration network for exclusive use

by the parent partition. If you are using network adapter teaming, ensure that a virtual network adapter is

presented to the parent partition for live migration traffic. For detailed configuration options, please see

the Hyper-V: Live Migration Network Configuration Guide.

http://technet.microsoft.com/en-us/library/ff428137(v=ws.10).aspx

Virtual Machine Network(s)

The virtual machine networks are dedicated to virtual machine LAN traffic. A virtual machine network can

be two or more 1 Gb Ethernet networks, one or more networks created through network adapter teaming,

or virtual networks created from shared 10 Gb Ethernet network adapters. If you are using 1 Gb Ethernet

network adapters to create virtual networks, ensure that all Hyper-V hosts have two or more dedicated

network adapters connected to the virtual machine network for exclusive use by the guest virtual

machines. If you are using 10 Gb network adapters, confirm that virtual network adapter teaming is

presented to the virtual switch to ensure redundancy.

Host Failover Cluster Storage

CSV is a feature that simplifies the configuration and management of Hyper-V virtual machines in failover

clusters. With CSV on a failover cluster that runs Hyper-V, multiple virtual machines can use the same LUN

(disk), and fail over independently of one another. CSV provides increased flexibility for volumes in

clustered storage—for example, it allows you to keep system files separate from data to optimize disk

performance, even if the system files and the data are contained within VHD files. If you choose to use live

migration for your clustered virtual machines, CSV can also provide performance improvements for the live

migration process. CSV is available in versions of Windows Server 2008 R2 and of Hyper-V Server 2008 R2

that include failover clustering. CSV should be enabled so that it can be utilized for storing multiple virtual

machines on a single LUN.

Hyper-V Guest Virtual Machine Design Standardization is a key tenet of private cloud architectures and virtual machines. A standardized collection

of virtual machine templates can drive predictable performance and greatly improve capacity planning

capabilities. As an example, the following table illustrates what a basic virtual machine template library

would look like.

Template Specs Network Operating

System

Unit

Cost

Template 1 – Small 1 vCPU, 2 GB memory, 50 GB disk VLAN 20 Window Server

2003 R2

1

Template 2 – Med 2 vCPU, 4 GB memory, 100 GB disk VLAN 20 Window Server

2003 R2

2

Template 3 – X-Large 4 vCPU, 8 GB memory, 200 GB disk VLAN 20 Window Server

2003 R2

4

Template 4 – Small 1 vCPU, 2 GB memory, 50 GB disk VLAN 10 Windows Server

2008

1

Template 5 – Med 2 vCPU, 4 GB memory, 100 GB disk VLAN 10 Windows Server

2008

2

Template 6 – X-Large 4 vCPU, 8 GB memory, 200 GB disk VLAN 10 Windows Server

2008

4

Table 3: Example virtual machine template library

Virtual Machine Storage

The various disk types available in Hyper-V are detailed in the sections below.

Dynamically Expanding Disks

Dynamically expanding VHDs provide storage capacity as needed to store data. The size of the VHD file is

small when the disk is created and grows as data is added to the disk. The size of the VHD file does not

shrink automatically when data is deleted from the virtual hard disk. However, you can compact the disk to

decrease the file size after data is deleted by using the Edit Virtual Hard Disk Wizard.

Fixed Size Disks

Fixed VHDs provide storage capacity by using a VHD file that is in the size specified for the virtual hard disk

when the disk is created. The size of the VHD file remains fixed regardless of the amount of data stored.

However, you can use the Edit Virtual Hard Disk Wizard to increase the size of the VHD, which increases

the size of the VHD file. By allocating the full capacity at the time of creation, fragmentation at the host

level is not an issue (fragmentation inside the VHD itself must be managed within the guest operating

system).

Differencing Disks

Differencing VHDs provide storage to enable you to make changes to a parent VHD without altering the

disk. The size of the VHD file for a differencing disk grows as changes are stored to the disk.

Pass-Through Disks

Hyper-V enables virtual machine guest operating systems to directly access local disks or SAN LUNs that

are attached to the physical server without requiring the volume to be presented to the host server. The

virtual machine guest accesses the disk directly (by using the disk’s GUID) without having to use the host’s

file system. The performance difference between fixed-disk and pass-through disks is negligible, so the

decision about which to use is based on manageability. For instance, if the data on the volume will be very

large (hundreds of gigabytes), a VHD is not portable at that size, given the extreme amounts of time it

takes to copy. When planning a backup strategy, consider that the data on pass-through disks can be

backed up only from within the guest operating system.

When utilizing pass-through disks, no VHD file is created because the LUN is used directly by the guest

operating system. Because there is no VHD file, there is no dynamic sizing or snapshot capability.

iSCSI Initiator

Hyper-V can also utilize iSCSI storage by directly connecting to iSCSI LUNs through the guest’s virtual

network adapters. This is mainly used to access large volumes on SANs to which the Hyper-V host is not

connected, or for guest-clustering. Guest operating systems cannot boot from iSCSI LUNs that are accessed

through the virtual network adapters without using a non-Microsoft iSCSI initiator.

Take the following information into consideration when you plan your virtual machine storage:

Fixed disks. Use for production environments. Fixed disks provide increased performance and

ease the monitoring of storage availability. Using fixed disks allocates the full size of the disk upon

creation.

Dynamically expanding disks. Provide a viable option for production use. However, they carry

other risks, such as storage oversubscription and fragmentation, so use dynamically expanding

disks with caution and monitor your virtual to physical storage use.

Differencing disks. Never recommended for production server workloads.

Pass-through disks. Use only in cases in which absolute maximum performance is required and

the loss of features such as snapshots and portability is acceptable. Because the performance

difference between pass-through and fixed-disks is minimal, there should be very few scenarios in

which pass-through disks are required.

iSCSi. For iSCSI in guest operating systems, make sure that a separate virtual network is used to

access the iSCSI storage to obtain acceptable performance. If the iSCSI network on the virtual

machine is shared with Ethernet traffic, use QoS to provide performance assurances to the

different networks. Consider using jumbo frames within the guest operating system to improve

iSCSI performance.

Virtual Machine Networking

Hyper-V guest operating systems support two types of virtual network adapters: synthetic and

emulated.

Synthetic adapters make use of the Hyper-V virtual machine bus architecture, and they are

the high-performance, native devices. Synthetic devices require that the Hyper-V integration

services be installed within the guest operating system.

Emulated adapters are available to all guest operating systems even if integration services are

not available. They perform much more slowly and should only be used if synthetic devices are

unavailable.

It’s recommended that you utilize synthetic virtual network adapters when possible. Use emulated

network adapters only for unsupported guest operating systems or in special circumstances such as if

the guest operating system needs to pre-boot execution environment (PXE) boot.

You can create the following types of virtual networks on the server running Hyper-V to provide a

variety of communications channels:

Private network. Communications between virtual machines only.

Internal network. Communications between the host server and virtual machines.

External network. Communications between a virtual machine and a physical network by

creating an association to a physical network adapter on the host server.

For the private cloud scenario, the recommendation is to use one or more external networks per virtual

machine, and segregate the networks with VLANs and other network security infrastructure as needed.

Virtual Processors

Please reference the following tables for the number of virtual processors that are supported in a

Hyper-V guest operating system. Note that the following list is somewhat dynamic. Improvements to

the integration services for Hyper-V are periodically released, which add support for additional

operating systems. For more information, please see About Virtual Machines and Guest Operating

Systems.

Server Guest Operating System Virtual Processor Support

http://technet.microsoft.com/en-us/library/cc794868(WS.10).aspx

http://technet.microsoft.com/en-us/library/cc794868(WS.10).aspx

Server Guest Operating System Editions Virtual

Processors

Windows Server 2008 R2 with SP 1 Standard, Enterprise, Datacenter, and

Web

1, 2, 3, or 4

Windows Server 2008 R2 Standard, Enterprise, Datacenter, and

Windows Web Server 2008 R2

1, 2, 3, or 4

Windows Server 2008 Standard, Standard without Hyper-V,

Enterprise, Enterprise without Hyper-

V, Datacenter, Datacenter without

Hyper-V, Windows Web Server 2008,

and HPC

1, 2, 3, or 4

Windows Server 2003 R2 with SP 2 Standard, Enterprise, Datacenter, and

Web

1 or 2

Windows Home Server 2011 Standard 1, 2 or 4

Windows Storage Server 2008 R2 Essentials 1, 2 or 4

Windows Small Business Server 2011 Essentials 1 or 2

Windows Small Business Server 2011 Standard 1, 2, or 4

Windows Server 2003 R2 x64 with SP 2 Standard, Enterprise, and Datacenter 1 or 2

Windows Server 2003 with SP 2 Standard, Enterprise, Datacenter, and

Web

1 or 2

Windows Server 2003 x64 with SP 2 Standard, Enterprise, and Datacenter 1 or 2

Windows 2000 Server with SP 4

Important: Support for this operating

system ended on July 13, 2010. For more

information, see the notice at the

beginning of this article.

Server, Advanced Server 1

CentOS 6.0 and 6.1 x86 and x64 1, 2, or 4

CentOS 5.2-5.7 x86 and x64 1, 2, or 4

Red Hat Enterprise Linux 6.0 and 6.1 x86 and x64 1, 2, or 4

Red Hat Enterprise Linux 5.7 x86 and x64 1, 2, or 4






SUSE Linux Enterprise Server 11 with SP 1 x86 and x64 1, 2, or 4

SUSE Linux Enterprise Server 10 with SP 4 x86 and x64 1, 2, or 4

Table 4: Server guest operating system virtual processor support

Client Guest Operating System Virtual Processor Support

Client Guest Operating System Editions Virtual

Processors

Windows 7 with SP 1 Enterprise, Ultimate, and Professional.

This applies to all 32-bit and 64-bit

editions.

1, 2, 3, or 4

Windows 7 Enterprise, Ultimate, and Professional.

This applies to all 32-bit and 64-bit

editions.

1, 2, 3, or 4

Windows Vista Business, Enterprise, and Ultimate,

including N and KN editions

1 or 2

Windows XP with SP 3 (SP3)

Important: Performance might be

degraded on Windows XP with SP3

when the server running Hyper-V uses

an AMD processor. For more

information, see Degraded I/O

Performance Using a Windows XP

Virtual Machine with Windows Server

2008 Hyper-V.

Professional 1 or 2

Windows XP with SP 2

IMPORTANT: Support for this operating

system ended on July 13, 2010.

Professional 1

Windows XP x64 Edition with SP2

Professional 1 or 2

Table 5: Client guest operating system virtual processor support

Hyper-V supports a maximum ratio of eight virtual processors per logical processor for server

workloads, and 12 virtual processors per logical processor for VDI workloads. A logical processor is a

processing core that is seen by the management operating system or parent partition. In the case of

Hyper-Threading, each thread is considered a logical processor.

A server with 16 logical processors supports a maximum of 128 virtual processors, which equates to

128 single-processor virtual machines, 64 dual-processor virtual machines, or 32 quad-processor virtual

machines. The 8:1 or 12:1 virtual processor to logical processor ratios are the maximum supported

limits. It is recommended that lower limits be utilized rather than the maximum.

Consider the following when you are planning your processor ratio:

Use a virtual processor to logical processor ratio of up to 4:1 for production server workloads.

Use a virtual processor to logical processor ratio of up to 8:1 for production VDI workloads.

The maximum ratio that is supported in a Hyper-V guest operating system for server

workloads is 8:1.

The maximum ratio that is supported in a Hyper-V guest operating system for VDI workloads

is 12:1.





Management Architecture

The management architecture is used to manage the Hyper-V hosts.

Management Hosts Management hosts run Windows Server 2008 R2 SP1 with Service Pack 1 (64-bit) with the Hyper-V role.

For the specified scalability, the supporting products in System Center 2012 and their dependencies run

within Hyper-V virtual machines on the management hosts.

Design Pattern 1

For typical implementations, a two-node fabric management cluster is necessary to provide high

availability of the fabric management workloads. This fabric management cluster is dedicated to the

virtual machines running the suite of products that provide IaaS management functionality, and it is

not intended to run additional workloads. To accommodate additional scale, more fabric management

host capacity might be required.

Figure 19: Fabric management infrastructure

Compute (CPU)

The fabric management compute node CPUs are expected to have fairly high utilization. A

conservative virtual processor (vCPU) to logical processor ratio should be utilized: two or less. This

implies a minimum of two sockets per fabric management compute node with six to eight cores

per socket. During maintenance or failure of one of the two nodes, this CPU ratio will be

temporarily exceeded. It’s recommended that each fabric management compute node within the

configuration support a minimum of 12 logical CPUs and 96 vCPUs.

Memory (RAM)

Compute node memory should be sized accordingly to support the products in System

Center 2012 and their dependencies that provide IaaS management functionality. It’s

recommended that each fabric management compute node within the configuration have a

minimum of 96 GB RAM, although 128 GB RAM is recommended.

Network

It’s recommended that you use multiple network adapters, multiport network adapters, or both on

each compute node. For converged designs, network technologies that provide network adapter

teaming or virtual network adapters can be utilized—provided that two or more physical adapters

can be teamed for redundancy, and multiple virtual network adapters and/or VLANs can be

presented to the hosts for traffic segmentation and bandwidth control. It’s also recommended that

10 Gb or higher network interfaces are used to reduce bandwidth contention and simplify the

network configuration through consolidation.

Storage Connectivity

The recommendation for storage is that shared storage is provided with sufficient connectivity, but

no particular storage technology is recommended. The following guidance is provided to assist

with storage connectivity choices. For direct attached storage to the compute node, an internal

Serial Advanced Technology Attachment (SATA) or Serial Attached SCSI (SAS) controller is required

(for boot volumes), unless the design is 100 percent SAN-based, including boot from SAN for the

host operating system. Depending on the storage device used, the following adapters are

recommended to enable shared storage access:

If you use a Fibre Channel SAN, use two or more HBAs.

If you use iSCSI, use two or more 10 Gb Ethernet network adapters or HBAs.

If you use FCoE, use two or more 10 Gb CNAs.

Management Host Storage

The management components require three types of storage:

Physical boot disks (DAS or SAN) for the fabric management host servers.

CSV LUN(s) for the fabric management virtual machines (Fibre Channel or iSCSI).

iSCSI LUNs for the virtualized SQL Server cluster.

Design Pattern 2

There are configurations in which the fabric management components run directly on the

fabric and do not have dedicated nodes. In these cases, special care and considerations should

be taken to secure the availability of the fabric management components and to make sure

that they do not encroach on the performance of other fabric workloads. In this case, the

previous hardware minimum requirements still apply, although additional scalability items are

addressed as follows:

Use a dedicated two-node cluster for the fabric management host cluster for the

fabric management components.

Fabric management components can also run directly on the fabric cluster if the scale

of the Private Cloud Fast Track configuration necessitates it. In this case, refer to the

Fabric Logical Management section earlier in this document for details about the host

configuration.

Management Logical Architecture

Design Pattern 1

The following image depicts the fabric management logical architecture if using a dedicated two-node

fabric management host cluster:

Figure 20: Fabric management infrastructure: Pattern one

Design pattern one consists of two physical nodes in a failover cluster with SAN-attached storage that

supports iSCSI and redundant network connections. This architecture provides a high availability platform

for the fabric management systems. Some components of the fabric management have additional high

availability options, and in these cases, the most effective high availability option will be used.

The fabric management components include:

Two computers running SQL Server in a guest cluster configuration (can include optional third or

fourth servers)

Two servers running Virtual Machine Manager in a guest cluster configuration

Two servers running Operations Manager that are using the built-in failover and redundancy

features

Two servers running Orchestrator that are using the built-in failover and redundancy features

Two servers running Service Manager.

One computer running Service Manager Data Warehouse

One Service Manager Self-Service Portal with the Cloud Services Process Pack.

One server running App Controller

One deployment server that provides WDS, PXE, and WSUS

Design Pattern 2

The following graphic depicts the fabric management logical architecture if you run the fabric

management components directly on the fabric host cluster:

Figure 21: Fabric management infrastructure: Pattern two

The fabric management architecture consists of a minimum of four physical nodes in a failover cluster with

SAN-attached storage that supports iSCSI and redundant network connections. This architecture provides a

high availability platform for the fabric management components and high capacity for workloads. In this

scenario, the high-availability options for fabric management components are scaled down to facilitate a

smaller fabric management footprint on the fabric infrastructure.

The fabric management components include:

Two computers running SQL Server in a guest cluster configuration (with an optional third node)

One server running Virtual Machine Manager

One server running Operations Manager

One server running Orchestrator

One server running Service Manager

One computer running Service Manager Data Warehouse

One Service Manager Self-Service Portal with the Cloud Service Process Pack

One server running App Controller

One deployment server that provides WDS, PXE, and WSUS

Note: Hosts have an increased memory requirement of 128 GB to accommodate fabric management

component workloads.

Management Systems Architecture

Prerequisite Infrastructure

The following section outlines the fabric management component architecture and its dependencies

within a datacenter environment.

Active Directory Domain Services

AD DS is a required foundational component. The solution supports AD DS in Windows Server 2008

and Windows Server 2008 R2 with SP1. Previous versions of the Windows operating system are not

directly supported for all workflow provisioning and deprovisioning automation.

Note: It is assumed that AD DS deployments currently exist within the datacenter. Deployment of

these services is not within the scope of this document.

Recommendations for the existing AD DS are:

Forests and domains: The preferred approach is to integrate the solution into an existing AD DS

forest and domain, but the solution does not require it. A dedicated resource forest or domain can

also be employed as an additional part of the deployment. The solution supports multiple domains

or multiple forests in a trusted environment that uses two-way forest trusts.

Trusts: The solution enables multi-domain support within a single forest in which two-way forest

trusts (that use Kerberos protocol) exist between all domains. This is referred to as multi-domain or

inter-forest support.

DNS

Domain Name System (DNS) name resolution is a required element for System Center 2012

components and the process automation solution. AD DS-integrated DNS zone data is required for

automated provisioning and deprovisioning components within Orchestrator as part of the solution.

The solution provides full support and automation for replicating AD DS-integrated DNS zone data to

Windows Server 2008 and Windows Server 2008 R2 with SP1AD DS.

Implementations within environments that use non-Microsoft or non-AD DS-integrated DNS zone

data might be possible, but they would not provide for automated creation and removal of DNS

records that are related to virtual machine provisioning and deprovisioning processes. Use of solutions

outside of AD DS-integrated DNS zone data would require manual intervention or require

modifications to the Orchestrator run books within the System Center 2012 Cloud Services Process

Pack.

DHCP

To support dynamic provisioning and management of the physical and virtual compute capacity within

the IaaS infrastructure, use Dynamic Host Configuration Protocol (DHCP) for all physical computers

and virtual machines by default to support Orchestrator run book automation. For physical hosts like

the fabric management host cluster nodes and the fabric infrastructure cluster nodes, using DHCP is

recommended so that the physical computers and network adapters have known IP addresses to

provide centralized management.

Windows DHCP is required for automated provisioning and deprovisioning components within

Orchestrator run books as part of the solution. DHCP is used to support host cluster provisioning. The

solution provides full support and automation for Windows Server 2008 and Windows Server 2008 R2

with SP1 that are running the DHCP Server service. Use of solutions that use the DHCP Server service

outside these Windows Server operating systems requires additional testing and validation activities.

SQL Server

Two virtual machines running SQL Server will be deployed as a guest failover cluster to support the

solution (with an option to scale to a 4-node cluster). This multi-node failover cluster will contain all

the databases for each System Center 2012 product in discrete instances by product and function. This

separation of instances allows for division by unique requirements and scale over time as the needs of

each component scales higher. Note that not all features are supported for failover cluster

installations— some features cannot be combined on instances and some allow configuration only at

the initial installation.

As a general rule, SQL Server Database Engine services and SQL Server Analysis Services will be hosted

in separate instances within the failover cluster. Because of the support for SQL Server Reporting

Services (SSRS) in a failover cluster, SSRS will be installed on System Center 2012 Operations Manager

Reporting Server. However, this installation will contain “Files Only,” and the SSRS configuration will

configure remote SQL Server Reporting Services databases that are hosted on the component instance

on the failover cluster.

The exceptions to this are the Analysis Services and Reporting Services configurations in Operations

Manager. For this scenario, Analysis Services and Reporting Services must be installed on the same

server and with the same database instance to support Virtual Machine Manager and Operations

Manager integration. All instances must be configured with Windows Authentication. The following

table outlines the options required for each instance.

Database Instances and Requirements

Fabric

Management

Component

Instance

Name

(Suggested)

Components Collation4 Storage

Requirements

SCSMAS Analysis

Services

SQL_Latin1_General_CP1_CI_AS 2 LUNs

SCSPFarm Database

Engine


App Controller SCACDB Database

Engine


Operations

Manager

SCOMDB Database

Engine,

Full-Text

Search


Operations

Manager

Data Warehouse

SCOMDW Database

Engine,

Full-Text

Search


Orchestrator SCODB Database

Engine


Service Manager SCSMDB Database

Engine,

Full-Text

Search


Service Manager

Data Warehouse

SCSMDW Database

Engine,

Full-Text

Search


Virtual Machine

Manager

SCVMMDB Database

Engine


Windows Server

Update Services

(optional)

SCWSUSDB Database

Engine


Table 6: Database instances and requirements

4 The default SQL collation settings are not supported for multi-lingual installations of the Service Manager component. Only use the default

SQL collation if multiple languages are not required. Note that the same collation must be used for all Service Manager databases

(Management, Data Warehouse and Reporting Services).

Each virtual machine running SQL Server will be configured with four vCPUs, at least 16 GB of RAM (32

GB is recommended for large scale configurations), and three virtual network adapters (that are used

for LAN, cluster communications, and iSCSI). Each virtual machine running SQL Server will access iSCSI-

based shared storage with two LUNs that are configured for each hosted database.

If the needs of the solution exceed what two virtual machines are able to provide, additional virtual

machines can be added to the SQL Server failover cluster and each SQL Server instance can be moved

to a virtual machine in the cluster. This configuration requires SQL Server 2008 R2 Enterprise Edition. In

addition, where organizations can support solid-state drive storage, it should be used to provide the

necessary I/O for these databases. The instances and associated recommended node placement is

outlined as follows:

Figure 22: System Center SQL instance configuration

Note: For a more detailed version of this diagram, please see the Appendix.

SQL Server Configuration

You will need the following when you set up your SQL Server configuration:

Two high availability virtual machines (optional third or fourth node for reserve capacity and

failover)

Windows Server 2008 R2 Enterprise with SP1

SQL Server 2008 R2 Enterprise Edition

One 40 GB VHD per virtual machine running SQL Server on the host CSV

Four vCPUs per virtual machine running SQL Server

16 GB memory (32 GB is recommended, and do not enable Dynamic Memory)

Three virtual network adapters (one for client connections, one for cluster communications,

and one for iSCSI)

Storage: One VHD per virtual machine running SQL Server and 22 dedicated cluster iSCSI LUNs

(20 LUNs for System Center 2012, one LUN for quorum, and one LUN for the Microsoft

Distributed Transaction Coordinator

SQL Server Data Locations

LUN Purpose Size

LUN 1, iSCSI SQL Server cluster quorum 1GB

LUN 2, iSCSI SQL Server cluster quorum 1GB

LUN 3-19 (odd), iSCSI SQL Server databases varies

LUN 4-20 (even), iSCSI SQL Server logging varies

Table 7: SQL Server data locations

Virtual Machine Manager

The use of Virtual Machine Manager is required for the tested solution. Two servers running Virtual

Machine Manager are deployed and configured in a failover cluster that uses a dedicated instance of

SQL Server on the virtualized SQL Server cluster. One library share in Virtual Machine Manager will be

utilized. Additional library servers can be added as needed. Virtual Machine Manager and Operations

Manager integration is configured during the installation process. The following hardware

configurations will be used:

Virtual Machine Manager Servers

Design Pattern 1: Two guest-clustered, non-high-availability virtual machines

Design Pattern 2: One host-clustered, high availability virtual machine

Windows Server 2008 R2 with SP1

Four vCPUs

8 GB memory

Two virtual network adapters

Storage: One operating system VHD, one data VHD or pass-through volume, and one iSCSI

LUN

Operations Manager

System Center 2012 Operations Manager is required for the tested solution. Two servers running

Operations Manager are deployed in a single management group that uses a dedicated instance of

SQL Server on the virtualized SQL Server cluster. An Operations Manager agent is installed on every

guest virtual machine and every management host and scale unit cluster node to support health

monitoring functionality.

Note: Operations Manager gateway servers and additional management servers are supported for

custom solutions; however, for the base reference implementation, these additional roles are not

implemented.

The Operations Manager installation uses a dedicated instance of SQL Server on the virtualized SQL

Server cluster. The installation will follow a split SQL Server configuration: SQL Server Reporting

Services and Operations Manager components will reside on the Operations Manager virtual machine,

and the SQL Server Reporting Services and Operations Manager databases will utilize a dedicated

instance on the virtualized SQL Server cluster. The estimated SQL Server database sizes are 72 GB for

the Operations Manager database and 2.1 TB for the Operations Manager Data Warehouse database.

The following hardware configurations are utilized for the solution:

Operations Manager Management Servers

Design Pattern 1: Two non-high-availability virtual machines plus one host clustered, high

availability virtual machine

Design Pattern 2: One host-clustered, high-availability virtual machine


Four vCPUs

16 GB memory

One virtual network adapter

Storage: One operating system VHD

Management Packs

The following Operations Manager management packs are required for the solution:

System Center Monitoring Pack for System Center 2012 – Virtual Machine Manager

System Center Monitoring Pack for Windows Server Operating System

Windows Server Failover Cluster

Windows Server Hyper-V

SQL Server

Windows Server Internet Information Services 7

Windows Server Internet Information Services 2000/2003

Windows Server Internet Information Services 2008

System Center Monitoring Pack for System Center 2012 – Orchestrator

System Center Monitoring Pack for System Center 2012 – App Controller

System Center Monitoring Pack for System Center 2012 – Virtual Machine Manager

System Center Monitoring Pack for System Center 2012 – Operations Manager

Monitoring Pack for System Center 2012 Configuration Manager

Server OEM third-party management packs

Note: Operations Manager is integrated with Virtual Machine Manager

Service Manager

Service Manager, or a product that provides similar functionality is recommended for the solution. The

Service Manager management server is installed on two virtual machines. A third virtual machine hosts

the Service Manager Data Warehouse database. The Service Manager database and the Service

Manager Data Warehouse database each use a dedicated instance of SQL Server on the virtualized SQL

cluster. The Service Manager portal is hosted on a fourth virtual machine. The following virtual

machine configurations will be used:

Service Manager Management Servers

Design Pattern 1: Two high-availability virtual machines



Four vCPUs

16 GB memory



Service Manager Data Warehouse Server

One high-availability virtual machine


Four vCPUs

16 GB memory



Service Manager Portal Servers

One high-availability virtual machine


Four vCPUs

8 GB memory



Note: The estimated SQL Server database sizes for Service Manager are 40 GB for the Service Manager

database and 80 GB for the Service Manager Data Warehouse database.

Orchestrator

The Orchestrator installation uses a dedicated instance of SQL Server on the virtualized SQL Server

cluster. Use two servers running Orchestrator for high availability and scale purposes. Orchestrator

provides built-in failover capability, but it does not use failover clustering. By default, if an Orchestrator

server fails, any workflows that were running on that server will be started (not restarted) on the other

Orchestrator server. (The difference between starting and restarting is that restarting implies saving or

maintaining state and enabling an instance of a workflow to keep running.) Orchestrator only assures

that it will start any workflows that were started on the failed server. The state may, and likely will, be

lost, which means a request might fail. Many workflows have some degree of state management built

in, which helps mitigate this risk.

In addition, two Orchestrator servers are deployed by default for scalability. By default, each

Orchestrator server can run a maximum of 50 simultaneous workflows. This limit can be increased

depending on server resources, but an additional server is needed to accommodate larger scale

environments.

Orchestrator Servers

Design Pattern 1: Two non-high- availability virtual machines



Four vCPUs

8 GB memory



App Controller

System Center App Controller is not required for the solution. However, if the Service Manager portal

is utilized, App Controller must also be installed. App Controller uses a dedicated instance of SQL

Server on the virtualized SQL Server cluster. A single App Controller server is installed on the host

cluster.

Service Manager provides the service catalog and service request mechanism. Orchestrator provides

the automated provisioning. App Controller provides the end-user interface for connecting to and

managing workloads post-provisioning.

App Controller Server

1 host-clustered, high-availability virtual machine


Four vCPUs

8 GB memory



Management Scenarios Architecture

Management Scenarios

Following are the primary management scenarios that are addressed in the solution, although the

management layer can provide many more capabilities.

Fabric management

Fabric provisioning

IT service provisioning (including platform and application provisioning)

Virtual machine provisioning and deprovisioning

Fabric and IT service maintenance

Fabric and IT service monitoring

Resource optimization

Service management

Reporting (used by chargeback, capacity, service management, health, and performance)

Backup and disaster recovery

Security

Fabric Management

Fabric management is the act of pooling multiple disparate computing resources together and being

able to subdivide, allocate, and manage them as a single fabric. The following methods make fabric

management possible.

Hardware Integration

Hardware integration refers to the management system being able to perform deployment or

operational tasks directly against the underlying physical infrastructure, such as storage arrays, network

devices, or servers.

Storage Integration

Through the Virtual Machine Manager console, you can discover, classify, and provision remote

storage on supported storage arrays. Virtual Machine Manager fully automates the assignment of

storage to a Hyper-V host or Hyper-V host cluster, and it tracks the storage that is managed by Virtual

Machine Manager.

To enable the new storage features, Virtual Machine Manager uses the Microsoft Storage Management

Service to communicate with external arrays through a storage management initiative specification

(SMI-S) provider. The Storage Management Service is installed by default during the installation of

Virtual Machine Manager. You must install a supported SMI-S provider on an available server, and then

add the provider to Virtual Machine Manager. It’s recommended that the implemented solution

provide an automated mechanism for provisioning storage and attaching it to Hyper-V hosts for

automated cluster provisioning. The use of Virtual Machine Manager and SMI-S is recommended to

achieve this.

Network Integration

Networking in Virtual Machine Manager includes several enhancements that enable administrators to

efficiently provision network resources for a virtualized environment. The networking enhancements

include the following:

Creating and Defining Logical Networks

A logical network with one or more associated network sites is a user-defined grouping of IP subnets,

VLANs, or IP subnet and VLAN pairs that is used to organize and simplify network assignments. Some

possible examples include backend, frontend, lab, management, or backup. Logical networks represent

an abstraction of the underlying physical network infrastructure, which helps you model the network

based on business needs and connectivity properties. After a logical network is created, it can be used

to specify the network on which a host or a virtual machine (stand-alone or part of a service) is

deployed. Users can assign logical networks as part of virtual machine and service creation without

having to understand the network details.

Static IP and MAC Address Pool Assignment

If you associate one or more IP subnets with a network site, you can create static IP address pools from

those subnets. Static IP address pools enable Virtual Machine Manager to automatically allocate static

IP addresses to Windows-based virtual machines that are running on any managed Hyper-V, VMware

ESX or Citrix XenServer host. Virtual Machine Manager can automatically assign static IP addresses

from the pool to stand-alone virtual machines, to virtual machines that are deployed as part of a

service, and to physical computers, when its used to deploy them as Hyper-V hosts.

Additionally, when you create a static IP address pool, you can define a reserved range of IP addresses

for load balancers virtual IP addresses. Virtual Machine Manager automatically assigns a virtual IP

address to a load balancer during the deployment of a load-balanced service tier.

Load Balancer Integration

You can discover and add hardware load balancers to Virtual Machine Manager. By adding load

balancers to Virtual Machine Manager, and creating associated virtual IP templates, users who create

services can automatically provision load balancers when they create and deploy a service. The use of

Virtual Machine Manager advanced network integration features is recommended to fully enable the

use of Service Templates and enable a richer self-service experience. It’s also recommended that it be

integrated with the load balancer(s) in use within the organization.

Fabric Provisioning

In accordance with the principles of standardization and automation, creating the fabric and adding

capacity should be an automated process. In Virtual Machine Manager, this is achieved through a

sequence of steps:

1. Provisioning Hyper-V hosts

2. Configuring host properties, networking, and storage

3. Create Hyper-V host clusters

Each step in this process has dependencies:

1. Provisioning Hyper-V Hosts

1.1. A PXE boot server

1.2. Dynamic DNS registration

1.3. A standard base image to be used for Hyper-V hosts

1.4. Hardware driver files in the Virtual Machine Manager library

1.5. A host profile in the Virtual Machine Manager library

1.6. Baseboard management controller on the physical server

2. Configuring host properties, networking, and storage

2.1. Host property settings

2.2. The storage integration from above plus addition MPIO and/or iSCSI configuration

2.3. For the network, you must have already configured the logical networks that you want to

associate with the physical network adapter. If the logical network has associated network

sites, one or more of the network sites must be scoped to the host group where the host

resides.

3. Create Hyper-V host clusters

3.1. The hosts must meet all requirements for Windows Server failover clustering.

3.2. The hosts must be managed by Virtual Machine Manager.

Virtual Machine Manager Private Clouds

After you have configured the fabric resources, you can subdivide and allocate them for self-service

consumption through the creation of Virtual Machine Manager private clouds. During private cloud

creation, you select the underlying fabric resources that will be available in the private cloud, configure

library paths for private cloud users, and set the capacity for the private cloud. For example, you might

want to create a cloud for use by the finance department. You will be able to:

Name the cloud.

Scope it to one or more host groups.

Select which logical networks, load balancers, and virtual IP templates are available to the

cloud.

Specify which storage classifications are available to the cloud.

Select which library shares are available to the cloud for virtual machine storage.

Specify granular capacity limits to the cloud.

Select which capability profiles are available to the cloud.

o Capability profiles match the type of hypervisor platforms that are running in the

selected host groups. The built-in capability profiles represent the minimum and

maximum values that can be configured for a virtual machine for each supported

hypervisor platform.

Virtual Machine Provisioning and Deprovisioning

One of the primary cloud attributes is user self-service capability. In this solution, self-service capability

refers to the ability for the user to request one or more virtual machines or to delete one or more of

their existing virtual machines. The infrastructure scenario that supports this capability is the virtual

machine provisioning and deprovisioning process.

This process is initiated from the self-service portal or some other tenant user interface that may be in

use within the organization. It triggers an automated process or workflow in the infrastructure through

Virtual Machine Manager to create or delete a virtual machine based on the input from the user or

tenant. Provisioning can be template-based, such as requesting a small, medium, or large virtual

machine template, or provisioning can be a series of selections that are made by the user. If

authorized, the provisioning process could create a new virtual machine per the user’s request, add the

virtual machine to any relevant management products in the private cloud, and enable access to the

virtual machine by the requestor.

IT Service Provisioning

In Virtual Machine Manager, a service is a set of virtual machines that are configured, deployed, and

managed as a single entity. An example would be a deployment of a multi-tier, line-of-business

application.

In the Virtual Machine Manager console, you use the Service Template Designer to create a service

template, which defines the configuration of the service. The service template includes information

about the virtual machines that are deployed as part of the service, including which applications to

install on the virtual machines and the networking configuration that is needed for the service.

Resource Optimization

Elasticity, perception of infinite capacity, and perception of continuous availability are the Microsoft

Private Cloud Architecture Principles & Concepts that relate to resource optimization. This

management scenario deals with optimizing resources by dynamically moving workloads around the

infrastructure based on performance, capacity, and availability metrics. Examples include the option to

distribute workloads across the infrastructure for maximum performance or consolidating as many

workloads as possible to the smallest number of hosts for a higher consolidation ratio.

Based on user settings, Virtual Machine Manager Dynamic Optimization migrates virtual machines to

perform resource balancing within host clusters that support live migration.

Dynamic optimization attempts to correct the following scenarios in priority order:

1. Virtual machines that have configuration issues on their current host.

2. Virtual machines that are causing their host to exceed configured performance thresholds.

3. Unbalanced resource consumption on hosts.

Virtual Machine Manager Power Optimization is an optional feature in Dynamic Optimization, and it is

only available when a host group is configured to migrate virtual machines through Dynamic

Optimization. Through Power Optimization, Virtual Machine Manager helps save energy by turning off

hosts that are not needed to meet resource requirements within a host cluster, and then turns the

hosts on when they are needed again.

By default, Virtual Machine Manager always performs Power Optimization when the feature is turned

on. However, you can schedule the hours and days during the week when Power Optimization is

performed. For example, you might initially schedule power optimization only on weekends, when you

anticipate low resource usage on your hosts. After observing the effects of power optimization in your

environment, you might increase the hours.

For Power Optimization, the computers must have a baseboard management controller that enables

out-of-band management.

Fabric and IT Service Maintenance

A private cloud solution must provide the ability to perform maintenance on any component without

impacting the availability of the solution. Examples include the need to update or patch a host server

or add additional storage to the SAN. The system should not generate unnecessary alerts or events in

the management systems during planned maintenance.

Virtual Machine Manager includes the built-in ability to maintain the fabric infrastructure servers in a

controlled, orchestrated manner. Fabric infrastructure servers include the following physical computers,

which are managed by Virtual Machine Manager: Hyper-V hosts, Hyper-V clusters, library servers, pre-

boot execution environment (PXE) servers, the Windows Server Update Management server, and the

Virtual Machine Manager management server.

Virtual Machine Manager supports on-demand compliance scanning and remediation of the fabric

infrastructure. Administrators can monitor the update status of the servers, scan for compliance, and

remediate updates for selected servers. Administrators also can exempt resources from installation of

an update.

Virtual Machine Manager supports orchestrated updates of Hyper-V host clusters. When a Virtual

Machine Manager administrator performs update remediation on a host cluster, Virtual Machine

Manager places one cluster node at a time in maintenance mode and then installs updates. If the

cluster supports live migration, intelligent placement is used to migrate virtual machines off the cluster

node. If the cluster does not support live migration, Virtual Machine Manager saves state for the virtual

machines. The Virtual Machine Manager update management requires the use of a Windows Server

Update Management server.

Fabric and IT Service Monitoring

A private cloud solution must provide the ability to monitor every major component of the solution

and generate alerts based on performance, capacity, and availability metrics. Examples of availability

metrics include monitoring server availability, CPU, and storage utilization.

Monitoring of the fabric infrastructure is performed through the integration of Operations Manager

and Virtual Machine Manager. Enabling this integration allows Operations Manager to automatically

discover, monitor, and report on essential performance and health characteristics of any object that is

managed by Virtual Machine Manager:

Health and performance of all Virtual Machine Manager managed hosts and virtual machines.

Diagram views in Operations Manager that reflect all Virtual Machine Manager deployed

hosts, services, virtual machines, private clouds, IP address pools, and storage pools.

Performance and Resource Optimization, which can be configured at a very granular level and

delegated to specific self-service users.

Monitoring and automated remediation of physical servers, storage, and network devices.

Note: For additional workload and application-specific monitoring in the guest operating system,

simply deploy an Operations Manger agent within the virtual machine operating system and enable

the desired management pack. Although this is not considered part of fabric monitoring, it’s also

valuable to monitor the workload running inside the virtual machine.

Reporting

A private cloud solution must provide a centralized reporting capability. The reporting capability

should provide standard reports that detail capacity, utilization, and other system metrics. The

reporting functionality serves as the foundation for capacity, or utilization-based, billing and

chargeback to tenants. In a service-oriented IT model, reporting serves the following purposes:

Systems performance and health

Capacity metering and planning

Service level availability

Usage-based metering and chargeback

Incident and problem reports that help IT focus efforts

As a result of integrating Virtual Machine Manager and Operations Manager, several reports are

created and available by default. However, metering and chargeback reports and incident and

problem reports are enabled by using Service Manager and the Cloud Services Process Pack.

Reporting should be enabled, at a minimum, for the default reports that ship with Operations Manager

and Virtual Machine Manager. In addition, it’s recommended to enable the Service Manager and

Cloud Services Process Pack reports for metering, chargeback, incidents, change requests, and more.

You could also define your own custom reports and bundle reports from non-Microsoft applications in

the solution.

Service Management System

The goal of Microsoft System Center 2012 Service Manager is to support IT service management in a

broad sense. This includes implementing Information Technology Infrastructure Library (ITIL) processes,

such as change and incident management, and it can also include processes like allocating resources

from a private cloud.

Service Manager maintains a configuration management database (CMDB). The CMDB is the

repository for nearly all configuration and management-related information in the System Center 2012

environment. For the Cloud Services Process Pack, this information includes Virtual Machine Manager

resources such as virtual machine templates and virtual machine service templates, which are copied

regularly from the Virtual Machine Manager library into the CMDB. This allows virtual machines and

users to be included in Orchestrator run books for automated tasks like request fulfillment, metering,

and chargeback.

User Self-Service

The user self-service solution consists of three elements:

Service Manager Self-Service Portal

Cloud Services Process Pack

App Controller

Service Manager Self-Service Portal. By using the information in the CMDB, Service Manager can

create a service catalog that shows the services that are available to a particular user. For example, let’s

say that a user wants to create a virtual machine in the group’s cloud. Instead of passing the request

directly to Virtual Machine Manager like App Controller does, Service Manager starts a workflow to

handle the request. The workflow contacts the user’s manager to get approval for this request. If the

request is approved, the workflow starts an Orchestrator run book.

The Service Manager Self-Service Portal has the prerequisite of running on a server with Service

Manager. It consists of two roles, which must be located on a single dedicated server:

Web content server

MicrosoftSharePointweb part

Cloud Services Process Pack. An add-in component, which enables IaaS capabilities through the

Service Manager Self-Service Portal and Orchestrator. The Cloud Services Process Pack provides:

Standardized and well-defined processes for requesting and managing cloud services, which

includes the ability to define projects, capacity pools, and virtual machines.

Natively supported request, approval, and notification to enable businesses to effectively

manage their allocated infrastructure capacity pools.

For more information, see Cloud Services Process Pack.

App Controller. The portal that a self-service user would utilize to connect to and manage their virtual

machines and services after a request is fulfilled. App Controller connects directly to Virtual Machine

Manager by using the credentials of an authenticated user to display virtual machines and services,

and to provide a configurable set of actions.

Service Management The service management layer provides the means for automating and adapting IT service management

best practices, which are found in Microsoft Operations Framework (MOF) 4.0 and the ITIL, to provide

built-in processes for incident resolution, problem resolution, and change control.

http://technet.microsoft.com/en-us/library/hh562067.aspx

MOF provides relevant, practical, and accessible guidance for IT professionals. MOF strives to seamlessly

blend business and IT goals while establishing and implementing reliable, cost-effective IT services. MOF is

a downloadable framework that encompasses the entire service management lifecycle. For more

information about MOF, see Microsoft Operations Framework 4.0.

Figure 23: MOF 4.0 Model

Backup and Disaster Recovery In a virtualized data center, there are three commonly used backup types: host-based, guest-based, and

SAN-based. The following table contrasts these backup types:

Backup Type Comparison

Capability Host

Based

Guest

Based

SAN

Snapshot

Protection of virtual machine configuration X X*

Protection of host and cluster configuration X X*

Protection of virtualization-specific data X X

Protection of data inside the virtual machine X X X

Protection of data inside the virtual machine stored on pass-

through disks

X X

Support for VSS-based backups for supported operating

systems and applications

X X X*

Support for continuous data protection X X

Ability to granularly recover specific files or applications inside

the virtual machine

X

Table 8: Backup type comparison

*Depends on storage vendor’s level of Hyper-V integration

http://technet.microsoft.com/en-us/library/cc506049.aspx

Data Protection Manager

System Center 2012 Data Protection Manager enables disk-based and tape-based data protection and

recovery for servers such as SQL Server, Exchange Server, SharePoint, virtual servers, file servers, and

support for Windows desktops and laptops. Data Protection Manager can also centrally manage

system state and bare metal recovery.

When you use Data Protection Manager with Hyper-V, you should be fully aware of and incorporate

the recommendations to protect virtual machines that are located on CSVs. For more information, see

Managing Hyper-V computers.

Security The three pillars of IT security are confidentiality, integrity, and availability. IT infrastructure threat

modeling is the practice of considering what attacks might be attempted against the components in an IT

infrastructure. Generally, threat modeling assumes the following conditions:

Organizations have resources (in this case, IT components) that they want to protect.

All resources are likely to exhibit some vulnerability.

People might exploit these vulnerabilities to cause damage or gain unauthorized access to

information.

Properly applied security countermeasures help mitigate threats that exist because of

vulnerabilities.

The IT infrastructure threat modeling process is a systematic analysis of IT components that compiles

component information into profiles. The goal of the process is to develop a threat model portfolio, which

is a collection of component profiles.

One way to establish these pillars as a basis for an IT infrastructure threat models through MOF. MOF

provides practical guidance for managing IT practices and activities throughout the entire IT lifecycle.

The Reliability Service Management Function (SMF) in the Plan Phase of the MOF addresses creating plans

for confidentiality, integrity, availability, continuity, and capacity. The Policy SMF in the Plan Phase provides

context to help you understand the reasons for the policies, and their creation, validation, and

enforcement. It also includes processes to communicate policies, incorporate feedback, and help IT

maintain compliance with directives. For more information, see:

Reliability Service Management Function

Policy Service Management Function

The Deliver Phase of MOF contains several SMFs to help ensure that project planning, solution building,

and the final release fulfill requirements and create a solution that is fully supportable and maintainable

when operating in production.

http://technet.microsoft.com/en-us/library/hh757970.aspx



Figure 24: Security threat modeling

For more information, please see the following documents:

IT Infrastructure Threat Modeling Guide

Security Risk Management Guide

Security for the solution is founded on three pillars: protected infrastructure, application access, and

network access.

Protected Infrastructure

A defense-in-depth strategy is utilized at each layer of the solution. Security technologies and controls

must be implemented in a coordinated fashion. An entry point represents data or process flow that

crosses a trust boundary. Any portions of an IT infrastructure in which data or processes cross from a

less-trusted zone into a more-trusted zone should have a higher review priority. Users, processes, and

IT components operate at specific trust levels that vary between fully trusted and fully untrusted.

Typically, parity exists between the level of trust that is assigned to a user, process, or IT component

and the level of trust that is associated with the zone in which the user, process, or component resides.

Malicious software poses numerous threats to organizations, from intercepting a user's logon

credentials with a keystroke logger to achieving complete control over a computer or an entire

network by using a rootkit. Malicious software can cause websites to become inaccessible, destroy or

corrupt data, and reformat hard disk drives. Effects can include additional costs such as to disinfect

computers, restore files, or re-enter or re-create lost data. Virus attacks can also cause project teams to

miss deadlines, leading to breach of contract or loss of customer confidence. Organizations that are

subject to regulatory compliance can be prosecuted and fined.

A defense-in-depth strategy, with overlapping layers of security, is a strong way to counter these

threats. The least-privileged user account approach is an important part of that defensive strategy. This

approach directs users to follow the principle of least privilege and log on with limited user accounts.

This strategy also aims to limit the use of administrative credentials to administrators for administrative

tasks only.

http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=2220


Application Access

AD DS provides the means to manage the identities and relationships that make up the solution.

Integrated with Windows Server 2008 R2, AD DS provides the functionality that is needed to centrally

configure and administer system, user, and application settings.

Windows Identity Foundation enables .NET developers to externalize identity logic from their

application, improving developer productivity, enhancing application security, and enabling

interoperability. Developers can enjoy greater productivity while applying the same tools and

programming model to build on-premises software and cloud services. Developers can create more

secure applications by reducing custom implementations and using a single simplified identity model

based on claims.

Network Access

Windows Firewall with Advanced Security combines a host firewall and IPsec. Unlike a perimeter

firewall, Windows Firewall with Advanced Security runs on every computer, and it provides local

defense from network attacks that might pass through your perimeter network or originate inside your

organization. It also contributes to computer-to-computer connection security by allowing you to

require authentication and data protection for communications.

Network Access Protection (NAP) is a platform that allows network administrators to define specific

levels of network access, based on a client’s identity, the groups to which the client belongs, and the

degree to which the client complies with corporate governance policy. If a client is not compliant, NAP

provides a mechanism for automatically bringing the client into compliance (a process known as

remediation) and then dynamically increasing its level of network access. NAP is supported by

Windows Server 2008 R2, Windows Server 2008, Windows7, Windows Vista®, and Windows XP with

Service Pack 3. NAP includes an application programming interface that developers and vendors can

use to integrate their products and to utilize this health state validation, access enforcement, and

ongoing compliance evaluation.

You can logically isolate server and domain resources to limit access to authenticated and authorized

computers. You can create a logical network inside an existing physical network in which computers

share a common set of requirements for more secure communications. To establish connectivity, each

computer in the logically isolated network must provide authentication credentials to other computers

in the isolated network to prevent unauthorized computers and programs from gaining access to

resources inappropriately. Requests from computers that are not part of the isolated network will be

ignored. When you configure security on the Hyper-V server, consider the following:

The antivirus solution on the Hyper-V server needs to support Hyper-V and be configured

appropriately. For more information, see article 961804 in the Microsoft Knowledge Base.

Host-based firewall should be enabled and configured.

Endpoint Protection

Desktop management and security have traditionally existed as two separate disciplines; yet, both play

central roles in helping to keep users safe and productive. Managing the system provides proper

system configuration, deploys patches against vulnerabilities, and delivers necessary security updates.

Security provides critical threat detection, incident response, and remediation of system infection.

http://support.microsoft.com/kb/961804

System Center 2012 Endpoint Protection (formerly known as Forefront® Endpoint Protection 2010)

aligns these two work streams into a single infrastructure. It makes it easier to help protect critical

desktop and server operating systems against viruses, spyware, rootkits, and other threats. Endpoint

Protection provides the following security features:

Single console for management and security: Provides a single interface for managing and securing

desktops that reduces complexity and improves troubleshooting and reporting insights.

Central policy creation: Provides a central location for managers to create and apply all client-related

policies.

Enterprise scalability: Makes it possible to efficiently deploy clients and policies in large organizations

around the globe. By using System Center 2012 Configuration Manager distribution points and an

automatic software deployment model, organizations can quickly deploy updates without relying on

Windows Server Update Service.

Highly accurate and efficient threat detection: Helps to protect against the latest malware and

rootkits with a low false-positive rate, and helps to keep employees productive by providing scanning

that has a low impact on performance.

Behavioral threat detection: Uses system behavior and file reputation data to identify and block

attacks on client systems from previously unknown threats. Detection methods include behavior

monitoring, the cloud-based Dynamic Signature Service, and dynamic translation.

Vulnerability shielding: Helps prevent exploitation of endpoint vulnerabilities through deep protocol

analysis of network traffic.

Automated agent replacement: Automatically detects and removes common endpoint security

agents to lower the time and effort that are needed to deploy new protection.

Windows Firewall management: Makes sure that Windows Firewall is active and working properly to

help protect against network-layer threats. It also allows administrators to more easily manage

protection across the enterprise.

For more information, please see Microsoft System Center 2012 Endpoint Protection.

Service Delivery Layer

As the primary interface with the business, the service delivery layer is expected to know or obtain answers

to the following questions:

What services does the business want?

What level of service are the business decision makers willing to pay for?

How can private cloud move IT from being a cost center to becoming a strategic partner with the

business?

With these questions in mind, IT departments must address two main problems within the service layer:

http://www.microsoft.com/en-us/server-cloud/system-center/endpoint-protection-2012.aspx

How do we provide a cloud-like platform for business services that meets business objectives?

How do we adopt an easily understood, usage-based cost model that can be used to influence

business decisions?

An organization must adopt the private cloud architecture principles & concepts to meet the business

objectives of a cloud-like service. For more information, see the Private Cloud Architecture Principles &

Concepts section earlier in this document.

Figure 25: Service Delivery Layer of the Private Cloud Reference Model

The components of the service delivery layer are:

Financial management: Incorporates the functions and processes that are used to meet a service

provider’s budgeting, accounting, metering, and charging requirements. The primary financial

management concerns in a private cloud are providing cost transparency to the business and structuring a

usage-based cost model for the consumer. Achieving these goals is a basic precursor to encourage desired

consumer behavior.

Demand management: Involves understanding and influencing customer demands for services, plus the

capacity to meet these demands. The principles of perceived infinite capacity and continuous availability

are fundamental to stimulating customer demand for cloud-based services. A resilient, predictable

environment with predictable capacity management are necessary to adhere to these principles. Cost,

quality, and agility factors influence consumer demand for these services.

Business relationship management: Provides a strategic interface between the business and IT. If an IT

department is to adhere to the principle that it must act as a service provider, mature business relationship

management is critical. The business should define the functionality of required services and partner with

the IT department to procure a solution. The business also needs to work closely with the IT department to

define future capacity requirements to adhere to the principle of perceived infinite capacity.

Service catalog: Documents a list of services or service classes that detail the output of demand and

business relationship management. This catalog describes each service class, eligibility requirements for

each service class, service-level attributes, targets that are included with each service class (like availability

targets), and cost models for each service class. The catalog must be managed over time to reflect

changing business needs and objectives.

Service lifecycle management: Takes an end-to-end management view of a service. A typical journey

includes identifying a business need, managing the business relationship, and providing availability to the

service. Service strategy drives service design. After launch, the service is transitioned to operations and

refined through continual service improvement. Taking a service provider’s approach is critical to successful

service lifecycle management. For more information, see the Private Cloud Architecture Principles and

Concepts section earlier in this document.

Service-level management: Includes negotiating service level agreements (SLAs) and making sure that

the agreements are met. SLAs define target levels for cost, quality, and agility by service class, in addition

to the metrics for measuring actual performance. Managing SLAs is necessary to achieve the perception of

infinite capacity and continuous availability. This requires IT departments to implement a service provider’s

approach,

Continuity management and availability management: Continuity management defines how risk will

be managed in a disaster scenario to help make sure that minimum service levels are maintained.

Availability management defines processes necessary to achieve the perception of continuous availability.

The principles of resiliency and automation are fundamental.

Capacity management: Defines the processes necessary to achieve the perception of infinite capacity.

Capacity must be managed to meet existing and future peak demand while controlling underutilization.

Business relationship and demand management are key inputs into effective capacity management, and

they require a service provider’s approach. Predictability and optimization of resource usage are primary

principles to achieve capacity management objectives.

Information security management: Strives to make sure that all requirements are met for confidentiality,

integrity, and availability of the organization’s assets, information, data, and services. An organization’s

information security policies drive the architecture, design, and operations of a private cloud. Resource

segmentation and multitenancy requirements are important factors to consider during this process.

Utilizing all System Center 2012 products and components helps achieve a comprehensive service delivery

layer in the private cloud.

Operations

The operations layer defines the operational processes and procedures that are necessary to deliver IT as a

service. This layer uses IT service management concepts that can be found in prevailing best practice such

as ITIL or MOF.

The main focus of the operations layer is to carry out the business requirements that are defined at the

service delivery layer. Cloud-like service attributes cannot be achieved through technology alone; mature

IT service management is also required.

The operations capabilities are common to all three services: IaaS, PaaS, and SaaS.

Figure 26: Operations Layer of the Private Cloud Reference Model

The components of the operations layer include:

Change management: Responsible for controlling the lifecycle of all changes. The primary objective is to

implement beneficial changes with minimum disruption to the perception of continuous availability.

Change management determines the cost and risk of making changes and balances them against the

potential benefits to the business or service. Driving predictability and minimizing human involvement are

the core principles behind a mature change management process.

Service asset and configuration management: Maintains information about the assets, components, and

infrastructure that are needed to provide a service. Accurate configuration data for each component and

its relationship to other components must be captured and maintained. This data should include historical,

current, and expected future states, and it should be easily available to those who need it. Mature service

asset and configuration management processes are necessary to achieve predictability.

Release and deployment management: Responsible for seeing that changes to a service are built, tested,

and deployed with minimal disruption to the service or production environment. Change management

provides the approval mechanism (determining what will be changed and why), but release and

deployment management is the mechanism for determining how changes are implemented. Driving

predictability and minimizing human involvement in the release and deployment process are critical to

achieve cost, quality, and agility goals.

Knowledge management: Responsible for gathering, analyzing, storing, and sharing information within

an organization. Mature knowledge management processes are necessary to achieve a service provider’s

approach, and they are a key element in IT service management.

Incident and problem management: Strives to resolve disruptive, or potentially disruptive, events with

maximum speed and minimum disruption. Problem management also identifies root causes of past

incidents and seeks to identify and prevent (or minimize the impact of) future incidents. In a private cloud,

the resiliency of the infrastructure helps make sure that faults, when they occur, have minimal impact on

service availability. Resilient design promotes rapid restoration of service continuity. Driving predictability

and minimizing human involvement are necessary to achieve this resiliency.

Request fulfillment: The goal of request fulfillment is to manage user requests for services. As the IT

department adopts a service provider’s approach, it should define available services in a service catalog

based on business functionality. The catalog should encourage desired user behavior by exposing cost,

quality, and agility factors to the user. Self-service portals, when appropriate, can assist the drive towards

minimal human involvement.

Access management: Attempts to deny access to unauthorized users while making sure that authorized

users have access to needed services. Access management implements the security policies that are

defined by information security management at the service delivery layer. Maintaining smooth access for

authorized users is critical to achieve the perception of continuous availability. Adopting a service

provider’s approach to access management will also make sure that resource segmentation and

multitenancy are addressed.

Systems administration: Performs the daily, weekly, monthly, and as-needed tasks required for system

health. A mature approach to systems administration is required to achieve a service provider’s approach

and to drive predictability. The vast majority of systems administration tasks should be automated. The use

of Service Manager and Orchestrator help achieve a comprehensive and mature operational process for

the private cloud.

Appendix:

Detailed Private Cloud Fast Track SQL Server Design Diagram

Figure 27: System Center 2012 SQL Server Requirements (Fast Track)