21
* Protecting Converged Virtual Infrastructures A reference architecture for VMware Data Protection using CommVault® Simpana® IntelliSnap™ and Dell EqualLogic Arrays

Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

Embed Size (px)

Citation preview

Page 1: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Protecting Converged Virtual

Infrastructures

A reference architecture for VMware Data

Protection using CommVault® Simpana®

IntelliSnap™ and Dell EqualLogic Arrays

Page 2: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Tableaaaaa

Contents

Converged Virtual Infrastructure Building Block ........................ 4

Data Protection Challenges in a Virtual Environment .................. 7

Integrated Protection for the Converged Virtual Environment ........ 8

Reference Architecture ..................................................... 17

Page 3: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Data centers continue to evolve from an environment based on physical servers and storage to one based on virtual platforms.

Such modern data centers based on converged infrastructure (servers, storage, networking and applications) demand an

integrated approach to deploying, managing and protecting critical business resources within a consolidated virtual

environment. Due to unprecedented scale and consolidation afforded by VMware vSphere, today’s Data Centers have begun

to resemble private clouds where the physical infrastructure complexity is completely hidden from users and application

owners to such a high degree that protecting and managing the data for each user and each application has become

challenging to say the least. A proven tool and a simple approach are needed to provide agility and automation to how

individual application and user data is protected and managed in modern data centers.

Integrated and scalable data protection for Microsoft and Linux applications

Too often in a super virtualized cloud-like data center with converged infrastructure, data protection and management are an

afterthought, addressed with limited-purpose external tools leading to sub-optimal results and poor resource utilization. The

critical and ever more complex task of data management within virtual machines – including protection, recovery and long

term retention – in a densely populated data center with multiple applications like Microsoft SharePoint and Exchange and

multiple databases like Oracle and MS-SQL, needs a more integrated and flexible approach.

CommVault® Simpana® software is a proven solution that provides an integrated, unified, and robust end-to-end data

management solution that seamlessly enables data protection policies in physical as well as virtual environments. CommVault

software natively integrates with VMware vStorage APIs and EqualLogic PS Series APIs for agile and automated data

protection and recovery, while minimizing impact on production systems. Organizations that use VMware and EqualLogic

virtualization infrastructure can confidently transition to a modern data center knowing that they can bridge the same level and

granularity of protection as in physical environments, but optimized to deal with the unique challenges of virtualization.

Page 4: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Converged Virtual Infrastructure Building Block

Dell and CommVault introduce an integrated converged virtual infrastructure that combines the best of Dell servers, storage

and networking components with industry leading CommVault® Simpana® software. The combined solution forms the core

building block for IT departments looking to build scalable VMware based private clouds.

Figure 1: Converged Virtual Infrastructure with Dell and CommVault® Software

The building block includes all components needed to build a truly agile and responsive virtual data center, ranging from very

small environments to the very large data centers with agile and automated data protection and recovery.

Key components of the converged virtual infrastructure building block are

Dell PowerEdge M1000e Blade Enclosure

Dell PowerEdge M610, M610x, M710 or M910 blade servers

Dell PowerConnect M6220, M8024, 8024F Switches

Dell EqualLogic PS 6510X or PS 6010XV for primary storage

Dell EqualLogic PS 6510E for tier 2 secondary storage

CommVault® Simpana® Software for Data Protection and Management

Optional Dell PowerEdge DL2200 with MD1200 enclosures for tier 3 and offsite storage

Optional Dell DX Object Store Platform for long term granular retention and archiving.

Page 5: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Integrated and robust end-to-end data protection for the virtual machines on this converged virtual infrastructure is provided by

CommVault® Simpana® software. Automated protection and recovery can be delivered using the Simpana® software

integration with VMware vStorage APIs and EqualLogic PS APIs, while also minimizing impact on production systems.

Organizations can confidently shift to a converged modern data center knowing that they get the same level of protection as in

physical environments, but optimized to deal with the unique challenges of virtualization.

Key Benefits of the Converged Virtual Infrastructure

Flexible: Supports workloads for the smallest virtual environments to very large and dense virtual data centers

Linearly scalable: Simply add identical virtualization blocks enabling predictable and manageable growth

Highly available and reliable: Built-in redundancy for all key components ensures there is no single point of failure in

the entire block.

Integrated Data Protection and Recovery: Includes all components necessary for end to end data protection and

recovery.

EqualLogic PS Series Arrays

The key to any successful VMware deployment is the underlying storage system. The EqualLogic PS Series storage arrays

form the backbone of the Converged Virtual Infrastructure Building Block, delivering consolidated, highly-available and

scalable iSCSI storage. At the core is the peer storage architecture that allows multiple arrays to function as a single

virtualized pool of highly-available storage. This enables businesses to shift and grow storage resources as needed with

minimal disruption to existing production workloads, a flexibility that is critical to a successful VMware deployment. By using

Ethernet as the underlying networking protocol, EqualLogic PS Series arrays also enable IT groups to avoid the cost and

complexity of specialized storage networks and allow them to capitalize on their existing investment in and knowledge of

server networking infrastructure.

The PS series is available in a wide variety of flavors for differing workloads. For instance, the Converged Virtual Infrastructure

includes the 10K RPM SAS based PS6510X for hosting VMs with moderate workloads, the 15K RPM SAS based PS6010XV

(or SSD based PS6010 XS) for VMs with high I/O requirements and SATA based PS6010E for hosting backup copies. Each of

these PS series arrays have the inherent benefit of the peer storage architecture, allowing capacity and performance to

increase seamlessly, linearly and on demand.

EqualLogic PS Array Features and Benefits

Peer storage architecture

The EqualLogic PS Series is based on a unique, peer storage architecture. In this context, peer describes the collaboration

and equal partnership of a single, simple architecture; components and arrays function as peers, working together to share

resources, evenly distribute loads, and collaborate to help optimize application performance and provide comprehensive data

protection.

The result is an intelligent storage array that can deliver rapid installation, simple management, and seamless expansion.

Using patented, page-based data mover technology, members in a storage area network (SAN) work together to automatically

manage data, load balance across resources, and expand to meet growing storage needs. Because of this shared

architecture, enterprises can use PS Series arrays as modular building blocks for simple SAN expansion.

This architecture provides the basis for numerous features and capabilities, including peer deployment, control, provisioning,

protection, and integration

Peer deployment is a SAN configuration technology that can sense network topology, automatically build RAID sets, and

conduct a system health check to help ensure that components are fully functional. Peer deployment enables IT staff to

potentially install, configure, and deploy most EqualLogic arrays in minutes.

Page 6: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Peer control offers virtualized storage management with a single view. PS Series arrays are designed to be self-managing;

systems are designed to continuously monitor storage resources and automatically load balance data across controllers,

network connections, and disk drives to help deliver optimal performance. Peer control automates key functions for

configuration, management, storage pooling, and data distribution, helping minimize the complexity of storage administration.

Peer provisioning enables administrators to dynamically provision resources to meet application requirements—including not

only disk space, but also connectivity, security, performance, and data protection. When application requirements change, the

storage configuration can change seamlessly. Peer provisioning is designed to simplify expansion while systems remain

online; new arrays can be automatically added to the group and automatically connect to the SAN. Expansion is linear,

enabling administrators to scale not only disk drives but also controllers, ports, cache, and performance as the environment

grows. Peer provisioning enables enterprises to purchase storage on demand, which facilitates efficient use of both capital and

storage resources. Advanced thin provisioning capabilities are included, giving administrators additional flexibility in providing

storage to applications. Administrators can also allocate virtual storage to volumes up to preset limits and add physical

capacity on demand. In both physical and virtual server environments, cloned volumes can be rapidly assigned to different

servers to help meet changing needs.

Peer protection starts with a robust design that avoids single points of failure and is designed to provide greater than 99.999

percent availability. It also includes built-in features such as application-aware snapshots for quick recovery and remote

replication for disaster protection. These features enable administrators to quickly create end-to-end solutions that can help

provide comprehensive protection against multiple types of failure or outage.

Peer integration provides a comprehensive software toolkit to facilitate the deployment, ongoing management, and protection

of EqualLogic SANs in Microsoft® Windows® OS environments and VMware® environments.

vStorage API for Array Integration

Organizations can confidently shift to a converged modern data center knowing that they get the same level of protection as in

physical environments, but optimized to deal with the unique challenges of virtualization. Dell and VMware collaborated in the

development of the VMware vStorage APIs for Array Integration (VAAI) to help improve the scalability of a virtualized

environment while enhancing business agility. These APIs provide native integration between the hypervisor hosts and the

storage arrays, enabling the VMware vSphere software to work directly with the Dell EqualLogic PS Series arrays. These APIs

enable administrators to offload certain storage workloads from the host servers to the SANs. These offloading operations can

dramatically accelerate VM deployment tasks by avoiding the requirement to read the data all the way up the storage stack

and write it back down the storage stack. Instead, this operation occurs within the storage array. Administrators can deploy

VMs even faster than before, enhancing the agility of IT and the business. Helping to reduce processor, memory, and network

burdens also makes the environment increasingly efficient—there are additional server and network resources available for

running an increased number of VMs. These APIs also enable administrators to scale the virtual environments to higher

densities than before through the use of increasingly granular protection of VMFS metadata provided by Hardware Assisted

Locking. This command enables administrators to increase the number of VMs in each data store that can now utilize

increased volume sizes.

Some additional key features for VMware include

Storage Adapter for Site Recovery Manager (SRM): Enables VMware SRM to take advantage of PS Series replication for

full SRM integration.

Multipathing Extension Module for VMware® vSphere: Enhances VMware multipathing functionality with connection

awareness of PS Series network load balancing.

PSAPI Services: For high level of automation and coordinated operations, most of the PS-Series tools and controls are

available via the PS-API, enabling closer integration with Simpana® software to customize and automate many of the

EqualLogic snapshot and administrative functions.

Page 7: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Built-in monitoring: Includes management tools and consolidated monitoring of individual volumes, arrays, pools and tiers of

storage. SAN HQ enables detailed historical reporting and analysis.

Data Protection Challenges in a Virtual Environment

Traditional data protection solutions use resource intensive agents to physically collect and move data from production storage

to a backend disk or tape target. This legacy method of using brute processing power to move data from one place to another

is not sufficient. Moreover, the traditional practice of depending on previous night’s backup is simply no longer acceptable for

modern Service Level Agreements (SLAs). Daily change rates are too high to make this a reliable recovery strategy.

With the shift to virtualized data centers and round the clock operations, there is a need to rethink the traditional data

protection techniques. Data protection and data recovery must have minimal front end impact and cannot exclusively rely on

copying from the production to the backend. An effective solution minimizes the load on production systems, minimizes

administrative effort and eases the transition to a virtualized and eventually a cloud-based data center. Some key challenges

for data protection in a virtual environment are

Not enough time to move data for backup: High server consolidation and high VM density concentrates data ownership to a

small number of ESX servers with most resources dedicated for production workloads. There are little resources, if any, left for

traditional backup tasks that move data nightly. Moreover, high data growth and round the clock operations are rapidly

shrinking backup windows.

Multiple Recovery Points: With high data growth and change rate, relying on last night’s backup for recovery is no longer

sufficient. Organizations are demanding Recovery Point Objectives (RPO) of hours. In other words, it is necessary to be able

to recover to a few hours ago, not to last night, to minimize data loss as a result of disruption. Creating frequent recovery

points without disrupting production activity is a huge challenge.

Enforcing Data Protection policies: The ease of deploying new VMs leads to a virtual machine sprawl making it tedious and

time consuming for administrators to keep track of new virtual machines and ensure correct data protection and retention

policies are applied to them. Administrators spend a significant part of their day tracking down new VMs and manually applying

data protection policies.

Application Integration: As more and more mission critical applications are virtualized it is necessary to provide the same

level of protection and recovery capabilities for these applications, while staying within the bounds of constraints imposed by a

highly consolidated VMware environment.

Multiple Recovery Options: Need to perform full VM as well as granular recovery from a single protection operation

Long Term Retention: Moving data into virtual servers does not eliminate the need for long term retention of data. However,

it makes the process more complicated since backup now involves full VM images. Retaining full VM images for several

months if not years is extremely wasteful, even with techniques like deduplication. Businesses are looking for smarter ways to

retain critical data, especially individual objects within applications, for long duration independently of the VM backup images.

CommVault® Simpana® Software: New Approach to Data Management

CommVault® Simpana® software is a revolutionary data management solution that not only addresses the challenges arising

out of limitations in legacy data center environment, but more importantly, accelerates the shift into virtualized and cloud-

enabled data centers. With Simpana software businesses can start to realize tangible benefits on day one of deployment as

they transition from traditional environments to the modern data center. More critically, using techniques that access data

once, but reused for multiple operations, businesses can reap the advantages of the modern data center immediately with

Simpana software, bypassing many of the pitfalls that arise from trying to force-fit legacy techniques or point solutions.

With CommVault® Simpana® software you can:

Protect hundreds of virtual servers in minutes with no impact on physical production servers.

Page 8: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Protect very large VMware environments with thousands of virtual machines

Automatically discover new virtual machines for guaranteed protection with minimal administrator intervention

Use embedded source side deduplication for rapid creation of secondary DR copies.

Create 100% application consistent protection copies.

Rely on Granular Message level protection for granular restores, content search and eDiscovery

Customers can use Simpana software to rapidly modernize their data centers into private clouds to take full advantage of the

advances in virtualization technology while continuing to meet all the data management and data retention needs.

Integrated Data Protection for the Converged Virtual Infrastructure

Figure 2 illustrates an example of the converged virtual infrastructure using Dell blade servers, networking and Dell

EqualLogic storage with integrated Simpana® data protection software.

Figure 2: Converged Virtual Infrastructure Example with Optional Components

The production blade servers (highlighted in the green box) run VMware ESX and host the production virtual machines. These

virtual machines host their virtual machine disks on the primary EqualLogic group. This group consists of one or more

EqualLogic PS 6510 X or PS 6010XV storage arrays connected via 10 GigE iSCSI to the production ESX servers

A dedicated or lightly loaded utility blade server (highlighted in red) is responsible for end to end data protection and recovery

of all the VMs in the building block. The utility node runs an ESX server and a virtual machine to which most of the physical

resources are allocated. The virtual machine runs the Simpana® Virtual Server Agent (VSA) and MediaAgent (MA) that

executes data protection policies with no impact on the production ESX blades or the virtual machines. Simpanasoftware

drives the secondary copies on the Secondary EqualLogic Group which contains Tier 2 EqualLogic PS 6510E SATA based

storage. These backup copies are deduplicated and indexed.

Also shown are optional components, the Dell DL2200 which functions as Tier 3 disk target for long term retention of data. The

architecture also includes the Dell DX6000 object store which Simpana software utilizes to drive message level protection

policies for long term retention of messages from Exchange mailboxes hosted on virtual machines.

Page 9: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

The integrated Simpana solution on the single utility blade affords a variety of data protection solution for this environment

depending on recovery needs. These include:

Snapshot based Protection With IntelliSnap™ for Virtual Servers: Leverages snapshot capabilities in the EqualLogic PS

storage array to create persistent snapshot based recovery copies on primary storage to protect hundreds of virtual machines

in minutes. Also supports creation of deduplicated and indexed copies on secondary EqualLogic from the snapshot copies

using vStorage API for Data Protection (VADP).

Incremental Forever Backup with vStorage API for Data Protection: Leverages VMware APIs and Change Block Tracking

(CBT) for incremental forever backups. Deduplication enhanced DASH Full to create full secondary copies

Snapshot based Protection With IntelliSnap™ for Virtual Servers

Embrace EqualLogic Snapshots with IntelliSnap™ for Virtual Servers

Traditional data protection techniques for VMware such as vStorage API for Data Protection (VADP), or VMWare Consolidated

Backup (VCB) rely on an off-host agent to protect virtual machine images. As efficient as VADP is, it is still a streaming

method that moves the image files from the datastore to backup disk for protection. This method suffices for small to moderate

sized environments with reasonable backup windows. For larger environments with ever shrinking backup windows, there is

simply not enough time or bandwidth to move all the VM data. Moreover in a highly dense virtual environment, VADP backups

could impose a tremendous burden on the production ESX hosts and virtual machines each time backup job runs. With an

estimated average of 40% data growth rate every year, this method is simply not sustainable.

IntelliSnap™ for Virtual Servers solves these issues by integrating with the EqualLogic PS APIs and the native hardware

snapshot capabilities in the EqualLogic

PS array to create persistent snapshot-

based recovery copies for virtual

machines.

IntelliSnap™ for Virtual Servers using

Virtual Server Agent and the Media

Agent module are configured on a

Windows system. This system could

be a virtual machine or a physical

machine. In the converged virtual

infrastructure block, this is hosted on a

single half height blade server or utility

blade. This backup ESX proxy can

also run on the ESXi version of the

hypervisor.

A IntelliSnap™ job leverages VMware

APIs to prepare VMs for backup.

However, instead of using VADP to

copy data blocks, it executes a rapid

hardware snapshot. The sequence is

as follows

Discover new VMs based on

pre-defined criteria.

Quiesce VMs in protection policy to ensure a consistent set of image files.

Determine data stores associated with the VMs.

Execute hardware snapshot using EqualLogic PS APIs, this takes a few seconds.

Figure 3: Virtual Server Protection Workflow using VSA, IntelliSnap and Media Agent Combination

Page 10: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Un-quiesce the VMs to resume normal operations.

Index the snapshots and the VM list inside the snapshot.

The IntelliSnap operation takes only a few minutes, and the VMs themselves are made quiescent for a very short time. The

short duration with minimal impact on virtual servers allows the creation of multiple recovery copies per day, enabling more

aggressive recovery SLAS. Simpana VSA software also includes the Live Browse capability that allows users to browse and

recover individual files from Windows virtual machines directly from the hardware snapshot copy.

Extend Hardware Snapshots with Deduplication enabled copies

EqualLogic snapshots when used with Simpana® IntelliSnap™ technology to allow rapid fire creation of persistent and

consistent snapshot copies. These snapshot copies are ideal for applications that require rapid recovery and multiple recovery

points per day. However, there are certain considerations to keep in mind with snapshots

Snapshots need reserve space to keep track of changed blocks. More snapshots require more reserve space, leading to

less disk utilization.

Consequently, snapshots provide limited retention. Retaining snapshots for more than a couple of weeks may require

sizeable reserve space.

Since snapshots operate at the volume level, indexing contents of snapshots and virtual machines is challenging.

Applications running inside VMs add an additional layer of complexity.

Snapshots depend on the original volume, if the original volume is lost, dependent snapshot are unavailable.

Most environments have longer retention requirements than what can be met only by snapshot copies. In a typical

environment with 30-90 days retention requirement, one would create snapshot based recovery copies every 4-8 hours that

are retained for 7-14 days. Longer term retention copies require conventional backup techniques.

Simpana software extends the lifetime of a snapshot by creating offline, fully cataloged and deduplicated backup copies on

secondary disk targets from snapshots. The IntelliSnap and Virtual Server Agent combination (or IntelliSnap™ for VSA)

includes the ability to copy the contents of selective snapshots to a deduplication enabled disk target. In the converged virtual

infrastructure the secondary disk target could be a secondary EqualLogic group or the Dell DL2200 appliance.

Figure 4: Creating Secondary Copy from snapshots

When this option is selected, VSA selectively mounts a snapshot to a proxy ESX server (the utility blade) and uses VADP APIs

to copy the contents of the snapshots to a deduplicated disk library. VSA also indexes the contents of the VM images as they

are copied. Since the snapshot already contains consistent VM images, no interaction is needed with the production ESX

servers and VM as the backup copy is created. Moreover, VSA takes full advantage of Change Block Tracking (CBT) to

support incremental backup copies from snapshots.

Page 11: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

IntelliSnap™ for VSA provides all the benefits of hardware snapshot based protection with the ability to extend these

capabilities to meet long term retention objectives as well.

Benefits of IntelliSnap™ for VSA

Extremely fast, reliable protection for data in virtual machines, with minimal production impact. Allows critical systems to

be virtualized for more rapid ROI from the shift to the virtual platform.

Multiple recovery points per days allows users to recover from a point closer to the point of failure, allowing recovery to a

more recent point in time than simply last night’s backup.

Retain snapshots on primary EqualLogic for 1-2 weeks, while also extending the contents of the snapshots to secondary

EqualLogic for long term retention

Minimize the reserve space required for snapshots.

Index the contents of the snapshots for granular file level recovery from long term copies.

Since the backup copy is created from a snapshot, it has minimal impact on the production ESX server and virtual

machines.

Leverage Change Block Tracking (CBT) while creating secondary copies.

Embedded deduplication reduces the backup storage footprint significantly.

Page 12: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Incremental Forever Backups with DASH Full

In environments that do not require multiple recovery points per day or cannot support snapshot based protection, Simpana®

software offers an alternate method

to protect all VMs in the

infrastructure without

overburdening production VMs and

ESX servers.

VMware provides vStorage API for

Data Protection (VADP), an API set

for backup vendors to use for

backing up virtual machines using

an external proxy. VADP eliminates

the need for configuring backup

agents inside each VM. A key

element of VADP is Change Block

Tracking (CBT) which identifies

only those blocks that have

changed since the last backup.

Incremental backup jobs can use

CBT to quickly identify changed

blocks and back them up.

The Simpana® VSA agent on the

utility blade takes full advantage of

the VADP and CBT to enable

incremental forever backups to the

secondary EqualLogic pool. The

distributed, self-protected, index ensures any incremental job is capable of performing a single pass full VM recovery or

granular file level recovery.

Incremental backups with CBT are a great way to create fully recoverable backup copies without overburdening the VMs.

However, it may be necessary to create a full backup set for exporting to an alternate site or to offline media like tape. Running

a periodic full backup can impose significant overhead on the production VMs. An alternative is to create Synthetic Full backup

copies.

A traditional Synthetic Full reads the last full backup and all incrementals that follow and writes back a new consolidated full.

On a deduplication enabled disk, this is a very expensive and wasteful approach. The traditional synthetic full job reads all the

data, thereby rehydrating it, creates a new consolidated full and feeds this full backup to the deduplication engine for

processing. However, since all the data blocks are already on disk, the deduplication engine never writes these to disk. The

net result is that the Synthetic Full job wasted a lot of time and cycles reading and deduplicating data that resulted in only

reference pointers being updated on disk. This costly, time-consuming approach is illustrated in Figure 5.

Figure 4: Virtual Server Protection Workflow using an Incremental Forever Backup Approach

Page 13: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Figure 5: Traditional Synthetic Full

Simpana® software solves the rehydration problem by including the option of Deduplication Accelerated Synthetic Full or

DASH Full. DASH Full provides a smarter way to create a consolidated full backup on secondary EqualLogic disk storage

pool. Since all the data blocks are already on the pool, DASH full operation does not waste time reading the data blocks and

re-processing it for deduplication. Instead it merely creates a new index that contains references to the appropriate data blocks

and updates reference counts in the deduplication database. As a result the DASH Full creates a new consolidated full in a

much shorter time than a traditional Synthetic Full. This dramatically more efficient approach to deduplicated, CBT enabled

VADP backup is illustrated in figure 6.

Figure 6: DASH Full

Simpana® software combines the power of CBT with DASH Full to enable an incremental forever strategy for VMware

backups. The incremental forever backup minimizes the impact on production virtual machines and ESX servers and reduces

the amount of data that needs to be read and moved from the production storage to the backup disk. At the same time DASH

full enables the rapid creation of a consolidated full backup with no impact on the production ESX hosts, virtual machines or

production storage.

Benefits of an Incremental Forever Backup with DASH Full Include

Incremental forever with CBT minimizes the impact of backup in the production ESX host, virtual machines and data

stores, reading and moving only data that has changed.

Each Incremental backup enables full VM recovery in a single pass

Allows multiple incremental backups per day for shorter RPO.

Fast consolidated full for exporting weekly/monthly fulls to tier 3 disk or offline media like tape or cloud for long term

retention.

Creation of consolidated Full has no impact whatsoever on the production ESX hosts, virtual machines and data stores.

Page 14: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Application Integration

Application Integrated Protection with IntelliSnap™ for Virtual Servers

A critical consideration when virtualizing mission critical applications is the ability to protect data in a fully application consistent

manner. One solution is to treat VMs running applications as regular servers and configure traditional backup agents inside the

guest VMs to guarantee application consistent backup. While Simpana® software does support this approach with source side

deduplication for efficient processing, this method may impose additional burden on the production hosts. Alternately, VADP

provides a method to leverage Microsoft VSS to create application consistent VM snapshots, but application consistency is not

guaranteed in all cases.

Simpana IntelliSnap™ for Virtual Servers offers a unique solution to this challenge. VSA integrates with applications inside

virtual machines to ensure the applications are in a consistent state before creating an EqualLogic snapshot based recovery

copy. This guarantees that the snapshot copy contains an application consistent representation of the VM image. This is

achieved without placing undue burden on the virtual machine or the ESX host.

The image illustrates how IntelliSnap for VSA integrates with applications running inside a virtual machine. It utilizes a

lightweight application aware, VSS aware and license free module inside the virtual machine hosting the application. Before

the IntelliSnap job makes the VM quiescent it interacts with the lightweight module to prepare the application for backup using

VSS. Once the application is ready, SPE invokes PS API to

create a hardware snapshot of the datastore on the primary

EqualLogic storage array. Once the EqualLogic snapshot

creation is complete, the application is released for normal

operations and the VM is made un-quiescent. This whole

operation lasts a few minutes. Secondary backup copy

operation can then be used to copy the contents of the

snapshot, including the application consistent VM image, to

the secondary EqualLogic group. Applications can be

recovered by recovering the application database files and

using native application tools to bring the application back

online.

IntelliSnap for VSA supports Microsoft Exchange, Microsoft

SQL and Microsoft SharePoint Database on virtual machines

for VSS consistent protection. In addition, Simpana software

supports log truncation for Exchange Databases on inside

virtual machines that are protected with IntelliSnap for VSA.

The software module inside the application VM does not

actually move any data and uses few resources, if any, during the protection operation. It does not require special license can

be remotely installed and updated. In addition, it acts as a restore target allowing file level restores directly to the virtual

machine, without the need to temporarily stage the files being recovered.

Message Level Protection for Exchange

Simpana software includes ability to provide message level protection for Exchange on VMware virtual machines. Message

level protection for Exchange uses the snapshots created by IntelliSnap for VSA as the source for mailbox backup. This

extends all the benefits of snapshot based off-host protection to Exchange Mailboxes hosted on virtual machines including

Off-host processing for no impact Mailbox backup. Backup job mounts and extracts contents from snapshots copies

created by the IntelliSnap™ jobs automatically.

Supports message level Full, Incremental and Synthetic (DASH) Full backups.

Figure 7: Application Integrated Protection with IntelliSnap for VSA

Page 15: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

All recovery benefits of Mailbox backups including ability to recover individual messages.

Separate retention for messages, snapshot copies and VMware images, leading to lower overall storage consumption.

Enables Content Indexing of e-mails for end user and compliance search and information management, without the need

for agents inside the virtual machine.

It is a common requirement within organizations to retain e-mail messages for significantly longer periods than backup images.

Most VMware backup tools that can only protect VMware images force retention of full VM images for long periods of time to

satisfy e-mail retention needs. This is a wasteful strategy and requires lot more disks space for backups. As a result, most

organizations either do not virtualize Exchange servers, or if they do virtualize them, use traditional Mailbox backup agents

inside the guest VMs, in addition to VADP backups, imposing additional burden on the Exchange VMs and the underlying ESX

servers.

With Message level protection for Exchange, Simpana® software allows businesses to protect and retain messages

separately from the EqualLogic snapshot copies. For example, IT departments can retain snapshot copies and the associated

VM images for short duration (7-14 days) for VM level recovery. Message level backups can be retained separately on

secondary EqualLogic, DL2200 or DX6000 for several months or years, without the need to preserve the full VM images for

the extended duration, leading to significantly less space overhead. Moreover, all the granular recovery capabilities of mailbox

backups are available.

Multiple Recovery Options

Simpana® software enables several recovery options for virtual servers.

Recovery options from EqualLogic snapshots

Full VM Restore: Recover entire virtual machines from the EqualLogic snapshot copies. Simpana® software automatically

and seamlessly mounts the snapshot in question and recovers the selected VMs to the specified location. The complexities of

snapshot related operations (identify, mount, register, unmount) are completely hidden from end users.

File Level Restore with Live Browse: Recover files from Windows virtual machines directly from the EqualLogic snapshot

copies. Simpana® software mounts the appropriate snapshot copy and presents a file level view of the contents of the

selected virtual machine. All storage complexities are masked from end users.

Application Level Recovery: Recover application database files using File Level Live Browse from EqualLogic snapshot

copies.

Exchange Message Level Mining: Recover Exchange Database files to an alternate location using File Level Live Browse

from an application consistent snapshot copy. Use Offline Mining tool to view and copy individual messages

SharePoint Document Level Mining: Recover SharePoint Database files with File Level Live Browse from an application

consistent snapshot copy. Use Offline Mining tool to view and copy individual documents.

Full Volume Restore: Recover the entire data store from a snapshot using native EqualLogic tools.

Recovery Options from Secondary EqualLogic Copy

Simpana® software supports a wide variety of restore options from secondary disk copies. IntelliSnap™ for VSA and

Incremental Forever backups provide similar restore options

Full VM Restore: Recover entire virtual machines to the original or alternate location

File Level Restore: Recover individual files directly to the original VM or an alternate location. Supported for Windows virtual

machines and Linux virtual machines with ext3 file systems

Page 16: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Message Level Recovery: When using IntelliSnap™ for VSA and Message Level Protection, Simpana® software allows

message level recovery similar to recovery from traditional Mailbox level backups.

Application Level Recovery: When using IntelliSnap™ for VSA and backup copy to secondary EqualLogic, it is possible to

recover application database files to original or alternate location.

The recovery experience is identical whether data is being recovered from a snapshot on the primary EqualLogic or backup

copy on the secondary EqualLogic. The end user need not know where the data is residing, only needs to know the data to be

recovered. If the recovery point is a snapshot, Simpana® VSA uses the appropriate PS API calls to mount the snapshots and

recover the data. If the recovery point is on the secondary EqualLogic, Simpana® software presents the same consistent

recovery user interface.

Automatic VM Discovery and Automated Data Protection

Given the ease of creating virtual machines, new ones are created all the time, almost always without the knowledge of the

storage or backup admins. The admins, who are tasked with ensuring all data in the environment is protected, have to spend

an inordinate amount of time every day identifying new virtual machines and their owners, determining their purpose and

deducing their data protection and retention policy. These manual tasks lead to tremendous loss of productivity and wastage

of time that could otherwise have been spent on projects that help drive business growth.

Simpana® VSA includes the ability to automatically discover new virtual machines based on pre-defined rules and

transparently add them to data protection policies. This ensures virtual machines are protected even though administrators

have no knowledge about their existence. Wide variety of auto-discovery rules are available that, allow administrators to fine

tune discovery policies that best meet their needs. There is even a catch-all policy to ensure VMs that do not meet any of the

pre-defined criteria are protected nevertheless.

With this option, administrators can set the discovery rules once and never have to bother spending a minute on looking for

new virtual machines to protect. This can save

hours per person, a time that can be utilized for

more fruitful activities.

The most commonly used rule for IntelliSnap™ is

the Data Store Affinity rule that groups all VMs in

the datastore in a single data protection policy, or

subclient. This ensures that when a datastore

LUN is snapped all the virtual machines hosted

in that data store are in a consistent state,

eliminating the possibility of “dirty” VM image

within the recovery copy. The data store affinity

rules also allows the backup copy process to

distribute the workload of copying data blocks

equally across all LUNs without excessively

over-burdening any particular LUN.

Benefits of Virtual Machine AutoProtection Include

Increased IT staff productivity, freeing up more time for administrators to drive business critical projects that drive

revenue.

Effective workload distribution ensures best possible use of storage LUNs without overwhelming any single one, allowing

production processes to run with minimal impact.

Figure 5: Automatic Discovery Rules

Page 17: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

Reference Architecture for Virtualization Building Block

This section describes 3 reference architectures for the Converged Virtual Infrastructure building block. Each of the reference

architecture provides a different level of scalability and capacity and can be combined to build an environment of desired size.

Each building block, regardless of the size, contains an integrated data protection solution with Simpana® software.

Small Reference Architecture

Designed for the small data center that is in the initial stages of virtualization. Optimized to enable virtualization of non-critical,

non-application servers.

Possible Virtual Machine configuration # of VMs per ESX

Total # of VMs

Avg VM size (GB)

Avg. VM RAM (GB)

Description

28-35 200-250 60-100 2-4 Number of VMs is constrained by how much memory is available on each blade and the number of cores. Typical VMware environments contain 4 server class VMs per core

Server and Storage Configuration Device Role Qty Configuration/Description

Server Configuration

MD1000e Blade Enclosure 1

PE M710 Production ESX Servers 7 48-96 GB RAM per blade

Primary ESX servers

PE M610 ESX Server/Utility Blade 1 96 GB RAM

Optional vCenter, optional Simpana® CommServe®, SAN Management software (SANHQ)

PE M610 ESX Server/Data Protection Utility Blade

1 96 GB RAM

ESX Proxy running Simpana® VSA and MediaAgent with integrated deduplication

Storage Configuration

PS 6510X Primary Storage 1 28.8 TB, ~24 TB usable

About 20% usable space (4 TB) is reserved for snapshots, VMWare swap files and Simpana® DDB volume. This leaves about 20 TB of storage for VMs

PS 6010XV Alternate Primary Storage 2-3 19.2 to 28.8 TB, ~15 to ~24 TB usable

20% usable is reserved for snapshots, VMware swap files and Simpana® DDB. Use when VMs run critical apps demanding high IO performance.

PS 6010E Secondary Storage 1 32 TB, ~28 TB usable

Used as backup target. Present 2 TB LUNs to Simpana® VSA/MA virtual machine.

Medium Reference Architecture

Possible Virtual Machine configuration # of VMs per ESX

Total # of VMs

Avg VM size (GB)

Avg. VM RAM (GB)

Description

50-70 400-500 60-100 2-8 Number of VMs is constrained by how much memory is available on each blade and the number of cores. Typical VMware environments contain 4 server class VMs per core

Server and Storage Configuration Device Role Qty Configuration/Description

Server Configuration

Page 18: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

MD1000e Blade Enclosure 1

PE M710 Production ESX Servers 7 144 GB RAM per blade

Primary ESX servers

PE M610 Alternate Production ESX Servers

14 48-96 GB RAM per blade

Alternative to the M710s. Can also use a combination of M610 and M710 depending on desired VM density

PE M610 ESX Server/Utility Blade 1 96 GB RAM

Optional vCenter, optional Simpana® CommServe®, SAN Management software (SAN HQ)

PE M610 ESX Server/Data Protection Utility Blade

1 96 GB RAM

ESX Proxy running Simpana® VSA and MediaAgent with integrated deduplication

Storage Configuration

PS 6510X Primary Storage 2 57.6 TB, ~48 TB usable

About 20% usable space (8 TB) is reserved for snapshots, vmware swap files and Simpana® DDB volume. This leaves about 40 TB of storage for VMs

PS 6010 XV

Alternate Primary Storage 5-6 48 to 57.6 TB, ~42 to ~48 TB usable

20% usable is reserved for snapshots, VMware swap files and Simpana® DDB. Use when VMs run critical apps demanding high IO performance.

PS 6510E Secondary Storage 1 48 – 96 TB, ~40 – 80 TB usable

Used as backup target. Present 2 TB LUNs to Simpana® VSA/MA virtual machine.

PS 6010XV or PS 6010XVS

Deduplication Database storage (optional)

1 4.8 TB Used to store Simpana® deduplication database if primary storage capacity is constrained.

Large Reference Architecture

Possible Virtual Machine configuration # of VMs per ESX

Total # of VMs

Avg VM size (GB)

Avg. VM RAM (GB)

Description

130-165 800-1000 60-100 4-16 Number of VMs is constrained by how much memory is available on each blade and the number of cores. Typical VMware environments contain 4 server class VMs per core

Server and Storage Configuration Device Role Qty Configuration/Description

Server Configuration

MD1000e Blade Enclosure 1

PE M910 Production ESX Servers 6 256-512 GB RAM per blade

Primary ESX servers

PE M610 ESX Server/Utility Blade 1 96 GB RAM

Optional vCenter, optional Simpana® CommServe®, SAN Management software (SAN HQ)

PE M610 ESX Server/Data Protection Utility Blade

3 96 GB RAM per blade

ESX Proxy running Simpana® VSA and MediaAgent with integrated deduplication

Storage Configuration

PS 6010XV or PS 6010XVS

Deduplication Database Storage

1 4.8 TB Used for hosting DDB for the multiple Simpana® MAs that participate in deduplication

PS 6510X Primary Storage 4 115.2 TB, ~96 TB

About 20% usable space (16 TB) is reserved for snapshots, vmware swap files and

Page 19: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

usable Simpana® DDB volume. This leaves about 80 TB of storage for VMs

PS 6010 XV

Additional Primary Storage 2-3 19.2 to 28.8 TB, ~15 to ~24 TB usable

Use for VMs running critical apps that demanding high IO performance.

PS 6510E Secondary Storage 2 96 – 192 TB, ~80 – 160 TB usable

Used as backup target. Present 2 TB LUNs to Simpana® VSA/MA virtual machines.

Performance Summary

Dell and CommVault conducted extensive performance benchmark tests on the Medium Reference Architecture to highlight

the capabilities of the integrated EqualLogic and CommVault® solution for the Converged Virtual Infrastructure. Some

highlights include

Protect 500 virtual machines in 30 minutes with IntelliSnap™ and EqualLogic snapshots

A single Simpana® Virtual Server Agent can protect 500 VMs in a daily 8 hour window (assuming incremental forever

backups)

Deduplication accelerated DASH Full complete in 8 hours for all 500 virtual machines

Test Details

The tests were run for an environment equivalent to the Medium Reference Architecture described above. The virtual

machines were configured as follows

# of VMs Size of each VM # of Data Stores Total Data Size

500 30 GB 15 15 TB

Space on each virtual machine was about 90% full with representative data that was unique to the virtual machine to simulate

real world conditions. The only common content across all 500 virtual machines was the operating system binaries.

Simpana® Virtual Server Agent (VSA) and MediaAgent (MA) are configured on a virtual machine on a dedicated Utility Blade.

This single Simpana VSA is able to protect all 500 virtual machines in the converged infrastructure. SimpanaVSA integrates

with the PS APIs to create persistent snapshot based recovery copies in the Primary EqualLogic Group. VSA also uses VADP

to create backup copies on the secondary EqualLogic Group for longer retention

Test Results: IntelliSnap™ for VSA

This sequence of tests measures the performance when using IntelliSnap™ for VSA along with EqualLogic snapshots to

protect 500 virtual machines. This sequence also measures the time required and the throughput to create deduplicated

secondary backup copies on the secondary EqualLogic array. The following table summarizes the results.

Job Type Time Average Processing Rate

Description

IntelliSnap™ for VSA 30 minutes

- Make quiescent 500 virtual machines across 15 jobs, engage PS API to create a datastore level hardware snapshot and thaw the VMs. 15 jobs completed in 30 mins.

Full Secondary Copy 24 hours 625 GB/hr Copy the contents of hardware snapshots to the secondary

Page 20: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

*

EqualLogic. Deduplication occurs inline. Usually done once a week or once a month

Incremental Secondary Copy

8 hours - Copy changed blocks from previously created hardware snapshots using VADP and CBT to the secondary EqualLogic.

Test Results: Incremental Forever with DASH Full

This sequence of tests measures the performance when using Incremental forever protection strategy with DASH full to

backup 500 virtual machines to the secondary EqualLogic storage. Hardware snapshots on the primary storage are not used

in this case. The following table summarizes the results

Job Type Time Average Processing Rate

Description

First Full Backup 30 hours 500 GB/hr First full backup using VADP. Since the data has never been backed up (or deduplicated before) the throughput is relatively low.

Incremental Backup 8 hours - The default backup policy, uses Change Block Tracking to backup only changed blocks from VMs.

DASH Full 9 hours 1,667 GB/hr Creates a periodic consolidated Full. Can be run once a week to mimic the traditional weekend full, daily incremental schedule. Run bi-weekly or once a month, if at all. Typically used when selective Full copies are required on tier 3 disk media, on tape or on cloud

Beyond the Numbers

The intention of the performance tests described above is to identify resources sufficient to accomplish data protection tasks

for converged virtual infrastructure within a typical operating window. The results are not by any means an indication of any

performance limitations of the EqualLogic PS arrays or Simpana® software. If higher performance is desired, the peer storage

architecture of EqualLogic arrays ensures additional storage controllers can be added seamlessly and easily without any

disruption.

Similarly, the modular and linearly scalable nature of Simpana® software allows you to dramatically improve performance by

simply introducing additional Simpana® Virtual Server Agents (VSA) and MediaAgents (MA). For instance, by adding a second

VSA/MediaAgent in the converged virtual infrastructure building block, the average processing rate can be doubled and the

time take can be cut in half for an environment with 500 virtual machines.

Scaling it up

Each building block in the Converged Virtual Infrastructure is an independent entity with data protection capabilities built in. By

adding additional building blocks, not only do you increase the size of the virtual environment but you also add means to

protect the larger infrastructure. The Simpana® VSA included in each converged infrastructure building block ensure that data

protection capacity scales linearly along with the infrastructure. With each Simpana VSA protecting VMs in its own block and

all the Simpana agents working in parallel, the time to protect the virtual machines within this stack remains constant. For

example, 500 VMs in a single block is backed up in 8 hours by a single SimpanaVSA. 1000 VMs in two blocks with two

Simpana VSAs will also be backed up in 8 hours. The following table summarizes the effect of stacking multiple building

blocks and the time required for data protection.

1 Building Block 2Building Blocks 3 Building Blocks 4 Building Blocks

# of VMs @ 30 GB 500 1000 1500 2000

# of VMs @ 50 GB 300 600 900 1200

Data store size 15 TB 30 TB 45TB 60 TB

# of Simpana® VSA 1 2 3 4

IntelliSnap™ For VSA

Snapshot Copy Time 30 mins 30 mins 30 mins 30 mins

Full Backup Copy Time 24 hours 24 hours 24 hours 24 hours

Page 21: Protecting Converged Virtual Infrastructures - …webdocs.commvault.com/assets/protecting-converged-virtual...Protecting Converged Virtual Infrastructures A reference architecture

For more information about Simpana® software modules and solutions,

and for up-to-date system requirements, please visit www.commvault.com www.commvault.com • 888.746.3849 • [email protected] CommVault Worldwide Headquarters • 2 Crescent Place • Oceanport, NJ 07757 Phone: 888.746.3849 • Fax: 732.870.4525

CommVault Regional Offices: United States • Europe • Middle East & Africa • Asia-Pacific • Latin America & Caribbean Canada • India • Oceania

©1999-2013 CommVault Systems, Inc. All rights reserved. CommVault, CommVault and logo, the “CV” logo, CommVault Systems, Solving Forward, SIM, Singular Information Management, Simpana, CommVault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, IntelliSnap, ROMS, Simpana OnePass, and CommValue, are trademarks or registered trademarks of CommVault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.

Full Backup Copy Throughput 625 GB/hr 1,250 GB/hr 1,875 Gb/hr 2,500 GB/hr

Incremental Backup Copy Time 8 hours 8 hours 8 hours 8 hours Incremental Forever with DASH Full First Backup Time 30 hours 30 hours 30 hours 30 hours

First Backup Throughput 500 GB/hr 1,000 Gb/hr 1.500 GB/hr 2,000 Gb/hr

Incremental Backup Time 8 hours 8 hours 8 hours 8 hours

DASH Full Time 9 hours 9 hours 9 hours 9 hours

DASH Full Throughput 1,667 GB/hr 3,333 GB/hr 5,000 GB/hr 6,6667 GB/hr

As the table indicates, additional building blocks increase the infrastructure required to host more virtual machines. However,

the time taken to protect this larger virtual environment remains constant. This is because each building block contains its own

Simpana® data protection capabilities built in. The predictability of this model makes it extremely easy to plan for and deploy

converged virtual infrastructure that will scale to meet the virtualization requirements of the largest of environments.

Summary

The Converged Virtual Infrastructure Block includes all the components necessary to successfully and reliably deploy a virtual

data center. Each building block includes high level of hardware component redundancy, built-in 10 Gb networking and

EqualLogic Peer Storage architecture for unprecedented availability. The fully integrated Simpana® software, with ability to

protect hundreds of virtual servers in minutes, provides level and scale of protection and recovery for the building block

unmatched in the industry. Each self-contained, self-protected and self-managed building block allows businesses the ability to

rapidly shift to a modern and agile data center.

Note: the original work for this reference architecture was performed using CommVault Simpana v9 and completed in June

2010.