10
Boost storage efficiency across the enterprise An integrated HPE Data Protector and HPE StoreOnce solution delivers scale, agility, performance, and cost efficiency Technical white paper

Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Boost storage efficiency across the enterprise An integrated HPE Data Protector and HPE StoreOnce solution delivers scale, agility, performance, and cost efficiency

Technical white paper

Page 2: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper

Contents Introduction ................................................................................................................................................................................................................................................................................................................................................... 3

Limitations and inefficiencies of the first-generation deduplication technologies ...................................................................................................................................................................... 3

Deduplication explained .................................................................................................................................................................................................................................................................................................................... 4

HPE StoreOnce and HPE StoreOnce Catalyst .............................................................................................................................................................................................................................................................. 4

HPE Data Protector ............................................................................................................................................................................................................................................................................................................................... 5

Advanced integrations of HPE Data Protector with HPE StoreOnce Systems ............................................................................................................................................................................... 5

Flexible deployment ........................................................................................................................................................................................................................................................................................................................ 5

Powerful and intelligent deduplication technology ............................................................................................................................................................................................................................................ 6

Efficient data movement ............................................................................................................................................................................................................................................................................................................. 6

Centralized management and control ............................................................................................................................................................................................................................................................................ 6

Federated deduplication use cases ......................................................................................................................................................................................................................................................................................... 6

A small remote office protection ......................................................................................................................................................................................................................................................................................... 7

Medium-sized regional offices with local recovery requirements ......................................................................................................................................................................................................... 8

Large enterprises with multiple remote offices and data centers......................................................................................................................................................................................................... 8

Virtual environment protection ............................................................................................................................................................................................................................................................................................ 9

Conclusion ....................................................................................................................................................................................................................................................................................................................................................10

Page 3: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 3

Introduction As data volumes double every 12 to 18 months, managing and protecting the growing volumes of information while reducing the storage cost continues to be one of the top IT priorities.1 Data deduplication is one of the most important and fastest growing storage optimization techniques to appear in recent years. HPE StoreOnce is one of the industry’s most advanced deduplication engines in the market today. It can help organizations optimize their data protection infrastructure by reducing the amount of backup data that needs to be stored by 95 percent,2 significantly lowering network traffic and backup footprint as well as backup and restore times. And, when HPE StoreOnce Systems are paired with the HPE Data Protector backup software, the integrated solution can deliver the scale, performance, agility, and cost-efficiency that is needed to protect today’s IT environments.

Limitations and inefficiencies of the first-generation deduplication technologies Today most enterprise backup vendors provide some form of deduplication to reduce the backup storage footprint and cost. However, the cost, efficiency, and implementation of the deduplication process must be assessed for each available option. For example, different deduplication algorithms between source and target require data rehydration before it is sent across the wire and leads to poor resource utilization and longer restore times.

Deploying an application-specific deduplication agent instead of running standardized deduplication across all data requires buying, deploying, and separately managing different deduplication agents, which often results in increased solution cost and management complexity (figure 1).

Figure 1. Limitations and inefficiencies of the first-generation deduplication technologies

Chunking and hashing data is an I/O-intensive process; an inefficient deduplication process can place a huge burden on server resources and slow down other applications running on the server. This can greatly impact the performance of backup or application servers, depending on where deduplication is being executed, to the point where deduplication makes them virtually unusable or prevents them from scaling to backup large volumes of data.

Deduplication appliances running legacy deduplication technologies also have some challenges. They are focused on providing high ingest rates—the speed in which they can write data to the backup media—so they can meet backup windows. As a result, they write highly fragmented data, creating a “tax” during restores.

Reconstituting these fragmented data chunks can dramatically slow the process of reading and rehydrating data from a deduplicated backup. Recovery performance is a critical criterion for businesses trying to quantify their recovery time objectives in the event of system or site failure.

1 Three Considerations for Modernizing Data Protection with HPE, ESG, December 2015 2 Get Protection Guarantee Program at hpe.com/storage/getprotected

Page 4: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 4

Deduplication explained The deduplication process involves breaking data in smaller blocks (chunking) and generating a unique hash (hashing) for each block, which is then used to make a decision whether or not to store the chunk. Data deduplication compares chunks of information to detect duplicates, and stores each unique data segment only once.

For this to happen, a deduplication engine assigns a unique identifier to each chunk of data using mathematical hash functions. Once it has identified two chunks of data as identical, the system expects to replace the duplicate with a link to the original chunk.

There are two architectural approaches to chunking. A fixed deduplication algorithm breaks data into blocks of a fixed size. Variable chunking groups the data into blocks based on patterns in the data itself. The advantage of variable chunking is that it can recognize duplicates when small changes have occurred and merely shifts the data from one backup to the next. The technique most commonly used today, variable chunking, leads to higher deduplication ratios. Deduplication involves a combination of three elements:

• The deduplication engine is where the majority of processing takes place. It manages the logic and processing of the backup stream by calculating segments and hash values, identifying unique and repeated segments, and maintaining the hash lookup table.

• The deduplication store is the disk storage location managed by the deduplication engine. It stores the unique (deduplicated) segments and is often physically coupled with the deduplication engine.

• Backup agents (for example, media agents, disk agents, and application agents) manage some of the deduplication processes. Agents can be deployed separately from the deduplication engine to offload the performance impact. Agents can perform tasks such as segmenting the data, calculating the hash value of segments, and sending new data to the engine and the store. The deduplication agent talks to the deduplication engine to calculate which segments are unique.

Deduplication can take place at the application source, backup server, or target device.

• Application source deduplication removes redundant data before it is transmitted to the backup target. This type of deduplication reduces storage and bandwidth requirements, as only unique data is transmitted over the wire. However, it can be slower than target deduplication and can increase the workload on servers.

• Backup server deduplication shifts the deduplication execution onto a separate dedicated server to maximize the performance of the target device and minimize the impact on the application server. In this case, it provides network efficiency between the backup server and the target storage.

• Target deduplication runs deduplication processing at the backup target and removes redundant data from a backup stream before storing it on the local store. This method is transparent to the backup application.

HPE StoreOnce and HPE StoreOnce Catalyst HPE StoreOnce, built upon an HPE-developed deduplication algorithm, is one of the most advanced federated deduplication solutions in the market. StoreOnce implements smart techniques such as variable data chunking, sparse indexing, and container matching to deliver a highly efficient deduplication solution.

Designed using a modular approach for flexibility, StoreOnce provides a common deduplication algorithm that can be deployed as software-only solutions or as dedicated physical or virtual appliances.

The StoreOnce product family offers a broad array of backup target options, including a StoreOnce Virtual Storage Appliance (VSA) and a series of physical appliances that range from cost-effective, single-node systems to highly available, high-capacity multi-node appliances. The target StoreOnce Systems support standard virtual tape library (VTL) and common internet file system (CIFS) or network file system (NFS) interfaces, along with HPE StoreOnce Catalyst interface.

The HPE StoreOnce Catalyst API is designed to improve the backup and recovery speed, reduce network bandwidth, and maximize storage efficiency. The Data Protector software fully supports writing backup data through these three interfaces.

Page 5: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 5

HPE Data Protector HPE Data Protector delivers comprehensive data protection, real-time intelligence, and guided optimization across physical, virtual, and cloud infrastructures enabling an adaptive backup and recovery environment.

HPE Data Protector backup software allows customers to centrally manage and orchestrate all data protection tasks for data center and remote offices, manage replication and software store creation, and secure data in local storage and within the network.

The solution provides streamlined management and reporting without having to rethink how backup and restores are performed or taking on additional infrastructure spend. When complemented with HPE Backup Navigator, Data Protector delivers real-time operational intelligence, which enhances management and future planning of backup resources. It also offers analytical reporting, highly intuitive and interactive monitoring dashboards, rapid root-cause analysis and problem solving, and the ability to predict capacity needs to optimize CAPEX and OPEX investments.

Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data Protector software in conjunction with StoreOnce Systems solve the challenges associated with traditional deduplication solutions. Data Protector seamlessly works with StoreOnce targets—software only, or both physical and virtual appliances—to deliver the scale, performance, agility, and cost efficiency that customer needs to protect their enterprise from the core to the edge.

Together, this integrated data protection solution enables the advanced federated deduplication solution (figure 2) with the flexibility of performing deduplication at the source, backup server, or target system within the environment, depending on performance requirements and business needs.

Figure 2. Using the same patented deduplication technology, HPE Data Protector and HPE StoreOnce appliances enable enterprises to standardize their infrastructure across the entire IT environment

The HPE federated approach supports the notion that deduplication should be performed only once, at the most advantageous location in the environment to reduce the required network bandwidth, without the need to rehydrate data, and managed through a single pane of glass. This unique capability provides maximum flexibility in deployment and maximizes storage efficiency.

Flexible deployment The StoreOnce federated deduplication capability provides a common modular architecture that can be deployed across a wide range of hardware—on both physical and virtual devices—from the edge of an enterprise to the data center. The StoreOnce technology is application independent, supporting any type of application data or files.

Data Protector leverages the flexible architecture of the StoreOnce deduplication engine to provide a software-based StoreOnce library that allows customers to deploy deduplicated target stores on any industry standard hardware. A single software deduplication store can be shared among multiple clients. Software deduplication can be remotely managed and deployed in remote offices without requiring onsite IT expertise.

Page 6: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 6

Powerful and intelligent deduplication technology Data Protector software deduplication is powered by StoreOnce, which offers a thin, efficient footprint that minimizes the load on CPU processing and maximizes application availability. And since it uses a small amount of memory, it can be deployed on application or backup servers and even virtual machines, without crippling performance.

A highly efficient adaptive micro-chunking technique segments data into very small blocks, with an average of four kilobytes. These four-kilobyte chunks are up to one-sixteenth the size of the blocks used by other solutions, enhancing Data Protector’s ability to find commonality in the data stream during deduplication storing less data on disk.

The sparse indexing and container matching algorithm reduces the number of times the deduplication engine has to read the data to determine if chunks match. Instead of reading an entire data chunk, these algorithms preview parts of it and compare them to a table of existing chunks stored in memory, improving throughput, and reducing processing requirements.

Efficient data movement The StoreOnce technology enables a highly efficient use of system resources and network bandwidth. Since the entire backup stack (the software and backup appliance) uses a single algorithm, data is deduplicated only once and moved from site to site without the need for rehydration. This includes the ability to set different retention times at different sites based on business needs.

Centralized management and control Data Protector backup software manages and controls the entire data backup and recovery process, from edge to data center, through a single pane of glass, over IP or Fibre Channel networks (figure 3). Centralized management enables IT to deploy, manage, and monitor backup agents on the remote office and branch office (ROBO) locations eliminating the need for specialized IT staff at these locations.

With the StoreOnce Catalyst integration, Data Protector manages and controls deduplication-enabled, multisite replication between sites—for locally or geographically distributed environments. Geographically distributed organizations can take control of the data at its furthest outposts and bring it to the data center in a cost-effective way.

The Data Protector GUI enables IT administrators to create both regular and encrypted StoreOnce stores on any StoreOnce target, and proactively manage the storage capacity of any StoreOnce target by configuring a quota threshold to avoid service interruption due to capacity issues.

Figure 3. Single pane of glass management for backup and recovery, deduplication, and replication across the enterprise

Federated deduplication use cases Today, much of an organization’s critical information is created and consumed at the ROBO locations. As there is often little to no IT expertise at these small remote locations, they are exposed to data loss—and the subsequent business fallout—because they are not adequately protected. Additionally, the traditional tape-based approach for a remote office is cumbersome, expensive, and labor-intensive.

HPE Data Protector’s federated deduplication capability can be deployed across a range of different scenarios; particularly in a global, remote office environment, which can include hundreds of small, medium, and large remote offices with differing backup and recovery needs.

Page 7: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 7

A small remote office protection Small offices often have limited space, IT infrastructure, and IT staff. These standalone offices generally have a small number of applications and servers (fewer than five servers) with a relatively small amount of data that needs to be protected. And, in many cases, the network connectivity between remote sites and the data center is limited to T1 lines, 1.54 Mbps, or lower.

The recommended backup strategy in these scenarios is using application source deduplication (figure 4) with Data Protector and StoreOnce Catalyst to back up data directly to a StoreOnce appliance in the central data center, eliminating the need of having a backup appliance onsite. Since the data is deduplicated before it is sent across the wire, only the unique data is transferred, dramatically reducing backup windows, especially in high latency networks.

In case of multiple remote sites storing to the same store in the primary data center, cross-site deduplication can improve the efficiency even further. For example, if all the different remote sites have the same file, it expects to be stored only once.

With the StoreOnce Catalyst integration, Data Protector enables IT staff to manage the entire data backup and recovery operation from the central site, including the deployment of backup agents on the remote site. The extremely lean and efficient StoreOnce engine allows applications and software deduplication to coexist on the same server without crippling performance.

The StoreOnce algorithm delivers a higher deduplication ratio with a smaller chunk size, improving overall storage efficiency for most enterprise applications.

Figure 4. Application source deduplication reduces the network bandwidth and can eliminate the need for storage and server at the remote site providing a cost-effective dedupe solution for small environments

Figure 5. Backup server deduplication minimizes the impact on application performance and maximizes the performance of the target device

Page 8: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 8

Medium-sized regional offices with local recovery requirements Data Protector offers deduplication at the backup server level (figure 5), which helps to support complex configurations in regional office settings, which often have a relatively large number of servers and large data sets with local recovery needs. A backup server is essentially a backup client with a Data Protector media agent installed and running the deduplication task and other standard media management tasks, such as mirroring using object copy. Running deduplication tasks on a dedicated server minimizes the impact on application performance and maximizes the performance of the target device.

The Data Protector media agent can run on most leading operating systems: Windows, Linux, and UNIX. A server-side deduplication strategy is very useful in medium-sized (5–15 servers) remote offices that have local recovery requirements. The Data Protector StoreOnce store can be easily created on the backup server, which backs up data locally. The data can be backed up on a local store at the backup server and then replicated to the primary data center for disaster recovery (DR) purposes.

The entire data backup and replication is done using the same algorithm and it is centrally managed via the Data Protector console. This approach reduces the load on the application servers and provides local recovery on the remote sites. Since only unique data is transferred from the remote site to the primary data center, the network bandwidth is used very efficiently, reducing the backup window, especially in high-latency networks.

Large enterprises with multiple remote offices and data centers Large data centers that are connected to several remote sites are relatively more complex than a single remote office location. The data centers typically have a large number of applications, different platforms and storage arrays, physical and virtual server environments, and generally have IT expertise to support this infrastructure. In these environments, the StoreOnce Backup appliance store can be deployed to backup data center applications and remote site data. Multi-node, highly scalable StoreOnce appliances with built-in availability are best-suited solutions in this case.

Data Protector can centrally manage the entire backup and replication process. Through the StoreOnce Catalyst, Data Protector can trigger the replication process on a multi-node StoreOnce System and catalog this information. With Data Protector, IT can centrally manage the entire backup, recovery, and replication process with maximum storage efficiency in large enterprise environments (figure 6).

Figure 6. Data protection in large enterprise environment with multiple remote sites and data centers

Page 9: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper Page 9

Figure 7. StoreOnce deduplication significantly reduces the amount of storage required for backing up of virtual server environments

Virtual environment protection As virtualization technologies become more mature and reliable, IT organizations are increasingly deploying mission-critical applications in virtual environments, and these environments require data protection. A large virtual environment can have thousands of virtual machines running the same operating system (for example, Microsoft Windows, or Linux). The duplication of information within virtualized datastores is driving enormous consumption of backup storage resources and the associated capital expenditures.

Data Protector combined with StoreOnce provides many advanced options to protect virtual environments (figure 7). Data Protector’s policy-based protection for applications and virtual environments automates and simplifies virtual environment protection. It also frees up IT staff for high-priority projects that drive business growth. Data Protector’s deduplication capabilities offer significant cost savings through storage efficiency, by eliminating the redundant operating system information across backup images and guest profiles, which provides fast recovery to any data within the backup image.

Data Protector provides application-aware, array-based snapshots for virtual environments for HPE 3PAR and other third-party storage arrays, ensuring business continuity for 24x7 global operations. Through a single pane of glass, Data Protector can manage the entire backup and recovery process across any hypervisor, including snapshots and replication in VMware, Microsoft Hyper-V, and Citrix® Xen environments.

Page 10: Boost storage efficiency across the ... - hpe-tdaas.eshpe-tdaas.es/hpeazlan/ecosistema/assets/docs... · Advanced integrations of HPE Data Protector with HPE StoreOnce Systems Data

Technical white paper

Conclusion The HPE Data Protector backup software offers highly flexible, centrally managed, and efficient data protection for any enterprise. It is often combined with the StoreOnce deduplication technology which provides a common architecture across software and hardware—at remote sites and in the data center—enabling efficient deduplicated data movement from edge to core, without having to rehydrate at multiple deployments. Powered by an advanced deduplication engine of StoreOnce, Data Protector and StoreOnce appliances deliver federated deduplication that enables deduplication of data at any location in the backup stack.

Data Protector software provides the single point of management for the entire data movement—backup, replication, and data recovery. Together, Data Protector and StoreOnce can help organizations maximize their critical storage resources through a highly efficient deduplication while meeting stringent business service-level agreements and minimizing backup infrastructure related costs.

Learn more at hpe.com/software/dataprotector

Sign up for updates

Rate this document

© Copyright 2011–2012, 2016 Hewlett Packard Enterprise Development LP. The information contained herein is subject to change without notice. The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.

Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. UNIX is a registered trademark of The Open Group. Citrix is a registered trademark of Citrix Systems, Inc. and/or one more of its subsidiaries and may be registered in the United States Patent and Trademark Office and in other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. Oracle is a registered trademark of Oracle and/or its affiliates. SAP is the trademark or registered trademark of SAP SE in Germany and in several other countries.

4AA3-8728ENW, April 2016, Rev. 2