34
EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning Abstract This white paper contains a compilation of specific configuration and best practices information for the EMC ® DL1500 and DL3000 with IBM Tivoli Storage Manager. July 2009

EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

  • Upload
    ngoliem

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager

Best Practices Planning

Abstract

This white paper contains a compilation of specific configuration and best practices information for the EMC®

DL1500 and DL3000 with IBM Tivoli Storage Manager.

July 2009

Page 2: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Copyright © 2008, 2009 EMC Corporation. All rights reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com

All other trademarks used herein are the property of their respective owners.

Part Number h5894.1

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 2

Page 3: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Table of Contents Executive summary ............................................................................................5 Introduction.........................................................................................................5

Audience ...................................................................................................................................... 5 Terminology ................................................................................................................................. 5

IBM Tivoli Storage Manager...............................................................................6 Storage pools............................................................................................................................... 7 Progressive incremental backup.................................................................................................. 7 Daily schedule.............................................................................................................................. 7

TSM database .......................................................................................................................... 7 TSM reclamation ...................................................................................................................... 7 TSM migration .......................................................................................................................... 8

Settings and considerations......................................................................................................... 8 DL overview.........................................................................................................9

The deduplication process ........................................................................................................... 9 Deduplication considerations..................................................................................................... 10

Backup policies ...................................................................................................................... 10 Change rate............................................................................................................................ 10 Data types .............................................................................................................................. 10 Compression .......................................................................................................................... 10 Encryption............................................................................................................................... 10 Multiplexing............................................................................................................................. 11

Sizing ......................................................................................................................................... 11 Space management................................................................................................................... 12

Native format data .................................................................................................................. 12 Deduplicated data .................................................................................................................. 13

Phasing deduplication into the environment .............................................................................. 13 Network connectivity........................................................................................14

Best practices............................................................................................................................. 15 Using the DL in a NAS environment................................................................17

General recommendations......................................................................................................... 17 LAN backup-to-disk.................................................................................................................... 17 Mounting NFS shares ................................................................................................................ 18

AIX 5.2.................................................................................................................................... 18 HP-UX 11i............................................................................................................................... 19 Linux ....................................................................................................................................... 19 Solaris..................................................................................................................................... 19

Configuring CIFS shares............................................................................................................ 20 Using the DL in a SAN environment................................................................21

General recommendations......................................................................................................... 21 Expiring volumes .................................................................................................................... 21

DL SAN attach ........................................................................................................................... 21 Device discovery (persistent binding) .................................................................................... 21 Device drivers......................................................................................................................... 22

Configuring the backup server ................................................................................................... 22 Impact on TSM daily activities ................................................................................................... 23

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 3

Page 4: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Client backup.......................................................................................................................... 23 Backup storage pools............................................................................................................. 24 Database backup ................................................................................................................... 24 TSM reclamation .................................................................................................................... 24

Replicating deduplicated data .........................................................................25 Replication considerations ......................................................................................................... 25 Seeding (pre-populating the target system)............................................................................... 27 Space management considerations .......................................................................................... 28 Recovery/Failback ..................................................................................................................... 28

Conclusion ........................................................................................................29 References ........................................................................................................29 Appendix: Using replication with Tivoli Storage Manager............................30

DL setup..................................................................................................................................... 30 TSM setup.................................................................................................................................. 30 Authentication setup .................................................................................................................. 30

Create the keys ...................................................................................................................... 30 Install the public key on the DL .............................................................................................. 31 Test the keys .......................................................................................................................... 31

TSM post-backup script example .............................................................................................. 32 Recovery from a replicated NAS share ..................................................................................... 33 Recovering replicated cartridges ............................................................................................... 33

To recover at the target site ................................................................................................... 33 Clean up ................................................................................................................................. 34

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 4

Page 5: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Executive summary The EMC® DL1500 and DL3000 appliances provide simple and reliable disk-based backup and recovery systems with the added feature of deduplication. The DL products with deduplication integrate seamlessly into your current backup environment to provide leading-edge backup and restore operations. The DL was designed and implemented for optimal performance and ease of use, and easily scales to meet your storage needs.

Introduction This white paper provides specific configuration and best practices recommendations for the DL1500 and DL3000 appliances when used with Tivoli Storage Manager (TSM) in NAS and SAN environments.

Audience This white paper is intended for EMC customers, system engineers, and members of the EMC and partners professional services community who are interested in configuration and best practices information when using DL appliances with TSM.

Terminology • Backup-to-disk (B2D) – A backup solution where data is written to hard disk instead of tape. • Balance-rr mode – A “round-robin” algorithm for transmitting and receiving data to and from peer

systems. This mode is also referred to as Mode 0 (see also Link Aggregation Control Protocol). • Block pool – Deduplicated data stored on the DL back-end array. • Cartridge-level replication – Process of verifying the contents of an individual cartridge in a source

VTL is present in a target VTL, the sending of the metadata description of the source cartridge, and the rendering of that cartridge as a readable image on the target DL.

• CIFS – Protocol for a Windows-based network fileshare. • Data Protection Advisor (DPA) – An EMC data protection management solution that provides

automatic and continuous data collection, conditional analysis that triggers alerts, and a single, consistent interface for reporting.

• Deduplication – Process of detecting and identifying the redundant variable-length blocks (or data segments) within a given set of data to eliminate redundancy.

• Directory/file-level replication – Process of verifying the contents of a directory or an individual file in a NAS share on the source system is present in a NAS share on the target system, the sending of the metadata description of the source directory or file, and the rendering of that directory/file as readable data on the target DL.

• Failback operation - The replicated NAS share or VTL is transferred from the target system to the source system.

• Immediate deduplication – Process of deduplicating the backup datastream as it is ingested by the DL appliance (see also scheduled deduplication).

• Ingest – Process of receiving backup data from the backup server. • Link Aggregation Control Protocol (LACP) – Method for controlling the bundling of several

physical Ethernet ports together to form a single logical channel. LACP allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer (a directly connected device that also implements LACP). This mode is also referred to as Mode 4 (see also balance-rr mode).

• Logical Unit Number (LUN) –Identifying number of a SCSI object that processes SCSI commands.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 5

Page 6: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

• Locked – State in which a target NAS share or VTL is unavailable for updating with the latest

replicated data. Any data replicated to a directory or file within a locked share or to a cartridge within a locked VTL will be placed in an update queue on the target system (see also unlocked).

• Metadata description – Sequence of unique blocks and pointers that represent the original data. The receipt of this metadata description by the target system results in the generation of a recoverable image.

• Namespace replication – Process of verifying that all data associated with the source is present at the target followed by replication of the metadata description of the source.

• NAS – Network-attached storage. • Native format data – Uncompressed backup data that has not been deduplicated. • NFS – Protocol for cross-platform network fileshares. • Partial replication – The recovery point representing a VTL or NAS share or a readable image

representing a cartridge or directory/file on the target system is missing data present in its source equivalent.

• Policy-based deduplication – The choice of immediate, scheduled, or no deduplication. • Recovery operation - The replicated NAS share or VTL (recovery point) is made available for use on

the target DL system. • Recovery point – A replicated NAS share or VTL on the target system whose metatdata description

has also been replicated. A replicated virtual tape or directory/file on the target system. • Remote replication – Backup data resided on a DL is copied over a LAN or WAN to another DL for

disaster recovery protection. • Scheduled deduplication – Deduplication of data occurs after the backup stream has been ingested by

the DL (see also immediate deduplication). • Shoeshining – An interruption in the datastream to physical tape drives that requires the tape to be

repositioned on the drive’s head in a manner that causes a stop-start or shoeshine motion of the tape device. This behavior is due to an inadequate data flow to the tape drive and slows overall performance.

• Source-based deduplication – Deduplication of backup data occurs at the client before it is sent to the primary storage system.

• Sync ID – Tag by which a source-target destination is identified in a DL replication configuration for cartridge-level or directory/file-level namespace replication.

• Target-based deduplication – Deduplication of backup data occurs at the target storage system. • Truncation – DL space management function that removes the oldest blocks of data from storage

having a deduplication equivalent when used capacity exceeds 70 percent. • Unlocked – State in which a target NAS share or VTL is available for updating with the latest

replicated data. If a locked NAS share or VTL is unlocked, any directory/file or cartridge with queued replication data will be updated at that time (see also locked).

• Virtual tape library (VTL) – Software emulation of a physical tape library system.

IBM Tivoli Storage Manager IBM Tivoli Storage Manager (TSM) is based on a client/server architecture. TSM manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory, file system, or partition.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 6

Page 7: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Storage pools TSM stores data in storage pools. A storage pool is a group of storage media of the same type, on which backup and archive data can be stored. Multiple storage pools can be created each with a specific device type and associated policies. TSM controls these repositories and the flow of data between them.

The three types of storage pools are primary, copy, and active data.

• Primary storage pools receive backup and archive data written to them by the TSM server or client. • Copy storage pools are created by copying data from a primary storage pool. The typical application

for this copy is offsite storage of data for business continuance or disaster recovery. • Active data pools are repositories for active versions of backup data that are in primary storage pools. Active backup versions exist in both a primary storage pool and one or more active data pools. The active data pool is populated either synchronously with the backup or asynchronously.

Progressive incremental backup Other backup applications use a scheme of full and incremental backups. A full backup backs up all the files on a given client or file system. An incremental backup backs up only those files that have changed since the last full or incremental backup. In this scheme, there is usually a full backup once per week, with incremental backups on the remaining days of the week.

TSM backups, on the other hand, are usually incremental forever – or what TSM refers to as progressive incremental. In this scheme, the first time a given client or file system is backed up, all data is backed up, resulting in a traditional full backup. Each subsequent backup of that client or file system is incremental. TSM does not back up those files that have not changed. Although this backup method reduces network traffic and enables faster backups and more cost-effective media utilization, it does require periodic consolidation of files to reduce the number of separate media needed for a recovery. While the process of consolidating media can be performed outside the backup window, it will tie up tape drives and other media from performing other activities.

Daily schedule In the TSM environment, there are several administrative operations that should occur on a regular basis, usually in a specific sequence. In a 24-hour day, these administrative operations typically occur outside of the backup window and include database backup, expirations, reclamation, and migration. Because these administrative tasks require the use of the tape library and drives, scheduling can sometimes be problematic if not planned properly. If a client backup exceeds its window, some tasks in the administrative schedule may be compromised. Optimally, overlapping of the backup window and the daily administrative tasks should be avoided.

TSM database TSM maintains a database of details about the backup objects (files, subfile components, directories, volumes, and so on) and their associated management policies from each of its clients, as well as a mapping to the volumes containing the backup data. The database facilitates the ability to locate and recover data rapidly. The TSM server keeps this database updated and cataloged through the backup/restore process. It is backed up daily.

TSM reclamation Backups and archives in TSM are policy-based. These policies govern:

• Which files on a server will be backed up or archived • To which media the files will be backed up or archived • How many versions of various files will be retained

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 7

Page 8: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

• How long the versions will be retained. Because TSM is policy-based, backup objects, or versions, are expired rather than the entire backup or individual piece of media. As backup objects on a tape cartridge expire, the cartridge becomes more and more logically empty. Eventually the amount of data remaining on a tape drops below a predefined threshold and needs to be reclaimed. The TSM reclamation process mounts a tape to be reclaimed, mounts another tape with free space, and copies all the remaining valid data from one tape to another. The tape cartridge that was the source for the reclamation is now empty and can be reused as a scratch tape.

This TSM reclamation process is usually scheduled to run once per day and executes automatically on those tapes whose content is below the predefined reclamation. TSM reclamation ensures that data is stored efficiently and that the media is optimized for restoration.

If active data pools are used, reclamation must occur and must run regularly to remove inactive versions from the active data pool. A version becomes inactive whenever a new version of that object is received from the client.

TSM migration Migration is a daily administrative process where backup objects are migrated automatically from one storage pool to another based on the pool’s storage policies. Typically, TSM administrators will design the storage hierarchy to back up data to a primary disk pool (for performance) and then migrate the data to a primary tape storage pool during off-peak hours.

Settings and considerations The following describes several Tivoli Storage Manager settings and best practices for optimizing the backup environment:

• Avoid running disk-intensive applications such as virus scanning on the backup client when it is backing up or restoring files.

• A single client can concurrently back up multiple images as parallel backup streams. Using multiple parallel streams increases the aggregate throughput of the backup process and shortens the backup window for the client.

• To achieve optimal deduplication, do not enable client compression. Data should be uncompressed when transmitted to the DL to increase the matches the deduplication engine can find in the datastream. Compressing data prior to sending it to the DL will most likely result in negligible redundancy and poor deduplication ratios. EMC recommends turning off client-side compression.

• For backups to NFS or CIFS shares defined on the DL, use sequential access file type device class disk pools.

• Do not use alternate pathing for tape drives. • Delayed failover is required when using VTLs. • For backups to VTLs, set up each TSM server with its own virtual library rather than use the library

manager model. This will allow each server to be its own library manager and facilitate recovery.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 8

Page 9: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

DL overview The DL1500 and DL3000 are stand-alone backup-to-disk (B2D) appliances offering a network-attached storage (NAS) front end (over IP) and an optional virtual tape library (VTL) front end (over FC). Both configurations of the DL include data deduplication to efficiently store and archive data written to the appliance. These appliances can be configured to present an NFS share, CIFS share, or VTL to the backup server. When enabled, both NAS B2D and VTL functionality can be used simultaneously. With policy-based deduplication, the data can be deduplicated as soon as the backup stream is sent to the appliance or postponed until the backup window has closed.

With its data deduplication capability, the DL appliance:

• Eliminates redundant data from backups to reduce storage, enabling longer onsite retention, and reduced replication costs

• Performs subfile comparisons on variable length data blocks to capture small block inserts and overstrikes in unstructured data

• Provides policy-based deduplication that is customer-tunable to adjust to changing backup environments and optimize backup performance

• Understands backup software metadata to optimize the backup process • Includes built-in data compression that is additive to deduplication in the data reduction process

The DL appliance’s replication option leverages the deduplication capability of the DL, substantially reducing the amount of backup data that needs to be migrated to a remote site. DL replication provides rapid local and remote restore with the following benefits:

• Supports replication between any combination of DL1500 and DL3000 appliances • Permits bi-directional replication between DL appliances • Replicates deduplicated NAS shares and/or virtual tape libraries to reduce bandwidth • Checks and validates each data block before replication so that only unique data blocks are sent to the

target DL to further reduce network traffic • Supports up to 10 (5 in DL version 1.2 and later) source DLs to one target DL

Replication of a target DL to another DL that is not one of its sources is not supported.

• Maintains a common data deduplication repository at the target DL for maximum storage savings • Replicates entire VTLs or NAS shares or individual tape cartridges (for VTL) and files (for NAS

shares) • Automatically recovers replicated tape cartridges and files on the target system • Defers tape cartridge/file replication during a backup window when a scheduled deduplication is used • Provides detailed replication reporting • Provides optional inline 128-bit AES data encryption for replicated data

The deduplication process The deduplication process divides the native data into variable length blocks and calculates a signature for each block. For each signature, the deduplication process determines if any other identical signature already exists, and if not, hardware-compresses the unique block and writes it to its data pool. If it finds any matches to these signatures, it replaces that redundant block with a pointer to the unique block so that only one copy of that data block will exist on disk. The metadata description (signatures and pointers) of the data is stored separately from the backup data on a RAID 6 LUN on the disk array in the DL3000 and on a RAID 1 LUN in the DL1500. A metadata description is stored for all data deduplicated within the DL.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 9

Page 10: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

As additional data is written to the DL, its data is also broken down into variable length blocks whose signatures are compared against those for existing unique blocks. Only new unique blocks are stored along with pointers to existing unique blocks in the appropriate sequence that can be used to reconstitute the data.

Deduplication considerations The ratio of the amount of storage space required to store the total number of save sets in a conventional disk storage system to the storage capacity used in a deduplication system is referred to as the deduplication ratio. There are many factors that affect deduplication ratios. Some key factors include:

Backup policies Retaining data for longer periods of time improves the chance that common data will already exist in storage, resulting in greater storage savings and better deduplication ratios.

Change rate When the first datastream is deduplicated, the number of unique blocks within it can vary widely depending on the data type. The deduplication process may find little or no data redundancy to 50 percent or more data redundancy.

When multiple backups of the same policy are written to the DL, however, storage savings often increase significantly with each backup as only those data blocks that are unique to each backup need to be written to disk. In conventional business operations, this unique data may represent only 1-2 percent of the data present in each additional backup set. The remainder of the backup consists of pointers to data already present on the DL.

Data types Backups of user data such as text documents, PowerPoint presentations, spreadsheets, some database types, source code, and Exchange are known to contain redundant data and are good deduplication candidates.

Other data sources such as audio, video, and scanned images consist of unique data and are not good deduplication candidates unless backed up multiple times.

Compression Compression does not affect deduplication of files that do not change. However, compression can cause even a small change in a file to ripple throughout the compressed file, greatly reducing the commonality that can be found between current and prior versions, or similar but not identical documents on different machines. As a result, software client-side compression prevents optimal deduplication from taking place since data that is already compressed cannot be efficiently deduplicated once it reaches target storage.

Data should be uncompressed when transmitted to the DL to increase the matches the deduplication engine can find in the datastream. Compressing data prior to sending it to the DL will most likely result in negligible redundancy and poor deduplication ratios. The DL, in addition to deduplication, hardware-compresses the deduplicated data when writing to storage.

Encryption Encryption does not affect deduplication of files that do not change between backups. However, encryption will cause even a small change in a file to ripple throughout a file when it is encrypted, greatly reducing the commonality that can be found between current and prior versions, or similar but not identical documents on different machines. Changing the data zone encryption passphrase will also change the encrypted form of every document, preventing deduplication from finding commonality between current and prior versions. As a result, software client-side encryption prevents optimal deduplication from taking place since data that is already encrypted cannot be efficiently deduplicated once it reaches target storage.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 10

Page 11: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

To achieve encryption of backup data, consider the use of an encryption appliance deployed between the DL and the physical tape library.

If encryption is required without a physical encryption appliance, avoid changing the data zone passphrase frequently, so that the current version of an unchanged file is more likely to be identical to the prior version. If poor deduplication ratios are achieved, it may be appropriate to create a dedicated NAS share or VTL with deduplication turned off, to receive those backups.

Multiplexing Multiplexing is typically used for many reasons:

• To avoid shoeshining on a physical tape drive • To increase concurrency for a given number of devices • To reduce complexity and user management effort for a given level of concurrency • To shorten the backup window by writing save sets from multiple clients to a single drive

simultaneously • To reduce the load on the server

While shoeshining does not apply to a DL, the other reasons do.

Multiplexing interleaves backup streams, writing a little of datastream 1, then a little of datastream 2, and so on, so that none of the clients sending data need to wait for the other clients to finish. Unfortunately, this interleaving of the datastreams has a significant impact on deduplication efficiency. Multiplexed streams hinder the deduplication process from efficiently identifying blocks of common data because of the additional header information added to the data. In order to realize the full benefit of deduplication, EMC strongly recommends multiplexing be turned off when using the DL.

Sizing Storage capacity needs to be sized to adequately handle the amount of data anticipated to be retained in both native and deduplicated format. Backups that are larger than expected or contain data that deduplicates poorly can require much more storage space. Experience and analysis have shown that there are three distinct use cases for the DL deduplication products. Each use case has its own storage capacity needs. Some space must always be available for new data. That amount of space, however, will depend on how the system is used.

• Immediate deduplication. With an immediate deduplication policy, deduplication may not keep up with ingest. Systems using an immediate deduplication policy ideally should include a storage buffer of 50 percent of the largest amount of data ingested on a single day during the week.

• Immediate deduplication with fast read/restore. To ensure that data required for rapid restore is available in native format, these systems include a storage buffer that accounts for the truncation threshold.

• Scheduled deduplication. These systems include a storage buffer that is sized for the largest backup in native format on a single day during the week.

EMC strongly recommends performing a sizing assessment when including replication in the backup environment. It will also recommend the additional amount of storage to add for TSM reclamation operations.

The sizing assessment models the ingest performance and capacity required to meet the specified retention period. If replication is selected, it also reports the replication times calculated for all days in the retention period for the source system. The sizing assessment can help determine if replication can occur in the desired timeframe based on the replication network bandwidth, latency of the connection between the source and target systems, and estimated amount of data to replicate on each day. If the calculated replication times exceed the required replication window, additional source systems may be required. There

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 11

Page 12: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

are cases where a single source system meets capacity and ingest performance objectives, but not replication performance as replication performance is much lower than ingest and deduplication rates.

Bi-directional replication, if used, needs to take into account the data being sent to each DL system from the replication source. This value is added to the actual capacity needed for the system.

Refer to the EMC DL Deduplication Sizer (DLDS) Best Practices Guide for more details on the information required to assess sizing requirements and the sizing tools available for this purpose. The most up-to-date version of the DL sizing application and the EMC DL Deduplication Sizer (DLDS) Best Practices Guide are available on Powerlink at (restricted audiences only): Products > Hardware/Platforms > Disk Library Family > Sales Tools

Space management When a backup stream is ingested, the native data is written to the share or tape cartridge. When deduplication occurs, the unique blocks contained in that backup are written to storage in a different location and the metadata index of signatures and pointers representing deduplicated backup is stored. The native format data and unique blocks present on the system represent the used storage space.

The DL retains the native format data for as long as possible to facilitate quick recoveries. If this native data has been deduplicated for longer-term retention, data recovery will be from the deduplicated data once the native data is removed.

EMC recommends the following practices:

• Configure TSM to use expired tapes before scratch tapes as expired tapes still consume space on the DL1500 or DL3000 until space reclamation has been run.

• Expired media is not subject to space reclamation until the tape is also relabeled. Relabeling the expired tape places it in a state that allows the space reclamation process to de-reference and subsequently delete the unique blocks associated with the backups on that tape.

• Either a backup script needs to be created or properly size the number of tapes required so tapes are relabeled as they are reused. TSM will always use least recently used tapes and if there are a lot of unnecessary tapes, space reclamation will be inefficient.

• Do not create more virtual tape cartridges than needed. Regularly expiring and relabeling virtual tapes ensures efficient use of storage space.

Used storage space is reclaimed for reuse in several ways. The following describes the processes employed for native format data and deduplicated data.

Native format data Recovery of the storage space occupied by native format data is primarily under the control of TSM. When TSM expires a volume, there is no direct communication of the event to the DL. This has an impact upon DL space management when the DL is configured as a VTL. As a result, the DL’s Console Manager will still indicate that logical tape cartridge as containing data and therefore still occupying storage space on the DL. For the DL to reuse this space, TSM must also relabel the logical tape cartridge by writing a data block to the beginning of the scratch volume. Relabeling is a task that is common to the management of physical tapes and may be performed via scripting.

To further self-manage its storage capacity, the DL monitors its available storage for user data and has a truncation feature that begins executing automatically when used storage capacity reaches 70 percent. The truncation feature removes the oldest native format data blocks from storage until the used storage capacity reaches 65 percent. (These levels are 82 and 77 percent, respectively, in DL3000 version 1.2 and later.) This truncation feature operates only on fully hydrated blocks of data that have a deduplicated block counterpart. It does not remove any native format data that will not undergo deduplication (deduplication policy of never) or has not yet been deduplicated.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 12

Page 13: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Deduplicated data After TSM relabels a volume for reuse, the DL’s space reclamation process feature identifies unique data blocks in the block pool associated with the relabeled volume. It analyzes the blocks to determine whether they are required by any other virtual tape cartridges. It also deletes the metadata indices associated with the relabeled volume as well as any pointers it had to unique blocks associated with other deduplicated backups. When unique blocks are no longer referenced by any volumes, they will be removed and the storage space they occupy will be recovered when space reclamation is performed by the DL using the DL’s space reclamation feature. The feature is set to run daily by default. This process can be scheduled or run manually.

There is significant overhead associated with space reclamation and it is best to run it during periods of low appliance use. Replication, reclamation, and backup stream ingest all consume system resources and therefore should not all be done at the same time without understanding overall system performance implications.

Best practices

EMC strongly recommends scheduling the start time for space reclamation to occur after the backup(s) for each day are complete.

Phasing deduplication into the environment EMC recommends that backup clients are phased in over time to a new DL, starting with an initial full backup followed by the other backup types. Add similar clients gradually so that over time the device has its block pool populated in a controlled manner.

For example, if the DL is installed during a normal incremental schedule, the initial backups to the DL are tied to the existing technology’s last full backup. If a restore is needed, data would be required both from the previous technology as well as the DL. If the customer does not want to perform a full backup outside the normal policy schedule, they need to wait until the full backup is to occur to begin sending data to the DL.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 13

Page 14: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Network connectivity The DL contains six or eight Gigabit Ethernet ports for the DL1500 and DL3000, respectively, configured as a bonded network interface by default. All six or eight ports service replication traffic, appliance management traffic, and NAS backup data traffic.

In version 1.2 and later, there are two segmentation configuration options for the DL Ethernet ports:

• Non-segmented – All Ethernet ports are bonded into a single interface and utilize a single IP address. This is the default Ethernet port configuration for the DL and the only network interface option for DL version 1.1 and earlier.

• Segmented – Data traffic, replication traffic, and management traffic are separated between two different physical interfaces. Each traffic type is given its own network interface (IP address, network mask, and gateway). This option is available in version 1.2 and later.

With segmentation, the DL traffic types can be placed on either of two physical interfaces depending on performance or security needs. One traffic type can be isolated to a single interface while the other two traffic types share the other interface. For example, data traffic can be isolated to a network segment and be not visible on a common network.

Note: If the DL is configured for network segmentation, you must use the data segment IP address, not the management segment IP address or hostname, to map CIFS shares or manage the DL. If the DL is joined to an Active Directory domain, the Microsoft Management Console (MMC) tool, which is used to manage shares and user/group access, may fail to connect because the system name cannot be resolved into the data segment IP address. In this case you must specify the DL using its data segment IP address, not the hostname, by choosing the following options in the Computer Management console: Action > Connect to another computer > Another computer.

The bonding mode used for the DL ports on the bonded interface can be either the round-robin policy (Mode 0), which transmits packets sequentially across all ports for load balancing and fault tolerance, or LACP (Mode 4), which utilizes IEEE 802.3ad dynamic link aggregation. The underlying DL appliance software and operating system manage the I/O distribution between each of the bonded ports in an attempt to optimize the aggregate bandwidth possible across the links.

Mode 4 is available in DL version 1.1 and later. Mode 0 is available in all DL versions.

Significant performance optimization will be achieved through configuration settings on managed Ethernet switches. For Mode 0 and Mode 4, EMC recommends using managed switches to group the physical port connections to which the DL appliance is connected.

For Mode 0, use a managed switch with the ports used by the DL configured for "trunking" or "link aggregation" or "EtherChannel" without LACP or pAgP to maximize performance potential. Refer to the vendor documentation for specific instructions when configuring the port group.

For example, to configure a Cisco 6500 Series switch for a common EtherChannel group, run the following command from the Cisco switch CLI:

set port channel mod/ports... [admin_group] mod = switch module ports = port numbers to be added to the port channel group. admin_group = name of port channel group (optional)

For example:

set port channel 2/1-8

For Mode 4, switches that utilize IEEE 802.3ad dynamic link aggregation allow its ports to be automatically grouped together to form an ultra-high-bandwidth connection that greatly expands bandwidth

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 14

Page 15: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

capacity to the network. This aggregation of DL port connections potentially improves NAS data ingest or replication performance depending on the environment.

For example, to configure a Cisco 6500 Series switch for an EtherChannel group with LACP, run the following commands from the Cisco switch CLI:

set channelprotocol lacp mod set port lacp-channel mod/ports mod = switch module ports = port numbers to be added to the port channel group

For example:

set channelprotocol lacp 2 set port lacp-channel 2/1-8

For more information on programming EtherChannel groups and the LACP mode when using Cisco 6500 Series switches, see:

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/catos/8.x/configuration/guide/channel.html#wp1020899

For Mode 0, all DL ports connect to the same Ethernet switch. Splitting bonded ports on a DL between multiple switches, however, is supported on certain switches when using Mode 4. Consult the switch vendor’s documentation.

When using Mode 0, make sure LACP is disabled when configuring the EtherChannel port group on managed switches. When using Mode 4, only connect to Ethernet switches that support IEEE 802.3ad dynamic link aggregation and are properly configured for LACP.

Best practices EMC recommends the following best practices when connecting the DL to the network: • Use a dedicated network by configuring a separate network or use QoS features that guarantee network

bandwidth. • Alternatively, use virtual networks (VLANs) to segregate backup from production network traffic. • Set network interface cards (NICs) for servers and clients to full duplex. Set all routers to full duplex. • Use only CAT 5e or CAT 6 cables (1 Gb/s rated cables). • If using a DNS server, verify the DNS server configuration settings are correct. • Use multiple DL Ethernet ports when connecting the bonded interface to the network. The more DL

Ethernet ports used, the better the load balancing will be across the ports. • For redundancy, connect at least two DL ports to an Ethernet switch when using a bonded interface. • Set each switch port used by the DL to auto-negotiate/auto-sensing. The DL network interface cards

are preset to auto/auto and cannot be changed. • When connecting multiple DL Ethernet ports to the network when using Mode 4 (LACP), make sure

the proper LACP settings on the switch are configured. Consult the switch vendor’s documentation for the configuration steps.

• When using Mode 0 (balance-rr mode), use a managed switch configured for "trunking" or "link aggregation" or "EtherChannel" to maximize performance potential. Refer to vendor documentation for the switch for specific instructions when configuring the port groups. Do not use the LACP settings on the switch.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 15

Page 16: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Keep the following in mind:

• Concurrent operations on the DL will have significant impact on performance. • The DL does not support jumbo frames. • Splitting bonded ports on a DL between multiple switches is not supported when using Mode 0. • Splitting bonded ports on a DL between multiple switches is supported on certain switches when using

Mode 4. Consult the switch vendor’s documentation. • When network segmentation is selected, the traffic types in the bonded interface must be on the same

subnet. The other physical interface can be on a different subnet.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 16

Page 17: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Using the DL in a NAS environment The DL is designed to be a target-based specialized hardware deduplication appliance that can be configured to present NAS B2D shares configured as CIFS or NFS shares to the backup server. Both types of shares can exist at the same time on the DL and there can be up to 128 total shares.

This section covers DL behaviors to expect as well as configuration recommendations on to how to achieve optimal performance when using it with TSM in a NAS environment.

General recommendations For TSM, field experience has shown that NFS shares perform better than CIFS shares.

NAS B2D shares created on the DL1500 and DL3000 can be hidden from network browsing when they are created using the Hide this share from network browsing option.

Refer to “Deduplication considerations” on page 10 for additional recommendations and settings.

LAN backup-to-disk IBM Tivoli Storage Manager is based on a client/server architecture. A TSM domain is a logical group of policies and clients within a TSM server. The TSM server controls and directs all TSM operations. Backup management information is stored or managed by the TSM server including configuration information, parameters for client systems to be backed up, the index of files backed up from those clients, and media volumes where backed up data is stored. Normally, a client operates under the control of the TSM server according to the policies an administrator establishes.

When a backup occurs, a TSM client sends data to the TSM server, which directs the data to the appropriate storage pool. The TSM server receives metadata information about the files being backed up separately and makes appropriate entries in the TSM database.

In the simplest LAN backup-to-disk configuration, there is a single TSM client, a backup server, and DL as the storage device all connected via an Ethernet LAN. Setting up this environment involves the following steps:

1. Create an NFS or CIFS share on the DL. This share will be mounted (or mapped) as a storage pool on the TSM server. If desired, multiple shares can be created to represent different storage pools.

2. Mount (or map) the share on the TSM server.

a. Create a mountpoint for the NFS share to be used by TSM and mount it using the commands for your particular operating system as described in “Mounting NFS shares” on page 18.

b. Map the CIFS share to a drive in Windows Explorer as described in “Configuring CIFS shares” on page 20.

3. Using TSM’s Integrated Solutions Console, select Storage Devices.

4. Create a device class, specifying the following properties:

Device type: FILE Name: <name of device class> Path: /<mountpoint> Mount limit: <use default value>

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 17

Page 18: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Ensure that the mount limit is set to a high enough value to support all concurrent sessions if the default value is not sufficient.

Maximum volume capacity: <desired size of share>

5. Create a storage pool, specifying the following:

Name: <name of storage pool> Type: Sequential access Device class: <use device class created in step 4> Scratch volume: 1

Refer to the section “Sequential Access Storage Pool Volumes” in the IBM Tivoli Storage Manager Administrator’s Guide for more information on creating storage pools of this type.

6. Activate the policy on the storage pool just created.

7. Perform a backup.

During a backup operation, the NFS or CIFS share on the DL designated as the storage pool volume receives the backup objects directly from the client/server. Create additional storage pool volumes as necessary.

Mounting NFS shares The DL can serve as a NAS appliance for backup purposes. If it is an NFS environment, access to the shares is restricted by hostname. Root access to a NFS share is not allowed and the access rights will be changed to “nfsnobody” as a security precaution.

The DL does not support NFS v4.

NFS shares created on the DL are mounted on the NFS client in the following environments:

AIX 5.2 Enter the following command:

nfso -o nfs_use_reserved_ports=1

This mount command does not persist across AIX reboots. For AIX 5.2 or later, use the –p option to mount the share permanently. For releases of AIX prior to 5.2, the mount command must be included in a bootup script.

If you are using NFS v3, mount the NFS share created on the DL using this command:

mount –o timeo=600 {nfs_server}:/{export path} /{mountpoint} For example: mount –o timeo=600 cosmowanda:/Q/shares/tl_nfs_1 /stoli1

To mount the NFS share on another NFS version, use the command:

mount -o vers={1|2|3},timeo=600 {nfs_server}:/{export path} /{mountpoint} To show the list of file systems exported by the NFS server:

showmount -e {nfs_server} and

showmount -e dfshares {nfs_server}

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 18

Page 19: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

To show or clear the NFS statistics, respectively:

nfsstat -c [show NFS statistics] nfsstat -z [clear NFS statistics]

HP-UX 11i If you are using NFS v3, mount the NFS share created on the DL using this command:

mount –o timeo=600 {nfs_server}:/{export path} /{mountpoint} For example: mount –o timeo=600 cosmowanda:/Q/shares/tl_nfs_1 /stoli1

To mount the NFS share on another NFS version, use the command:

mount -o vers={1|2|3},timeo=600 {nfs_server}:/{export path} /{mountpoint} To show the list of file systems exported by the NFS server:

showmount -e {nfs_server} and

showmount -e dfshares {nfs_server} To show or clear the NFS statistics, respectively:

nfsstat -c [show NFS statistics] nfsstat -z [clear NFS statistics]

Linux If you are using NFS v3, mount the NFS share created on the DL using this command:

mount –o timeo=600 {nfs_server}:/{export path} /{mountpoint} For example: mount –o timeo=600 cosmowanda:/Q/shares/tl_nfs_1 /stoli1

To mount the NFS share on another NFS version, use the command:

mount -o nfsvers={1|2|3},timeo=600 {nfs_server}:/{export path} /{mountpoint} To show the list of file systems exported by the NFS server:

showmount -e {nfs_server} To show the NFS statistics:

nfsstat -c [show NFS statistics]

Solaris Mount the NFS share created on the DL using this command:

mount –o vers=3,timeo=600 {nfs_server}:/{export path} /{mountpoint} For example: mount –o vers=3,timeo=600 cosmowanda:/Q/shares/tl_nfs_1 /stoli1

To mount the NFS share on another NFS version, use the command:

mount -o vers={1|2|3},timeo=600 {nfs_server}:/{export path} /{mountpoint}

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 19

Page 20: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

The mount {nfs_server}:/{directory} /{mountpoint} command defaults to NFS v4 in Solaris. Attempting to mount the share using NFS v4 results in a “No such file or directory error”. To avoid this error, specify the NFS version when using this command.

To show the list of file systems exported by the NFS server:

showmount -e {nfs_server} and

showmount -e dfshares {nfs_server} To show or clear the NFS statistics, respectively:

nfsstat -c [show NFS statistics] nfsstat -z [clear NFS statistics]

Configuring CIFS shares Note: If the DL is configured for network segmentation, you must use the data segment IP address, not the management segment IP address or hostname, to map CIFS shares or manage the DL. If the DL is joined to an Active Directory domain, the Microsoft Management Console (MMC) tool, which is used to manage shares and user/group access, may fail to connect because the system name cannot be resolved into the data segment IP address. In this case you must specify the DL using its data segment IP address, not the hostname, by choosing the following options in the Computer Management console: Action > Connect to another computer > Another computer.

If you are operating in a Windows environment, you must specify the Windows domain information in the DL Console Manager’s Configuration section before you create the CIFS share. This involves adding the name of the workgroup or Active Directory domain used by the Windows system on which the CIFS share will be used. Once a CIFS share is created, access is allowed on a per-username basis.

Since only one value for a workgroup or Active Directory domain can be entered, either workgroup or Active Directory domain validation will be used but not both at the same time. Also, only one workgroup name or Active Directory domain value can be used by all Windows systems using this DL system.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 20

Page 21: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Using the DL in a SAN environment The DL is designed to be a target-based specialized hardware deduplication appliance that, when used with its optional virtual tape library (VTL) feature, can be configured to look like a number of tape libraries with associated tape drives to the backup servers on the SAN.

This section covers DL behaviors to expect as well as configuration recommendations on how to achieve optimal performance when using it with TSM in a Fibre Channel SAN environment.

The VTL feature is a licensed option of the DL1500 and DL3000 and is used only in SAN environments.

For the TSM versions and supported operating systems, refer to the EMC Support Matrix on Powerlink®.

General recommendations EMC recommends configuring the VTLs as ADIC i2000 libraries with HP LTO3 drives.

Expiring volumes When TSM deletes a volume, there is no direct communication of the event to the DL. As a result, the DL’s virtual tape representing that volume may display as empty or SCRATCH in TSM, but this same tape will appear in the DL’s Console Manager as containing data. This indicates the data on the scratch volume is still using space on the DL. To reclaim this space, TSM must relabel the volume by writing a data block to the beginning of the scratch volume. TSM can relabel volumes using the RELABELSCRATCH parameter. Support for this parameter is available in Fix Pack 5.5.1 and later levels. For more information, see the article “RELABELSCRATCH parameter supported for DEFINE and UPDATE library commands (for Virtual Tape Libraries)” on the IBM website.

DL SAN attach The DL1500 has two 4 Gb Fibre Channel ports and the DL3000 has four 4 Gb front-end Fibre Channel ports for target mode Fibre Channel attach. All connections to these ports are made via a Fibre Channel switch. Direct attachment of any device to these front-end ports is not supported. The following recommendations apply when connecting the DL to a backup server via a Fibre Channel switch:

• When emulating an EMC Disk Library, use a Fibre Channel switch listed in the EMC Support Matrix. • Avoid using hard (port) zoning schemes with backup devices. • Always use a single initiator single target zoning scheme by World Wide Port Name (WWPN). • Switch encryption solutions are not supported by the DL. This product provides an encryption feature

with the optional replication feature to meet data in-flight encryption requirements. • Limit FC extended fabric (ISL link) configurations to three hops between the backup server and the

DL.

Device discovery (persistent binding) When an operating system scans its storage buses (Fibre Channel or SCSI) for devices, it may assign a particular device to a different device name from the one it used during a previous scan. Since backup software saves information about the device names of the various tape drives it sees in its database during the configuration process, device reassignment during a scan will impact the availability of backup devices to the application. Persistent binding is the ability to fix a particular device to a particular device name. In general, this is done by associating a unique device characteristic (such as a WWPN or serial number) with a device name to maintain persistence of a particular device to a specific device name. The method for doing this varies for different operating systems and Fibre Channel disk and tape device drivers.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 21

Page 22: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

If persistent binding is improperly configured, the operating system may provide a device with multiple names — giving a device one device name after a boot, but giving it a different name on a subsequent boot.

Device drivers Tivoli Storage Manager (TSM) uses its own device drivers for non-IBM tape drives only. Refer to the TSM documentation for details.

Configuring the backup server TSM operates in a client/server model consisting of servers, clients, and storage devices. When backing up to a virtual tape library on the DL, the client sends the data over the SAN to the DL.

1. Create a virtual tape library on the DL, specifying the emulation type, number of slots, number of tape drives and their emulation type. If desired, multiple VTLs can be created to service different backup servers.

EMC recommends configuring the VTLs as ADIC i2000 libraries with HP LTO3 drives in a TSM environment.

2. Create virtual tape cartridges on the DL for the VTL specifying the cartridge type, quantity, and capacity.

3. Assign the VTL to a client on the SAN to which the DL is attached.

4. Scan for devices at the TSM server. The VTL and its tape drives will appear.

5. In TSM’s Integrated Solutions Console on the media server, go to Storage Devices.

a. Create a library, specifying /dev/lbX (where “X” is the numeric value for that library) for the media changer under Device Special File.

b. Add drives to the library one at a time, specifying their name under Device Special File.

6. Label and check in volumes.

a. Add volumes by searching for all eligible volumes in regular slots in the library and accept all defaults in the dialog box. Then, discover volumes.

b. Create a Device Class of type LTO (drive type) and select the library created in step 5.

c. Create a Storage Pool, specifying the following:

Name: <name of storage pool> Type: Sequential access Device class: <use device class created in step 6b> Scratch volume: <use maximum value>

Refer to the section on “Sequential Access Storage Pool Volumes” in the IBM Tivoli Storage Manager Administrator’s Guide for more information on creating storage pools of this type.

7. Activate the policy on the storage pool just created.

8. Perform a backup.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 22

Page 23: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Impact on TSM daily activities The VTL looks like a tape device to TSM. It behaves sequentially like tape, just faster. The DL supports creation of up to 64 VTLs and 160 tape drives. Virtual media can be sized to accommodate specific needs.

Client backup Simultaneous backups The DL1500 and DL3000 support multiple virtual libraries and tape drives simultaneously.

To address backup bottlenecks commonly seen with shared devices, two solutions can be deployed:

• Create a library and tape drives for each backup host’s exclusive use. This ensures the best possible performance by removing delays inherent with shared tape drives.

• Create a single library with many tape drives, but assign only a few tape drives to each backup server. In this configuration, each backup server has its own dedicated resources. This also simplifies the initial installation and avoids any driver-related issues you could encounter when sharing a tape device between unlike OS platforms.

With either solution, it is a good practice to limit the number of virtual tape drives so when all are servicing I/O at the same time each drive can sustain at least 20 MB/s. You may need to stagger backups to support fast and slow datastreams to obtain maximum utilization of the available bandwidth.

Single library with multiple hosts For multiple hosts to use the same devices, the DL requires you to create different groups for each host. A group consists of exactly one host, one target, and one or more devices. The DL does not permit multiple hosts to access the same group.

Virtual tape libraries The VTL permits you to create combinations of any emulated library with any emulated tape drive. Some backup software applications are sensitive to combinations that do not exist in the real world, and will not work properly when unsupported combinations of library and tape drive types are specified. When they fail, it may not be obvious that it is an invalid combination.

EMC recommends configuring the VTLs as ADIC i2000 libraries with HP LTO3 drives in the TSM environment.

Until a virtual library is assigned to a SAN client, it is not enabled for use by the backup servers. To ensure correct library behavior when assigning or unassigning virtual libraries, ensure that device numbering on the bus starts with LUN 0 and that there are no LUN numbering holes. LUN numbering holes can occur when devices are unassigned.

To change the LUN values, use the Disk Library Console Manager as follows:

1. Go to Configuration.

2. Select SAN Clients.

3. Select Edit.

4. Change the LUN values for the device list as necessary.

Virtual tape cartridges The DL does not preallocate disk space when it creates a virtual tape cartridge. As such, there are several factors to consider:

• The virtual cartridge size represents the amount of native format data that can be stored uncompressed by that tape.

• Although a large number of virtual tapes can be created, it is very important to take into consideration cartridge reuse. When too many tapes are available, NetBackup will continue to use new cartridges

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 23

Page 24: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

rather than those available in the scratch pool. The DL’s space reclamation process will not be able to recover space occupied by tapes that are no longer needed.

• The cartridge size can be customized but cannot exceed the capacity of its physical equivalent. This ensures that the compressed data stored on a virtual tape cartridge will fit on a physical tape once it has been decompressed and then recompressed by the physical tape drive. The DL1500 and DL3000 will not allow you to create tape cartridges larger than their physical equivalents. A size of 100 GB has been found to be a good minimum virtual cartridge size.

Best practices

EMC recommends the following when creating and using virtual tape cartridges:

• Creating smaller-size media is preferred. It allows a virtual tape cartridge to be filled, even when backing up smaller data sets. It also aids in the transfer of data from virtual to physical media.

• Consider creating only enough media to accommodate the amount of backup data for the required retention period.

• Regularly expire and relabel tapes to avoid running out of storage capacity and allow space reclamation to complete within the daily schedule. If tapes are expired and relabeled only occasionally, depending on the amount of data represented by those tapes, space reclamation can take several days to run and as such impact other system processes.

Backup storage pools Typically, TSM administrators will design the storage hierarchy to back up data to a primary disk pool (for performance) and then migrate the data to a primary storage pool during off-peak hours. The DL VTL can be deployed as the primary storage pool, eliminating or reducing the need to stage the data to a disk pool first. This frees up time in the daily administrative schedule by eliminating this daily migration activity.

Database backup The TSM database is a critical component in the recovery of a TSM server. Best practices suggest performing a full backup of the TSM database daily and maintaining copies of it onsite and offsite.

There are two ways to back up this database:

• The first and probably fastest way is to use the DL replication option as it replicates only the unique the data. See “Replicating deduplicated data” on page 25 for more information.

• The second way is to allow TSM to write to the remote library as if it were local. This requires the OS to see the remote VTL’s tape drives as local. The advantage of this method is that TSM controls the copies.

TSM reclamation TSM reclamation occurs significantly faster in the DL than can be achieved with conventional tape drives. The time involved in mounting and positioning tapes to locate backup objects is streamlined because these activities become immediate with disk-based virtual media.

Make sure that volumes expired by TSM reclamation are also relabeled (see “Expiring volumes” on page 21.) This will allow the DL to recover the space previously occupied by those volumes and allow the DL’s space reclamation process to remove unique blocks associated with those volumes that are no longer referenced by other volumes in the system.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 24

Page 25: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Replicating deduplicated data DL replication does not utilize the backup application. Replication is managed by the DL and manual tracking of the replicated copies is required.

Replication is an optional licensed feature and is supported in the following configurations:

• DL1500 to DL1500 • DL3000 to DL3000 • DL1500 to DL3000 • DL3000 to DL1500

The DL systems involved in a replication configuration must be running the same major level of firmware. That is, version 1.1.x systems replicate to a target system running version 1.1.x. Likewise, version 1.2.x systems replicate to a target system running version 1.2.x. Replication between version 1.1.x and version 1.2.x systems is not supported.

DL replication occurs in two stages. First, a copy of the unique data associated with each replication-enabled VTL/share that is not already present at the target DL transfers to the target DL as the data deduplicates. This is referred to as continuous replication. Second, a metadata description of the replicated VTL/share or cartridge or directory/file is sent to the target system. This second stage ensures that a recoverable image is available on the target DL. This second stage is referred to as namespace replication.

In version 1.1 and later, there are two variations of namespace replication:

• VTL/NAS share-level replication (as is in version 1.0) • Individual tape cartridge or directory/file replication

With VTL/NAS share-level replication, a VTL or NAS share is made available as a recovery point when its metadata description is replicated to the target system. This replication of the metadata description can be scheduled once every 24-hour period (version 1.1 or earlier) or up to once per hour (version 1.2 and later) or be done manually as many times as desired. The 10 most recent recovery points (24 in version 1.2 and later) of a NAS share/VTL are available on the target DL for selection.

With individual tape cartridge or directory/file replication, the tape cartridge or directory/file becomes readable on the target DL as soon as its metadata description is replicated. This replication of the metadata description occurs when a tape is unmounted when using cartridge-based replication or when a CLI replication command is issued when using directory/file-based replication. The VTLs and NAS shares on the target DL contain the most recent copy of each replicated tape or directory/file.

Failure to perform a namespace replication of a replication-enabled VTL or NAS share will result in no recovery point or available cartridge or directory/file on the target DL.

See “Appendix: Using replication with Tivoli Storage Manager ” on page 30 for an example of using TSM with the DL’s directory/file-based replication feature.

More information on replication can be found in the EMC DL 1500 and DL 3000 Replication – Best Practices Planning white paper on Powerlink.

Replication considerations Replication transmits data over Ethernet using TCP. The source DL can optionally encrypt the data using AES-128 before sending it across the TCP link to the remote DL where it is unencrypted before it is stored. The DL uses TCP port 1062 for replication of the unique data and port 80 for replication of the metadata description. There is no physical Ethernet port on the DL reserved for the replication feature. See “Network connectivity” on page 14 for recommendations when connecting to the network.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 25

Page 26: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

When adding DL replication to the backup environment, keep the following in mind:

• Always assess the network between the source and target systems in both directions as part of the test and acceptance process and prior to putting the systems into production. This will ensure that sufficient bandwidth is available to support the maximum amount of data desired to be replicated within the replication window. This can also be done by performing a formal assessment if necessary. Use “ping” or similar analyzer software to get an idea of latency between the two systems as well

as determine if any network bottlenecks are occurring. Run this test multiple times on different days at different times to confirm the numbers.

Use the Console Manager’s built-in Performance Analyzer (Figure 1) to measure network performance between the two systems.

Figure 1. Network performance analysis

• Make sure both the source and target DLs are configured for replication before performing any backups to the source DL.

• Seeding of the target system is almost always required. See “Seeding (pre-populating the target system)” for recommended options.

• Do not place replication configurations into production until required replication windows are met. When replication falls behind (exceeds the replication window), it is hard to catch up.

• Consider using immediate deduplication to effectively extend the replication window and lower the chance of “falling behind.” Since replication is initiated when deduplication begins, this enlarges the replication window to maximize the amount of data that can be transferred should change rates increase unexpectedly, an unusually large amount of data is backed up, or excessive network traffic slows data movement.

Replication performance depends on daily change rates between new backup data and data residing on the target system.

• The higher the change rate, the more data that needs to be moved within the replication window. • Lower than expected deduplication rates will add time to the replication and may exceed the required

replication window. • If deduplication is slowed, continuous replication will also be slow. Deduplication can be slowed by

poor ingest rates, fragmentation, and other factors that affect the speed at which block level match detection can occur.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 26

Page 27: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

• Network latency will have a significant impact on replication performance. • 128-bit AES encryption for replication is available. Encryption may affect replication performance. It

should be disabled if the customer’s WAN is already secured.

If you have opted to do VTL/NAS share-level namespace replication, the length of time it takes to verify that all data associated with that VTL/NAS share is present at the target DL will take longer and longer as the amount of data increases. This will impact the speed at which a recovery point is available on the target DL as the metadata description of that VTL/NAS share cannot replicate to the target DL until the verification completes. Consider the following options:

• Use smaller VTLs or NAS shares targeting specific data and performing rolling backups to these VTLs/shares to spread the replication out over time and allow the replication process to work with easily managed data sizes.

• Use cartridge-level or directory/file-level namespace replication instead. Because a data verification step occurs at the cartridge or directory/file level, the delay that can occur before the metadata description replicates is substantially reduced unless you append to large cartridges. In this case, always overwrite or use new cartridges or limit the size of the cartridge.

Seeding (pre-populating the target system) Initial replication (first data copy) is a time-consuming operation and will usually take significantly more time than later replications because the data is typically new and unknown to the target DL.

For replication, the goal is to facilitate the transfer of the large amount of unique data produced by initial backups to the source shares/VTLs to the target DL. This is done by seeding the target DL with the same data as on the source DL so that only the minimum amount of data needs to be transferred between units on subsequent replications to maintain synchronization. The source and target systems should not be put into production until this seeding process is complete. Seeding is considered complete when the required replication window has been demonstrated to have been met.

There are a number of options that can be employed to accomplish this:

• OPTION 1: Locate the source and target DL so that they are on a dedicated Ethernet network and replicate locally. This will allow replication to proceed at the fastest possible network rate. EMC recommends replicating at least two full backups in this local configuration to demonstrate replication savings. If savings are not seen, further local replications may be needed to ensure the target system has sufficient data. (That is, the data transferred by subsequent replications is more representative of the quantity of data that will typically replicate within the window.) When the seeding completes, the target DL can be deployed to its intended location.

• OPTION 2: Locate the source and target DL so that they are both connected to the storage node, and use the storage node to perform an inline copy or dual write of data from at least two full backups to the source and target systems, using a temporary NAS share or VTL on the target DL. Deploy the target DL to its target location and perform a namespace replication of the VTL/NAS share. Delete the temporary NAS share or VTL on the target DL as well as references to the temporary copy in the backup application catalog.

• OPTION 3: Use tape to seed the data in the target DL. Using the tapes from at least two full backups present on the source DL, write the data to temporary NAS shares or a VTL on the target DL. Perform a namespace replication. Delete the temporary NAS share or VTL on the target DL as well as references to the copied save sets/images in the backup application catalog if the tape copy was performed by a backup application.

The goal of Options 2 and 3 above is to write native backup data directly to the source DL and target DL and allow each system to deduplicate the data.

If replication is not enabled for a NAS share or virtual tape library until multiple backup streams destined for that particular virtual device or NAS share are ingested, there will be a large number of unique blocks on the source DL that will need to be transferred over the network when the first namespace replication

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 27

Page 28: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

occurs. In this situation, the data verification performed as part of the namespace replication will detect that all the unique blocks on the source system must be sent over the network and there may be a significant delay before the data in the two systems is synchronized.

If there are disruptions during the replication process for any reason (for example, a network infrastructure problem), re-seeding may be necessary to eliminate a large replication backlog that can occur as a result of the disruption. Large backlogs, when they exist, make it difficult for the systems to resynchronize.

Space management considerations When using cartridge-level or directory/file-based namespace replication, it is necessary to regularly execute the synchronization feature. It is available through the Console Manager where it can be run manually and as a CLI command for scripted operation.

Synchronization removes tapes or directories/files in the corresponding target VTL or share on which it is run that are no longer present on the source system. It marks the associated data as eligible for space reclamation on the target DL. EMC recommends regularly running synchronization to ensure that unneeded data does not accumulate on the target DL, causing it to fill unnecessarily.

Recovery/Failback Should a NAS share or VTL become unavailable on a source DL, two methods are available to access the data either at the target system (recovery) or by replicating the share/VTL to the source DL (failback).

In order for the replicated NAS share or VTL to be accessed by a secondary server on the target DL, a recovery must be initiated. When a recovery on the target DL is performed, the NAS shares or virtual tape libraries appear on the target DL. In order for a virtual tape library to be accessed by a secondary server on the target side (visible to the SAN client ports), you will need to enter a name for the library, a virtual tape drive model number, and the number of drives.

For example, to recover a NAS share that has been replicated to the target DL, checkpointed, and available as a recoverable image, perform the following steps:

At the target DL:

1. Select the share to recover.

2. Specify a name for the share to recover. This name will be the mountpoint name.

NOTE: If you need to recover this share again to the same mountpoint, you will need to delete the recovered share first and then re-create the mountpoint. Alternatively, you can provide a different name for the share to recover.

3. Recover the share. When the recovery completes, “Success” displays.

4. Verify the presence of the recovered share by going to Configuration > NAS.

Go to the TSM server that will be accessing the recovered share on the target side:

1. Mount the recovered share (refer to “Mounting NFS shares” on page 18 or “Configuring CIFS shares” on page 20 for more information).

2. Ensure that the TSM application is running on the server using the following command:

ps –ef | grep dsm “dsmserv quiet” should be running If not, start TSM as follows: /usr/Tivoli/tsm/server/bin ./rc.adsmserv &

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 28

Page 29: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

3. Update dsm.sys (on UNIX) or dsm.opt (on Windows) to point to the recovery server.

4. Start the TSM client.

5. Select the files to restore and restore as normal.

If failing back to the source DL, delete the original share or virtual tape library on the source DL first. Perform a failback on the target DL, and then perform a recovery on the source DL.

For more information on the DL replication feature, refer to the EMC DL1500 and DL3000 Replication – Best Practices Planning white paper.

Conclusion The processes and configuration recommendations described in this white paper are intended to assist you in maximizing the performance and usage of the DL in TSM environments. Since no two backup environments are exactly the same, it may be necessary to vary individual settings to meet each specific environment’s requirements and goals.

References • EMC DL1500 and DL3000 - Best Practices Planning white paper • EMC DL1500 and DL3000 Version 1.1 New Features and Functions – A Detailed Review white paper • EMC DL1500 and DL3000 Replication - Best Practices Planning white paper • EMC DL1500 and DL3000 Copying to Physical Tape – Best Practices Planning white paper • IBM Tivoli Storage Manager Administrator’s Guide (per platform)

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 29

Page 30: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

Appendix: Using replication with Tivoli Storage Manager IBM Tivoli Storage Manager (TSM) can be configured to work collaboratively with replicated NAS shares. Recommendations for configuring the DL systems as well as TSM are outlined in the following sections “DL setup” and “TSM setup.” It is important to note, however, that no scripting is needed for the replication process for the DL in VTL mode, because cartridge-based replication of a tape volume occurs automatically when it is dismounted on the source DL.

DL setup The source and target DLs must be correctly configured for replication for this script to work.

• Target DL is enabled to accept replication from the source DL. • Source DL is enabled to replicate to the target DL. • Source NAS share is created with deduplication enabled. • Target DL is configured with a NAS share with deduplication enabled and having the same name as

the source share. • Target NAS share must be configured as a directory/file-based target, with a sync ID the same as the

share name, and access must be unlocked. • Source NAS share must be configured for directory/file-based replication and with a sync ID the same

as the share name. • Do not configure “replicate daily.” Test that the share contents can be replicated between the two DL systems by performing a synchronization on that share through the source DL’s Console Manager (Data Services > Replication > Source Role > NAS). Make sure this works before continuing with the next steps.

TSM setup Configure TSM as follows:

• Mount the source share on the TSM server. • Configure the server on which the replication script will run with SSH authentication. • Modify the replication script to include the name of the source DL and the share to replicate. • Place the replication script in the TSM server bin directory. The script can also be placed in a location

of your choice and be run manually or by a cron job.

Authentication setup The server on which the replication script runs must have SSH configured to log in to the (source) DL, from the command line and without a password prompt. The script will rely on ssh to send the DL the commands to replicate. Using the SSH client program’s password authentication mechanism is not only insecure, but difficult to do via a script. As a best practice, use of public key authentication is preferred. With public key authentication, you can connect securely to the DL from the host server on which the script is run without having to enter a password.

Create the keys This procedure takes you through the steps to create and copy ssh keys from the server from which you will run the post backup script for replication to the DL server.

For Linux servers only:

1. Log in as root.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 30

Page 31: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

2. Enter the following at the command prompt to generate the public and private keys:

ssh-keygen -t rsa

3. Elect to put the keys into the default directory.

4. Press RETURN when prompted for a passphrase.

5. Go to the default directory containing the generated keys and verify that the id_rsa.pub key exists.

Install the public key on the DL

1. Establish an ssh session with the source DL, logging in as cliadmin.

2. Append the id_rsa.pub key to the authorized_keys file in the cliadmin account on the source DL.

Test the keys

1. From a bash shell, test that the public key works by trying to ssh into the DL from your server without the password. This may prompt you to trust the host's credential, but not as for a password or a passphrase. For example:

$ ssh $DL -l cliadmin hostname

2. Run the remote hostname once more to confirm you get no prompts at all.

$ ssh $DL -l cliadmin hostname If this does not work, you'll need to fix it before you can run the script.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 31

Page 32: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

TSM post-backup script example This script has been tested on Sun Solaris, HP-UX, IBM AIX, and Linux.

This sample TSM replication script has the following high-level workflow:

• Searches the TSM activity log over the previous 12 hours for ANR05141. This value indicates a session has finished using a volume and builds a list of files (volumes) with this status.

• Since the activity log returns more than just the filename, the script removes all the extraneous information and builds a list of files to replicate.

• Connects to the source DL. • Replicates the list. • Waits for replication of all files (volumes) to finish.

To use this script:

1. For UNIX clients, place the replication script shown below in /usr/tivoli/tsm/server/bin (on HP/Solaris it is /opt/tivoli/tsm/server/bin) on the TSM server on which the script will be run.

2. Edit the script to contain the name of the source DL (DL), the share name (NFS), and the directory in which to place the file containing the list of volumes to replicate (VOL). (The name of the source DL can be its hostname or IP address.)

3. Run the script once backups are completed. The script will replicate the files (volumes) in the list it generates.

#!/bin/ksh #FINAL VERSION ######################################### # General setup ######################################### TSMCMD="dsmadmc -id=admin -password=admin -dataonly=yes" TSMDIR=/usr/tivoli/tsm/server/bin NFS=share_name VOL=/tmp/volume.list DL=IP address of source DL ######################################### # Executing TSM query $TSMCMD "q ac se=anr0514I begint=-12" > $VOL; # Getting uniq sorted list of cartridges (*.BFS volumes) CARTLIST=`cat $VOL | grep volume | grep -i bfs | awk '{ print $NF }' | sed -e 's/\.$//' | sort -u`; if [[ -z $CARTLIST ]]; then print "No volumes found."; exit; fi # Loop for each cartridge for SHARENAME in $CARTLIST; do # Changing BFS to lowercase SHARENAME=`basename $SHARENAME | sed -e 's/BFS/bfs/'`

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 32

Page 33: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

echo $SHARENAME ssh cliadmin@$DL "syscli --replicate nas --name $NFS --path $SHARENAME" done;

Recovery from a replicated NAS share When the target DL is configured with a NAS share, TSM may be able to recover data directly from the replicated DL. This process assumes that the DL systems have been configured according to the “DL setup” section on page 30.

1. Pause replication on the source DL. Even though the source DL may be unavailable the state of replication may still be “active.”

2. Unmount the share on the source DL.

3. Mount the share on the target DL. TSM now recognizes the replicated data and can manage recovery from the target share.

4. Ensure that the TSM application is running on the server using the following command:

ps –ef | grep dsm “dsmserv quiet” should be running If not, start TSM as follows: /usr/Tivoli/tsm/server/bin ./rc.adsmserv &

5. Update dsm.sys (UNIX) or dsm.opt (Windows) to point to the recovery server.

6. Start the TSM client.

7. Select the files to restore and restore as normal.

Recovering replicated cartridges The following test scenario describes how tapes replicated to a target DL using the cartridge-based replication feature can be recovered at the target site in a disaster recovery situation.

In this test scenario, tapes in the source VTL (lb2.0.0.1) have been replicated to a secondary library (lb1.0.0.1) in the target DL using the DL’s cartridge-based replication feature. The source and target VTLs are configured with the same sync ID. This sync ID defines the VTL pairing on the source and target systems for replication and the cartridge-based replication feature automatically makes the replicated tapes visible to a server connected to the target DL when replication completes.

An outage has caused communication with the source VTL to fail and TSM considers that library (lb2.0.0.1) to be offline. The TSM server is still functional. If it has gone down, the TSM admin is required to restore TSM by whatever means are normally used to do so.

To recover at the target site

1. Go to TSM and tape the path to the source media change offline. NOTE: This step is not performed in a true disaster recovery scenario.

upd path tsm_server1 lb2.0.0.1 srct=server destt=libr online=no

2. Try to restore the source cartridges. The restore should fail.

3. Check that the media is in a wait state.

q se f=d

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 33

Page 34: EMC DL1500 and DL3000 with IBM Tivoli Storage … manages client backup data in increments. The backup data originates from a single client and may consist of a single file, directory,

4. Verify that the source VTL is offline.

q path

5. Check in the target VTL (lb1.0.0.1).

checkin libv lb1.0.01 search=yes check1=barcode stat=private

Note: As a best practice, the tapes are checked in as private even if there is a mix of private and scratch tapes in the target library. This mitigates the risk of writing over a data tape.

6. Update the device class to use the target library.

upd devc ltoclass1 libr=lb1.0.0.1

7. Do a restore.

Clean up If the above procedure was followed to test a disaster recovery scenario, perform the following steps to delete the test VTLs and media on the source and target DLs.

When the source system is recovered, in TSM:

1. Delete data on the source that is associated with the recovered volumes at the target.

delete vol xxxxxx discarddata=yes

2. Check out the tapes in the source library.

checkout libv (librname) volr=000000, 000005 removed=no check1=no At the source DL:

1. Export the recovered tapes.

2. Delete the exported tapes.

3. Synchronize the source VTL with the target VTL. This will delete all media on the target.

EMC DL1500 and DL3000 With IBM Tivoli Storage Manager Best Practices Planning 34