55
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved. Symmetrix Solutions Design Concepts - 1 © 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts Symmetrix Solutions Design Concepts Symmetrix and Layered Applications Welcome to Symmetrix Solutions Design Concepts. The AUDIO portion of this course is supplemental to the material and is not a replacement for the student notes accompanying this course. EMC recommends downloading the Student Resource Guide from the Supporting Materials tab, and reading the notes in their entirety. These materials may not be copied without EMC's written consent. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC is a registered trademark and Celerra is a trademark of EMC Corporation. All other trademarks used herein are the property of their respective owners.

Symm Sdw Impact

Embed Size (px)

Citation preview

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 1

© 2005 EMC Corporation. All rights reserved.

Symmetrix Solutions Design ConceptsSymmetrix Solutions Design Concepts

Symmetrix and Layered Applications

Welcome to Symmetrix Solutions Design Concepts.

The AUDIO portion of this course is supplemental to the material and is not a replacement for the student notes accompanying this course.

EMC recommends downloading the Student Resource Guide from the Supporting Materials tab, and reading the notes in their entirety.

These materials may not be copied without EMC's written consent.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC is a registered trademark and Celerra is a trademark of EMC Corporation.

All other trademarks used herein are the property of their respective owners.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 2

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 2

Course ObjectivesUpon completion of this course, you will be able to:

Describe the important technical data to be gathered about the use of Symmetrix

Describe how technical data is gathered for Symmetrix

Describe how to interpret and comprehend the gathered data

State the parameters to set, and tools used to control and manage Symmetrix

Discuss the best practices for configuring and deploying Symmetrix

The objectives for this course are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 3

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 3

Concepts and TerminologyUpon completion of this lesson, you will be able to:

Describe Symmetrix management concepts and terminology

Describe TimeFinder and SRDF concepts

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 4

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 4

Review of Hardware Topics Basic Symmetrix architecture – LUN, – RAID Protection Types– Front End, Back End– Dynamic Spares

Performance Basics– Little’s Law– Application I/O and block sizes– Workload characterization

Let us review some of the topics that have been covered in the Foundation modules:

A logical unit number (LUN) is how a storage device is presented by the Symmetrix and recognized by the host.

RAID (Redundant Array of Independent Disks) protection is available in one of three variations: RAID 1 refers to Mirror protection; RAID-S is the older EMC technique for providing RAID protection, where there was one dedicated parity drive for a group of 3 or 7 drives; and the newer variation is RAID 5, where parity is distributed across the entire rank of drives.

The front end of the Symmetrix connects to hosts. The back end is connected to the physical drives. Patented EMC technology (Enginuity) carves up the physical storage and presents it as smaller logical volumes to hosts connected to the front end.

Little’s Law formulated by John Little of MIT’s Sloane School of Management has been adapted for computer systems to indicate that response time (such as how quickly an I/O is serviced by a disk) increases non-linearly as utilization increases.

Block sizes associated with application I/O can vary and has an effect on performance. The response times for large block I/Os tends to be greater but the amount of data moved in MB/sec is larger. Small block I/Os result in more I/Os per second but fewer MB/sec throughput.

Workloads can be batch or interactive, read or write intensive. Interactive workloads, where humans are waiting for a response from the system tend to be more response time sensitive. Batch workloads are less sensitive to response times. In addition disk subsystems can handle more reads per second than writes, so the read/write mix has an impact on disk array performance.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 5

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 5

Review of Symmetrix Local and Remote ReplicationLayered Applications– TimeFinder

MirrorSnapClones

– EMC Open Replicator– SRDF

Synchronous (SRDF/S)Asynchronous (SRDF/A)Automated Replication (SRDF/AR)Consistency GroupsData Mobility (SRDF/DM)SRDF/Star

EMC’s rich set of local and remote replication products are introduced in the EMC Technology Foundations modules. To review each product, briefly:

The TimeFinder family comprises the local replication products. TimeFinder/Mirror is the most mature and high performing local replication product. The Business Continuance Volume is a detachable mirror that can act either as a transparent mirror of the standard or as an independent device. TimeFinder/Snap is a space saving pointer based copy of a source suitable for read intensive applications, not as suitable for write intensive applications. TimeFinder/Clone is a full volume pointer based copy that overcomes the 4 mirror limitation of TimeFinder mirror by allowing up to 8 differential and 16 non-differential full copies of the source.

EMC Open Replicator for Symmetrix is a relatively new product that facilitates data transfers between a DMX and other kinds of EMC and third party arrays.

The Symmetrix Remote Data Facility (SRDF) product suite consists of EMC’s remote replication products for the Symmetrix.

SRDF/S replicates all writes to the local array remotely in real time. SRDF/A buffers writes in the local array’s cache and replicates them remotely in near real time (within seconds to minutes). SRDF/AR takes periodic point in time copies of the original data on disk and propagates the data at a later point in time. This mode of delayed replication requires less network bandwidth and saves transmission costs at the expense of having the target lag the source by a few hours. SRDF/DM is a low cost solution that is good for moving data from one Symmetrix to another in adaptive copy mode. It is unsuitable for disaster recovery. SRDF/Star allows higher tiered customers to maintain three data centers. In the event of a failure two data centers can continue running with very little interruption. This preserves the ability to survive a second data center failure.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 6

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 6

Terms Related to Cache ManagementLogical Volume Write Pending Ceiling– Maximum number of writes (measured in tracks) that can be queued

to a logical volume– Slows performance to write intensive logical volumes– Impact on local applications and SRDF/S performance

System Wide WP ceiling– Maximum number of writes (measured in tracks) that can be queued

to the Symmetrix– When the ceiling is hit (80% of available cache)

system performance deterioratesSRDF/A sessions fail

Logical volume write-pending restrictions are imposed when too much data has been written to an individual Symmetrix system logical device, but not destaged to disk. When a device reaches the LVWP ceiling, each new write to a track that is not already write pending for the device, will trigger a special task that waits for old data to be destaged to disk before the new write is accepted.

The system-wide write-pending limit is imposed when too much data has accumulated system wide in the Symmetrix system without being destaged to the back-end disks. When the Symmetrix system is at the system-wide write-pending limit, and you write to a track that is not already write pending, each new write for any volume to the Symmetrix system will trigger a special task that waits for old data to be destaged to disk before the new write is completed. At the write-pending limit, the Symmetrix system changes the priority of writes to equal that of read misses. Because the Symmetrix system is no longer prioritizing reads over writes, read response time may also be significantly impacted.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 7

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 7

I/O Performance ConceptsBandwidth– The data transfer rate, measured in MB/s– Key metric for SRDF links– Attribute of host interconnects such as Fibre channel

I/O Size– Exchange like (4K block size)– Oracle like (8K block size)– Sybase like (2K block size)

Read / Write ratio

Random I/O– Consecutive I/O operations are on different disk areas

Sequential I/O– Small block sequential writes good for Snap performance– Large block sequential writes poor for Snap performance

Different sources use the term ‘bandwidth’ differently. Some define bandwidth as the link capacity expressed in Megabytes, and others in Megabits. For the purposes of our discussion we will refer to bandwidth as the data transfer rate in Megabytes per second. It is an important consideration in designing networks for SRDF and Open Replicator. Bandwidth is also an important metric for host interconnects such as Fibre Channel.

Knowing the customer’s application environment can facilitate the planning process. The Performance Engineering group publishes performance information for the EMC internal Speed community. Their reports are based on different workloads. In general they comprise OLTP workloads with different average I/O sizes and read/write mixes and Decision Support workloads with different I/O sizes and read/write I/O mixes.

Typically the variables in a workload are: a) I/O size in KB b) Sequential or Random I/O c) Read/write mix. It is useful to know how each of these workloads affect different EMC products. For instance large block sequential writes pay a noticeable CopyOnWrite penalty when the write occurs to the source or target of a Snap session. Comparatively, small block sequential writes do not pay as high a penalty.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 8

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 8

Security and Change Measurement ToolsAccess Control– The ability to limit the amount of control that a channel attached host

with SYMCLI can exercise on a Symmetrix– The function of segregating devices into pools for control operations

by different hosts (e.g. TimeFinder, SRDF, Configuration Manager, etc.)

DeltaMark (SDDF)– Symmetrix Differential Data Facility is used to track changes

between the logical volume and its replica– 16 per Symmetrix logical volume– Used for Open Replicator, TimeFinder/Snap, TimeFinder/Mirror,

Change Tracker, TimeFinder/Clone

Access Control is a free feature of Solutions Enabler that restricts the ability of a channel attached host to control the Symmetrix using SYMCLI commands. The Access Control paradigm divides the Symmetrix operations into 20+ privileges or Access Rights. Selected Access Rights such as BCV, SRDF, etc. can be assigned to different hosts so they can execute different subsets of SYMCLI commands on selected pools of devices. For instance, a host dedicated to the accounting department could be assigned the privilege to issue TimeFinder commands to a pool of standards and BCVs that are earmarked for use by that department. A host from the sales department lacking that privilege could not exercise TimeFinder control on accounting’s pool of devices.

The DeltaMark feature, sometimes known as the Symmetrix Differential Data Facility (SDDF), identifies tracks that have changed on a Symmetrix LUN since the creation of a DeltaMark session. Each Symmetrix LUN supports up to 16 DeltaMark sessions. It is the mechanism by which relative changes between two volumes such as the source and target of a Snap or a clone, and a standard and BCV are measured. It is also the mechanism by which volume changes in Change Tracker and data transfers performed by Open Replicator are measured.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 9

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 9

TimeFinder ConceptsTimeFinder Emulation Mode– Enables use of TimeFinder Clones in existing TimeFinder scripts– Underlying technology behind RAID 5 BCVs– Settable by session

Enginuity Consistency Assist (ECA)– Used for TimeFinder consistent splits

TimeFinder Consistent Splits– Required for SRDF/AR, SRDF/A and local applications

When TimeFinder was originally introduced in the late 1990s, it functioned by allowing a BCV to become an additional mirror. Though revolutionary at the time, over the years the 4 mirror limit of a Symmetrix LUN limited the number of concurrent full volume copies that could be made from a single source. By implementing TimeFinder/Clones, which use up DeltaMark sessions as opposed to mirror positions, the number of concurrent mirrors was expanded to 8 with Enginuity 71.

TimeFinder Emulation mode preserves customers’ investments in scripts that had been written for TimeFinder/Mirror. Using this mode, calls to the TimeFinder/Mirror “symmir” command are translated by the CLI into the TimeFinder/Clone command “symclone”.

Enginuity Consistency Assist is the Symmetrix feature which makes it possible to perform TimeFinder consistent splits. ECA will hold write I/O to a user defined list of Symmetrix standard volumes while BCVs are being split from them. The momentary stoppage of writes permits the creation of a consistent and re-startable copy of the data on the BCV. Similar functionality is available also on TimeFinder/Snap, TimeFinder/Clone and Open Replicator, where ECA will hold writes to the source volumes while the Target volumes are activated.

TimeFinder Consistent Splits are an important Enginuity feature and can be executed in the local Symmetrix or in an SRDF attached remote Symmetrix.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 10

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 10

SRDF ConceptsSRDF Consistency

– Technology to assure that remote copy is consistent and restartable– Useful in synchronous SRDF setups with more than two Symmetrix units– Used in SRDF/Star

SRDF Groups– Theoretical max of 16 per director (pair) and 64 per Symmetrix– Important for managing SRDF/A

SRDF/A session priorities– A number from 1 to 64 that regulates the order in which SRDF/A sessions will fail if

the Symmetrix fails to cope with incoming writes

SRDF/A Cycle Time– The minimum time that must elapse before a delta set switch occurs, provided the

transmit and the apply cycles have finished– Governs the amount of data loss exposure

SRDF Daemon– Host process run on one or more hosts to manage SRDF/A cycle switching and

Synchronous SRDF consistency groups.

SRDF Consistency Groups for open systems allow customers to define logical volume groups, which can be associated with a given workload. These groups of SRDF logical volumes will automatically be suspended in case there is a failure to write to any volume in the group because of network or other hardware problems. The remote SRDF logical volumes will be consistent, even if these logical volumes span multiple Symmetrix systems. One such example is a large database with its tables on one Symmetrix and its log files on another.

The Open Systems version of this capability is available for HP/UX, Solaris and IBM AIX environments and uses PowerPath to manage inter Symmetrix communication. Consistent split capability is inherent in SRDF/A implementations where data on the R2 is guaranteed to be consistent if a consistent SRDF/A setup encounters a failure.

In most kinds of SRDF/A implementations, devices belonging to an RDF group can be subdivided into smaller SYMCLI device groups and distributed among different applications. These device groups can establish, split, fail over and fail back independently of each other. In contrast, a group functioning in SRDF/A mode requires that all volumes in the RDF group be managed as a single entity, i.e. devices belonging to an RDF group must all be placed in one device group. This places a limit on the number of device groups, and consequently applications can be supported per director pair in the Symmetrix.

Since Enginuity 71 supports multiple SRDF/A groups within a single Symmetrix, it is possible to assign session priorities to each SRDF/A group. If cache resources become overextended, the SRDF/A group traffic will be suspended in order of lowest (64) to highest (1).

Cycle times for each individual group can be set using Symmetrix Configuration Manager with Solutions Enabler V6.0 and Enginuity 71.

The SRDF Daemon in Open Systems is used to maintain consistency protection in SRDF/S and SRDF/A environments. In a Synchronous SRDF environment it works with Symmetrix ECA to guarantee consistency. In SRDF/A environments it plays a role in cycle-switching when multiple SRDF/A sessions must be consistently managed.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 11

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 11

Important Information About This TechnologyUpon completion of this lesson, you will be able to:

Describe the technical data that you need to gather about the Symmetrix configuration and performance measurement

Describe the technical data that you need to gather about OpenReplicator usage

Describe the technical data that you need to gather about TimeFinder usage

Describe the technical data that you need to gather about SRDF usage

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 12

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 12

Hardware Layout of Existing EnvironmentCapacity– Number of applications– Amount of storage that each application / host uses– Amount of used and unused capacity

Availability– What type of RAID protection is being used (R-5, R-S, R-1)

Security– Is Access Control being used to limit host control over the Symmetrix

Connectivity– How many hosts are connected to the array– Are redundant paths being used– What is the total number of FA ports available vs. in use today

To understand an existing customer environment some of the questions that need answers are:

• What is the number of applications and the amount of storage each uses?• What is the amount of used and unused storage capacity in the entire Symmetrix?• What type of RAID protection being used?• Is Access Control is being used to implement security?• How many hosts are connected to the array?• Are redundant paths in use?• How many of FA ports are in use? Are there any unused ones?

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 13

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 13

Performance and Management of Existing Environment

Performance– Bandwidth (MB/sec) used by each application / host– Read/Write ratio– Random/Sequential I/O mix– I/O Block size – How much cache is being used– Devices close to their device WP ceilings– Is the Symmetrix approaching its WP ceiling

Management– Any GUI based products (EMC ControlCenter, Performance

Manager)– CLI based

I/O Profile of a Host running E-Mail Application

Different applications have different I/O profiles. For instance an E-mail application exhibits a spike early in the morning, then subsides to a steady level before tailing off late at night. Other applications have other IOPs characteristics.

To gather performance related information about an existing environment it is necessary to find out:- The nature of the workload - Bandwidth expressed in MB/sec- Its read/write ratio- Average I/O block size - Random vs. sequential I/O- Throughput in IOPs - The size of cache in the existing Symmetrix- How close individual devices and the Symmetrix are to the Write Pending ceiling limits

Are EMC ControlCenter, Replication Manager or EMC ControlCenter Performance Manager being used for management or monitoring?

Is SYMCLI in use?

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 14

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 14

System-wide Cache Back-end Directors

DisksHost Ports

Component Usage

This slide shows the four primary component types within a Symmetrix: (from the bottom left) Host Ports (the Front-end activity), System Cache, Back-end directors, and the Disks. Component usage results expose design flaws by showing imbalances and poor use of directors, cache and devices. It also shows where insufficient resources could result in poor performance by exposing places where resources are running at or near their technical limits.

The graphs on this page represent the four areas of contention within the Symmetrix and provide the following key metrics:

The host ports display is I/O per sec.

The cache activity is represented by a hit% graph.

Back-end directors show I/O per sec.

The Disks graph is of SCSI commands per sec.

As each of the components get loaded, the responses become slower. Little’s Law as applied to storage subsystems states that response times increase with greater component utilization.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 15

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 15

TimeFinder Usage in Existing EnvironmentConfiguration

– Which TimeFinder product is being used (Mirror, Snap or Clone)– How much data is being replicated– How many concurrent and how many sequential copies are being used– How long is the life of the copies– What is the rate and total amount of change in the source data during the life of the

copy

Performance– If using Snaps or Clones, is there an unacceptable degradation of write performance

due to CopyOnWrite or CopyOnAccess penalties– Are the copies being predominantly used for reads (e.g. backups) or writes (e.g. data

warehouse loads)– Are Business Continuance time deadlines (for backup, reporting etc.) being

adequately met

Management– Are GUI based management products being used (EMC ControlCenter, RM)– Are SYMCLI scripts being used for automation

The questions shown here pertain to any existing TimeFinder setup the customer may currently have.

The amount of data and the number of copies being used point to the amount of storage being used for local replication. A maximum of two concurrent copies are permitted in TimeFinder/Mirror and eight with TimeFinder/Clone. The duration of the copies and the amount of data that changes during that time could be an indicator to whether Snaps may be a viable option in the future.

Performance of TimeFinder Snaps and Clones can suffer if heavy writes are executed against the source while the copy is in progress. This is because of the Copy On First Write penalty in the case of Snaps and Copy on First Access penalty in the case of clones. With both of the above products the data from the point in time of Snap or Clone activation the has to be moved from the source target before the initial write or access can be allowed.

Is the performance suitable for meeting the backup/reporting deadlines?

Is the management of the product being done by using GUIs or CLIs?

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 16

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 16

Open Replicator Usage in Existing EnvironmentConfiguration

– Is data moving to or from a non-EMC arrayIf yes is the other array qualified

– Is data being transferred to (pull) or from (push) the DMX– Are incremental pushes being done– How much data is being replicated in how much time– Are live pushes or pulls being done– Is the connection over a local or a wide area SAN

Performance– What is the network distance and quality between the control DMX and the remote array– What if any bottlenecks are affecting the transfer (e.g. network infrastructure, speed of remote

array, hot volumes on the DMX)– Is Open Replicator consuming too high a share of SAN bandwidth and affecting local I/O– Are the “ceiling” and “pace” parameters being used

Management– CLI based

Scalability– How many LUNS are being migrated

Open Replicator is a relatively new product which enables point in time data transfers to occur between EMC DMX and other types of arrays. Various arrays of third party vendors have been qualified to work with Open Replicator including several models from Hitachi, IBM, and HP StorageWorks.

Incremental copies are only possible when the data originates on the DMX and is pushed out. Cold pushes can service up to 15 targets simultaneously. Live pulls run the risk of data loss if the session is interrupted before it completes. For these reasons it is important to know about the planned direction of data flow. Open Replicator uses a DeltaMark session and is subject to the 16 session limit for every logical volume.

The performance of Open Replicator is dependent on network quality. Other factors include the abilities of the source and target arrays to transfer data. A fast array can perform poorly if the volumes involved in the data transfer are too busy because of an uneven distribution of I/O load.

Open Replicator can sometimes impact host I/O by consuming an unfairly large share of SAN bandwidth. Two parameters, “ceiling” and “pace” can be used to throttle the amount of bandwidth that Open Replicator uses. “Ceiling” sets the maximum bandwidth that Open Replicator can use on a specific director and port regardless of how many sessions may be active. “Pace” can be set to slow down specific sessions by injecting waits between transfers.

At this time, the only management interface to Open Replicator is through SYMCLI.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 17

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 17

SRDF Configuration in Existing EnvironmentConfiguration– What kind of SRDF is being used (/DM, /AR, /S or /A)– Physical and network distance between production and DR site– Amount of Data that is being replicated– Number of LUNs being replicated– Network interconnect type (Fibre, ESCON or GIG-E)

What is the availability of data / what is the maximum possible data loss with this solution – Days / hours (SRDF/AR)– Seconds / Minutes (SRDF/A)– None (SRDF/S or SRDF/Star)– Are the Recovery Time Objective (RTO )/ Recovery Point Objective

(RPO) goals being met

In order to discover what kind of SRDF setup the customer has, it is important to know:

Which flavor of SRDF is in use, the physical and network distances between the two sites and the volume of the source and target data. The number of LUNs being replicated and network interconnect types are also relevant.

If SRDF/Star is being used the distances between the Workload site and the Sync target site, between the Sync target site and the Async target site and the Workload site and the Async target site need to be ascertained.

It is also worth inquiring whether the actual data loss potential is satisfying the expected Recovery Point Objective.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 18

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 18

SRDF Operations in Existing EnvironmentPerformance statistics– Bandwidth– Latency– Packet Loss– Is TimeFinder competing with SRDF Adaptive Copy for Symmetrix

back end resources

Management– GUI based (Replication Manager, SRDF/TimeFinder Manager)– SYMCLI based

Scalability– Are consistency groups being used to scale across multiple

Symmetrix arrays

Network attributes such as bandwidth, latency, and packet loss have an impact on SRDF performance. With SRDF/AR implementations, the Symmetrix back end, i.e. the DAs, can find themselves trying to satisfy requests for host I/O, TimeFinder synchronizations, and SRDF copy tasks leading to performance degradation.

Management of SRDF can be done with either SRDF/TimeFinder Manager, which is part of EMC ControlCenter, or with SYMCLI, which is part of Solutions Enabler.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 19

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 19

Gathering Symmetrix Information Upon completion of this lesson, you will be able to:

Describe the tools and resources that allow you to gather Symmetrix configuration, management, and performance information

Discuss how to interpret this information

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 20

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 20

Information Gathering for SymmetrixInterview storage managers / users to learn– Nature of application

Database typeRead or write intensive

– Number of hosts connected to the Symmetrix

Interview account CE or examine Symmetrix BIN file to learn– How many logical volumes are configured in the Symmetrix– Protection type (RAID 1, RAID-5, RAID-S or unprotected)– Number of front end host ports

Use host based tools (e.g. iostat, sar, Perfmon ) to estimate– Read and Write IO loads on the Symmetrix

Use EMC tools (if available) to:– Measure Read and Write I/O loads (Performance Manager, STP, symstat)– Gauge the size of needed storage by using SYMCLI (symdev)

One of the primary sources of information in existing Symmetrix environments should be the user community. They are the ones who can describe the nature of the application that the Symmetrix is being used for, for example; the type of database, whether it is read or write intensive, and the number of hosts that use it.

The EMC CE can also provide information about the layout of the Symmetrix, the protection scheme of the logical volumes, and the number of front end ports available for host connectivity.

To assess the I/O loads on the Symmetrix there are a choice of tools that can be used. Host based tools such as iostat and sar on Unix, and Perfmon on Windows, are easily accessible but do not provide a comprehensive picture of the load in the Symmetrix.

EMC tools such as Performance Manager, STP and the SYMCLI commands ‘symstat” and “symdev”provide a more comprehensive picture of the entire array.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 21

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 21

Interpreting sar Information

SunOS losap160 5.8 Generic_117350-12 sun4u 08/11/05

18:18:44 device %busy avque r+w/s blks/s avwait avservsd191 99 1.0 112 14386 0.0 8.8sd191,a 0 0.0 0 0 0.0 0.0sd191,b 0 0.0 0 0 0.0 0.0sd191,c 99 1.0 112 14386 0.0 8.8sd191,g 0 0.0 0 0 0.0 0.0sd192 99 1.0 112 14310 0.0 8.9sd192,a 0 0.0 0 0 0.0 0.0sd192,b 0 0.0 0 0 0.0 0.0sd192,c 99 1.0 112 14310 0.0 8.9sd192,g 0 0.0 0 0 0.0 0.0sd193 99 1.0 113 14489 0.0 8.7sd193,a 0 0.0 0 0 0.0 0.0sd193,b 0 0.0 0 0 0.0 0.0sd193,c 99 1.0 113 14489 0.0 8.7sd193,g 0 0.0 0 0 0.0 0.0

sar –d 5 (example below is on Solaris) shows average R/W I/Os/sec., average number of 512 byte blocks/sec, average service time in millisec

System Activity Reporting (sar) is one of the two major Unix host based performance monitoring utilities. It collects data and produces reports for CPU, memory, and disk performance. The sar –d report disk statistics. The columns show:

• The portion of time the device was busy servicing a transfer request

• Average number of requests outstanding during that time

• Number of read/write transfers from or to device, number of bytes transferred in 512-byte units

• Average wait time in milliseconds

• Average service time in milliseconds

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 22

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 22

Interpreting iostat information

extended device statisticsr/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device0.0 13.4 0.0 0.1 0.0 1.0 0.0 71.1 0 4 c0t0d00.0 111.6 0.0 7.0 0.0 1.0 0.0 8.9 0 99 c3t0d100.0 114.6 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d110.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d120.0 114.0 0.0 7.1 0.0 1.0 0.0 8.7 0 99 c3t0d130.0 114.8 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d140.0 66.8 0.0 4.2 0.0 1.0 0.0 14.9 0 99 c3t0d150.0 66.6 0.0 4.2 0.0 1.0 0.0 14.9 0 99 c3t0d160.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d170.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d180.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d190.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d200.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d210.0 115.0 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d220.0 106.4 0.0 6.6 0.0 1.0 0.0 9.3 0 99 c3t0d231.4 1.2 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d171

iostat xtczMn (example below is on Solaris) will print out number of reads, writes (IOPs) as well as MB/sec per device

IOstat is another Unix based host performance monitoring utility. It collects data and produces reports for terminals, disks and tapes. The columns denote:

- r/s reads per second

- w/s writes per second

- Mr/s Megabytes read per second

- Mw/s Megabytes written per second

- wait average queue length of transactions waiting for service

- actv average number of transactions actively being serviced

- asvc_t average service time in milliseconds

- %w percent of time there are transactions waiting for service

- %b percent of time the disk is busy

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 23

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 23

Performance Manager Sample Report Diskperf (Performance Manager) provides snapshot of Disk Performance

The diskperf utility is the major Windows performance measurement utility. It controls the types of counters that can be monitored using the System Monitor Utility.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 24

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 24

Limitations of Host Performance MeasurementThe host views each logical volume as a separate physical spindle– All modern disk arrays present logical volumes to hosts as if they

were standalone devices– Different logical volumes can share the same physical spindles– Logical volumes sharing the same physical spindles can contend

with each other for the drive actuator and degrade each others’performance

Each host assumes it has the exclusive use of its logical volumes

Can only ‘see’ the I/Os and response times locally

Data must be collected for volumes across each host

Since the host views every logical volume as a separate entity, the information derived from the host based tools can be deceptive. Since disk arrays can host several logical volumes on the same spindle, the host tools could easily present a distorted view of a disk’s performance.

Host based performance measurement tools only offer performance information about the host they are running on. They cannot provide a comprehensive picture that shows the performance of the whole array.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 25

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 25

Use of symstat

DEVICE IO/sec KB/sec % Hits %Seq Num WP10:37:22 READ WRITE READ WRITE RD WRT READ Tracks

10:37:22 0022 (c3t0d10s2) 0 106 0 6826 N/A 100 N/A 56730023 (c3t0d11s2) 0 106 0 6826 N/A 100 N/A 56720024 (c3t0d12s2) 0 106 0 6826 N/A 100 N/A 56820025 (c3t0d13s2) 0 106 0 6826 N/A 100 N/A 56620026 (c3t0d14s2) 0 96 0 6144 N/A 99 N/A 74990027 (c3t0d15s2) 0 53 0 3413 N/A 96 N/A 56620028 (c3t0d16s2) 0 53 0 3413 N/A 100 N/A 56500029 (c3t0d17s2) 0 96 0 6144 N/A 99 N/A 5656002A (c3t0d18s2) 0 106 0 6826 N/A 100 N/A 5639002B (c3t0d19s2) 0 106 0 6826 N/A 100 N/A 5648002C (c3t0d20s2) 0 106 0 6826 N/A 100 N/A 5659002D (c3t0d21s2) 0 106 0 6826 N/A 100 N/A 5661002E (c3t0d22s2) 0 106 0 6826 N/A 100 N/A 5637002F (c3t0d23s2) 0 106 0 6826 N/A 100 N/A 566100B4 (c3t0d140s2) 0 0 0 0 N/A N/A N/A 204700B5 (c3t0d141s2) 0 0 0 0 N/A N/A N/A 5695

------ ------ ------- ------- --- --- --- ------Total 0 1358 0 87374 N/A 100 N/A 88803

symstat breaks down reads and writes by Symmetrix Logical Volumes

The SYMCLI command, symstat, captures statistics information about the Symmetrix in real time. You can examine the performance of one or more devices or directors. The statistics in this display are broken down by I/Os per second, KB/sec, Read and Write cache hits, as well as a breakdown between reads and writes. The number of write pending tracks indicates the tracks awaiting destaging from cache to disk.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 26

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 26

Use of symdev

losap160/usr/sengupta> symdev list -sid 53

Symmetrix ID: 000387940053

Device Name Directors Device

--------------------------- ------------- -------------------------------------

Cap

Sym Physical SA :P DA :IT Config Attribute Sts (MB)

--------------------------- ------------- -------------------------------------

0000 /dev/rdsk/c3t0d0s2 15C:0 16B:C0 2-Way Mir N/Grp'd VCM WD 11

0001 /dev/rdsk/c3t0d1s2 15C:0 02B:C0 2-Way Mir N/Grp'd (M) RW 17261

0002 Not Visible ***:* 01A:C0 2-Way Mir N/Grp'd (m) RW -

0003 Not Visible ***:* 02A:C1 2-Way Mir N/Grp'd (m) RW -

0004 Not Visible ***:* 01B:C1 2-Way Mir N/Grp'd (m) RW -

symdev list will print out device capacities, which can be added up to arrive at the total storage used by an application

The “symdev” command lists information about devices in the Symmetrix. By using this command it is possible to display the sizes of all volumes in the Symmetrix. If the user knows the identities of the volumes dedicated to his application he can then sum up their capacities and arrive at the total number of MB being used for his application.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 27

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 27

TimeFinder Configuration Information GatheringInterview storage managers and users to learn:– Which TimeFinder products are being actively used– Number of actual copies made– Life of each copy of data– Extent of change to the source during the life of the copy

Validate the above information– Examine SYMCLI license file ( symapi_licenses.dat ) on the

management hosts to see which TimeFinder products are licensed– Check the device groups (symdg list) to learn how many devices with

what capacities are currently being used for TimeFinder– Peruse some of the log files (symapi-YYYYmmdd.log)to determine which

TimeFinder family commands are being used– If Snap is being used or planned, run change tracker to determine extent of

change to source volume during the life of the copy– Is emulation mode being used either through use of RAID 5 BCVs or use of

the SYMCLI_CLONE_EMULATION environment variable

By interviewing users, you can discover which TimeFinder products are in use and how many copies of data are needed for business continuity operations. It is also important to know about the length of time that a copy must be available and the amount of change to the source, and the copy during the life of the copy.

Answers to those questions would provide a clue to the suitability of Snaps in that environment. The information gathered during user interviews can be validated by examining the following host based files:

The license database resides in /var/symapi/config/ or \Program Files\EMC\SYMAPI\Config. It lists the licenses of the Symmetrix software products that can run on that host.

The composition of the device groups and the actions they are involved in can be deduced by displaying a list of the device groups using the symdg list command and by examining the log files located in the /var/symapi/log and \Program Files\emc\SYMAPI\log directories.

If TimeFinder/Snap is being planned, change tracker may be run to determine the extent of change to the source during the period that the snap is expected to exist.

If TimeFinder/Clones are being used in Emulation Mode this can be validated by checking if the SYMCLI_CLONE_EMULATION mode environment variable is set.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 28

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 28

TimeFinder Performance Information GatheringIf performance is deemed unsatisfactory:– Examine Performance Manager data to discover if write contention

exists on the back end– In case of Snaps, check change tracker data to examine amount of

change over life of copy

If Business Continuance operations are failing to meet time deadlines– Assess the viability of making more copies– Use other products from the TimeFinder portfolio (such as Snap or

Clone) to make copies available earlier

A common cause of performance problems with TimeFinder/Mirror is the overloading of the Disk Adapters (DAs). This occurs when excessive Establish/Restore activity collides with host write activity on older model (pre-DMX) Symmetrixes.

Another situation where performance problems can arise is when BCVs and standards that they are paired with, reside on the same spindle. A TimeFinder operation in this configuration will cause contention on the drive.

If BC Operations are failing to meet time deadlines, it may be worth creating more copies of data. If more than 2 concurrent copies of data are needed, Snaps and Clones may be appropriate.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 29

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 29

Information Gathering for Open ReplicatorInterview storage mangers and other users to learn:– How much data is being transferred in which direction (push or pull)– Is it to a Symmetrix or a supported non-Symmetrix array– Are recurring incremental pushes being performed– Network quality attributes (bandwidth, latency, packet loss)

Use EMC tools to validate:– Issue symdev list and add up the sizes of the volumes participating in

the OR session – Measure existing load on the SAN to assess impact of OR on existing

infrastructure by using Performance Manager– Use symstat to measure the throughput of the network– Use symcfg list and symmask list –logins to ascertain which FA

is logged in to which FA

Examine Data from Performance Manager to:– Assess the backend load on volumes involved in data transfer– Possible overloading of SAN infrastructure

If Open Replicator is part of the current environment, you can interview existing users to find out:

- How much data is being transferred

- Whether it is to a Symmetrix or non-Symmetrix array

- Are incremental pushes being used

- Network quality

Using SYMCLI command symdev list it would be possible to find out the list and sizes of the devices participating in the Open Replicator session.

The symstat command can provide snapshots of Open Replicator performance.

The outputs of the symcfg and symmask commands will show how the DMX and the remote array are configured.

Data from Performance Manager will indicate if the disk spindles or the SAN infrastructure are overloaded.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 30

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 30

Information Gathering for SRDFInterview storage mangers and other users to learn:

– Which SRDF products are in use– Distance between production and remote sites– Distance between each site and the other two in an SRDF/Star configuration– How much data is being transferred– What is the RPO and RTO– How SRDF is being managed (GUI or CLI)

Validate the above information: – Examine SYMCLI license file ( symapi_licenses.dat ) on the management

hosts to see which TimeFinder products are licensed– Check the device groups (symdg list) to learn how many devices with what

capacities are currently being used for SRDF– Peruse some of the log files (symapi-YYYYmmdd.log)to determine which SRDF

family commands are being used– Use network monitoring tools from network hardware providers to assess the

performance of the network– Use Symmerge (available to Performance Gurus) to verify if SRDF traffic is impeding

host or TimeFinder performance

By interviewing users, you can discover information about an SRDF infrastructure such as:- Which members of the product family are being used- Distance between the sites- How much data is being transferred- What is the RTO and RPO - If management is being done via GUI or CLI

Answers to those questions can be validated by examining the SYMAPI license database, the SYMAPI log, and by examining the outputs from the command symdg list to examine the sizes of the devices participating in SRDF.

Network hardware providers such as CNT will often provide tools to monitor network performance. These software tools can often give a good indication of how the SRDF traffic is flowing across the network.

Symmerge, an EMC proprietary tool available to SPEED community members, can be used to determine if the back end of the Symmetrix is being overextended by SRDF and / or TimeFinder

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 31

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 31

Use of Change TrackerEach Symmetrix logical volume can use up to 16 DeltaMark (SDDF) sessions

Using the symchg create followed by symchg markcommands you can start a DeltaMark session on a logical volume

Incremental changes can now be measured at discrete intervals

Useful for potential Snap and SRDF/AR implementations

More information can be found in the white paper Using SYMCLI to Measure Volume Changes with Change Tracker

Change Tracker uses DeltaMark bitmap technology to identify logical blocks that have been changed on a Symmetrix FBA device. Before change tracking can begin, a DeltaMark session must be created using the symchg create command. The symchg mark command is then used to perform a timestamp and mark the selected area of disk storage occupied by a data object using the DeltaMark bitmap.

After a set of devices has been marked, incremental changes can be measured at discrete time intervals. Although the measurement interval can be set in seconds, practically, the measurement intervals would be a few hours to a few days depending on the time duration over which changes are being measured.

Data change rates are important for planning SRDF/AR and Snap implementations. Hence, Change Tracker is typically used to estimate change rates on devices that are candidates for participating in an SRDF/AR implementation, or devices that would be the sources for TimeFinder/Snaps.

The Solutions Enabler manual on Change Tracker and the white paper Using SYMCLI to Measure Volume Changes with Change Tracker have more information.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 32

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 32

Parameters and Tools

Upon completion of this lesson, you will be able to:

Describe Symmetrix configuration and management parameters and tools

Describe TimeFinder management parameters and tools

Discuss SRDF management parameters and tools

Discuss Open Replicator management parameters and tools

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 33

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 33

Symmetrix Configuration ToolEMC Customer services uses Symmwin to configure Symmetrix

Customers can use Symmetrix Configuration Change CLI to make online changes

Any time a new Symmetrix is configured or an old one reconfigured, the EMC account team creates a proposed configuration after consulting the customer. This configuration is validated by the Configuration Control group at EMC. In the case of more complex solutions such as SRDF over IP and Open Replicator, an approval from the Solution Validation Center at corporate is needed.

The primary tool for configuring the Symmetrix is the Symmwin program which runs on the Service Processor. It is used by EMC Customer Service to set up the array in accordance with the approved configuration.

Customers can use SYMCLI based Symmetrix Configuration change CLI to make online configuration changes to the Symmetrix.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 34

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 34

TimeFinder Management ToolsSymmetrix Management Console

– Easy, point-and-click access– Excellent for ad hoc TimeFinder operations

EMC Replication Manager– Discovers replication environments– Automates replication process– Integrates replication technologies at the application level

TimeFinder/Exchange Integration Module (TimeFinder/EIM)– Provides a comprehensive backup management interface specifically for Windows

servers that support Microsoft Exchange databases residing in Symmetrix storage– Produces exact copies of the production volumes that hold the Exchange server

information stores and logs– Full and single mailbox restores in a fraction of the usual time

TimeFinder/SQL Integration Module (TimeFinder/SIM)– Provides a comprehensive backup and recovery management interface specifically

for Windows servers that support Microsoft SQL Server databases– Integrates and collectively automates the command actions and behavioral features

EMC offers a rich set of tools to manage and monitor TimeFinder. Symmetrix Management Console is a simple GUI application that is suited for ad-hoc Symmetrix management operations. It features easy point and click access to TimeFinder operations. It is good for individual TimeFinder operations but it is not suitable for automation.

EMC Replication Manager provides a GUI interface for managing local replicas. It can use multiple TimeFinder products such as mirrors and Snaps, and it permits the user to automate the process using a GUI interface

The TimeFinder Exchange Integration Module allows Windows users to automate Exchange Backup using TimeFinder. It has a built in capacity to perform consistency checks on the data before it is backed up to tape. The BCV data can be used for Information Store, Directory, or single mailbox recovery.

The TimeFinder/SQL Integration module is available for customers needing to quickly and easily integrate TimeFinder and SQL Server.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 35

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 35

Managing TimeFinder with Solutions EnablerSYMCLI commands are used to manage different TimeFinder products:– For TimeFinder/Mirror symmir – For TimeFinder/Snap symsnap– For TimeFinder/Clones symclone

Commands are based on device and composite group structure

Device Groups– A collection of devices, assigned to a named group, to provide a

more manageable object to query status and impart control operations

– Devices can be associated as either a device group or a composite group

Most TimeFinder users use SYMCLI scripts to automate their Business Continuance operations. The commands symmir, symsnap and symclone are used to control TimeFinder/Mirror, TimeFinder/Snap, and TimeFinder/Clones respectively.

These commands are designed to act on groups of devices placed by the user into device groups. Device groups are the basic building block of the Solution Enabler universe. Composite groups are a construct that is similar to device groups. Composite groups can contain devices belonging to several Symmetrixes in them, while device groups cannot.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 36

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 36

Managing SRDF with EMC ControlCenterSRDF Manager within EMC ControlCenter– Easy, “point-and-click” access– Excellent for ad hoc SRDF operations

Symmetrix Management Console features an easy point and click access to TimeFinder and SRDF operations. It is good for individual operations, but it is not suitable for automation.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 37

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 37

Managing SRDF with Solutions EnablerSYMCLI commands are used to manage different SRDF products:– For SRDF/A, SRDF/DM and SRDF/S symrdf– For SRDF/AR symreplicate

Commands are based on device and composite group structure

Most SRDF users use SYMCLI scripts to automate their Disaster Recovery operations. The command symrdf is used to control all SRDF.

The commands are designed to act on groups of devices placed by the user into device groups. Device groups are the basic building block of the Solution Enabler paradigm. Composite groups are a construct that is similar to device groups. Composite groups can contain devices belonging to several Symmetrixes while device groups cannot span Symmetrixes.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 38

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 38

Tools to Manage Open ReplicatorThe SYMCLI command symrcopy is used to manage Open Replicator

It is possible to regulate the speed of data transfer using the pace and ceiling parameters

Open Replicator is a product released in 2005. It runs on DMXs running 71 code. Though it can be used between Symmetrixes, its main purpose is to enable transfer of data between a DMX and a dissimilar array. The remote array can be a CLARiiON or a qualified array from another storage vendor.

The symrcopy command controls Open Replicator actions. Two parameters, pace and ceiling,regulate the rate of data flow between the controlling DMX array and the remote storage array. The pace parameter can throttle a single Open Replicator session and can be specified for each transfer. The ceiling parameter determines what percentage of the total FA bandwidth can be used by all Open Replicator sessions using the FA.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 39

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 39

Best Practices Upon completion of this lesson, you will be able to:

Describe best practices for optimizing Symmetrix configuration

Discuss best practices for TimeFinder performance optimization

Discuss best practices for SRDF operations

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 40

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 40

Optimizing Symmetrix ConfigurationsAllow for future growth– How many more hosts will need to connect to the server– At what rate will storage capacity grow

Spread load across backend infrastructure– Have many physical spindles work in parallel

Match application I/O block size to Symmetrix backend– 2KB I/O size causes SRDF performance to suffer

RAID1 or RAID 1+0 (striped metavolumes) are best for write intensive applicationsRAID 5 is cost effective and performs well for read intensive applications

Important considerations for optimizing a Symmetrix configuration are:

• Allowing for future growth

• Spreading the workload across as many spindles as possible, thereby improving application performance.

• Matching application I/O size with the way Symmetrix handles data is good practice. Since the smallest block of data transmitted by SRDF is 4KB, it is not a good practice to run older versions of Sybase with 2 KB I/O block size on SRDF volumes. Block sizes corresponding to higher powers of 2 (i.e. 4, 8, 16, 32, etc.) are all right to use.

• RAID 1 or RAID 1+0 volumes are best for write intensive applications

• RAID 5 volumes are cost effective and offer good read performance. They are unsuitable for write intensive applications.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 41

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 41

Optimizing TimeFinder Performance BCV performance could cause contention to source volume if not properly placed– BCVs and standards should be on different spindles– Establish/Restore operations stress the back end– Short SRDF/AR cycles require frequent establishes and can stress the

Symmetrix infrastructure

Clones are prone to CopyOnAccess performance penalty– Avoid heavy performance-sensitive writes to source until clone has finished

copying– Place sources and targets on different spindles

Snaps are susceptible to CopyOnWrite penalty– Avoid heavy performance-sensitive writes to source for life of Snap– Limit data changes to source to about 30%– Spread out Save Devices across many spindles

Avoid using TimeFinder/Clones and TimeFinder/Mirrors in the sameSymmetrix– It can lead to unexpected results

Since BCVs are full image copies, they require the same amount of usable disk space as the source volume. They do not have to be the same RAID type or drive type as their source. BCVs only require incremental resynchronization. BCV establishes, place a strain on the Symmetrix back end. A large number of full establishes, or short SRDF/AR cycles which require frequent establishes and splits, can lead to heavy usage of the Symmetrix resources and hamper host I/O performance.

Clone performance has no impact on reads from the source as long as there is no workload on the target. Accessing the target before the clone is fully replicated could cause disk contention with the source. This impact is referred to as the CopyOnAccess penalty.

Immediately after its creation, all tracks on a Snap source are “protected”. This means that these tracks have to be moved to the Save area prior to any new writes to the source. This causes the first write to any track on the source to be delayed, and is called the CopyOnWrite penalty. If too much data on the source volume changes, Snap loses the advantage of being a space saving copy. Spreading out Save devices over many spindles is critical to performance. Otherwise they can become a performance bottleneck when simultaneous changes occur to a lot of snapped volumes.

Using TimeFinder/Mirrors and TimeFinder/Clones in the same Symmetrix is not recommended. It can lead to unexpected results.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 42

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 42

Recommendations for Synchronous SRDF SRDF performance factors

Write rate to SRDF volumes– Provide sufficient bandwidth

Network distance– Balance response time needs with

impact of latency

Network quality– Observe packet loss limits

Symmetrix Infrastructure– Avoid hot volumes

R1 R2

SRDF

The following factors affect SRDF performance. They are:Write rate - This refers to the amount of incoming write I/O that has to be replicated. Since Synchronous SRDF completes every write to the remote Symmetrix before acknowledging completion, the available bandwidth must be able to handle the rate of incoming writes. Otherwise, writes waiting for remote acknowledgement will slow the host down.Network distance - Network distance determines write latency. A good rule of thumb is that every 125 circuit miles adds a millisecond of latency each way. Latency caused by network distance will determine how far the remote site can be without adversely affecting application performance under SRDF/S.

Network quality - Network quality has an impact on latency. Typically a packet loss of more than 0.1% is deemed to be unsatisfactory.

Symmetrix infrastructure - Synchronous SRDF will not queue more than one write to a logicalvolume. This means that if there is one overworked logical volume, all writes will slow down, because the application will not be able to continue until the write to the hot volume has been processed. Avoiding hot volumes requires careful analysis on a busy Symmetrix. Striping logs is one way of getting around the problem, because logs tend to be heavily accessed. Avoiding a lot of TimeFinder establishes at the same time that SRDF load is heavy, can also help.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 43

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 43

Balancing SRDF/A Cache and Bandwidth

Cache – must be configured to absorb excess writes during spikes

Bandwidth – Must be sufficient to handle average write throughput of the environment

Data Loss Potential (RPO) – Sum of the current (capture) and previous (transmit and receive) delta sets in seconds

13:00 14:00

Average across ½ hour

Average across an hour

Link required for sync mode

The diagram above shows the effect of different bandwidths in an SRDF/A environment. The higher the bandwidth the less the need for caching writes inside the Symmetrix.

A reduction in the SRDF/A cycle time causes the bandwidth requirements to go up and the cache requirements to go down.

An increase in the SRDF/A cycle time reduces the bandwidth requirements and increases the cache requirements.

At a time when the cycle times are elongated because the write load is higher than the available bandwidth, one could say that the time to drain the excess writes in the transmit cycle is at the point in time when the area below the curve is equal to the area above the curve.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 44

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 44

Recommended Practices for SRDF/AWhen in SRDF/A mode all devices in the RDF group must be managed together

Engineering recommends maximum 8 groups per director (pair) though theoretical maximum is 16 per director (pair)

The following should be set using Symmetrix Configuration Manager if necessary– Priority for RDF groups determines in which order an SRDF/A group should

be suspended if cache resources become low (default is 33)– Cycle time (default is 30 sec.) can be lowered to reduce potential data loss,

though this will require higher bandwidth– Amount of cache used by SRDF/A (default is 94% of available cache) can

be adjusted downwards to guarantee availability of cache for local applications

After long outages drop out of SRDF/A and synchronize the two sides using Adaptive Copy Write Pending

Unlike other modes in SRDF, devices belonging to an RDF group in SRDF/A mode have to be managed together. They cannot be subdivided into smaller groups. This feature limits the number of independently manageable SRDF/A applications to the number of RDF groups in the Symmetrix.

The theoretical limits of supported SRDF groups are 16 per director and 64 per Symmetrix. If redundancy is desired, this puts the limit at 16 groups per director pair. However, engineering recommends a maximum of 8 SRDF groups per director. To learn about the prevailing limits it is best to consult with the Solution Validation Center.

Starting with Enginuity 5671 it is possible to dynamically adjust:

a) The priority of an SRDF/A group. The priority determines the order in which an SRDF/A session will be suspended if cache resources become scarce.

b) The cycle time is the minimum time that must elapse before a new cycle is started. The data loss potential or RPO in SRDF/A is the amount of data in the Capture and the Transmit-Receive cycles. By reducing the cycle time, it is possible to reduce the RPO. However, this will increase the bandwidth requirements of the solution.

c) By default, SRDF/A is permitted to use 94% of the available cache in a Symmetrix. This percentage can be adjusted downward so there is more cache available for local applications.

After a failure when there has been a buildup of a significant number of invalids, it is best to drop out of SRDF/A and change the mode to Adaptive Copy write pending until the two sides are nearly synchronized. Otherwise. you run the risk of SRDF/A being dropped because the link cannot handle the excess load in SRDF/A mode.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 45

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 45

Recommendations for SRDF/ARSymmetrix Automated Replication provides delayed replication using SRDF Adaptive Copy mode

SRDF Adaptive Copy and BCV establishes compete with host I/O for Symmetrix DA time– Use QOS to slow down Adaptive

copy or– Increase SRDF/AR cycle times so

there are fewer TimeFinder establishes

STD R2

SRDF/AR R1/BCV BCV

SRDF/AR is a flexible disaster recovery solution that allows production data to be replicated at a slower pace than with SRDF/A or SRDF/S in exchange for a higher data loss potential.

Apart from the larger data loss potential, the primary disadvantage of SRDF/AR is its heavy use of the Symmetrix back end resources. Specifically, the Disk Adapters or DAs are responsible for scheduling adaptive copy writes across the RDF link. Since SRDF/AR also involves continual splitting and establishing of BCVs, the establish activity also places a heavy load on the DAs. Finally, host writes also pass through the DAs in the process of being written to disk.

The multiple activities contending for DA resources can sometimes cause performance issues in a Symmetrix. One simple work around is to slow down SRDF link traffic by using the Quality of Service parameter. Another is to simply elongate the SRDF/AR cycles so as to reduce the frequency of TimeFinder establishes and splits.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 46

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 46

Considerations for the SRDF Failover Site (R2)The Business Resumption Server/Site needs to be:– Same platform type– Comparable size– Same levels of hardware and software

Must have network and power connections equal to those at the primary site

Requires access and data security levels equivalent to the source site

Should have similar physical environment characteristics to the primary production server/site

Planning for the restart requires that all components required for operations at the production site be available at the secondary site. The greater the number of resources that have to be procured at the remote site following a site outage will add to the total Recovery Time.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 47

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 47

Design Pitfalls and Overall RisksUpon completion of this lesson, you will be able to:

Describe the risks, impact and options associated with Symmetrix solutions

Describe the risks, impact and options associated with TimeFinder solutions

Describe the risks, impact and options associated with Open Replicator solutions

Describe the risks, impact and options associated with SRDF solutions

The objectives for this lesson are shown here. Please take a moment to read them.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 48

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 48

TimeFinder/Mirror Solution Risks, Impact and Options

Solution RisksBack end could get overworked

BCVs failing while established

All four mirror positions are used

ImpactBusy Symmetrix back end impacts host I/O and SRDF/AR

Replication will halt until failed drive is replaced

Limits the number of concurrent TF/Mirror BCVs

OptionsUse QOS to slow down SRDF

Use TF/clones instead of standard BCVs - clones do not use mirror positions

Use protection mechanisms that do not consume mirror positions

The next few slides highlight some of the possible pitfalls in implementing Symmetrix software solutions.

TimeFinder/Mirror has a tendency to consume Disk Adapter resources during establish operations. If the establish operation coincides with SRDF/Adaptive copy activity and host writes, performance can suffer. The work around for this problem is to slow down SRDF using QOS or to reduce the number of establishes that conflict with the other two activities.

If a BCV fails while it is established, TimeFinder/Mirror processes will stop until the drive is physically replaced. This is true even if the BCV is mirrored, because an “establish” will pair only the “moving” mirror with the standard. If drive failures are a common problem, use of TimeFinder/Clones will get around the issue.

One of the drawbacks of TimeFinder/Mirror is that it occupies one of the 4 mirror positions in a Symmetrix logical volume. This limits the number of concurrent BCV copies to two. That limit could be extended to 8 and 15 copies by using TimeFinder/Clones or TimeFinder/Snaps respectively.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 49

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 49

TimeFinder/Snap Solution Risks, Impact and Options

Solution RisksRapid changes to source data

Large percentage of change to source data

Cannot be cascaded

ImpactCopyOnWrite penalty will affect host I/O

Will negate space saving benefit of Snap

A Snap target cannot be used as a snap source

OptionsPick a different replication solution

Pick a different replication solution

Try snapping from standard volumes

TimeFinder/Snap is a great product for environments where the source or the target do not experience heavy writes, and the data does not change much during the life of the copy. It can cause problems if TimeFinder/Snap is used in the wrong environment.

Heavy writes, either to the Snap source or the target, can cause CopyOnWrite penalty which will slow down write performance. A large amount of data changing on the source negates the advantage of using Snaps because there is no conservation of disk space; only a second copy resides on the shared save pool. A different local replication solution may be in order in both of these cases.

Snaps have a disadvantage in that they cannot be cascaded. Therefore, a Snap target cannot be used as a Snap source.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 50

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 50

TimeFinder/Clone Solution Risks, Impact and Options

Solution RisksSlow initial performance

Different command structure from TimeFinder/Mirror

Cannot be cascaded

ImpactCopyOnAccess penalty while data copy is underway

New commands have to be learned

Clone targets may not be used as Snap or Clone sources

OptionsWait till data copy is complete

Use TimeFinder Emulation Mode

Try cloning from standard volumes

TimeFinder/Clones offer two notable advantages over TimeFinder/Mirrors:

They are available immediately without having to wait for synchronization of source and targets, and they permit up to 8 concurrent copies as opposed to two with TimeFinder/Mirror.

The price of immediate availability is slower write performance to the source and target while the copy is in progress. The first writes to the source or data access on the target, are preceded by the transfer of the original track from the source to the target. If the performance degradation becomes a problem, it might be a good idea to wait until the data copy is complete.

TimeFinder/Clones have a slightly different command structure from TimeFinder/Mirror, and represent a learning curve for users. One way of shortening the learning process is to use TimeFinder/Clones in emulation mode. In this mode, users can continue using old TimeFinder/Mirror command syntax while the Solutions Enabler software translates each mirror command into its clone equivalent. Apart from subtle differences, this process is transparent.

Clones may not be cascaded. This means clone targets cannot be used as clone or snap sources. One could always use standard volumes as clone sources to avoid this problem.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 51

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 51

Open Replicator Solution Risks, Impact and Options

Solution RisksFA Port Contention

Poor SAN network quality

Application impact during live migration

Hot pull data loss if migration is aborted

ImpactContention occurs with incoming writes

Impact overall migration time

EOR causes a CopyOnWrite penalty

Write activity during migration is lost

OptionsFA usage for ceiling and pace values

Conduct a network assessment

If application impacts cannot be tolerated use ‘cold’

Use hot pulls only when necessary

The big selling feature of Open Replicator is that it uses existing SAN infrastructure to transfer data between two storage arrays. The disadvantage of this feature is that there may be contention between the host and the Open Replicator session for FA bandwidth. It is possible to set the “pace” and “ceiling” parameters available in the product to prevent Open Replicator from using up an unfair share of the FA port bandwidth.

Poor network quality can lead to long or aborted migration attempts. It is best to conduct a network assessment while implementing Open Replicator over long distances.

Live migration (push) is a powerful Open Replicator feature which allows the production volume to stay online while a point in time snapshot of the data is being migrated. This can lead to application impact, because every “protected” or “yet to be transferred” track of data is first transferred before an application is allowed to write to that track on the production volume. This is known as the CopyOnWrite or CopyOnFirstWrite penalty. This penalty can be avoided if a cold push is undertaken.

When data is being pulled from a remote array while it is being accessed by the host on the DMX, any attempt to access a track that has not been moved over, will result in the data being moved first before access is permitted. This is known as the CopyOnAccess penalty. Since the new data written to the DMX is not replicated back to the remote array, there is a potential for data loss if the Open Replicator session is terminated unexpectedly prior to completion

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 52

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 52

Synchronous SRDF Risks, Impact and Options

Solution RisksDistance between subsystems

High bandwidth requirements

Busy volumes

ImpactElongated response times

Expensive

A single busy volume can slow down all traffic for the group

OptionsConsider use of SRDF/A if write response is poor

Consider SRDF/AR if bandwidth costs are too high

Spread application load evenly

Synchronous SRDF offers the benefit of guaranteeing that the source and target sites are exact mirrors of each other, all the time. The price for real time replication is in two forms: Greater write response times, and bandwidth requirements that meet or exceed the peak write capacity.

If write response times are unacceptably high, one possibility might be to consider SRDF/A instead of SRDF/S. By having the target just a few seconds behind the source, performance of source applications can be significantly improved.

High bandwidth can be expensive and can cost millions of dollars per year. To reduce the cost of bandwidth over large distances, SRDF/A or SRDF/AR may be a better solution. The reduction in bandwidth requirements using SRDF/A is not significant, since the bandwidth has to keep up with the average arrival rate of writes.

SRDF/S is very sensitive to the existence of busy volumes. One overworked volume can impact the performance of the whole group of devices adversely. If busy volumes are affecting SRDF/S, a redistribution of load across the spindles is probably appropriate.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 53

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 53

SRDF/A Solution Risks, Impact and Options

Solution RisksInsufficient bandwidth

Insufficient cache

Longer distances

ImpactSRDF/A session drops

SRDF/A session drops

Longer resynchronization times

OptionsAcquire adequate bandwidth

Provide enough cache

Ensure availability of gold copy before starting resynchronization

Unlike synchronous SRDF which waits for an acknowledgement from the remote side before it acknowledges I/O completion to the host, SRDF/A will, by default, logically suspend the links if the bandwidth of the link cannot keep up with the arrival rate of the writes. Other than slowing down writes or buying more bandwidth, there is no simple solution to this problem.

A similar problem arises if there is insufficient cache to buffer the writes. Rather than slow down the host application, SRDF/A will logically suspend the session if it runs out of cache resources. Again, there is no simple solution to the problem other than to increase cache or slow down writes.

Typically, since SRDF/A is implemented over longer distances, it takes a longer time to resynchronize the two sides after a link failure. It is therefore important to preserve a gold copy of consistent restartable data on the target side prior to starting resynchronization after a failure.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 54

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 54

SRDF/Star Solution Risks, Impact and Options

Solution RisksComplex and relatively new product

Devices used for Star cannot be easily used for other applications

Complex failover actions

ImpactLong implementation cycle

Combining Star with other technologies is non-trivial

Need to understand data implications before deciding how to fail over

OptionsUse EMC professional services

Use “-star” option to manage devices

Ensure availability of gold copy before starting resynchronization

Synchronous SRDF waits for an acknowledgement from the remote side before it acknowledges I/O completion to the host. With SRDF/A, the links (by default) are logically suspended if the link bandwidth cannot keep up with the arrival rate of the writes. Other than slowing down writes or buying more bandwidth, there is no simple solution to this problem.

A similar problem arises if there is insufficient cache to buffer the writes. Rather than slow down the host application, SRDF/A logically suspends the session if it runs out of cache resources.

Typically, since SRDF/A is implemented over longer distances, it takes longer to resynchronize the two sides after a link failure. Therefore, it is important to preserve a gold copy of consistent restartable data on the target side prior to starting resynchronization after a failure.

Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.

Symmetrix Solutions Design Concepts - 55

© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 55

Course SummaryKey points covered in this course:

Recognizing important technical data to be gathered about the use of Symmetrix

Gathering technical data for Symmetrix

Interpreting and comprehending the gathered data

Recognizing parameters to set and tools for managing Symmetrix

Identifying the best practices for configuring and deploying Symmetrix and it underlying applications

These are the key points covered in this course. Please take a moment to review them.