21
Global Storage Resource Management How EMC ® SRM enabled a bank to save over $100MM real dollars in 2006, and $40MM annually in internal chargebacks EMC Proven™ Professional Knowledge Sharing July, 2007 Rich Ayala VP Senior Architect A Leading Financial Institution [email protected] [email protected] Page 1

Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Global Storage Resource Management How EMC® SRM enabled a bank to save over $100MM real dollars in 2006, and

$40MM annually in internal chargebacks

EMC Proven™ Professional Knowledge Sharing July, 2007

Rich Ayala VP Senior Architect

A Leading Financial Institution [email protected] [email protected]

Page 1

Page 2: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Table of Contents Introduction ..................................................................................................................................................... 3 Goals .............................................................................................................................................................. 3 Solution Requirements.................................................................................................................................... 4 The Solution.................................................................................................................................................... 6 Collection Layer (EMC Control Center)........................................................................................................... 7 Correlation Layer ...........................................................................................................................................10 Presentation Layer.........................................................................................................................................10 Report Scheduling .........................................................................................................................................11

Report/User Security ..................................................................................................................................11 Report Creation ..........................................................................................................................................12 Report Samples..........................................................................................................................................12 Host Reports ..............................................................................................................................................12 Infrastructure Reports.................................................................................................................................16

Implementation Considerations......................................................................................................................18 Site Inventory/Assessment.............................................................................................................................18 Solution Design & Implementation .................................................................................................................18 Support Considerations & Next Steps............................................................................................................20 Deficiencies....................................................................................................................................................20 Conclusion .....................................................................................................................................................21

Disclaimer: The views, processes or methodologies published in this article are those of the author. They do not necessarily reflect EMC’s views, processes or methodologies.

Page 2

Page 3: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Introduction I recently lead the design and implementation of one of the world’s largest globally-integrated Storage Resource Management (SRM) systems. Now, I have an opportunity to relax, take a deep breath and reflect on the past year. As the project’s Lead Technical Architect, it’s been a year of many ups and downs, and much learning. We achieved our goal, and it was influenced by many things - our storage technology base, current tools, past experiences with SRM, strategic storage initiatives - as well as the size of the organization and the environment’s distributed nature. With 12 PB sitting on 435 arrays, served to over 7,000 servers, connected to over 50,000 switch ports residing in over 60 data centers across four continents, we believe this SRM implementation to be the largest integrated SRM implementation to date in terms of total storage and server assets discovered, size and global distribution of the storage. Like many financial services organizations, the world’s third largest bank had experienced several recent acquisitions. As a result of the mergers, the legacy IT organizations had deployed SAN topologies from super-large edge-core-edge SANs of 7500 switch ports to SANs of two 32-port switches. Not surprisingly, vendor-specific tools from EMC, HDS, Brocade, McData and Cisco were implemented in many different ways. The skill and maturity level for use of the tools varied widely. The range of capacities and capabilities across the data centers meant that any tools we selected would have to accommodate this heterogeneous mix. Implementing a consolidated, common approach to SRM was driven by a strategic initiative to reduce run-away storage spending, improve utilization of deployed storage, and to assure that new applications were being deployed to the appropriate storage tier. The bank’s storage IT staff had historically taken new demand forecasts from the Lines-of-Business (LOB), including Investment Bank, Credit Card Services, and Retail Banks. They simply coordinated the purchase and deployment of storage for the LOBs - they were order-takers and implementers.

Goals A change was required after years of 50-60% annual capacity growresulted in 80% of all storage sitting on Tier 1. To enact this change, we formalized the following corporate storage goals:

th that

Host-level utilization of SAN storage from 40% to 60% Reduce Tier 1 storage from 75% to 50% in one year, moving many applications to Tier 2/Tier 3 Stop new storage deployments at data centers tagged as “exit” locations

3Pg. 3

The Need for an integrated Storage Management solution:Storage environments have grown far too large, too complex, and most importantly, too critical to the business to manage storage on an ad hoc basis, as has been the prevailing practice.

With incomplete, often inaccurate information about the makeup of the storage environment, no way to correlate applications to the storage components they depend on, and poor visibility into storage infrastructure events, storage administrators can't ensure consistent delivery of the storage service to the applications that drive the business - capacity, performance, availability, recoverability, scalability, and associated services.

-- Hitachi Data Systems

SAN Installed Capacity (GB)

1,000,000

3,000,000

5,000,000

7,000,000

9,000,000

11,000,000

2002 2003 2004 2005 2006

SAN InstalledCapacity (GB)

Historical Issues in the environment. (2002-2005)

• Coping with the sheer size and rate of growth fully consumed storage staff for many years

• Vast majority of the environment was not instrumented

• Multiple Spreadsheets were used to track everything from capacity to configuration

• No single place to go for Storage Information Reporting

• Utilization Reporting was not possible

Global Storage Management Before there was GSRM ……

Page 3

Page 4: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

• •

• •

Unless approved, ensure that any storage “refreshes” are deployed on the default tier (Tier 2), or lower class of storage Reclaim unused storage

To achieve these goals, the GSRM had to provide data views from different perspectives, from the array, switch, server, file system, and database. We had to provide roll-up summaries and drill-down capability by geographic location, owning customer organization (LOB) and storage tier. The reporting mechanism had to be flexible enough to slice-and-dice across all of these perspectives. With visibility into the current environment, management would be able to see how effectively the LOBs were using their current storage and could better evaluate requests. While overall usable storage grew by 2PB in 2006, GSRM helped the bank to:

Reduce 2005 Tier 1 storage spend from over $100MM to $0 in 2006 Reduce Tier 1 capacity from 78% to 59% while raising Tier 2 through Tier 5 capacity from 22% to 41% Improve SAN filesystem utilization from 40% to 55% Reduce internal charge-back of more than $40MM for our internal LOB customers

If 2006 business-as-usual demand & growth had continued into 2007: Tiering Breakout Q1 2006 Utilization & Cost Q1 2007 Utilization & Cost 2007 business-as-usual WITHOUT retiering

Distrib Usable PB Cost per month Distrib Usable PB Cost per month Distrib Usable PB Cost per monthTier 1 78.00% 4.7 $22,177,382 59.00% 4.7 $22,177,382 78.00% 6.345 $29,939,466Tier 2 15.00% 0.9 $3,067,085 20.00% 1.6 $5,452,595 15.00% 1.215 $4,140,564Tier 3/4 7.00% 0.4 $1,048,576 19.00% 1.5 $3,932,160 7.00% 0.54 $1,415,578Tier 5 2.00% 0.3 $314,573Total 100.00% 6 $26,293,043 100.00% 8.1 $31,876,710 100.00% 8.1 $35,495,608Annualized Cost $315,516,518 $382,520,525 $425,947,300Savings $43,426,775 The resulting integrated SRM infrastructure provides collection, reporting and dissemination of capacity and utilization data across these various perspectives. The project also laid the foundation for a common storage management framework that is now reaping benefits over-and-above the original project objectives. Let’s look at what we learned with the design, implementation and support of integrating EMC Control Center as the foundation of GSRM with a third-party ETL and reporting product.

Solution Requirements The intent of the project was not to replace commercial element managers or in-house developed product. In fact, many of the element managers are needed for discovery into the GSRM infrastructure. The original requirements were primarily driven by the capacity planning group but we soon learned that the LOB users and system administrators also had a variety of requests. The term “perspectives” can mean different things to different people. The chart on the right illustrates the relationship between managed objects, perspectives and report hierarchy.

Managed Objects

Report Perspecitves /Hierarchy Report Format

LOCATION Drillable thru Hierarchy: Tabular + Trended Area Chart

By Global Region Drillable thru Hierarchy: Tabular + Trended Area Chart

Array By Country Drillable thru Hierarchy: Tabular + Trended Area Chart

Server By State Drillable thru Hierarchy: Tabular + Trended Area Chart

File System By City Drillable thru Hierarchy: Tabular + Trended Area Chart

Database By Datacenter Drillable thru Hierarchy: Tabular + Trended Area Chart

Switch OWNER Drillable thru Hierarchy: Tabular + Trended Area Chart

NAS Filer By Line of Business

By Cost Center Drillable thru Hierarchy: Tabular + Trended Area Chart

APPLICATION Grouping outside Hierarchy: Tabular + Trended Area Chart

TIER Grouping outside Hierarchy: Tabular + Trended Area Chart

Page 4

Page 5: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Each discovered storage asset is a managed object. Each managed object has specific properties, some of which are collected generically by the SRM tool and others imported from an external source. We needed the corporate asset system to assign Owner, Location, Tier and Application. We also had to be able to drill down into Location, from the Region down through the location hierarchy to a specific Data Center. Likewise, we had to be able to drill down into the ownership of the managed object to its Cost Center. We had to associate each managed object type to the hierarchy to provide different perspectives into the relationships. For instance, utilization summarized by owner was not sufficient. We needed to summarize utilization by location, by owner, then by tier. We had to summarize regionally, drill down to the data center, then to the array, the servers connected to the array, see the detailed LUNs allocated, and what was used by that server, and how much remained un-utilized or under-utilized. Since the LOBs are billed for their storage utilization, we had to provide a clean and efficient way of correlating the storage utilization and trending, not only to the server, but to the LOB and Cost Center. Then, we had to deliver that information into the hands of the LOBs so they could improve their storage utilization, consider storage re-tiering and provide them the ability to monitor and directly affect their storage cost. Off-the-shelf SRM tools provide capability to assign user groups to managed objects. This is typically a manual association that must be done in the tool’s GUI for each managed object, and in each instance of the tool. And, typically, the association does not easily allow for incorporation of imported hierarchies based on a managed object type (i.e. array, server, switch, and database). We knew that we would need multiple SRM instances to provide adequate coverage across more than 60 data centers. It was essential that the corporate asset system assign the imported managed object properties to provide one centrally defined, common authoritative system-of-record.

Collection/Discovery•Ability to plug in new infrastructure after merger/acquisition•Flexible discovery of server assets (limit host-based agents, where possible)•Allow option to discover servers (based on masking data) without having to deploy full host agent•Minimize number of agents required to deploy•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally)•Support Heterogeneous storage (EMC Symm/DMX, Clariion, Centera, Celerra, CDL, HDS, NetApp)•Support Discovery of back-end array/volume properties behind virtualization front-end•Support for 64-bit OS & databases including Solaris, Windows, Linux, AIX, HP-UX, Netware, Oracle, Sybase, SQLServer

Integration with External Systems•Integration of external bank system data (to provide correlationof LOB owner, location, application)•Integration with legacy custom in-house SRM system

Flexible Reporting•Aggregation of global data into one reporting portal•Dashboard capability for executive at-a-glance eye-candy•Custom Report Definition•Report Scheduling•Access to reports based on user role•Flexible Tier Definition •Slice & Dice into Perspectives

It was important to support custom reportingboth for end users and IT support. We wanted to deploy an architecture where reports would reside in one location, be easily accessible and customizable, and most importantly, provide global aggregation of our 13 SRM instances. We needed dash-boarding capabilities for executive level, at-a-glance reporting. We also had security requirements, LOB users were to access host-level data and the relationships of the host data to arrays, etc; but not access array configuration, capacity and utilization information. Bank product managements define storage tiers annually, and they may change based on technology advancements and corporate storage strategy changes. What was considered Tier 2 last year may now be Tier 3. The specific criteria may also change. For instance, in 2007, disk size is one of the defining characteristics; and virtual tape is assigned its own tier. With tiering at the ftiering, utilization compliance and seemingly continuous storage refreshes, the ability to flexibly adjust the definition of a specific “tier” is important.

orefront of several strategic initiatives including re-

Page 5

Page 6: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Currently, we define storage tiers using the following criteria:

• • • • • •

Model of array/virtual tape Serial number of array Size of disk RAID protection type Type of disk (ATA/FC) Type of volume (primary/standard, point-in-time replica/BCV, remote mirror/R2)

We also needed a means to tie storage forecasts and storage reservations back to properly project true availability of storage. For example, if 500GB of Tier 2 is to be allocated to a new project, the overall availability of Tier 2 storage in that data center should reflect that reservation. Not knowing about future storage requirements limits the view of available storage. No tool on the market could satisfy these requirements for global collection and aggregation, correlation of owner, location, tier, application, and put that utilization information into the hands of the LOB user. I wish that we were able to provide all this in our solution. We did not satisfy all the requirements, but we did satisfy many, with subsequent phases to follow. The next sections describe the solution we implemented, and indicate where we achieved our goals, where we’re lacking, and what we plan to do next.

The Solution There were many isolated installations of EMC Control Center® throughout the bank as well as a few other competing SRM products. Due to the fragmented nature of the existing implementations, we decided to implement EMC Control Center in conjunction with a third-party product, called SERP from NovusCG, to provide correlation, aggregation and presentation (CAP) functions. EMC Control Center would discover and collect SAN and NAS assets and feed the CAP tool. We understood that this integration effort would require significant joint development, collaboration, and testing with the CAP tool vendor, since this was a relatively new product. The bank recognized the risk and effort associated with implementing a relatively unproven product; but also realized the benefits of helping to shape and develop CAP product features and functionality. EMC Control Center was the best SRM candidate for the bank, considering we primarily used EMC storage but had a significant HDS footprint. While the immediate goal was to standardize global reporting, future goals include globally standardized storage monitoring /alerting, globally standardized storage performance reporting, and some facets of automated provisioning. EMC Control Center provides many of those capabilities. Inbound data sources to the schema are fed by standard and custom CAP tool “collectors”. When a standard collector (via StorageScope API) is not available, the CAP vendor develops a custom collector. Custom collectors import data from the bank corporate asset system as well as the legacy custom-developed SRM system.

Page 6

Page 7: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Correlation Layer

Presentation Layer

Correlation Layer

Presentation Layer

Collection Layer

DC2 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

DC3 Remote

Device Agents

Storage

Servers

SAN

DC1 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

DC4 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

Local SERP

Local SERP

Local SERP

Global SERPPortal

Asset, LOB, Location,

Applic data

Collection Layer

DC2 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

DC2 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

Storage

Servers

SAN

DC3 Remote

Device Agents

Storage

Servers

SAN

Storage

Servers

SAN

DC1 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

DC1 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

Storage

Servers

SAN

DC4 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

DC4 Core

EMC Control Center

Device Agents

Storage

Servers

SAN

Storage

Servers

SAN

Local SERP Local SERP

Local SERP Local SERP

Local SERP Local SERP

Global SERPPortal

Asset, LOB, Location,

Applic data

,

e

– Secondary data source

The ultimate solution included three data sources inbound to the CAP Oracle 9i schema, and therefore, three collectors:

• ource for itches

acity, his

ntrol

ort

• rating

• SRM Collector

• •

EMC Control Center Collector - Data sstorage infrastructure (arrays, swNAS filers) and server discovery including configuration, caputilization and relationship data. Tcollector is a standard collector and does not require customization. Thiscollector utilizes the EMC CoCenter StorageScope™ API to impdata into SERP (CAP tool). Corporate Asset Collector– Corpoasset & inventory system definownership and location of servers andstorage arrays. This is a custom collector. In-house developedfor storage infrastructure (arrays) and limited server discovery. Since the in-house tool had better asset coverage early in the project, its data would gradually be superceded by EMC Control Center data as each managed object was discovered. This is another custom collector.

These three data sources are combined, summarized, correlated and available via a Business Intelligence front-end, Business Objects. To best describe the overall architecture, the solution was broken into three major functional layers:

Collection Layer, where the base data elements are collected from EMC Control Center distributed agents Correlation Layer, where these collected data elements are combined and correlated Presentation Layer, where reports are generated and presented to report users

Collection Layer (EMC Control Center) Multiple EMC Control Center instances had to be deployed to cover the bank’s 60+ data centers. We performed a global storage audit to collect all the asset information site-by-site to best determine the number of instances required as well as their specific locations. After analyzing the current asset locations and the growth and planned decommission sites, eleven “core” instances were identified for deployment, with 13 separate EMC Control Center instances. These “core instances” would serve as collection points. The remaining data centers were classified as “remote” or “distributed”; and would collect and forward infrastructure data to the “core” instances for subsequent processing.

Page 7

Page 8: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Data centers were classified as core, remote or distributed based on the following factors:

Page 8

DISTRIB

IP

FC

Greenwich CT

Boston

345Park 277Park

522 Fifth Ave Boston

Atlanta RPC Ft Worth RPC

Harahan LA RPC Oklahoma RPC Phoenix RPC

SanAntonio RPC Thornton CO RPC

San Diego

Tampa Orlando San Fran

Greens Cross Sao Paulo

WCC 1CC

Wilm 1201 N Mkt

MilwaukeeRPCSpringfield IL RPC

Toronto RPC Chicago RPC

Elgin 1BOP

800 Brooksedge

Ft Washtn PA

NOC5/Jersey City

Weehawken

Indianapolis RPC Louisville RPC Brecksville OH Westerville OH

SERP

Global Master SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

IMS Collector

CORE

REMOTE

KSRM Collector

London CDC1 CDC2 Triad JIP RDC Easton Somerset Asia PacBmthRDC

BmthCDC

Core: GSRM-Dedicated HardwareECC & SERP Infras Servers

Remote: GSRM-Dedicated HardwareECC Symmetrix Agents

Distributed: Non-Dedicated HardwareRemote/ IP-Based DiscoveryRemote / FC-Based Discovery

Portsmouth North Harbor

MCCVE 10A LW

Stratford Triton Sq

FD Basingstoke

601 Travis Arlington

BellevilleTokyo

Hong Kong Australia

MFN 3CMC 4CMC

245Park 4NYP

340 McCoy

ECCECC

Weehawken

DISTRIB

IP

FC

Greenwich CT

Boston

345Park 277Park

522 Fifth Ave Boston

Atlanta RPC Ft Worth RPC

Harahan LA RPC Oklahoma RPC Phoenix RPC

SanAntonio RPC Thornton CO RPC

San Diego

Atlanta RPC Ft Worth RPC

Harahan LA RPC Oklahoma RPC Phoenix RPC

SanAntonio RPC Thornton CO RPC

San Diego

Tampa Orlando San Fran

Greens Cross Sao Paulo

WCC 1CC

Wilm 1201 N Mkt

MilwaukeeRPCSpringfield IL RPC

Toronto RPC Chicago RPC

Elgin 1BOP

800 Brooksedge

Ft Washtn PA

NOC5/Jersey City

Weehawken

Indianapolis RPC Louisville RPC Brecksville OH Westerville OH

SERP

Global Master SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

ECC

SERP

IMS Collector

CORE

REMOTE

KSRM Collector

London CDC1 CDC2 Triad JIP RDC Easton Somerset Asia PacBmthRDC

BmthCDC

Core: GSRM-Dedicated HardwareECC & SERP Infras Servers

Remote: GSRM-Dedicated HardwareECC Symmetrix Agents

Distributed: Non-Dedicated HardwareRemote/ IP-Based DiscoveryRemote / FC-Based Discovery

Portsmouth North Harbor

MCCVE 10A LW

Stratford Triton Sq

FD Basingstoke

601 Travis Arlington

BellevilleTokyo

Hong Kong Australia

MFN 3CMC 4CMC

245Park 4NYP

340 McCoy

ECCECC

Weehawken

Infrastructure Agents:Arrays (EMC, CX, HDS)SwitchesNAS Filers

Host Agents:ServersDatabasesFile Systems

Infrastructure Agents:Arrays (EMC, CX, HDS)SwitchesNAS Filers

Host Agents:ServersDatabasesFile Systems

• • • • • •

ificant structure requiring

storage

Distributed” sites could e

s for

ach site had to be analyzed

MC’s Performance and Scalability Guide is based on the number of managed objects (switches,

e reviewed the largest of the instances with the EMC Solution Validation Center (SVC). This is

Geographic location Size of data center Growth projections of data centers SAN management operational considerations Data Center migration or closure schedules Type of storage infrastructure discovery required (IP or FC)

The following diagram illustrates the relationship of the Remote, Distributed and Core sites. “Core” sites housed the actual EMC Control Center repository and corresponding local CAP server. “Remote” sites had a signstorage infradedicated hardware foragents. “either be discovered into thinfrastructure via IP-based discovery, or had few FC-based discovery assets. Therefore, they did not require dedicated serverthose agents. Eto ensure it did not exceed therecommendations in the EMC Control Center Performance and Scalability Guide. Eservers, databases, arrays) as well as the number of SAN ports to be discovered, the number of volumes in the arrays, etc. These recommendations stipulate what types of agents may co-exist on the same servers and which must reside independently based on managed object counts. Wa formal process where the SVC reviews a qualifier to ensure that the proposed infrastructure (server, agent placement, collection times etc.) is adequate to meet customer expectations andsupportability requirements.

Page 9: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

We added significant hardware to the EMC Control Ccollectio

Core SAN RequirementsCDC1ECCP02 requires 75 GB SAN StorageCDC1SERP05 requires 400 GB SAN Storage

Management SAN Requirements

Each Symmetrix array requires a total (20)Symmetrix gatekeepers are configured on each Symmetrix array. (10) gatekeepers are dedicated for use by each Symm Agent server. We will present (5) gatekeepers to each Symm Agent server’s HBA port. The Storage Agent for Symmetrix requires EMC Solutions Enabler version 6.0.3 be installed prior to agent installation.

Management IP Network RequirementsEach Management Server Requires:1. One connection to Production/ ESF network (2. One connection to SAN Management network3. One connection to Backup network4. One connection to RIB/RSA card

Core IP Network RequirementsEach Core Server Requires:1. One connection to Production/ESF network(2. One connection to Backup network4. One connection to RIB/RSA card

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

Each Management Server will be connected to each fabric (four connections per server, four servers per fabric)

DL380ECC Repository

ECC Server Process

DL380StorageScopeIntegr Gwy

Large ECC Server Infrastructure

DL380 ESX 3.0.1 – 2 VMs

WLA 1FCC 1

Centera Agt----------

HDS AgentNAS Agent

Clariion Agent

DL380ESX 3.0.1 2 VMs

WLA 2FCC 2

Centera Agt----------

NAS AgentCentera AgentClariion Agent

DL380Console

1) Number of Symm/SDM servers dependent on (a) Symm managed object counts & (b) number of FC connections required to connect to fabrics to allow Symmetrix discovery.

2) Blue Box indicates ESX virtualization – 2 VMs on DL380 with agents as shown (per EMC recommendation)

3) Servers with Symm/SDM & Stores may be virtualized but require minimum of (2) DL585 (DL380’s are shown). Any data center virtualizing Symm/SDM agents must qualify the configuration with EMC Solution Validation Center process.

4) In preparation for ECC v6, EMC recommends isolating StorageScope onto dedicated server, consolidating Repository and Server process.

5) While Consoles may be installed directly on end user desktop, accessing centralized console via Terminal Services Remote Desktop provides ease of management and centralized control.

DEPENDANT ON EMC SVC VALIDATION &

APPROVAL

CDC1 Managed Object CountSymms : 54Clariions: 23HDS: 50Switches: 80 Ports: 12,000Unix: 749 Windows:591Databases: 670 (est 50% of hosts)Net App: 17Celerra:3Centera:2

Symm DetailLarge-sized: 21Medium-sized: 15SRDF “discovered”: 18

Fabric A Fabric B Fabric C Fabric D

Core SAN RequirementsCDC1ECCP02 requires 75 GB SAN StorageCDC1SERP05 requires 400 GB SAN Storage

Core SAN RequirementsCDC1ECCP02 requires 75 GB SAN StorageCDC1SERP05 requires 400 GB SAN Storage

Management SAN Requirements

Each Symmetrix array requires a total (20)Symmetrix gatekeepers are configured on each Symmetrix array. (10) gatekeepers are dedicated for use by each Symm Agent server. We will present (5) gatekeepers to each Symm Agent server’s HBA port. The Storage Agent for Symmetrix requires EMC Solutions Enabler version 6.0.3 be installed prior to agent installation.

Management SAN Requirements

Each Symmetrix array requires a total (20)Symmetrix gatekeepers are configured on each Symmetrix array. (10) gatekeepers are dedicated for use by each Symm Agent server. We will present (5) gatekeepers to each Symm Agent server’s HBA port. The Storage Agent for Symmetrix requires EMC Solutions Enabler version 6.0.3 be installed prior to agent installation.

Management IP Network RequirementsEach Management Server Requires:1. One connection to Production/ ESF network (2. One connection to SAN Management network3. One connection to Backup network4. One connection to RIB/RSA card

Management IP Network RequirementsEach Management Server Requires:1. One connection to Production/ ESF network (2. One connection to SAN Management network3. One connection to Backup network4. One connection to RIB/RSA card

Core IP Network RequirementsEach Core Server Requires:1. One connection to Production/ESF network(2. One connection to Backup network4. One connection to RIB/RSA card

Core IP Network RequirementsEach Core Server Requires:1. One connection to Production/ESF network(2. One connection to Backup network4. One connection to RIB/RSA card

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

DL380Symm & SDM

AgentStore

Each Management Server will be connected to each fabric (four connections per server, four servers per fabric)Each Management Server will be connected to each fabric (four connections per server, four servers per fabric)

DL380ECC Repository

ECC Server Process

DL380ECC Repository

ECC Server Process

DL380StorageScopeIntegr Gwy

DL380StorageScopeIntegr Gwy

Large ECC Server Infrastructure

DL380 ESX 3.0.1 – 2 VMs

WLA 1FCC 1

Centera Agt----------

HDS AgentNAS Agent

Clariion Agent

DL380 ESX 3.0.1 – 2 VMs

WLA 1FCC 1

Centera Agt----------

HDS AgentNAS Agent

Clariion Agent

DL380ESX 3.0.1 2 VMs

WLA 2FCC 2

Centera Agt----------

NAS AgentCentera AgentClariion Agent

DL380ESX 3.0.1 2 VMs

WLA 2FCC 2

Centera Agt----------

NAS AgentCentera AgentClariion Agent

DL380ConsoleDL380Console

1) Number of Symm/SDM servers dependent on (a) Symm managed object counts & (b) number of FC connections required to connect to fabrics to allow Symmetrix discovery.

2) Blue Box indicates ESX virtualization – 2 VMs on DL380 with agents as shown (per EMC recommendation)

3) Servers with Symm/SDM & Stores may be virtualized but require minimum of (2) DL585 (DL380’s are shown). Any data center virtualizing Symm/SDM agents must qualify the configuration with EMC Solution Validation Center process.

4) In preparation for ECC v6, EMC recommends isolating StorageScope onto dedicated server, consolidating Repository and Server process.

5) While Consoles may be installed directly on end user desktop, accessing centralized console via Terminal Services Remote Desktop provides ease of management and centralized control.

DEPENDANT ON EMC SVC VALIDATION &

APPROVAL

CDC1 Managed Object CountSymms : 54Clariions: 23HDS: 50Switches: 80 Ports: 12,000Unix: 749 Windows:591Databases: 670 (est 50% of hosts)Net App: 17Celerra:3Centera:2

Symm DetailLarge-sized: 21Medium-sized: 15SRDF “discovered”: 18

Fabric AFabric A Fabric BFabric B Fabric C Fabric D

nter

en layer to avoid

hile EMC is expanding its er

ing $100M annually)

ion. graded

e overall EMC Control

orth

e deployed the following EMC Control Center components:

• HDS Storage Agent • • lerra) Agent • • gent • r Archiver (performance archives) • pository & Server, Console, StorageScope™

MC’s Control Center delivers a standardized storage management infrastructure with a common pproach to collecting and analyzing storage performance. More and more functions are being

issues. The result was 13instances with good levels of performance and stability. Wsupport of EMC Control Centagents on VMware, this v 5.2 SP1 implementation was prior to much VMware support, which would have reduced the overall hardware investment required. This investment represented lessthan .5% of the annual (assumbank storage spend and resulted in end-to-end visibility of storage utilizatSince then, we have upto SP4 which now supports more agents on VMware. In fact, we recently reducedthCenter footprint by 25% bringing the number of NAmerican servers to 58. W

• Symmetrix® Array and SDM Agents

CLARiiON® Storage Agent NAS (NetApp and CeCentera™ Agent FCC (FibreChannel Connectivity) AWorkload AnalyzeRequisite EMC Control Center Stores, Re

Eaperformed using the EMC Control Center GUI rather than CLI scripts. Operational functions are being standardized and simplified.

Page 9

Page 10: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Correlation Layer The correlation layer consists of the CAP collectors that perform an Extract, Transform and Load (ETL) function to import locally-collected data into the global CAP database for centralized reporting.

A dedicated server hosting the local CAP database and application is associated with each EMC Control Center instance. Each local CAP instance imports data via the collectors. The local CAP data is then imported into the global CAP repository. The local CAP server that is associated to each local EMC Control Center instance is also known as a “child” SERP. The global CAP portal is known as “master” or “mother” SERP. The combination of EMC Control Center instance and CAP server is a Management Reporting Instance (MRI).

The childSERP servers are Proliant DL585’s with 400GB usable of SAN storage and two HBA’s. They run Oracle9i, Apache web services, Business Objects. The masterSERP instance consists of two servers: a DL580 that is the application server and a Solaris9 database server running on a Fujitsu PP2500 domain, essentially equivalent to a SunFire 6500. The daily import cycle to load data from each StorageScope into the local CAP database begins after the daily StorageScope reports run. The task scheduler on each ChildSERP kicks-off the local import from StorageScope and subsequently to the global repository. Imports are controlled via a set of property files that are read upon each import. These property files define the data sources, their location, import date ranges along with the sites to import. As the solution entered the acceptance testing phase, it became evident that additional functionality would be required to allow import of selected data sources as well as specific days. This logic was added and controlled by editing the property files. There was also significant work done to provide a “last-in, best wins” logic since we were importing data that was duplicated by both EMC Control Center and the legacy SRM system. This logic essentially supercedes legacy SRM data with EMC Control Center data once a given asset was discovered into EMC Control Center.

Presentation Layer The CAP Business Objects (BO) presentation layer consists of the BO “Universe” and reports. The CAP database schema is configured to the BO tool as a BO “Universe”. The CAP toolset provides the generic Universe wcustom objects to meet any custom requirements. The CAPproduct provides many generic BO reports. We also produced a cu

ith BO perspectives, data elements, filters, variables, as well as

stomized et of reports per bank

eport

se

srequirements. We designed each report via the BO RBuilder tool. Access to the Report Builder and specific reports can be set up via theSupervisor tool where user groups can be defined and specific reports placed in thogroups.

Page 10

Page 11: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Excel or PDF file, refresh a report, schedule a report, etc.

CAP Collectors gather the data elements available for a report. The StorageScope Collector imports many of the StorageScope data elements plus calculated data elements such as those related to trending. The custom collectors gather the data elements from the legacy SRM system. The asset collector gathers fields such as Ownership hierarchy fields of LOB and Cost Center, as well as Geography hierarchy fields of Region, Country, State, City, and Data Center.

Report Scheduling We can schedule reports to run daily, weekly, monthly, or on a user-defined schedule. This can be done with the BO Report Builder interface, but the BCA Manager provides the most flexibility in managing the reports. They may be suspended, started or rescheduled. Once the report is opened, it may be “refreshed”. Each report refresh is a complete query initiated from BO against the Oracle database. Pre-scheduling intensive reports avoids the overhead of each user refreshing the report each time it is opened. Using pre-scheduling has the report ready when the user needs it - and executes off-hours so that users do not have to individually rerun the report.

Report/User Security The Supervisor tool administers report security. We decided to segregate report access by LOB so that one LOB could not see others’ reports. Each Supervisor assigned the users and reports to the appropriate user groups. To support that requirement, a separate report for each LOB was required. Each separate report had a different filter (WHERE clause) coinciding with the LOB group. We decided on LOB-specific accounts where the user would logon to the bank Single SignOn (SSO) system using a front-end Perl script for user account creation. We created a table outside of the CAP tool that mapped the user’s Single SignOn user ID to the back-end LOB-specific account. Other security settings are available to define whether a user may edit a report, save a report as an

Page 11

Page 12: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Report Creation We use the Business Objects Report Builder to easily create or change reports. The Builder interface has an Edit Query panel for defining the query as well as an Edit Report panel for defining the report format. In the Edit Query panel (shown), we can easily insert schema objects (DB table columns) into a query or define query filters. The objects chosen to go into a report can be generic objects as part of the BO Universe or custom variables with corresponding formulas.

Report Samples The best way to illustrate is to review some of the more useful reports. Currently, there are 23 reports for each LOB/Business Unit and there are 39 organizations. There are also 45 separate Infrastructure reports (array, switch, NAS filer). Many of these also have separate tabs in each report, similar to separate worksheet tabs in an Excel workbook.

Host Reports One of the most used reports is a prompted report where the user enters in the host name of a server to get a detailed configuration. The Detailed Host Lookup Report provides the data source (EMC Control Center or legacy SRM), amount of storage Assigned, OS-Configured, Unconfigured (potentially reclaimable), Local Storage, SAN storage configured to file systems, SAN storage not configured to file systems (possibly raw partitions), OS version, owning LOB and cost center, server location, server device-to-SAN volume relationship and data (OS device, data/volume group name, local/array storage flag, array volume served from, SAN volume, SAN volume size, protection, type (Primary or Replica), and Replica type (i.e. BCV, SRDF R2). It also includes specific file system information (not shown). The Host Monthly Utilization Report provides data on file systems residing on array storage by month. The Host Reclamation Report identifies reclamation candidates by identifying the difference between what is “assigned” (aka masked) to the server as compared to what is actually configured to the server operating system. Here, the Business Objects alerter feature is used to highlight figures, in red, that have exceeded a threshold. In this case, see “Host Unconfigured” and “FS Used %.”

Page 12

Page 13: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

A data element called a Sort Code (a hierarchical structure similar to a chart of accounts in a general ledger) is imported via the asset collector to provide storage utilization by application. Applications are assigned specific sort codes. The Host Array Summary by Location Report provides SAN utilization by application and server, along with location information. The Host Tiered Storage Details report provides the break-out of tiered storage, by server, organized by Location, or by Owning LOB. When a server has SAN volumes configured from more than one array, both arrays (possibly different tiers) are shown. This also identifies if the server has any “shared” storage (labeled as clustered) where a volume is masked to more than one server (in the case of a cluster or multi-hosted BCV). Volume sizes are totaled and categorized as Locally Protected Storage, Point-In-Time Copy, Remotely Protected Storage. This categorization is important for the bank since we charge storage product by Tier and by category. Note that EMC Control Center does not support identification of non-EMC Point-In-Time Copy or Remotely Protected volumes.

S, these volume types are grouped in with Primary volumes (Locally Protected). For example, with HD

Page 13

Page 14: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

14Page

Several scorecard reports were produced to measure compliance with the storage initiatives. The report to the left measures Array File System Utilization by LOB.

The Host Array File System Detail Screen shows the Business Object “document map” navigation feature. The user interactively scrolls through the server list and selects a specific server. While the other screen shots have been the actual report, this is an example of the interactive navigation and drill-down inherent in Business Objects. The Scorecard Report provides server detail and summary information for a LOB. “Instrumented” refers to whether the server is fully discovered into EMC Control Center with both the master and host agent installed. When a server is partially discovered, or only has a master agent installed, “No EMC Control Center Agent” is shown under the Instrumented Ac

tion column.

Page 15: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Since several LOB’s deployed EMC Control Center host agents to both SAN-attached and non-SAN-attached servers for full tool coverage, it was important to easily identify if the server had SAN storage. Host source identifies the source of the data – either EMC Control Center or legacy SRM system. This report also has utilization file system information on SAN volumes. The Array-Host Relationship Report, illustrated above, shows the Business Object Document Map navigation feature. This report provides three report tabs for flexible relationship lookups– By Array, By Data Center or By LOB. The Database Detail Report shows us where tablespaces reside on SAN volumes. The relationship between database, tablespace, datafile, array and array volume are brought together along with the overall utilization of the database. Another report, not shown, illustrates the utilization of each tablespace in a database. It relies on the deployment of the EMC Control Center Oracle agent; or the EMC Control Center Common Mapping

databases).

agent (for non-Oracle

Page 15

Page 16: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Infrastructure Reports The Array Capacity & Utilization Reports by Product Line illustrate Tier 3 array information by specific data center. Graphical and detailed information is shown by array. These reports provide Total Raw, Presentable (usable), Allocated and Free. The Trended Array Capacity & Utilization Report by Product Line shows tiered capacity & utilization growth over a year, by month, for a specific Region. Here, the arrays were not discovered into EMC Control Center until May, 2006. The legacy SRM system was collecting Total data prior.

Page 16

Page 17: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

The Switch Utilization by Fabric Report provides switch-level capacity & utilization along with free /connected port information and type of connection (connected to Host, Storage, and other switch/ISL).

The Switch Utilization by Location Chart illustrates Free Port Count, Connected Port Count and Total Port Count.

The NAS Utilization Monthly Report provides NAS filer monthly capacity and utilization trending.

Page 17

Page 18: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Implementation Considerations The bank opted to use professional services to discover the storage and server assets, and to install the software product. The project was broken into six phases:

• • • • • •

Site Inventory/Assessment Solution Design EMC Control Center Infrastructure Deployment & Storage Asset Discovery CAP tool Configuration EMC Control Center Server Discovery Training

Site Inventory/Assessment Our first step was to perform a storage site audit to determine the required EMC Control Center and CAP infrastructure. Everyone understood that this would be one of the largest EMC Control Center implementations, and the largest CAP tool implementation, so it was imperative that site storage inventories were accurate. The growth or exit strategy for each data center was another consideration. Some data centers were identified as “growth”, others as “maintain” and others “exit” which dictated future plans for each data center. Based on the site growth/exit strategy, the size of the data center, and the relationships between the data centers, specific data centers were identified as places where MRIs (management reporting instances) would reside. An MRI consisted of the EMC Control Center infrastructure and CAP application servers. We obtained storage configuration data including array model, array & switch microcode levels, connected fabric data, port counts, SAN switches, zoning information, and masking information. We combined informal spreadsheets with inventories from storage vendors. It was important that data center managers understood the project’s criticality to assure cooperation and inventory accuracy. We identified a site lead in each data center who was responsible for audit and deployment. We achieved a 75% accuracy rate at implementation due to the dynamic nature of the storage environment.

Solution Design & Implementation The SP1 EMC Control Center Performance & Scalability Guide was used to classify each install as Large or Medium; and as “Core, Remote or Distributed”. A Core site contains dedicated GSRM hardware including an EMC Control Center & CAP Repository. Remote sites are large enough to warrant dedicated hardware, including fibre-channel attached management servers for Symmetrix discovery. Remote sites feed into a Core site, consolidating the collected data.

Page 18

Page 19: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Distributed sites leverage existing non-dedicated, fibre-channel attached servers already in the environment, and feed into one of the Core sites. Often, the business would not permit an agent to be installed for Symmetrix discovery and proxy Symmetrix discovery was required. Proxy Symmetrix discovery is limited, in that it does not support the Symmetrix Device Masking (SDM) agent which captures Symmetrix masking data – so many Distributed sites could not collect device masking data. Without masking information for these sites, storage reclamation candidates could not be definitively identified. We standardized our MRI configurations for maximum supportability and flexibility. The number of fabrics in the data centers determined how many HBAs would be required in the Symmetrix agent servers to discover all the Symmetrix frames. This varied between data centers based on the legacy organization that implemented the SAN management and fabric design. Some legacy organizations were very efficient in building out few, large fabrics; while others built many, smaller SAN islands. Since the effect of data center/technology consolidation had not been felt at design time, it was important to provide enough HBAs in the Symmetrix agent servers to allow flexible discovery of Symmetrix-attached fabrics. We decided upon a minimum of 4 HBA ports per Symmetrix agent server. This would provide the most flexibility and efficiency in connecting to Symmetrix fabrics. While the initial requirements of the project were only for discovery and collection to produce daily reporting, the EMC Control Center infrastructure was sized to support the usual functions of active management, performance collection and alert monitoring. This proved to be a wise decision, as EMC Control Center’s scope will expand over time. Just as an inventory of storage assets was not readily available globally, a definitive list of servers was also unavailable. We utilized what inventory we could find, joined that with suspect SAN billing data and came up with a target list of servers that would have the EMC Control Center agent installed. We soon realized that to “facilitate” the cooperation of the businesses, we had to:

• create agent installation packages to ease the agent install • scorecard the progress of the agent installs • identify LOB-side project leads

We also had to secure the approval and cooperation of the bank Engineering Board who approves technology standards. The installation of the EMC Control Center agent became a required standard on all SAN-connected servers. We held weekly progress calls with each LOB to review the status of host agent installs.

To ease agent installs, master/host agent packages were created for each OS and placed on NIM (AIX) and Jumpstart (Sun & Linux) servers. We also established a standard host qualification process where a modified EMC Grab Lite was downloaded from a central site, executed on each target server and the output run through a qualification database to determine if any server remediation was required prior to agent install.

Page 19

Page 20: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Support Considerations & Next Steps With 99% of the storage assets and 95% of the SAN-attached servers now discovered into the infrastructure, we are now in production support mode. Phase II of GSRM includes: •

Health Monitoring o Installed the EMC Control Center Integration Gateway to forward alerts for

Symmetrix, CLARiiON, Centera, Celerra to Tivoli TEC framework where global alerts are monitored by the Storage Operations Center

Performance Analysis o While default array collection policies for Symmetrix and CLARiiON are established,

we also enable custom Performance Manager (WLA) Analyst collections as requested for more granular array intervals, as well as switch and array collections.

o Planning to implement default host level performance collections, as well as Core switch collection

o Define performance alert thresholds for arrays and/or hosts o Each LOB architect exports an EMC Control Center Visual Storage back-end layout

to Excel to supply to the LOB system administrator. This layout is manually compared to WLA array reports to best determine application and database placement. EMC Control Center roles are established to allow read-only access to these users.

Deployment of database agents o To obtain utilization of databases and raw partitions, we are testing, packaging and

deploying Oracle agent as well as Common Mapping agent for Sybase and SQL server.

To adequately support global deployment, we recommend the following resources:

Host agent support. This is the most tedious support item. Master or host agents become inactive for various reasons, including master agent start not in system startup or OS image rebuilds and EMC Control Center agent is inadvertently omitted. Agent/server decommissioning is an area we are still working on. EMC Control Center infrastructure support. Daily care-and-feeding of EMC Control Center and SERP including space monitoring, service management, alert monitoring, new asset discovery (array, filer, switch), WLA custom analyst collection, DCP management. SERP Application support. Report modification & scheduling.

Deficiencies While the solution satisfies 90% of the original requirements, here are some areas where improvement is needed by both vendors, the SRM/collection vendor as well as the CAP vendor:

CAP Extensible Database. Cannot update or insert any new data other than by the Collector. This is being addressed in the next release. SRM Visibility behind Virtualization Head. Cannot determine back-end storage properties behind virtualization layer.

Page 20

Page 21: Global Storage Resource Management - Dell...•Reasonable amount of hardware to accomplish discovery (less than 50 separate servers globally) •Support Heterogeneous storage (EMC

Limited HDS Support. Cannot determine type (FC/ATA) or size of physical as well as type of LDev (P-vol, S-vol) Integrated workflow where demand forecast, storage and port reservation are integrated into base SRM data to provide full capacity and utilization of arrays CAP Tool Report Variables. Custom variables such as tier are not globally defined and must be defined in each separate Business Object report. If a report variable formula changes, this can be very difficult. This issue is being addressed in the next release. Orphaned Storage definition and the ability to easily correlate masked LUNs to server OS-configured LUNs. Neither vendor readily provides out-of the-box functionality of orphaned storage reports. Capability will exist in next release of SRM tool as well as CAP tool. Manual “Munging” is still required. While the overall solution provides an evolutionary step-forward in SRM, there are times when munging, custom data manipulation is still required.

Conclusion EMC Control Center feeding the Novus SERP tool combined to accomplish what few, if any, organizations have accomplished. We have collected and disseminated globally-distributed SRM information in a format that is customized by and for the customer. The fact that the overall solution gave visibility into the storage environment to enable the bank to avoid $100MM in storage spends, as well as $40MM in internal chargeback, trumps any deficiency list. A capacity analyst reported it used to take him two days to pull data, now it only takes 2 hours. Our hope is that the SRM and CAP vendors will prioritize the areas that need improvement so that these types of cost savings and improved efficiencies can grow as the tools evolve and mature.

Page 21