46
Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved. Module 8: SAN Management - Best Practices - 1 © 2007 EMC Corporation. All rights reserved. SAN Management - Best Practices SAN Management - Best Practices Module 8

SAN Best Practices

Embed Size (px)

Citation preview

Page 1: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 1

© 2007 EMC Corporation. All rights reserved.

SAN Management - Best PracticesSAN Management - Best Practices

Module 8

Page 2: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 2

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 2

SAN Best Practices

Upon completion of this module, you will be able to:

Describe methods to assess SAN configuration

Describe methods to secure SAN architecture, components, and mechanism

Define Performance Management

List Performance Management Tools

These are the objectives for this module. Please take a moment to review them.

Page 3: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 3

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 3

Lesson 1: Assess SAN Configuration

List best practices for documentation

Identify SAN design and modeling

Describe how to track and manage assets

Identify SAN configuration elements to regularly back up

Describe capacity planning

These are the objectives for Lesson 1. Please take a moment to review them.

Page 4: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 4

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 4

Documentation - Best Practices

Maintain all information related to the SAN in a central location, in a searchable format

Maintain a site diagram highlighting all the different components

Review all documentation periodically for errors and/or updates

Use EMC provided SAN documentation to augment existing information

At the very minimum, a current inventory of all SAN equipment should be created and maintained. A site map, indicating connectivity and site design, should also be created and maintained.

Page 5: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 5

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 5

Tracking and Managing Assets

Leverage Management Applications to track and manage assets– Know utilization

Track use rates (storage, bandwidth, port consumption) by application, host, and environmentDevelop rules for future resource allocation

– Sort assets by use groupCharge back to user groups

All IT assets should be inventoried, and the usage of the assets should be tracked. This allows for the management of shared resources, such as storage, switch ports, and bandwidth. It also provides for the implementation of cost center methodologies to enforce charge back policies for resource consumption and utilization.

Develop reports and fact-based metrics from this information to manage systems and plan for future growth.

An EMC product that is used to track and manage assets is EMC ControlCenter StorageScope.

Page 6: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 6

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 6

Using Storage More EfficientlyPredict storage needs, dictate what growth is possible

Keep organizing groups:– Host groups– Array groups– Switch groups– Application groups

Review historical trends

Whenever possible, reclaim unused storage and return it to the array free pool. Do not let devices remain on host ports using valuable address space, or remain in the VCMDB to be used randomly. Know the capacity of the array, what is allocated, and what is free.

Tools such as EMC ControlCenter Storage Scope can help predict storage needs more accurately by:Organizing hosts, arrays, and switches into groups that represent growth areasProviding views that review historical trends, providing basic metrics that can predict where future consumption may be in the environment

Page 7: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 7

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 7

SAN Design and Modeling - Limitations

Growing scale and complexity of SAN environments

Lack of expertise in design rules and best practices

Frequent changes in component interoperability

New product introductions

Making even slight changes to an enterprise SAN can be a challenging task. Capacity planning should include data gathering of the current environment, performance monitoring, and speculation on how much the environment may change or grow over a given time period. EMC has tools, such as SAN Advisor, which aids in the design, modeling, and qualification phases.

Challenges to SAN design:

•Environments grow rapidly to support the needs of the business.

•As components change, code expires or is upgraded, the challenges for building a robust SAN becomes much more wide-scale and difficult for a single person or entity to manage.

Page 8: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 8

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 8

Capacity Planning

Capacity planning is a combination of record management, performance analysis and planning.

There are online tools available to the EMC community for qualifying and planning for future and existing SAN environments.

EMC SAN Advisor aids directly in– Qualifying environments– Scenario creation– Modeling prospective environments and changes

Tools are used to simplify the information-gathering tasks required to perform capacity planning.

EMC ControlCenter SAN Advisor imports existing SAN configurations into a database. The existing configuration and proposed changes to the configuration are continually cross-checked against SAN Advisor's validation engine. The validation engine includes EMC’s E-Lab interoperability support matrix, and over 100 rules/best practices for availability and array configuration. By simulating SAN changes and understanding their potential impacts prior to putting them into production, SAN Advisor allows users to avoid costly configuration errors and application downtime.

Page 9: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 9

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 9

SAN Advisor - Architecture

Excel or Visio

SAN Designers and IT Administrators

SAN AdvisorInterface

Existing SAN Environment

ESM Rules Product Info

SAN Advisor Validation Engine

SAN

Adv

isor

Auto-update (monthly)

eLab

ESM Rules Product Info

Internet

SAN Advisor is installed on a single server on the network. Administrative tasks are performed by an administrator logged into the SAN Advisor server. The SAN Advisor GUI can be accessed via a web browser from any machine with IP connectivity to the SAN Advisor server.

SAN Advisor validates SAN information against:• ESM: The EMC Support Matrix database. This database contains all of the eLab published

support statements.• Rules: EMC and industry best practice rules that are used to check SAN design. • Product Info: A meta-base that lists information about the various products SAN Advisor

permits in a SAN design. This includes information on arrays and switches not found in the EMC Support matrix.

These databases are updated on a monthly basis by the eLab and the SAN Advisor team. The updates are then made available on the SAN Advisor AutoUpdate server. SAN Advisor can be configured to automatically download and install these updates when they are available, or it can be configured to notify administrators via email when an update is available.

Page 10: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 10

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 10

Problems SAN Advisor Solves

Accelerate SAN design and change managementValidate proposed SAN designs against the latest:– EMC Support Matrix– SAN design best practice rules– Product information

Save SAN designs for use in collaborative environmentsDocument SAN designs using standard Office formats:– Excel– Visio

Before making any changes to an existing SAN, it is important to ensure that existing SANs are optimally configured for interoperability and availability. Through integration with ControlCenter, SAN Advisor automatically uploads detailed information of an entire SAN environment, reducing the potential for human error and eliminating tedious manual data entry by IT staff. After the existing SAN configuration is cross-checked with EMC best practices and the interoperability matrix, the baseline is established. A copy of the baseline can be made and used to model potential changes in a safe environment. This critical simulation step can catch incompatibilities. Ultimately, SAN Advisor recommends SAN design options, which can be followed or ignored, and helps arrive at best-choice solutions—all without interrupting production systems. SAN designers then have the option to export designs to Visio or Excel spreadsheets to share with upper management. Once a SAN design option is selected, a report can be generated that compares it to the existing SAN baseline. These differences between the two produce the action plan. The implementation team uses this action plan to assess the number of man hours required and document the change in any enterprise change management software in use by the organization. SAN Advisor manages multi-vendors SANs throughout their lifecycle, continually ensuring they remain optimized to meet required service levels. It supports multi-vendor storage area networks, including switches and directors, such as Brocade, Cisco, and McDATA; and multi-vendor storage arrays including EMC CLARiiON and Symmetrix, HP, HDS, IBM, Logic E-Series, and SUN StorEdge.

Page 11: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 11

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 11

SAN Components that Should Be Regularly Backed Up

Policies

Procedures

Alert logs

Management Data

Array configurations

Array log data

Access configuration

LUN masking information

Records of host port to volume configurations

Storage Data

Export all Zoning and fabric configuration details

Records of port configuration

Regular backups of all switch logs

Switch Data

Gold copies of known-good configurations

HBA configuration files

Local system files

Map files of device listings

Map files of LVM configurations

Host Data

Whenever changes are made to a switch, or a configuration change is introduced to a fabric, backups of the currently working configuration should be taken. Point-in-time recovery is essential whenever many hosts or environments might be changed by a single change in a SAN configuration.

Switches: Keep backups of the switch logs – these logs may be important to have on record for use as part of root cause analysis (RCA).

Important: Whenever new code loads are made to the switches, full backups should be taken.

Hosts: Backup a gold copy of the configuration files.

Array configurations: Each time a modification is made to the array configuration, a backup should be taken. Backups are needed as an insurance policy against loss of data, which can occur because of:

Hardware failuresHuman errorApplication failuresSecurity breeches, such as hackers or virusesCatastrophic events (Hurricanes, earthquakes, etc.)

Page 12: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 12

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 12

Lesson 1: Summary

Key points covered in this lesson:

What should be regularly backed up

Documentation - Best practices

Tracking and managing assets

Using storage more efficiently

Capacity Planning

SAN Design and Modeling

Tools - SAN Advisor

These are the key points covered in Lesson 1.

Page 13: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 13

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 13

Lesson 2: Performance Management

What is Performance Management?

Styles of Performance Management

Performance and Capacity Management

Performance Stack

Potential Bottlenecks

Fabric Performance

Storage Performance

Host Performance

These are the topics covered in Lesson 2.

Page 14: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 14

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 14

What is Performance Management

What is it?– Capturing metrics– Proactively responding to performance issues– Planning for future growth– Monitoring trends

Who is involved?– Application Owners– Administrators – Database, System and Storage

Areas and functions– Host, Fabric and Storage Performance– Building baselines for the environment– Service Level Agreements (SLA) enforcement

In a networked environment, it is necessary to have an end-point-to-end-point view of that environment. Each component of the system which constitutes either a read or write for the application needs to be monitored and analyzed.

SAN administrators need to be involved in nearly all facets of system planning, implementation, and delivery. Databases which are not properly planned for in an array’s backend inevitably cause resource contention and poor performance.

Examples of where performance bottlenecks can reside that create difficulty in diagnosing or planning:Database layout can cause disk overload.Server settings impact data path utilization.Shifting application loads create switch bottleneck.Poor SQL code causes excess I/O.

Page 15: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 15

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 15

DisksCache

Devices/LUNs

HBAs

HBAs

Host DevicesOperating Systems

File SystemVol. Manager

Database

User Applications

PowerPathPortsZones

Directors/CPUs

Device Drivers

Threads

Ports

Response time experienced here

Response time measured hereResponse time measured here

Networked Storage Topology - Performance Stack

Response time measured hereResponse time measured here

From application experienced response time, the two main stops to measure I/O response time in the I/O path are between file system to OS and switch to host ports on arrays. Although other I/O metrics help to illustrate the entire scenario, these two stop points incur most I/O contention and thus report time differences.

The above performance stack is a good place to start collecting data or analyzing the end point-to-end point environment. In an environment such as depicted above, there are multiple technical resources to understand the macro environment and potential problems within. Typical problem resolution teams include System Administrators, Database Administrators, Storage Administrators, Network Administrators, Application Programmers, System Architects, etc. While roles change from environment to environment, the usual suspects of complexity in a system do not.

Page 16: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 16

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 16

Locations for Potential Bottlenecks

DatabaseDatabaseFile SystemFile SystemVolume ManagerVolume ManagerOperating SystemOperating SystemHost DevicesHost DevicesHBA DriversHBA Drivers

HostHost

PortsPortsInterSwitch LinksInterSwitch LinksPortsPorts

SwitchSwitch

Storage DirectorsStorage DirectorsCacheCacheDisk DirectorsDisk DirectorsDisksDisks

StorageStorage

ApplicationApplication

The graphic above identifies where potential bottlenecks may occur in a typical SAN-attached environment. Investigation for this type of environment starts by either drilling down from the host perspective or drilling up from the array perspective. Frequently, applications and user-facing systems drive problem resolution and data collection. From that perspective, a common data collection and view method is drilling down from the host perspective.

Page 17: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 17

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 17

Fabric PerformanceMain components to monitor in Fibre Channel Switch environments– ISL utilization– Aggregate switch utilization– Port utilization

Fabric components are monitored using:ControlCenter AlertsSwitch native tools (e.g. Connectrix Manager or Fabric Manager)

Alerts can be created that send notifications when changes occur in the SAN environment that cause a change in state. In this way the SAN environment can be monitored proactively.

ISL utilization is commonly overlooked as servers and switches are added to the environment. However, this one link, if over-subscribed, can cause significant downtime to data for any system connected to a multiple switch fabric.

Ideally, keep the following monitored:Multi-level fabrics can have bottlenecks− ISLs can be overused−Need to ensure enough ISLs and use ISL aggregation where possible

Switches can have their own problems−Are there switch bit errors causing I/O slowdown?−Use switch statistics to detect HBA or array fibre problems

Page 18: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 18

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 18

Fabric Performance - ISL PerformanceKb reads and writes per second– Use to determine if high throughput is required– Verify that ISLs are not at maximum I/O load

Bit error rates– Ensure that weak links in switch, HBA, or storage port are not

affecting I/O

Inter-switch links can be the source of performance bottlenecks if the load on them was incorrectly planned or underestimated. Theoretical utilization of ISLs, while important, cannot provide a real time measure of their actual performance. Actual ISL performance can be measured by measuring actual Kb reads and writes per second and also bit error rates. If a particular ISL is the cause of a performance bottleneck, additional ISLs can be added or their bandwidth aggregated to minimize or eliminate the problem.

Page 19: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 19

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 19

Storage PerformanceFan-Out Ratio– Determine what is supported– Determine best and worst case

scenarios for I/O in the environment

Performance on a storage port is highly dependent on the number of I/O requests per second and the size of the I/O in the request. Use the EMC fan-out recommendations for storage ports.

As an example, if the fan-out of a storage port (Symmetrix) is 12:1 (12 HBAs to one storage port), we would allow the traffic from two storage ports to traverse from one switch to another over a single ISL. For redundancy reasons, another ISL should be added for increased availability.

Array Front-End:Processor performance limitations on number of I/OsPort performance limitations on Kb throughputCache hit rate

Storage Volume Considerations/Statistics:

Cache hit rateUse to determine if workload is random or sequentialUse to measure against planned or expected values of Reads and Writes and Kb per secondConfirm host device statistics

Page 20: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 20

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 20

Storage Performance - Array I/OStorage can only handle an I/O load up to its rated throughput

Causes for I/O bottlenecks– Overloaded director - too many devices– Too many random I/Os for a single disk– Sharing a volume with many other volumes on the same disk

Director statistics– Throughput and I/Os

Use to determine if ports are at limits

Physical disk statistics– Throughput and I/Os

Use to determine if disks are at limits

A storage array port can only handle an I/O load up to its rated throughput. Hence it is important to maintain a reasonable fan-out ratio to minimize the risk or the storage port being over-utilized. At the same time, too many random I/Os for a single disk or multiple shared volumes on a single disk can also cause performance bottlenecks. On such occasions, usually software applications like Symmetrix Optimizer are used to identify and correct hot spots such as over-utilized spindles.

Page 21: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 21

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 21

Storage Performance - Symmetrix

Front-end utilization

Cache utilization

Back-end utilization

The latest version of ControlCenter Workload Analyzer includes the response time metric for Symmetrix devices. Devices represent the logical work units to which I/Os are performed. Symmetrix devices are mapped, directly and indirectly, to host devices, and host applications generate I/Os to these host devices, again, directly and indirectly. The term indirectly addresses the general subject of virtualization, where an application file represents a database table, which in turn is mapped to a volume manager volume, which may be directly associated with a host operating system device. That device is then mapped to the Symmetrix device, which represents a striped set of multiple devices and ultimately resides on multiple disk drives.

The introduction of the response metric provides the performance analyst with the means to isolate the direct contribution to the performance by the storage system, with respect to the hosts and applications connected to that system. Response time is the best measurement of service quality provided by any system. Understanding how to determine where time is spent in the system provides the best means for problem identification, bottleneck detection, system tuning, and capacity planning.

Page 22: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 22

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 22

Storage Performance - CLARiiON

SP utilization– Number of hosts connected to an SP– Use all the host ports on the SP to create multiple paths

Cache utilization– Page size– Read and write cache ratios– High and low watermark– Effect of Environmental Issues on performance

When can cache get disabled?

SP Utilization: Utilization of an SP on the CLARiiON depends on many factors. Some factors are the number of hosts connected to the SP and the number and type of LUNs bound to the SP.Cache Utilization: The CLARiiON storage-system cache is a key element in providing good response times and throughput.Cache Page Size: CLARiiON storage systems allow administrators to set the cache page size. It is a global setting and thus affects all LUNs. The cache page size can be 2, 4, 8, or 16 KB. The page size is dependent on the application. For example, in Oracle, it should be set to the DB_BLOCK_SIZE.Which LUNs to Cache: All LUNs benefit from read cache, and LUNs containing non-static data benefit from write cache. For example, Oracle Redo log devices should have write and read caching enabled. The only LUNs that should be considered for disabling write cache are devices that contain data such as database redo log archive and static tables.

Page 23: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 23

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 23

Host Performance - Where Does I/O OccurThroughput Tuning– I/O Tuning - Max I/O size– Volume management– File-system or Raw device

I/O Layers– Database– File System– Volume Manager– OS Physical device drivers– HBA

Similar to the component stack for performance management, the Common I/O layers in a SAN Environment deal more specifically with the SAN component than the application or macro level environment. I/O Tuning: Max I/O Size: The default maximum I/O size on most systems is 64 KB to 128 KB. This is sufficient for OLTP-type applications, but even in that case, database backups and redo log archival benefit from a larger value. A practical target for I/O sizes for a CLARiiON storage system, for example, is 1 MB. DSS-type applications also benefit from a larger I/O size. The following are examples of parameters to change to increase the I/O size for a file system:

Solaris file system settings: maxphys, set in bytes, maxcontig, set in blocks, set to same capacity as maxphysAIX settings on a per-hdisk level, using chdev:, max_transfer, set in bytes and max_coalesce, set in bytesVERITAS VXFS:vxio:vol maxio is set in 512-byte units; it should be set as high as 2048 (which translates to 1 MB)

File System or Raw Partition: Some databases offer the option of implementing the tables on raw partitions or file systems. Each has advantages.

Page 24: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 24

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 24

Host Performance - HBA UtilizationUse two or more HBAs per host– HBA to host ratio should be 2:1 or more

The number of HBAs in a host depends on the I/O characteristics of the application and the capacity of the host

Configure number of HBAs to I/O potential of a host

Use the appropriate load balancing algorithm for making use of all available channels

Make sure a healthy Fan-in Ratio is maintained – A high fan-in ratio can cause too much “chatter” on the HBA– SCSI errors could be the result of FA over-subscription

The utilization of an HBA can play an important role in the overall performance of a SAN. To prevent single points of failure, it is common place to install two or more HBAs in hosts, especially in mission critical environments. While it is common to deploy multi-pathing software on a host, it is also important to use the load balancing algorithm that is suited for that particular environment. Additionally, it is important to maintain a healthy fan-in ratio to prevent too much control chatter and SCSI timeouts and errors.

Page 25: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 25

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 25

Host Performance - Physical Device StatisticsResponse time– Use to identify overload and application bottleneck– Composed of queue time plus service time

Reads and writes per second– High I/O rates are often the cause of large queues

Kb reads and writes per second– Use to determine if high throughput is required

Relationships through the I/O stack

Page 26: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 26

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 26

Host Performance - File Systems

Small Files (Mail)– Lots of small average read I/O– Requires lots of disks

Random reads have low cache hit rate

– Journal file systems have extra write loadEvery file open/close updates journal file (like database redo log)

Large Files (Medical Images, Video)– Large block size– Need disk striping– Put load on switches and directors

File Systems: File systems may offer some benefit from OS-level caching. Rereads of blocks from the OS cache are very fast, taking some of the read cache load off of the storage system. This helps in systems configured with very large amounts of RAM (above 8 GB) that can load entire indexes and tables into file system buffers.

Advantages of file systems:• Offer coalescing of writes, which is useful in maximizing bandwidth in sequential access operations.• Are easier to back up and mount from a backup host.• Are easier to manage and have more tools for analysis than raw partitions.

A file system should not be used for certain database writes, for example Oracle redo logs unless the write caching can be bypassed. For example, on Solaris, Oracle opens the redo log file with D_SYNC to force writes through to the physical medium.

Disadvantages of file systems are as follows:• There is an extra layer of indirection and logic• File system buffering requires that a sync of the file system be done before a backup is commenced

Advanced File Systems: Some file systems offer advanced features that improve performance. Advanced file systems, such as XFS, JFS, and VxFS are preferred over UFS file systems. These advanced file systems offer improved journaling and performance (by eliminating double buffering).

Page 27: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 27

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 27

How are Performance Tools HelpfulHost applications and server characteristics make up the I/O environment

Analyze the environment before troubles arise

Use the right tools to extract the necessary metrics

Detailed analysis means collecting data on each step of the I/O path

Drill up to find I/O source

Drill down to check for overload

Database ObjectsDatabase Objects

FilesFiles

Logical VolumesLogical Volumes

Host DevicesHost Devices

HostHost

Host DirectorsHost Directors

CacheCache

Disk DirectorsDisk Directors

DisksDisks

PortsPorts

InterSwitch LinksInterSwitch Links

PortsPorts

NetworkNetwork

StorageStorage

Start troubleshooting at the network layer and drill up and/or down to find root cause of problem.

Page 28: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 28

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 28

Performance Tools - EMC ControlCenter

ControlCenter provides a single point-of-reference for viewing real-time data and historic performance data

Using ControlCenter, compare and contrast SAN components in the environment

Utilizing ControlCenter relationship view, it is possible to visualize and collect some data on the different components within the SAN / Storage environments.

Page 29: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 29

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 29

Performance Tools – Performance Manager Collects, correlates, and graphically presents performance

Enables performance planning, tuning, and troubleshooting

Components monitored – Open Systems Servers– Mainframes– SAN Switches and Directors– CLARiiON and Symmetrix Storage

Arrays

Provides web reporting for worldwide monitoring

SAN

Storage

Servers

Performance Manager is a component of EMC ControlCenter. It is a performance monitoring and analysis tool.

Page 30: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 30

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 30

Lesson 2 SummaryKey points covered in this lesson:

Use the performance stack to understand the big picture of I/O in a SAN environment.

Identify key components to monitor in switched SAN storage environments.

ISL connections in SAN switch environments need to be closely monitored.

The entire environment must be analyzed for end-to-end performance management.

Understand limitations and supportability of each component layer within the SAN.

These are the key points covered in Lesson 2.

Page 31: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 31

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 31

Lesson 3: Secure SAN Architecture, Components, and Mechanisms

Describe Information-Centric Security

List ways to implement access control– Physical access– Administrative access– Host access to storage

Describe methods to protect – Storage infrastructure– Data in flight– Management infrastructure– Replication infrastructure

The objectives for this lesson are shown here. Please take a moment to read them.

Page 32: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 32

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 32

Information-centric Security

SAN

Infrastructure

Data

People

• Understand Business Risk to establish Policy and Priority

• Enforce policy through process and technology to secure

• People

• Data

• Information Infrastructure

• Audit to ensure policy compliance

Process

•Understand Business Risk

•Establish Policy and Priority

•Enforce policy through process and technology.

Information-centric security policy addresses:− People: establish and assure users’ identity−Data: encryption and key management solutions− Information infrastructure: security must be built in to make it seamless and transparent

You must continually audit the environment to ensure policy compliance.

Page 33: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 33

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 33

Information-Centric Security

Assess the security of information to establish risk and priorityPlan and implement Information-centric security programs

Identity and Access Management SolutionsAssure identity and control information access

Secure Information InfrastructureDeliver inherently secure infrastructure through a common, integrated security platform

Data Protection SolutionsDirectly protect sensitive information and transactions

Integrated Security Information Management SolutionsCollect, correlate, and analyze security information in real timeand over time

AssurePolicy Compliance

SecureData

SecureInfrastructure

SecureIdentities and Access

AssessRisk

1

2

3

4

5

• Assess the current state of information security. Understanding the risks to data. Having a baseline allows the development of effective security policies Things that need to be established are:

• How secure information is from the start• Distinguishing what information is sensitive from what isn’t• Knowing:

• Where sensitive information is stored• Who has access• How it is currently secured

• Secure identities and access. Identity and access management solutions establish and ensure users’ identity. They also control, at a very granular level, the information those users are authorized to access.

• Secure the infrastructure. An inherently secure infrastructure acts as the foundation on top of which security protections are directly built.

• Protect the confidentiality and integrity of data. This requires protecting not only the data itself, but also the transaction process by which this data is made available to users.

• Assure Policy Compliance.

Page 34: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 34

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 34

Controlling Physical AccessCentralized Server + Storage infrastructure

Hardware located in locked location– Only authorized users have physical access– Only Authorized personnel can make physical or logical changes to

the topology– Control access using badge control, cameras, security personnel– Regularly audit access logs

SAN components racked in lockable cabinets

Physical Security

It is imperative to maintain a secure location and network infrastructure, for if either of these are compromised then the SAN portion of the infrastructure is also quickly compromised. The following steps can help ensure that physical security is not compromised:

Segregate equipment in a physically secure location with an ID card controlled access points for entry.Only authorized personnel should be allowed physical access to the site. Only authorized personnel should have the ability to make physical or logical changes to the topology (e.g., move cables between ports, reconfigure access, add/remove devices to the network, etc.).

Page 35: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 35

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 35

Controlling Administrative Access

Username/Password– Named Accounts– Strong passwords

Separation of duties– Limit functions a given user can access based on business

requirementsi.e. Network Admin, Storage Admin, Security Admin

Secure Access to management console– Restrict IP address– Remote access only via secure tunnel network (i.e. VPN)

Access control can be complex. Without documented policies, access control may fail to meet auditing requirements.

Most FC SAN components provide some form of username/password combination to control administrative access. Require named user accounts, as opposed to default admin accounts. The username appears in most audit logs, so having unique usernames aids in determining who is responsible for what actions in the SAN.

Different access levels can be used to separate data access from data management. Most SAN management applications provide different levels of user access levels. This provides a finer granularity of control over what functions a specific account can perform.

Switch management should be done from secure networks.

Authorization in this sense should be set on site specific policies. Each site has access lists, which are records of who can read and who can write configurations. All access to systems and SAN management devices are based on user-IDs and roles which are assigned with such management systems.

Access control are well-defined within the site and should be implemented with authorization based tools. However, access to the authorization based tools should also be well defined.

Page 36: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 36

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 36

Authentication, Authorization, and Auditing

Authentication– Host to storage via IP or Fibre Channel– Authentication for all storage transports is DH-CHAP– Secure Authentication requires FC-SP for FC Devices

Authorization– A management function defining access and action – Typically, role-based authorization

Auditing– Record and monitor port activity– Fundamental means for intrusion detection– Intrusion detection is done post event– Security and integrity of logs must be protected

Storage connectivity using either IP or Fibre Channel requires the host to authenticate itself to the storage. This is connection authentication.

The required authentication for all storage transports is DH-CHAP (Diffie-Hellman Challenge Handshake Authentication Protocol). Fibre Channel storage uses FC-SP (Fibre Channel Security Protocol). Other authentication protocols worth mentioning are:

FCAP: Based on certificates. Authentication is done through bi-directional communication of certificates. Each entity in the environment has its own certificate which is based on a trusted certificate authority.FCPAP: Based on passwords. Shared password credential materials must be shared among all entities. Authentication is bi-directional.

Intrusion detection is typically done as post-processing of records and events to determine if something bad has happened. In the event that the site or system must be audited, especially for intrusion detection, the integrity of logs and data must be known or checked for authenticity. In many scenarios, intruders alter logs to hide the trail of events.

The management infrastructure needs to record port activity. That is used to determine if traffic destined for the port meets security policies.

Auditing means active management of the SAN to determine if any anomalies in traffic patterns are present. When anomalies are discovered, changes in management tactics must be taken to verify the change in activity.

Page 37: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 37

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 37

Authentication

B authenticates A

Switch A Switch B

DH-CHAP

Radius ServerB authenticates A

Switch A Switch B + RADIUS Client RADIUS SERVER

DH-CHAP is a challenge-response authentication protocol based on shared secrets. Systems that use DH-CHAP for authentication have access to a secret that is shared between the two systems. Authentication is performed during the fabric login phase and enforces fabric and device access through an authentication method. DH-CHAP does not include encryption and is open to violation if one of the systems participating in the shared secret is compromised. RADIUS (Remote Access Dial-in User Service) is a distributed client/server system that secures networks against unauthorized access. RADIUS servers are responsible for receiving user connection requests, authenticating the user, and then returning all configuration information necessary for the client to deliver service to the user. A RADIUS server can act as a proxy client to other RADIUS servers or other kinds of authentication servers.

Page 38: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 38

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 38

Controlling Host Access to Storage - ZoningDefault Zone disabled

Zoning Enforced– Soft (WWN) Zoning– Hard (Port) Zoning

Port Binding – Associates physical port ID and host WWN

Zoning restricts the components with which the SAN can communicate by grouping sets of nodes into zones. This restricts access to data to a certain set of nodes. This restriction is software only and is NOT a security feature, zoning is designed to logically divide a fabric. For security in the fabric, Port Binding must be used.

Default zoning states that all devices can see each other. The default zone must be disabled.

Soft Zoning is based on the node’s WWN regardless of the physical switch port to which the node is attached. Soft Zoning is the zoning method recommended by EMC. Soft zoning is flexible and easy to manage. Nodes can be moved to different switch ports without changing the node’s zone membership. The security disadvantage to soft zoning is that a malicious node could spoof a valid zone WWN, and access data in that zone.

Hard Zoning is enforced using physical switch port numbers for zone membership. The security disadvantage of hard zoning is that anyone with physical access to the switch can move their device into your port and access data in that zone.

Port Binding can be used in conjunction with zoning to add security by associating a physical port ID with a host’s WWN.

Page 39: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 39

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 39

Controlling Host Access to Storage – LUN MaskingLUN Masking

Source ID (S_ID) lockdown– Associates S_ID and the hosts WWN

LUN Masking is the ability of a host to see a LUN. LUN Masking is done by WWN. The security disadvantage to LUN Masking is that a malicious node could spoof a valid WWN, and access data.

For added security LUN masking can be used in conjunction with S_ID Lockdown. S_ID lockdown maps the FCID of the HBA with the WWN of the HBA into the VCMDB. This protects the LUN from being accessed by a initiator using a spoofed WWN.

Page 40: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 40

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 40

In flight EncryptionFC Authentication– Implement FC-SP (Fibre Channel Security Protocol)– Local– Centralized

RADIUSTACACS+

FC-SP protects data in flight by encrypting the data between host and storage.

Page 41: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 41

All switches must be secure for the fabric to be secure. If one switch is connected without the same security implementation, the entire fabric is vulnerable. Switches connected to the IP network for management should use secure methods, such as SSL and SSH to allow out-of-band connections. With both of these protocols, strong key encryption should be enforced and expiration of both keys and accounts should be considered.

G_Ports and U_Ports auto negotiate based on what is connected to them. For added security the user can lock a port into a specific mode such as F_Ports for hosts.

Page 42: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 42

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 42

Storage-to-Storage Security

Storage-to-Storage connections can use FCIP and/or iFCP to connect geographically dispersed sites

Protect the storage data transported between sites– IPSec can be used encrypt data in flight– Encrypt data at rest before sending data to the remote site– Virtual Private Network (VPN) Gateways can be used to encrypt data

For implementation using DWDM, the IPSec and VPN technologies are acceptable encryption/security methods.

IPSec is a set of protocols for securing IP communications IP communications by authenticating or encrypting each IP packet in a data stream.

Page 43: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 43

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 43

Zoning

LUN Masking

Device Mapping

Persistent Binding

Set-UID binaries

User privileges

OS Security

Host Security

Oracle Server

Email Server

Web Server ArrayArray

Volumes

001 002 003 004

005 006 007 008

FABRICFABRIC

Zoning & LUN Masking: Anyone with access to the SAN Management tools has the ability to change zoning & device mapping.

Persistent Binding: The association of the operating system's controller-target value and the WWPN of the Symmetrix system is called binding. Persistent binding is when this bind is persistent from boot-to-boot and across hardware changes and failures in the fabric. Persistent binding is usually performed in the HBA driver configuration file on the host (i.e. lpfc.conf for Emulex).

Set-UID command: Several vendors provide a Set-UID command to make it easier to manage a SAN. However, it can become a bane in a distributed environment or an environment where several end-users login to the host. As a rule of thumb, the setuid command should have it’s sticky-bit removed, or if absolutely necessary to keep it there, make sure system accounting and auditing is enabled.

Management server User Privileges: SAN Managers should have their own roles on the server, Management applications should only be available to users in that role. As a general rule, these applications should not be run while logged in as root or administrator. It also makes sense to make sure that this server is protected on a restricted network so unauthorized hosts cannot talk to it using rogue tools.

OS Security: A poorly secured host on the network puts the SAN at risk of being compromised. Secure hosts connected to a SAN behind a firewall to prevent unauthorized network access. Restrict root/administrator access to these machines.

Page 44: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 44

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 44

Lesson 3: Summary

Key points covered in this lesson:

Definition for SAN Security

Threat models

Fabric, host, network, and storage security

Best practices

Host-to-Storage Security

Secure Fabric

Storage-to-Storage Security

Integration into individual environments

These are the key points covered in Lesson 3.

Page 45: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 45

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 45

Module Summary

Key points covered in this module:

SAN Security

Configuration Management

Performance Management

These are the key points covered in this module. Please take a moment to review them.

Page 46: SAN Best Practices

Copyright © 2007 EMC Corporation. Do not Copy - All Rights Reserved.

Module 8: SAN Management - Best Practices - 46

© 2007 EMC Corporation. All rights reserved. Module 8: SAN Management - Best Practices - 46

Closing Slide