BP Perf and Troublshooting -Edge-v1.9

© Copyright IBM Corporation 2015

Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.

sPE0330

Best Practices for Performance,

Design and Troubleshooting IBM

Storage Connected to IBM Power

Systems

Chuck Laing

Senior Technical Staff Member

IBM GTS Service Organization


Storage Overview - what's inside?

1. Know the Physical makeup

2. Know the Virtual makeup (good throughput design tips)

3. What is a Storage Pool - where do I place data?

4. What should I be aware of/what should I avoid? (Tips & Pitfalls-

Tuning)

• To Stripe or not Stripe, that is the question!

5. Zoning configuration and dual connectivity

• Checking that multipathing is working on host

6. Documentation - why it matters

7. Topology Diagrams

8. Disk Mapping (view at a glance)

9. Easy Storage Inquiry Tools

10. How to Improve Performance

• Bottlenecks

2

Agenda

The Top Ten Things SA’s should know about Storage


Throughput and Performance Key Optimization Factors

3

• Throughput

– Spreading and balancing IO

across hardware resources

• Controllers

• Ports & zoning connections

• PCI’s Cards

• CPUs, RAM

• Disks spindles

– Compression

– Thin Provisioning

– Easy Tier - SSD

• Etc….

• IO Performance Tuning

– Using utilities, functions and

features to tweak (backend, frontend)

• Qdepths

• HBA transfer rates

– FC adapters

• LVM striping vs spreading

• Data Placement

– Random versus sequential

– Spreading versus Isolation

– Application characteristics

Configuring throughput optimally increases potential performance scalability


FlashSystem Any Storage Private, Public or Hybrid Cloud

Control

Virtualize Accelerate Scale

Family of Storage Management

and Optimization Software

Protect Archive

IBM

Spectrum

Control

Tivoli Storage Productivity Center

(TPC) and management layer of

Virtual Storage Center (VSC)

IBM

Spectrum

Protect

Tivoli Storage Manager (TSM)

IBM

Spectrum

Archive

Linear Tape File System (LTFS)

IBM

Spectrum

Virtualize

SAN Volume Controller (SVC)

IBM

Spectrum

Accelerate

Software from XIV System

IBM

Spectrum

Scale

Elastic Storage - GPFS

IBM Spectrum Storage Solutions

What you may remember …

4

Based On Technology From


5

Grid Building Block

-Data Module (1-15)

-CPU

-Memory (360GB/720GB)

-12 disk drives (1, 2, 3, 4, 6TB)

-Optional SSD 360/720 cache

External Connect

-Interface/Data Module (4-9)

-24 FC ports – 8Gb

-iSCSI ports – 22 GbE or 12 10GbE (Model 214)

Internal Interconnect

-2 Infiniband Switches

-3 UPSs Gen 3 Spectrum Accelerate

Build with a Strong Foundation

• Model 961 base and model 96E expansion

• 961 with up to 3 x 96E expansion frames

• 2.5” small-form-factor drives; 3.5” nearline; 1U High Performance Flash Enclosures

• 6 Gb/s SAS (SAS-2)

• Maximum of four frames

• Maximum of 1,536 drives plus 240 Flash cards

• Top of rack exit option for power and cabling

• Each frame has its own redundant set of power cords

• Two POWER7+ servers 4.228 GHz processors

• 2, 4, 8, and 16 core processor options per storage controller

• DS8870 has dual active/active controllers

• Up to 1TB of processor memory

• Host adapters

• Up to 64 16 Gb/sec or 128 8 Gb/sec ports or combination of 8/16 Gb/sec ports

• Each port supports FCP and FICON at the port level

• Base frame and first expansion frame allows 16 adapters. For 8 Gb/sec, both 4 and 8 port host adapter cards available. For 16 Gb/sec, 4 port host adapter card available

• All Flash configuration has all host adapters in the base frame

• Efficient front-to-back cooling (cold aisle/hot aisle)

V7000 V9000 • Great for

multiple mixed

workloads that

drive huge I/O

• Scale out for

more all flash

capacity, IOPS

and bandwidth

• Up to 2.5M

IOPS, 200µs

(.2ms)

• Up to 228TB

usable, 1.1PB

Effective


Traditional Applications New Generation Applications

Storage Management

Policy Automation

Analytics & Optimization

Snapshot & Replication

Management

Integration & API Services

Data Protection

Spectrum Virtualize

Virtualized SAN Block

Spectrum Scale

Global File & Object

Flexibility to use IBM and non-IBM Servers & Storage or Cloud Services

Spectrum Accelerate

Hyperscale Block

IBM Storwize, XIV, DS8000, FlashSystem and Tape Systems

Non-IBM storage, including commodity servers and media

Data Access

Storage and Data Control

Spectrum Control Spectrum Protect

Self Service Storage

Spectrum Archive

Data Retention

and non-IBM clouds

IBM Comprehensive Software Defined Storage

Capabilities - IBM Spectrum Storage Solutions – BP to Virtualize data

6

sPE0330 © Copyright IBM Corporation 2015

Foundation - Build a SAN Environment

Seems simple enough right?

• Build the Pools, at the Storage Device • Choose your storage type by disk characteristics, speeds and feeds

• Create Volumes from those pools

• Use ET, Compression, technology to render best Performance

• Connect Hosts to the Storage though the SAN Fabric

• Zone for redundancy and resiliency

• Configure settings to Best Practices

• Configure hosts to take advantage of the Storage Foundation

• Configure VIOS, LPARs, VMs, etc.

• Distribute virtual aspects appropriately

• Map the volumes to the hosts

• Create the file systems , LPs, PPs, PVs, LPs, VGs, etc

• Place the applications on the configured hosts


Foundation - Slow Performance or Outage Occurs

Now What?

• You followed the recipe • You took advantage of all the technology, features and functions by:

• Minimizing and automatically migrating volume IO hotspots – using ET in Pools

• Dual connecting all ports from the Storage to the Hosts

• Using good redundant performing storage foundation building blocks

• What happened?

• The cookies came out of the oven with clumps of:

• Salt, baking soda and brown sugar in spots.


Foundation -

What causes performance degradation and Outages?

• The 3 most common root causes are: • Configuration changes

• Hardware component failure

• IO load shift or increase

• You should consider designing the environment to

withstand load shift in the event of half the

environment failing such as: • Controller outages

• Fabric outages

• Server Outages

• You should design configurations to known Best

Practices • …Just because you can do something …should you?


Foundation - Top 10 Most Common Logical Issues

Found Globally / Trending

1. Incorrect Zoning Practices and Connections

• Oversubscribed SVC to Host Systems ports (Too many logical paths per vdisk)

• Oversubscribed Storage Controller ports to SVC Nodes (Too many connections)

• Single Points of Failure (SPoF) (Undersubscribed - not enough logical paths per vdisk)

2. Unsupported or down-level host multipathing drivers (SDD or non SDD drivers)

3. Incorrect load balancing

• Improper front-end load balancing (SVC preferred Node vdisk to host)

• Misconfigured Back-end Storage Controller balancing

4. Incorrect Volume Pool configurations

• Ineffective SVC Cache utilization (too many MDGs with too few disk spindles)

• Data placement - Improper application sharing versus isolation

• Improper Tiering decisions

5. Lack of documentation for proper management and troubleshooting

6. Down level or problematic microcode

7. Fabric port topology bottlenecks (incorrect physical topology)

8. Incorrect physical Fibre cabling practices (cracked glass)

9. Insufficient cable labeling practices

10.Suboptimal cooling

10


BP - Attaching Servers

Foundation – Zoning/Mapping Volumes from Pools

11

1. Incorrect Zoning Practices and Connections - can cause Fabric Congestion • Oversubscribed SVC to Host Systems ports (Too many logical paths per vdisk)

• Oversubscribed Storage Controller ports to SVC Nodes (Too many connections)

• Single Points of Failure (SPoF) (Not enough logical paths per vdisk)

2. Unsupported or down-level host multipathing drivers (SDD or non SDD drivers) –can cause “sick but not dead environments”

A Deeper Dive


New Storage Zoning Schema per Iogrp 12 Port Node

Evolution and Types of Zones – non cluster type

Making 1 zone per Node per Fabric with

the same 8 ports from a single backend

storage unit, will ensure the max login

count of 16 is not exceeded

Production SAN Fabric

D

STG Zone-1

STG Zone-2

STG Zone-3

STG Zone-4

Production SAN Fabric

C

12

Spectrum Virtualize DH8– 12 FC ports per node

I/O Group 0

Node 1

1 2 3 4

Slot 1

5 6 7 8

Slot 2

Physical

port

number 9 10 11 12

Slot 5 Logical

port with

wwpn #

embedded Node 2

1 2 3 4

Slot 1

5 6 7 8

Slot 2 9 10 11 12

Slot 5

2

2

2

1

2

4

2

3

2

2

2

1

2

4

2

3 5

1

5

2

5

4

5

3

1

2

1

1

1

4

1

3

5

2

5

1

5

4

5

3

1

2

1

1

1

4

1

3

Host/STG

Rep /

Node – Node

12


Back-end Storage to Spectrum Virtualize Zones

Storage Zone Type – How many Storage zones?

HBA1

P

1

P

2

HBA2

P

1

P

2

Back-end Storage

SAN Fabric 1 SAN Fabric 2

STG Zone-1

STG Zone-2

STG Zone-3

STG Zone-4

STG Zone-5

STG Zone-6

STG Zone-7

STG Zone-8

13


Looking at Power System Zoning between

Spectrum Virtualize and Standalone Power Systems

• Is this BP - redundant pathing for 2 HBA ports?

• What could be wrong?

14

Host

B1 A1

Fabric1 Core1 Fabric2 Core1

© Copyright IBM Corporation 2015 15

Host

B1 A1

Fabric1 Core1 Fabric2 Core1

Correct

Looking at Power System Zoning between

Spectrum Virtualize and Standalone Power Systems

4 paths per vdisk


Foundation -

Zoning Multi HBA hosts for Resiliency

• Sys Admins – provide PCI slot to Port WWPN identity to Storage Admins

• Storage Admins – define the SVC host definitions to match – Avoid single points of hardware failure at the Host HBA, Fabric and SVC

– Make four zones, one for each pseudo host per fabric( Red, Blue, Orange and Green )

© Copyright IBM Corporation

2014 16

HBA1

P

1

P

2

HBA2

P

1

P

2

Physical Host

SAN Fabric 1 SAN Fabric 2

HBA1

P

1

HBA2

P

1

P

2

SVC defined Pseudo Host1

HBA1

P

1

P

2

HBA2

P

1

SVC defined Pseudo Host2


LPM Could go to Frame2 or Frame3 Both active and inactive ports will be active during the LPM. Upon LPM completion the previous active ports will now show inactive and the previous inactive ports will show active.

Map the same Vdisks to the inactive LPAR in the same fashion as the active LPAR

FCA2

P3

P4

FCA4

P7

P8

SVC

VIO Server1 VIO Server2

Pseudo1

I

Pseudo 2

Active

Client

Logical

Partition

(LPAR2)

Fame1 Hypervisor

SAN

Pseudo1

Pseudo2

Active

Client

Logical

Partition

(LPAR1)

VFCA VP1.1a

VFCA VP7.1a

VFCA VP5.1a

VFCA VP3.1a

VFCA VP2.2a

VFCA VP8.2a

VFCA VP6.2a

VFCA VP4.2a

VP1.1a

VP2.2a

VP5.1a

VP6.2a VP2.2i

VP1.1i VP5.1i

VP6.2a

FCA1

P1

P2

FCA3

P5

P6

VP3.1a

VP4.2a

VP3.1i

VP4.2i

VFCA4.1 VP7.1a

VP8.2a

VP7.1i

VP8.2i

Pseudo1

Pseudo2

Inactive

Client

Logical

Partition

(Pseudo

LPAR1b)

VFCA VP1.1i

VFCA VP7.1i

VFCA VP5.1i

VFCA VP3.1i

LPM

•VFCA

P1 P2

P3 P4

P3 … P64

I

Fame2

Hypervisor

SAN

VFCA

.

•VFCA

P1 P2

P3 P4

P3 … P64

I

Fame3

Hypervisor

SAN

VFCA

.

inactive & active

vWWPN pairs

17

LPM Could go to Frame2 or Frame3

During LPM the number of paths double from 4 to 8 Starting with 8 paths per vdisk will render an unsupported 16 paths during this time - could lead to IO interruption

FCA2

P3

P4

FCA4

P7

P8

SVC


Pseudo1

I

Pseudo 2

Active Client Logical Partition (LPAR2)

Fame1 Hypervisor

SAN

Pseudo1

Pseudo2

Active Client Logical Partition (LPAR1)

VFCA VP1.1a

VFCA VP7.1a

VFCA VP5.1a

VFCA VP3.1a

VFCA VP2.2a

VFCA VP8.2a

VFCA VP6.2a

VFCA VP4.2a

VP1.1a

VP2.2a

VP5.1a

VP6.2a VP2.2i

VP1.1i VP5.1i

VP6.2a

FCA1

P1

P2

FCA3

P5

P6

VP3.1a

VP4.2a

VP3.1i

VP4.2i

VFCA4.1 VP7.1a

VP8.2a

VP7.1i

VP8.2i

Pseudo1

Pseudo2

Active Client Logical Partition (Pseudo LPAR1b) During LPM

VFCA VP1.1i

VFCA VP7.1i

VFCA VP5.1i

VFCA VP3.1i

LPM

•VFCA

P1 P2

P3 P4

P3 … P64

I

Fame2 Hypervisor

SAN

VFCA

.

•VFCA

P1 P2

P3 P4

P3 … P64

I

Fame3 Hypervisor

SAN

VFCA

.

inactive & active vWWPN pairs

18


LPM

VFCA6.1

VFCA8.1

VFCA5.1

VFCA3.1

VFCA1.1

VFCA2.1

VFCA4.1

Dual VIOS to Multiple LPARs

Is it resilient? - One VIOS Failure

x

FCA2

P3

P4

FCA4

P7

P8

Spectrum

Virtualize


Pseudo1

I

Pseudo 2

Active

Client

Logical

Partition

(LPAR2)

Fame1 Hypervisor

SAN

Pseudo1

Pseudo2

Active

Client

Logical

Partition

(LPAR1)

VFCA VP1.1a

VFCA VP7.1a

VFCA VP5.1a

VFCA VP3.1a

VFCA VP2.2a

VFCA VP8.2a

VFCA VP6.2a

VFCA VP4.2a

VP1.1a

VP2.2a

VP5.1a

VP6.2a VP2.2i

VP1.1i VP5.1i

VP6.2a

FCA1

P1

P2

FCA3

P5

P6

VP3.1a

VP4.2a

VP3.1i

VP4.2i

VFCA7.1

VP7.1a

VP8.2a

VP7.1i

VP8.2i

19


LPM

VFCA6.1

VFCA8.1

VFCA5.1

VFCA3.1

VFCA1.1

VFCA2.1

VFCA4.1

x

FCA2

P3

P4

FCA4

P7

P8

Spectrum

Virtualize


Pseudo1

I

Pseudo 2

Active

Client

Logical

Partition

(LPAR2)

Fame1 Hypervisor

SAN

Pseudo1

Pseudo2

Active

Client

Logical

Partition

(LPAR1)

VFCA VP1.1a

VFCA VP7.1a

VFCA VP5.1a

VFCA VP3.1a

VFCA VP2.2a

VFCA VP8.2a

VFCA VP6.2a

VFCA VP4.2a

VP1.1a

VP2.2a

VP5.1a

VP6.2a VP2.2i

VP1.1i VP5.1i

VP6.2a

FCA1

P1

P2

FCA3

P5

P6

VP3.1a

VP4.2a

VP3.1i

VP4.2i

VFCA7.1

VP7.1a

VP8.2a

VP7.1i

VP8.2i

Dual VIOS to Multiple LPARs

Is it resilient? – One SAN Fabric Failure

20


Types of Zones

Host ESX to SVC Zones

2+2 =4 Paths per LUN


Connectivity - Host to SVC Zoning Best Practices

DEV#: 5 DEVICE NAME: hdisk5 TYPE: 2145 ALGORITHM: Load Balance

SERIAL: 600507680181059C4000000000000007

==============================================================

Path# Adapter/Path Name State Mode Select Errors

0 fscsi0/path0 OPEN NORMAL 1996022 0

1* fscsi0/path1 OPEN NORMAL 29 0



22

• Good communication between the SA and the Storadmin, can uncover

issues quickly

– Correct datapathing has 3 factors

• Proper zoning

• Proper SVC Host definitions (SVC logical config of the host def)

• Proper redundancy for the SVC preferred /non preferred pathing


SERIAL: 600507680181059BA000000000000005

============================================================


















Correct Incorrect


Host Multipath Configuration Best Practices examples

-Running a script can show current status

• For AIX VIO - driver used - sddpcm – https://w3-connections.ibm.com/wikis/home?lang=en-

us#!/wiki/Global%20Server%20Management%20Distributed%20SL/page/Script%20to%20Capture%20SAN%20Path%20Info

– Multipath installed = Yes, Set to: 4 paths/ hdisk, fcsi_settings:2145: fast_fail,

– Multipath Policy =load_balance

• For Linux/ESX/VMWare - – https://w3-connections.ibm.com/files/app#/file/1ba027ec-5d20-4c60-a281-f18f16192f7a

– Device –mapper – multipath HBA elements=4,

–For Windows – driver used = MPIO=SDDDSM – https://w3-connections.ibm.com/files/app#/file/3e52f54c-a445-4b17-aa5d-a5da43d4bedb

– Multipath installed = Yes, HBA elements = 4, MPIO Policy = Optimized

• For Solaris – driver MPxIO • Https://w3-connections.ibm.com/files/app#/file/66ea3228-4b26-48bd-a8fd-55751a02fc42

• Multipath installed = MPxIO, Path Subscription= 4, MPIO Policy = round-robin

Content input from : Bill Marshall, Jason Moras, Brad Worthen, Ramesh Palakodeti

https://w3-connections.ibm.com/wikis/home?lang=en-us#!/wiki/Global%20Server%20Management%20Distributed%20SL/page/Script%20to%20Capture%20SAN%20Path%20Info





https://w3-connections.ibm.com/files/app#/file/1ba027ec-5d20-4c60-a281-f18f16192f7a











https://w3-connections.ibm.com/files/app#/file/66ea3228-4b26-48bd-a8fd-55751a02fc42












Load Balancing

24

3. Incorrect load balancing

• Improper Front-end load balancing (SVC preferred Node vdisk to host)

• Misconfigured Back-end Storage Controller balancing

A Deeper Dive


Examples of correct Host to SVC Volume Balancing

25

vdisk3

vdisk1

vdisk2

Preferred path for vdisk1 is

SVC N1P2 & N1P3

Non Preferred path for vdisk1

is SVC N2P2 &N2P3

Preferred path for vdisk2 is

SVC N2P2 & N2P3

Non Preferred path for vdisk2

is SVC N1P2 &N1P3

vdisk1

vdisk2 vdisk3 vdisk4

vdisk4


Spectrum Virtualize (# Volumes to Host ports

from Nodes / IOgrps)

26

Imbalanced Data I/O loads


Back-end Load Balancing

Which has better throughput?

27

SAN Fabric SAN Fabric

SAN Fabric SAN Fabric

1

2


Storage Pool Configurations

28

4. Incorrect Volume Pool configurations

• Ineffective SVC Cache utilization (too many MDGs with too few disk spindles)

• Improper Tiering decisions

• Data placement - Improper application sharing versus isolation

Looking Deeper


Storage Pools

Easy Tier v3: Support for up to 3 Tiers

• Support any combination of 1-3 tiers

Tier 0 Tier 1 Tier2

Flash/SSD ENT NL

Flash/SSD ENT NONE

Flash/SSD NL NONE

NONE ENT NL

Flash/SSD NONE NONE

NONE ENT NONE

NONE NONE NL

Storage Pools - “Example Only”

Drive Selection in an Easy Tier Environment

Tiered Storage Strategy Overview

Right Tiering - more on tier 1 than should be?

31

Tier0

Ultra High Performance

Tier1

High Performance

Mission Critical

Tier2

Medium Performance

Non-Mission Critical

Tier3

Low Performance

Archival/Tape

Performance

You should consider: Cost versus Performance

Cost

Per

Gigabyte

Tier 0: Ultra High Performance Applications 1-3%

Tier 1: Mission critical, revenue generating

applications

15-20%

Tier 2: Backup, recovery and vital data 20-25%

Tier 3: Archives and long term

retention 50-60%

Storage Pyramid

sPE0330

Storage Pools

Tiered Storage Classification

CO

ST

/ P

ER

FO

RM

AN

CE

/ A

VA

ILA

BIL

ITY

Tier rating is based on performance AND reliability

RAID6 recommended on drives greater than 900GB

TIER Description Technical Examples – High level Guidance – Local variations on technology exist

IBM (block) SVC Recommended for Open Performance range capability

TIER0 – Flash

Systems Preferred

Solid State drives

alternate

Ultra High Performance. Meet QoS for High End

DS8870/SVC 400GB recommended *** RAID5 -- Small Block Recommended – 840 / 900 FlashSystem RAID5 -- Excellent ‘Turbo TIER1’ when coupled with XIV GEN3 (SSD cache) --

DS8870 -> Greater than 250,000 IOPs/5500+ MBs 840/900 FlashSystem Greater than 500,000 IOPs mixed work load (70/30).

TIER1(a)

TIER1(b)

High Performance. Drive up utilization of high-ended storage subsystems and still maintain performance QoS objectives. For low capacity requirements smaller less powerful devices may meet tier definition

DS8870 w/SAS 600GB 15K disk drive RAID5/RAID6 arrays *** - 300GB 15K to be used only if DISK Magic shows need -- RAID6 should be seriously considered

DS8870 -> 200,000+ IOPs / 5000 MBs V9000 -> TBD

XIV* (GEN3) model 214 with 2TB SAS drives (11 Module or greater unit)

XIV* (GEN3) model 214 with 3TB SAS

drives (11 Module or greater unit) --- 11.2 code version required --- --- SSDs (solid state drives) required --- XIV GEN2 removed from strategy – --- Lower cost. Serious compare to DS8

recommended for non-mainframe ---

XIV 2TB XIV (GEN3 15 mod) less than 130,000 IOPs/3,400 MBs XIV (GEN3 11 mod) less than 95,000 IOPs/3,200 MBs XIV 3TB XIV (GEN3 15 mod) less than 120,000 IOPs/3,400 MBs XIV (GEN3 11 mod) less than 85,000 IOPs/3,200 MBs

sPE0330


IBM (block) SVC Recommended for Open

Performance range capability

TIER2

Medium Performance. Meet QoS for applications/data that resides here. For low capacity requirements smaller less powerful devices may meet tier definition

DS8870 w/SAS 600GB 10k disk drive RAID5/RAID10 arrays V7000 w/SAS 600GB using RAID5 (GEN2) V7000 In-Rack w/450GB SAS for PureSystems -- Note that the V7000 Flex Version is NOT RECOMMENDED --

DS8870 -> less than 80,000 IOPs or less than 3,000 MBs V7000 Gen2 (block) -> less than 75,000 IOPs or less than 2,500 MBs

TIER2b

Medium performance/Cloud and Commodity Storage

SDS Block Offering On Prem - Quantastor w/4TB 72 drive chassis running RAID10 or RAID5 as specified – with compression only. On Prem – 4TB R5 4+1 x 2 w/Software RAID 0 on top. V7000 with 1.2TB 10K SAS drivers using RAID6 (GEN2) XIV* (GEN3) model 214 with 4TB SAS

XIV -> less than 100,000 IOPs or less than 2,500 MBs SDS Block -> less than 50,000 IOPs or less than 1,000 MBs

CO

ST

/ P

ER

FO

RM

AN

CE

/ A

VA

ILA

BIL

ITY


RAID5 not recommended on drives greater than 900GB

Storage Pools


sPE0330


IBM (block) SVC Recommended for Open

Performance range capability

TIER3

Low Performance. Meet QoS for applications/data that resides here.

DS8870 with NL-SAS tech using RAID6 when customer already has DS8K with room for this storage V7000 with NL-SAS using RAID6 (GEN2)

DS8870 -> less than 25,000 IOPs or less than 1,000 MBs V7000 (block) -> less than 30,000 IOPs or less than 300 MBs

TIER3b

Low Performance. Meet QoS for applications/data that resides here.

SDS Block Offering On Prem - 4TB 72drive chassis running RAID5 – with compression

SDS & File Block: up to 35K and 280 MB/sec and when configured as specified SDS Object: Performance is high configuration dependent and measured in GETs and PUTs. Less than 25,000 IOPs or less than 300MB/sec

TIER4 Archival, long term retention, backup

Virtual Engines, Tape ATLs, ProtecTIER

N/A Tier based on features.

CO

ST

/ P

ER

FO

RM

AN

CE

/ A

VA

ILA

BIL

ITY


RAID5 not recommended on drives greater than 900GB

Storage Pools


sPE0330

Storage Pools

‘Block’ Decision Tree

Application Highly

Sensitive to IO

Response Time?

See slide 2 for SAN

strategy (DS8K, XIV) Virtualization Needed?

TIER2 needed? See TIER2b for V7K

Options or alterates

See Tier3 for V7K

Options or alternates

Yes

SVC 4 or 8 node

Cluster Recommended

No

No

See previous pages for Guidance on Tier Performance Levels

Virtualization Needed?

Yes

Yes

Yes

No

SVC 4 or 8 node

Cluster Recommended

See Previous slides for

SAN strategy (XIV,

V7K)

If new small solution with, iSCSI QuataStor (Tier 2B/Tier 3) is an option at 99.9% and lower avail.

If advanced capabilities like site to site replication, local instant

copy, etc, then Quantastor not recommened

No


How many pools are to many?

SVC Cache Partitioning

36

Potential

Risks 1. Too many MDGs results in ineffective utilization of SVC destage / write

cache partitioning

2. Potential Performance issues are a result during High IO workloads.

3. 3. Results in limited and slowed data IO performance

4. To few Mdisks in the MDG could results in degraded disk performance,

impacting throughput from the back-end controller to the SVC,

ultimately slowing down the Application read/write access IOs per

second (IOPS).

Actions to

correct the

error

1. Reduce the number of MDGs through SVC mdisk consolidation to 5 or

less per cluster where possible.

2. This is not an architectural limitation but is the global standard, however,

it may make sense to create more MDGs if attempting to isolate

workloads to different disk spindles.

3. Make larger MDGs, with the minimum of 8 MDisks in an the MDG .

Some considerations are:

i. More MDisks in the MDG is better for transactional workloads

ii. The number of MDisks in the MDG is the most important

attribute influencing performance and SVC destage write cache

dedication. See the following table:

Note: Increasing performance “potential” adversely increases impact

boundary, but cannot be avoided up to minimum performance requirements.

37

Documentation

5. Lack of documentation for proper management and troubleshooting

6. Down Level Microcode

• Looking Deeper at Documentation


Are there any automated storage inquiry tools out there that will help me understand my setup?

• Storage tools – Gathers information such as, but not limited to:

• LUN layout

• LUN to Host mapping

• Storage Pool maps

• Fabric connectivity

• Firmware/ code level

– DS8QTOOL

• Go to the following Website to download the tool:

– http://congsa.ibm.com/~dlutz/public/ds8qtool/index.htm

– SVCQTOOL

• Go to the following Website to download the tool:

– http://congsa.ibm.com/~dlutz/public/svcqtool/index.htm

38

Documentation

How do I get the information?

http://congsa.ibm.com/~dlutz/public/ds8qtool/index.htm


Documentation – helps Data placement in Pools

Mapping Virtual LUNS to Physical Disks

• On the host server using SDD

• Ask the StoAdmin to find disk/device UID or Raid-group in Storage Pool

• StorAdmin cross-references Storage Pool UID with Controller’s Arrays in Pools



SERIAL: 60050768019002F4A8000000000005C7

======================================================================


0 fscsi0/path0 FAILED NORMAL 89154 2

1* fscsi0/path1 FAILED NORMAL 63 0



LUN

to

Pool

to

Array

40

Troubleshooting and Tuning

• Best Practices for IBM Storage Performance, connected to Power

Systems


• After verifying that the disk subsystem is causing a system bottleneck, a

number of solutions are possible. These solutions include the following:

1. Consider using faster disks Flash & SDD will out perform HDD, etc.

2. Eventually change the RAID implementation if this is relevant to the server’s

I/O workload characteristics.

• For example, going to RAID-10 if the activity is heavy random writes may show

observable gains.

3. Add more arrays/ranks to the Storage pool.

• This will allow you to spread the data across more physical disks and thus improve

performance for both reads and writes.

4. Add more RAM

• Adding memory will increase system memory disk cache, which in effect improves disk

response times.

5. Finally, if the previous actions do not provide the desired application

performance:

• Off-load/migrate - processing to another host system in the network (either users,

applications, or services).

Troubleshooting - What are some Storage Bottlenecks?

41


Troubleshooting -Systems Administrator

How do I improve disk performance on the Host?

42

1. Reduce the number of IOs

• Bigger caches

• Application, file system, disk subsystem

• Use caches more efficiently

• No file system logging

• No access time updates

2. Improve average IO service times

• Consider changing the data layout

• Reduce locking for IOs

• Adjust Host Qdepth - Buffer/queue tuning

• Adjust HBA transfer Rates

• Use SSDs or RAM disk

• Consider changing the Radom versus sequential striping or spreading

• Faster disks/interfaces, more disks

• Short stroke the disks and use the outer edge

• Smooth the IOs out over time

3. Reduce the overhead to handle IOs


What are the most common/Important OS I/O tuning parameters?

• Device Queue Depth

– Queue Depth can help or hurt performance per LUN

• Be aware of Queue Depth when planning system layout, adjust only if necessary

• To calculate - best thing to do is go to each device “Information Center” URLs listed in link slide

• What are the default Queue Depths? ___

• HBA transfer rates

– FC adapters

• LVM striping vs spreading

• Data Placement

– Random versus sequential

– Spreading versus Isolation

43

• Queue Depth is central to the following fundamental performance formula:

• IO Rate = Number of Commands * Response Time per Command

• For example:

– IO Rate = 32 Commands per Second / .01 Seconds (10 milliseconds) per Command = 3200 IOPs

Some real-world examples:

• OS=Default Queue Depth= Expected IO Rate

• AIX Standalone = 16 per LUN = 1600 IOPs per LUN

• AIX VIOS = 20 per LUN = 2000 IOPs per LUN

• AIX VIOC = 3 per LUN = 300 IOPs per LUN

• Windows = 32 per Disk = 3200 IOPS per LUN

• Content provided by Mark Chitti

Troubleshooting -

Tips – Most Common OS IO Tuning Parameters


Troubleshooting -

#1 -Data Placement

44

Strip2

Strip4

Strip5

Strip1

Strip3

LUN1 made of strips on the outer edge of the DDMs (1s) also could have App A Raid-5 7+P

LUN3 made of strips in the middle of the DDMs (3s) also could have App B Raid-5 7+P

Placing Applications on the same LUNs/Pools result in IO contention

Extent pool or 8 Ranks

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

4

2 5

3

1

For existing applications, use storage and server performance monitoring tools to

understand current application workload characteristics such as:

• Read/Write ratio

• Random/sequential ratio

• Average transfer size (blocksize)

• Peak workload (I/Os per second for random access, and MB per second for sequential access)

• Peak workload periods (time of day, time of month)

• Copy services requirements (Point-in-Time Copy, Remote Mirroring)

• Host connection utilization and throughput (HBA Host connections)

• Remote mirroring link utilization and throughput

What causes THRASHING?

Most commonly when workloads peak at the same time or log files and data files share physical spindles


Data Placement on Power Systems

#1 - Random IO Data layout

45

RAID array

LUN or logical disk

1

2

3

4

5

1

PV

2 3 4 5

datavg

# mklv lv1 –e x hdisk1 hdisk2 … hdisk5

# mklv lv2 –e x hdisk3 hdisk1 …. hdisk4

…..

Use a random order for the hdisks for each LV

Slide Provided by Dan Braden

What does random LV creation order, help

prevent?



#1 - Data Layout - OS Spreading versus Striping

46

Is there is a difference? What’s the diff?

– Do you know what your volumes made of!

File system spread


#1 - Data Placement Performance issues!

Look for differences in the infrastructures ...Please....!!

47

• Leading causes for performance degradation after migration to new pools could be and

should be checked for:

• Different disk foundation technology or down level firmware

• Different pool sizes, with different number, size, speed and Raid type of disk (SAS, NL, FC, SSD, Flash, Sata)

• LUN sizes not as important as physical disk capacity, speed and type) but LUN size too big can cause an I/O bottleneck in the SVC

ports

• Cache utilization - could be the number of SVC MDiskgroups

• I/O Congestion in the target SAN

• Different SAN switches and firmware levels

• Perhaps a slow draining device - usually caused by a hung device HBA or SFP –

• Could be a lack of Buffer Credits

• Could be a dual core switch architecture, with incorrect zoning or not ISL'ed correctly

• Could be lack of Trunking

• Could be fixed versus auto-negotiate port speed setting

• Could be Port Fillwords not set to 3 for the IBM gear...

• Server side:

• CPU, cores,HBA transfer rates, Qdepth settings, BIOS settings

• Multipath Settings- can determine LUN behavior in handling the I/O

• Is striping or spreading for random IO configured on the server hosting the application

• Application side -type of application

• Are Data and Log files sharing the same physical spindles – OK on Flash / SSD XIV or Easy Tier is turned on, but not other

traditional other technology

• Application IO stacking..,is it the same...look for diff's



#1 - Data Layout Summary

48

Does data layout affect IO performance more than any tunable IO parameter?

Good data layout avoids dealing with disk hot spots

– An ongoing management issue and cost

Data layout must be planned in advance

– Changes are generally painful

iostat might and filemon can show unbalanced IO

Best practice: evenly balance IOs across all physical disks unless TIERING

Random IO best practice:

– Spread IOs evenly across all physical disks unless dedicated resources are needed to

isolate specific performance sensitive data

• For disk subsystems

• Create RAID arrays of equal size and type

• Create VGs with one LUN from every array

• Spread all LVs across all PVs in the VG


• Collect suspect device logs, send them to engaged support team

• Ask the device vendor - if a component has failed

• If yes block the component –HBA until replacement of component

• If no – continue to troubleshoot

• Determine what changed in last day/12 hours/6 hours

• If configuration changed – reverse the change

• If no configuration change continue to troubleshoot

• Is the zoning correct

• For missing SAN paths, determine if issue is narrow or widespread:

• Are paths missing on only one server? If so it is likely to be a Server HBA issue

• Are the paths missing on other servers? If so it is likely to be a SAN issue

• Are the paths missing on all Server HBA ports or only one of the ports?

• If paths are missing on multiple servers, are the HBA ports common to one Fabric?

• Are the missing paths common to one Fabric or to both Fabrics?

• Are the missing paths common to one Fabric blade?

• Are the missing paths common to one Storage port?

Troubleshooting Questions, If its not tiering, data

layout or storage foundation. What else could it be?

49

50

Incorrect Physical Topology

Configurations

7. Fabric port topology bottlenecks (incorrect physical topology)

8. Incorrect physical Fibre cabling practices (cracked glass)

9. Insufficient cable labeling practices

10. Suboptimal cooling

Physical Topology


Racking and Stacking: SVC Best Practice:

A Right Way Example

51

Rear View Front View

Clean and Neat


.

Unserviceable

fabric rack

1. Bend Radius

exceeded

2. Insufficient

strain relief

Cable weight

pull on other

cables

3. Cables loose

on floor

Susceptible to

pinching,

getting caught

in door, being

stepped

on…etc.

What’s wrong?

52


SVC rack and stack – how?

• What is the impact of this?

53

Can’t service

Blocked air exhaust


What’s this?

• What’s Wong!!

54


SVC rack and stack – how?

• Whats wrong here?

55

It was off in a corner with no surrounding

air flow.

Impact: Shortened component life span

from over heating.

Everytime the cabnet door was opened to

service it, it blew:

– A power supply

– Fans always failing

– Application outages


• Top 5 most common things that are or go wrong:

1. Growth without checking a very simple test – IO Read / Write response time in milliseconds

• Lack of data Spread - Additional Applications placed in Storage Pools not designed to share the load

• Lack of data Isolation - Timing of applications peak times is not isolated properly - causing congestion

• Improper configuration changes or even worse “No” configuration changes

• In the Data lifecycle things do change and shift overtime – causing load IO imbalance – impacting the overall SAN

2. Lack of Automatic Monitoring and Alerting and Daily Health Checks

• No Call Home or not configured or becomes out of date – wrong contact info

• Clocks not synchronized between equipment – makes it hard to pinpoint events

3. Down Level - Device Microcode slips beyond supported life or not updated regularly

• Servers, Storage or Fabric firmware get out of sync and become incompatible with changes

• Incorrect Host device Multipathing Driver

4. Single Points of Failure - Hardware components fail but not reconfigured properly:

• Server HBA WWPNs change but zoning is not updated in Fabric – Incorrect WWPN

• Raid Disks fail but not replaced in a timely manner because IO continues on redundant hardware

• Fabric ports fail but are not replaced in a timely manner because IO continues on redundant paths

• Improper IO pathing – to many or not enough paths by suboptimal design

5. Suboptimal physical design or changes over time

56

Troubleshooting - Summary

After initial build and configuration what next?

57 57

• Knowing - what's inside will help you make informed decisions?

• You should make a list of the things you don’t know

– Talk to the Storage Administrator or those who do know

• A better Admin understands

1. The backend physical makeup

2. The backend virtual makeup

3. What's in a Storage Pool for better data placement

4. Avoids the Pitfalls associated with IO Tuning

• 5. Know where to go to get right multipathing device drivers

• 6. Knows why documentation matters

– 7. Keep Topology Diagrams

– 8. keep Disk Mapping documentation

– 9. Be able to use Storage Inquiry Tools to find answers

– 10. Understand how to troubleshoot storage performance bottlenecks

Summary


Please fill out an evaluation for sPE0330

@ibmtechu.com


Some great prizes to be won!


Questions-

59


Extras for traditional storage

60

Best Practices for Performance, Design

and Troubleshooting IBM Storage

connected to Power Systems


• #1 Top Reason - Improper configuration changes or even worse “No” configuration

changes

Brocade Best practice to Trunk

- on all Fibrechannel ISLs

61

Severity Major Why the error has occurred One or more ISL does not have Trunking enabled.

Potential Risks 1. Running with a ISL without Trunking, ie. a single Fibre

connection(s) between 2 SAN switch's there are some risks:

• Single Point of Failure, causing Fabric Segmentation and loss of connectivity if the

only (or last) connection between 2 switch's are lost

• Performance bottleneck, ISL Trunking is designed to significantly reduce traffic

congestion in storage networks.

2. If there are 2 ISLs between the switches there are multiple

scenarios why a trunk is not formed

• Either there are no Trunking license

• The links are cabled in different ASICs in either end

• The difference in length of the cables are to long

• There are "noise" in one cable, could be a bad connector or patch panel, a too much

bend cable etc.

Actions to correct the error Add more connections between this switch and the neighbor switch running

with a single connection and/or purchase a Trunking License.


Cisco Best Practice to enable Call Home

• #2 Top Reason - Lack of Automatic Monitoring and Alerting

62

Severity Minor Why the error has

occurred This error has occurred because no SMTP server is

entered for callhome transport.

A good example looks like either:

smtp server:smtp.svl.ibm.com

smtp server port:25

smtp server priority:0

smtp server:9.30.121.1

smtp server port:25

smtp server priority:0

Potential Risks Not setting up SMTP server impact the ability to for

notifications, risking critical errors going undetected

Actions to correct

the error Configure SMTP server for callhome.



changes

Best Practice - Brocade Buffercredits on E-ports

should be greater than 20

63

Severity Warning Why the error has

occurred This error has occurred because the buffer credits on an

E-Port (ISL) are less than 20 Potential Risks During peak times, having less buffer credits on E-Ports

(ISL) will lead into loss of frames resulting in

performance issues.

By default 8 BB (Buffer) credits are allocated per port.

Considering the SAN topology it is highly recommended

to increase the default number of buffers to 20 or more Actions to correct

the error Use the command "portcfglongdistance" to allocate the

additional buffer credits



changes

Best practice – do not have Brocade L-Ports present

64


occurred This error has occurred because one or more ports

logged in as "Loop" port Potential Risks Having Loop port in the environment will lead into

performance issues. Loop port may appear in the SAN

due to improper login / registering of host into the SAN

Actions to correct

the error Make sure that host properly logged into the SAN

Make sure that topology on the host set to point-to-point


Best practice – To sync time between all devices

65

Severity Info Why the error has

occurred This error has occurred because Network Time Protocol

(NTP) is not configured on the fabric device. Potential Risks Without clock synchronization it is much more difficult to

correlate logs of events across multiple devices, and

unsynchronized clocks may cause problems with some

protocols. Actions to correct

the error Configure the fabric device to use an NTP server that is

consistent with other devices on the same fabric.


changes



changes

Best practice – Incorrect port speed of IFL/ISL Ports

66

Severity Minor

Why the error has

occurred

This error has occurred because, port speed on ISL / IFL are not set to fixed speed ie: A good example should look like this:

128 1 16 658000 id 4G Online E-Port 10:00:00:05:1e:36:05:b0 "<SWITCH_NAME>" (downstream)(Trunk

master)

129 1 17 658100 id 4G Online E-Port (Trunk port, master is Slot 1 Port 16 )



A bad example could be any of these:

48 4 0 653000 id N4 Online E-Port 10:00:00:05:1e:36:05:b0 "<SWITCH_NAME>" (Trunk master)

49 4 1 653100 id N4 Online E-Port (Trunk port, master is Slot 4 Port 0 )



60 4 12 653c00 id N4 Online E-Port (Trunk port, master is Slot 4 Port 13 )

Note : When ever you use the command "portcfgspeed" command, port will go offline and will come online,

hence it is disruptive for that particular link. Implement the change in an appropriate time.

Potential Risks Having ISL / IFL ports in "Auto Negotiate" mode switches will keep on check for the

connectivity. Which will lead to both the switches to exchange the capabilities which may

lead into principal switch polling

Actions to correct the

error

Make sure that you set the port speeds of ISL / IFL on all switches to a fixed value.



changes

Best Practice – Do not mix Hard and Soft Zoning

67

Severity Critical

Why the error has

occurred 1. There are two types of Zoning identification:

• Port World Wide Name (pWWN)

• Domain, Port (D,P).

2. For easier management it is possible to assign aliases for both

pWWN and D,P identifiers.

3. To ensure that all Zoning implements frame-based hardware

enforcement, use pWWN or D,P identification exclusively.

4. pWWN is more secure than D,P because of physical security

issues and it enables the use of FCR, FC FastWrite, Access

Gateway, and other features.

5. BEST PRACTICE: All zones should use frame-based hardware

enforcement; the best way to do this is to use pWWN identification

exclusively for all Zoning configurations Potential Risks Potential security and performance issues, when not following Best

Practice Actions to correct the

error Change any zones or alias's using Domain, Port to using WWPNs.

Change must be planned with all responsible parties to ensure

nondisruptive change.



changes

Best Practice – Balance Connections between Fabrics

68

Severity Critical Why the

error has

occurred

1.Connections from the SVC to the fabrics are not balanced. The SVC passes this check if all nodes in the

SVC cluster satisfies all of the following 3 conditions: It is connected to exactly 2 independent fabrics.

2.It is connected to the same number of switches in each fabric.

3.The fiber connections to a switch (SW1) in one fabric must correspond to the connections to a switch (SW2)

in the other fabric - ie. if the SVC is connected to switch SW1 with 2 fibers then it must also be connected to

SW2 with exactly 2 fibers.

It is recommended that if, for example, the SVC is connected to ports 1 and 2 in SW1, then it should also be

connected to ports 1 and 2 in SW2, but this "strict mirroring" is not required in order to pass the check.

4.An independent fabric in this context can be one of two things: A "simple" fabric - just a group of

interconnected switches.

5.Two or more fabrics connected via fiberchannel routing (FCR) – the switches will in effect make up a single

fabric.

TPCHC is able to distinguish between fabrics with and without FCR. If a storage device is connected to 2

switches in the same fabric, then the switches are either in the same simple fabric, or they are in separate

fabrics connected via FCR. In either case the switches are NOT in independent fabrics, and the storage

device fails this check.

Multiple independent fabrics can have the same ID (fid), but TPCHC is able to distinguish between different

fabrics using the same fid. Potential

Risks If storage devices are not connected redundantly to two fabrics there is a single point of failure.

Furthermore the workload cannot be spread evenly between the two fabrics. Actions to

correct the

error

Verify connections to the fabrics and take corrective actions to ensure balance is in place.



changes

Best Practice – MM/GM SVC Vdisks 500GB Max size

69

Name Vdisk sizing max 500GB if utilized in a Metro or Global

Mirror Relationship Severity info Why the error has

occurred This error has occurred because Flash Copy (FC)

enabled volumes larger than 500GB were discovered on

this SVC Cluster. Potential Risks Suboptimal performance. Actions to correct

the error Reduce the size of the volumes.


Best Practice – No Disk Controller Ports Off-line

70

Name Disk controllers degraded Severity Minor

Why the error has

occurred This error has occurred because the one or controllers have

been reported as degraded. This may be an unexpected

condition, resulting from a back-end storage ports that were

once configured that are no longer logged in to the SVC. Potential Risks 1. IO-loss and corrupted data

2. Data integrity is at risk.

3. This condition risks possible saturation of bandwidth on

existing configured ports potentially restricting the

available data I/O bandwidth. Actions to correct the

error 1. Check the Back-end storage Controller for failed arrays,

volumes or ports. Take appropriate action to correct the

condition.

2. Check Fabric conditions for port failures

3. It may be appropriate open a hardware PMR with

support to diagnose the cause of this condition

• #4 Top Reason - Single Points of Failure - Hardware components fail but not

reconfigured properly


• #4 Top Reason - Single Points of Failure - Hardware components fail but not

reconfigured properly

Best Practice – Correct Degraded Host Port Config’s

71

Severity Major Why the error has

occurred

This error has occurred because the one or more hosts have reported a host to SVC path status that is degraded. This condition may be

unexpected, check with Host owners for Hosts that have failed HBAs

Potential Risks 1. Undetected loss of redundancy

2. If this condition is unexpected then the host and applications residing on the hosts could be potentially impacted with a loss of IO Load

balancing and eminent IO bottlenecks

3. The data residing on these hosts are more susceptible to failure in that the remaining port is in a Single Point of Failure condition. In the

event of the reaming port failing the following results could occur:

1. I/O loss

2. Database corruption

3. Lengthy restore from tape backups

4. Extended Problem Determination

Actions to correct

the error

1. Check fabric ports for unexpected failures or offline ports

2. Check the SVC logical host definitions for wrong wwpn information in the event that a host HBA has been replaced but not updated in the

SVC logical definition

3. Check with the Host owners, Sys Admin, for decommissioned hosts that can be removed from the SVC logical definitions.

4. Dual HBAs should be architected on the clustered servers and the remaining non clustered severs, in order to add resiliency and protect the

critical data. The client should be advised of the current risks associated with this the current SPoFs.

5. Best practice is to dual port every host connection to the Fabric. Further "proper" testing should be done during a maintenance window

6. Phase1 Testing the redundancy between the Fabric and the host

1. Open a change record to reflect the change (Make sure all necessary approvers are notified)

2. Identify and verify which host HBA's are active for I/O activity by performing a test read and write to the SAN disk from the host

3. Stop I/O between the host and the Disk Storage

4. On the San Fabric, block the Switch port on the "even" fabric zoned between the host and the storage device.

5. Perform another read/write test to the same LUN

6. Identify and verify which host HBA is active for I/O activity

7. On the even SAN fabric unblock the Switch port

8. On the San Fabric, block the Switch port on the "odd" fabric zoned between the host and the storage device.

9. Perform another read/write test to the same LUN

10. Identify and verify which host HBA is active for I/O activity

11. On the odd SAN fabric unblock the Switch port

12. If the I/O activity toggles between the two HBA's then phase 1 of the test is successful

7.Phase 2 Testing the redundancy between the Storage device and the host

1. Repeat the process as defined in phase one, except block test and unblock the ports connected to the Storage ports instead of the

host.

2. When a new Host server or Storage device is added to the environment testing is strongly recommended.

8. Note: Ideally this type of test is best during the initial implementation of new equipment, before it is turned over to the customer or placed in

production.



changes

Best practice is to -

Utilize all FC Ports but only ports 1 & 3 to SVC

72


occurred Best practice are to utilize all hardware bought, so there

are:

• No idle components

• Optimize the usage for maximum performance and

resiliency

• For XIV the FC port number 4 is pre-configured for

mirroring which can be changed if mirroing is not

used, and the utilized for host connection, to

improve performance and distribution of ports

between Fabrics.

• For connection to the SVC – use only port 1 & 3 Potential Risks Wasted capacity, and lack of connectivity for hosts, where

best pratice is all hosts should have connectivity to all

modules. Actions to correct the

error Cable the remaining ports or modules


Best practice –XIV

Each host must have minimum 2 connections

73

Severity Major

Why the error

has occurred

message: This error has occurred because less than 2 connections were discovered in the XIV logical host definition

A logical host definition should contain atleast 2 wwpns

ie: a good example should look like this:

Name Type FC Ports

dssapsrvu002 default 21000000C95D3A6A,10000000C95FFA7

5

dssapsrvu003 default 21000000C95AAA6A,21000000C95BAA

75,21000000C9510A6A,21000000C9512A75

a bad example could look like this:

Name Type FC Ports

dssapsrvu002 default 10000000C95GAA6A

Potential Risks For single connections the risks associated could be either a single point of failure, with no IO failover in the event of

loosing an port on the XIV , fabric or host HBA

The data residing on these hosts are more susceptible to failure which could result in the following:

1.I/O loss

2.Database corruption

3.Lengthy restore from tape backups

4.Extended Problem Determination

Actions to

correct the

error

1.Check the XIV logical host definitions for only one HBA defined to a host

2.Action must be taken to add a second HBA definition and or host HBA, for any definitions showing less than 2 HBAs

3.

Note: Take caution in reducing HBA definitions as this is a disruptive action. Host System administrators will need to

rescan or rediscover corrected paths after action is taken.


Best practice – V7000u

Best practice are to utilize all Node Ports

74

Severity Warning

Why the error has occurred This error has occurred because a node or port on the node is offline or failed.

Potential Risks Best practice are to utilize all hardware bought, so there are

no idle components

optimize the usage for maximum performance and resiliency

For V7000u there might be ports reserved for services like Global and

Metro Mirroing, but it is important to make sure, there are no error in the

configuration, which might result in: 1.Data integrity is at risk.

2.This condition could result in a Single point of failure for any attached host to a node pair in an IOgrp.

3.The attached hosts and applications residing on the hosts could be potentially impacted with a loss of IO Load balancing

and eminent IO bottlenecks

4.The data residing on these hosts are more susceptible to failure in that the remaining port fails. In the event of the

reaming port failing the following results could occur:

5.I/O loss

6.Database corruption

7.Lengthy restore from tape backups

8.Extended Problem Determination

Actions to correct the error 1.It may be appropriate open a hardware PMR with support to diagnose

the cause of this condition

2.Open a Hardware PMR to dispatch consult with a support Product Field

Engineer (PFE) to check the condition


Best practice – V7000u

SMTP server must be filled in

75

Severity Minor Why the error has

occurred

: This error has occurred because the SMPT was reported as not being enabled. SMTP must be turned on as a minimum

requirement for email alerting to work. and a valid entry such as callhome*@*.ibm.com should be configured

Potential Risks Not setting up SMTP will result in not having the ability to manage, and activate email event and inventory notifications risking

critical errors going undetected

Actions to correct the

error

1.Enable SMTP by doing the following:

1. Issue the mkemailserver CLI command. Up to six SMTP email servers can be configured to provide redundant access to

the external email network.

2. The following example creates an email server object. It specifies the name, IP address, and port number of the SMTP

email server. After you issue the command, you see a message that indicates that the email server was successfully created.

3.

mkemailserver -ip ip_address -port port_number

where ip_address specifies the IP address of a remote email server and port_number specifies the port number for the email

server.

Add recipients of email event and inventory notifications to the email event notification facility. To do this, issue the

mkemailuser CLI command. You can add up to twelve recipients, one recipient at a time.

The following example adds email recipient manager2008 and designates that manager2008 receive email error-type event

notifications.

mkemailuser -address [email protected] -error on -usertype local

•Set the contact information that is used by the email event notification facility. To do this, issue the chemail CLI command. If

you are starting the email event notification facility, the reply, contact, primary, and location parameters are required. If you are

modifying contact information used by the email event notification facility, at least one of the parameters must be specified.

The following example sets the contact information for the email recipient manager2008.

chemail -reply [email protected] -contact manager2008 -primary 0441234567 -location 'room 256 floor 1 IBM'

•Activate the email and inventory notification function. To do this, issue the startemail CLI command. There are no parameters

for this command.

•Note: Inventory information is automatically reported to IBM when you activate error reporting.

•Optionally, test the email notification function to ensure that it is operating correctly and send an inventory email notification.

•

To send a test email notification to one or more recipients, issue the testemail CLI command. You must either specify all or the

user ID or user name of an email recipient that you want to send a test email to.

To send an inventory email notification to all recipients that are enabled to receive inventory email notifications, issue the

sendinventoryemail CLI command. There are no parameters for this command.


Best Practice - Brocade

76

Name Er_bad_os per port must be below 5 per minute

Severity info Why the error has occurred The number of invalid ordered sets (platform- and port-

specific).

Brocade reccomends alerting if value exceeds 5 per

minute. Counter is only read once a day and this will give

an threshold of 5x60x24=7200. Daily operation is around

8 hours per day so value is divided by 3 and rounded to

2500.

Potential Risks

Loss of synchronization if running 8Gb link, causing

interuption to data stream.

Actions to correct the error Change Fill Word settings, see brocade check 2.32


Best Practice - Brocade

77

Name Er_rx_c3_timeout per port must be below 5 per minute

Severity info

Why the error

has occurred

The number of receive class 3 frames received at this

port and discarded at the transmission port due to

timeout (platform-and port-specific).

For further explanation see IBM SANswers Wiki

Brocade reccomends alerting if value exceeds 5 per

minute. Counter is only read once a day and this will

give an threshold of 5x60x24=7200. Daily operation is

around 8 hours per day so value is divided by 3 and

rounded to 2500.

Potential Risks • Discards or frames will result in IO timeout, and

retransmit of frames, causing interuption of data

stream.

Actions to

correct the error

•Fix the reason for timeout, which can be

exhausted links and ISLs

•to few buffercredits assigned to ISLs and storage ports

•slow draining devices meaning devices which which

are not able to receive and process data at the speed it

is send.

https://w3-connections.ibm.com/wikis/home?lang=en_US#%21/wiki/W57710b94931e_4df9_b0cc_ac9f6293f10f/page/Porterrshow%20explanation





Best Practice – Brocade / Regularly Check Status

78

Issue Er_tx_c3_timeout per port is greater than 5 per minute Severity info Why the error has occurred The number of transmit class 3 frames discarded at the transmission port due to

timeout (platform- and port-specific).

For further explanation see IBM SANswers Wiki

Brocade recommends alerting if value exceeds 5 per minute. Counter is only read

once a day and this will give an threshold of 5x60x24=7200. Daily operation is

around 8 hours per day so value is divided by 3 and rounded to 2500.

Potential Risks Discards or frames will result in IO timeout, and retransmit of frames, causing

interruption of data stream.

Actions to correct the error 1. Fix the reason for timeout, which can be exhausted links and ISLs

2. To few buffer credits assigned to ISLs and storage ports

3. Slow draining devices meaning devices which are not able to receive and

process data at the speed it is send.






• Disk mapping at a glance

– Mapping becomes important • Spreading versus isolation

79

Isolation Spreading

Track data placement and Host Vdisk mapping

Documentation –

Does it matter? Why?


• Spreading versus Isolation

– Spreading the I/O across MDGs exploits the aggregate throughput offered by more physical resources working together

– Spreading I/O across the hardware resources will also render more throughput than isolating the I/O to only a subset of hardware resource

– You may reason that the more hardware resources you can spread across, the better the throughput

• Don’t spread file systems across multiple frames – Makes it more difficult to manage code upgrades, etc.

• Should you ever isolate data to specific hardware resources?

• Name a circumstance!

80

• Isolation

– In some cases more isolation on dedicated resources may produce better I/O throughput by eliminating I/O contention

– Separate FlashCopy – Source and Target LUNs – on isolated spindles

Data Placement and Host Vdisk

mapping


Data layout affects IO performance more than any tunable IO parameter

• If a bottleneck is discovered, then some of the things you need to do are:

– Identify the hardware resources the heavy hitting volumes are on

• Identify which D/A pair the rank resides on

• Identify which I/O enclosure the D/A pair resides on

• Identify which host adapters the heavy hitting volumes are using

• Identify which host server the problem volumes reside on

• Identify empty non used volumes on other ranks – storage pools

– Move data off the saturated I/O enclosures to empty volumes residing on less used

ranks/storage pools

– Move data off the heavy hitting volumes to empty volumes residing on less used hardware

resources and perhaps to the another Storage Device

– Balance LUN mapping across

• Backend and host HBAs

• SVC iogrps

• SVC preferred nodes

– Change Raid type.

Traditional Data Placement –

StorAdmin – How do I improve disk performance?

81


Back-end Load Balancing

Which has better throughput?

82

DA_0

S5=Array_

44

S3=Array_

10

S7=Array_

52

S1=Array_

0 DA_0

S6=Array_

48

S4=Array_

14

S8=Array_

56

S2=Array_

5

DA_1

S13=Array_

34

S11=Array_

26

S9=Array_

18 DA_1

S14=Array_

37

S12=Array_

30

S10=Array_

22

DA_2

S19=Array_

45

S17=Array_

11

S21=Array_

22

S15=Array

_1 DA_2

S20=Array_

49

S18=Array_

15

S22=Array_

57

S16=Array

_6

DA_3

S27=Array_

35

S25=Array_

27

S29=Array_

40

S23=Array_

19 DA_3

S28=Array_

38

S26=Array_

31

S30=Array_

42

S24=Array_

23

DA_4

S35=Array_

46

S33=Array_

12

S37=Array_

54

S31=Array

_2 DA_4

S36=Array_

50

S34=Array_

16

S38=Array_

58

S32=Array

_7

DA_5

S43=Array_

36

S41=Array_

28

S45=Array_

41

S39=Array_

20 DA_5

S44=Array_

39

S42=Array_

32

S46=Array_

43

S40=Array_

24

DA_7

S59=Array_

29

S57=Array_

21

S55=Array

_4 DA_7

S60=Array_

33

S58=Array_

25

S56=Array

_9

DA_6

S51=Array_

47

S49=Array_

13

S53=Array_

55

S47=Array

_3 DA_6

S52=Array_

51

S50=Array_

17

S54=Array_

59

S48=Array

_8

DA_0

S5=Array_

4

S3=Array_

2

S7=Array_

6

S1=Array_

0 DA_0

S6=Array_

5

S4=Array_

3

S8=Array_

7

S2=Array_

1

DA_1

S13=Array_

12

S11=Array_

10

S9=Array_

8 DA_1

S14=Array_

13

S12=Array_

11

S10=Array_

9

DA_2

S19=Array_

18

S17=Array_

16

S21=Array_

20

S15=Array_

14 DA_2

S11=Array_

19

S18=Array_

17

S22=Array_

21

S16=Array_

15

DA_3

S27=Array_

26

S25=Array_

24

S29=Array_

28

S23=Array_

22 DA_3

S28=Array_

27

S26=Array_

25

S30=Array_

29

S24=Array_

23

DA_4

S35=Array_

34

S33=Array_

32

S37=Array_

36

S31=Array_

30 DA_4

S36=Array_

35

S34=Array_

33

S38=Array_

37

S32=Array_

31

DA_5

S43=Array_

42

S41=Array_

40

S45=Array_

44

S39=Array_

38 DA_5

S44=Array_

43

S42=Array_

41

S46=Array_

45

S40=Array_

39

DA_7

S59=Array_

58

S57=Array_

56

S55=Array_

54 DA_7

S60=Array_

59

S58=Array_

57

S56=Array_

55

DA_6

S51=Array_

50

S49=Array_

48

S53=Array_

52

S47=Array_

46 DA_6

S52=Array_

51

S50=Array_

49

S54=Array_

53

S48=Array_

47

Unbalanced I/O to DA Cards Balanced I/O to DA Cards


Sequential IO Data layout

• Does understanding the backend enable good front-end configuration?

• Sequential IO (with no random IOs) best practice:

– Create RAID arrays with data stripes a power of 2

• RAID 5 arrays of 5 or 9 disks

• RAID 10 arrays of 2, 4, 8, or 16 disks

– Create VGs with one LUN per array

– Create LVs that are spread across all PVs in the VG using a PP or LV strip size >= a full stripe on the RAID array

– Do application IOs equal to, or a multiple of, a full stripe on the RAID array

– Avoid LV Striping

• Reason: Can’t dynamically change the stripe width for LV striping

– Use PP Striping

• Reason: Can dynamically change the stripe width for PP striping

83

Slide Provided by Dan Braden


Data Placement – Traditional Storage Pools and Striping

• Should you ever stripe with pre-virtualized volumes?

• We recommend not striping or spreading in SVC, V7000 and XIV Storage Pools

• Avoid LVM spreading with any striped storage pool

• You can use file system striping with DS8000 storage pools

– Across storage pools with a finer granularity stripe

– Within DS8000 storage pools but on separate spindles when volumes are created sequentially


84

Sequential Pools

Striped Pools

Host Stripe - Raid-0 only

Host Stripe

No Host Stripe

Host Stripe

S

t

r

i

p

e

• Please refer to the following PPTs

provided by Dan Braden

• Disk IO Tuning

• SANBoot

85

More on Host Disk IO Tuning


Tip - Queue Depth Tuning

• Take some measurements

• Make some calculations

– (Storage port depth / total LUNs per host = queue depth) • If a single host with 10 assigned LUNs, is accessing the storage port supporting 4096

then calculate as (4096/10 = 409) or 256 in this case

– Are there different calculations for the different storage devices? • For volumes on homogeneous hosts examples:

– SVC q = ((n ×7000) / (v×p×c))

– DS8000 = 2048

– XIV= 1400

– V7000 q = ((n * 4000) / (v * p * c))

• Best thing to do is go to each device “Information Center” URLs listed in link slide

– Don’t increase queue depths beyond what the disk can handle! • IOs will be lost and will have to be retried, which reduced performance

• Note:

– For more information on the info needed to make the calculations, please refer to the deck by “Dan Braden” in the Extra slides at the end of this deck

86


SDD Driver Testing for Proper HBA Failover

• On an AIX vio server, check the AIX system to verify IO activity still continues

on the alternate ports by testing with SDD and /or SDDPCM commands

• The Server Admin - Create a mount point for logical volume that can be manipulated to generate IO

traffic for the purpose of this test

• The Server Admin - Verify and record selected (targets yet to be determined) datapaths for preferred

and alternate status (active and inactive) by using the SDD "pcmpath query device" or "datapath

query device" command on the AIX vio server

• Note the path selection counts on the multiple paths. There should only be two paths under the

"Select" column, above zero (0). These are the two open paths on the preferred node. (if paths 0 and

2 show numbers under the "Select " column, other than zer0, then do the following:

1. Take one path off-line by issuing the command (pcmpath set device 0 path 0 offline) or (datapath set device 0 path 0 offline) - Path

0 should now be in a dead state .

2. Go to the mount point of lv and edit a file to create traffic. After creating the traffic, reissue the pcmpath or datapath query command

"pcmpath query device" or "datapath query device" and look at the path selection numbers. Notice only path selection count for

Path 2 increased for the other preferred path

3. Close Path 2 by issuing the command "pcmpath set device 0 path 2 offline" or "datapath set device 0 path 2 offline"

4. Return to the mount point and add or edit files to create IO.

5. Execute the "pcmpath query device" or "datapath query device“ command, to look at the path selection count. Disk access should

now be via the other paths. (This is now load balancing to the non-prefered SVC node for this Vdisk)

Reestablish both preferred paths by executing the following commands: "pcmpath set device 0 path 0 online" and "pcmpath

set device 0 path 2 online" or "datapath set device 0 path 0 online" and "datapath set device 0 path 2 online”

Slide provided by Chuck Laing


Non SDD Driver Testing for Proper HBA Failover

• Further "proper" testing should be done during a maintenance window

• Testing the redundancy between the Fabric and the host

1. Open a change record to reflect the change (Make sure all necessary approvers are notified)

2. The Server Admin - Identify and verify which host HBA's are active for I/O activity by performing a test read and write to the SAN disk from the host

3. The Server Admin - Stop I/O between the host and the Disk Storage

4. The SAN Admin - On the San Fabric, disable the Switch port on the "even" fabric zoned between the host and the storage device.

5. The Server Admin - Perform another read/write test to the same LUN

6. The Server Admin - Identify and verify which host HBA is active for I/O activity

7. The SAN Admin - On the even SAN fabric enable the Switch port

8. The SAN Admin - On the San Fabric, disable the Switch port on the "odd" fabric zoned between the host and the storage device.

9. The Server Admin - Perform another read/write test to the same LUN

10.The Server Admin - Identify and verify which host HBA is active for I/O activity

11.The SAN Admin - the odd SAN fabric enable the Switch port

• If the I/O activity toggles between the two HBA's then the test is successful

• When a new Host server or Storage device is added to the environment testing is strongly recommended

• Note: Ideally this type of test is best done during the initial implementation of new equipment, before it is turned over to the customer or placed in production

Slide provided by Chuck Laing


• For performance degradation issues (sick but not dead scenarios)

• Are you or any team member aware of any failed hardware?

• Is this issue a performance degradation issue (Sick but not Dead) to the point of stopping an

application of high ms Read/Write response times at the application disk level?

• The following could cause this scenario

• A failed Switch Optic

• A physical cable not securely attached

• Ask if the Patch Panel is clean and securely attached?

• If no failed hardware involved but congestion is possible:

• Are there buffer credit errors in the switch or SVC or Storage error or message logs?

• Did a large DB Query just run?

• Is there a server HBA with errors? – may cause slow drain - block the HBA until replacement?

• Is there a bladeserver with errors?

• Are any storage adapters fencing?

Troubleshooting Questions, What else could it be?

89


Document VIO to LPAR mapping

• Script Output sample to produce documentation

90

Content provided by Aydin Y. Tasdeler

Documents

BP Perf and Troublshooting -Edge-v1.9