Upload
brettallison
View
2.912
Download
3
Embed Size (px)
DESCRIPTION
Scope - The primary focus of this presentation is how to leverage open source software to help in managing Shared Storage performance. The storage server will be the focus with particular emphasis on ESS. This solution is a small one-off solution.
Citation preview
© 2003 IBM Corporation
IBM GLOBAL SERVICES
New Orleans, LA
P12
Brett Allison
Leveraging Open Source to Manage
SAN Performance
July 25-29, 2005
© IBM Corporation 2005
IBM Global Services
© 2003 IBM Corporation2
Trademarks & Disclaimer
The following terms are trademarks of the IBM Corporation:
Enterprise Storage Server® - Abbreviated: ESS
TotalStorage® Expert TSE
FAStT/DS4000/DS8000
AIX®
Other trademarks appearing in this report may be considered trademarks of their respective companies.
Perl is governed by the GNU General Public License
Apache is governed by the GNU General Public License
MYSQL is governed by the GNU General Public License
PEAR is governed by the GNU General Public License
PHP is governed by the GNU General Public License
MRTQ is governed by the GNU General Public License
RRDTool is governed by the GNU General Public License
SANavigator,EFCM McDATA
UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.EMC is a registered trademark of EMC Inc.
HP-UX is a registered trademark of HP Inc.
Solaris is a registered trademark of SUN Microsystems, Inc
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Disclaimer
The views in this presentation are those of the author and are not necessarily those of IBM
IBM Global Services
© 2003 IBM Corporation3
Table of Contents
Scope of presentation and solution
What is Open Source Software (OSS)?
What is the problem we wanted to address?
What is a SAN?
How can we measure and manage a SAN?
How can we use OSS to manage SAN performance?
IBM Global Services
© 2003 IBM Corporation4
What Is Open Source Software?
Open Source Software (OSS) is software that is developed, tested and maintained through community participation
OSS refers to open standards, shared source code and public collaborative development
OSS communities usually enabled via the internet
OSS does not mean free-of-charge software, but that the source code is freely available
OSS allows redistribution of copies
OSS allows modifications and distribution
IBM Global Services
© 2003 IBM Corporation5
What is a SAN?
Edge Switch - A
Edge Switch - B
ISL’s
Core Switch - A
Core Switch - B
Links Links
Storage Switch - A
Storage Switch - B
Servers Fabric Storage Servers
IBM Global Services
© 2003 IBM Corporation6
What Can We Measure on the Attached Server?
Server View
Read Kbytes/sec, Write Kbytes/sec, I/Os per second, Reads/sec, Writes/sec, End-to-End Response Time
Physical Volume
Read Kbytes/sec, Write Kbytes/sec, I/Os per second, Reads/sec, Writes/sec
Virtual Path/LUN
Read Kbytes/sec, Write Kbytes/sec, I/Os per secondAdapter
MetricsComponent
HBA
HBA
SAN Storage
FabricLUN
PATH A
PATH B
IBM Global Services
© 2003 IBM Corporation7
What SAN Fabric Components Can We Measure?
Edge Switch - A
Edge Switch - B
ISL’s
Core Switch - A
Core Switch - B
Links Links
Storage Switch - A
Storage Switch - B
What can be measured?
Fabric
IBM Global Services
© 2003 IBM Corporation8
What Can We Measure on the Storage Server?KB/sec, RTPhysical
NVS Delays
Cache Hits
Logical
Volume: Reads, Writes, Sequential
I/Os, KB/sec, I/O Time
Physical
IBM Global Services
© 2003 IBM Corporation9
Problem Definition and Constraints
Lack of centralized end-to-end view of SAN
Insufficient tools and processes to determine ESS and SAN performance issues
Inadequate visibility into fabric performance
Lack of historical reporting
Cost prohibitive enterprise tools
IBM Global Services
© 2003 IBM Corporation10
What is the Solution? See Appendix H for Requirements
Server
ESS
Switch
sar, iostat, filemon
MRTGSNMP
TSE,DB2, PERL
CollectPost
Process
PERL,PHP
PERL,PHP
Extract/Show
MYSQL
Store
RRDTOOL
MYSQL
Mixed
OSSLegend
Apache, PHP
Browser
OSS+Glue
IBM Global Services
© 2003 IBM Corporation11
Server Collection/Post-processing/Store
Use UNIX shell scripts to gather data
Use PERL or PHP to process
Create DB tables: server_performance, server_configuration
Use PERL or PHP to import processed data
Use Apache/PHP/Browser to view server data
Server
Appendix A,B
CollectPost
Process
Appendix C AppendixD
Store Extract/Show
Apache, PHP
Browser
IBM Global Services
© 2003 IBM Corporation12
SAN Fabric – Collection with MRTG/SNMP
Switch MRTGSNMP
Collect
Use MRTG to gather data over SNMP
Custom perl scripts could be used, but MRTG is designed to do this
Easy to configure
Switches are queried every 5 minutes
Separate mrtg.cfg’s per fabric recommended
IBM Global Services
© 2003 IBM Corporation13
SAN Fabric Data Storage with MRTG/RRDTool
Switch MRTGSNMP
Collect
RRDtool
Store
Fast!
Easy integration with mrtg (set “LogFormat: rrdtool” in config file)
Fixed-space – rrd size will never grow past its size at creation
Portable– Easy to move rrd’s between boxes
processing done by rrdtool itself during data insert
Various language extensions available– php-rrd extension – RRDs.pm perl module
IBM Global Services
© 2003 IBM Corporation14
SAN Fabric View with RRDTOOL/PERL/Apache
Switch MRTGSNMP
Collect
RRDTOOL
Apache, PHP
Browser
Store Extract/Show
Data presented as graphs
With MRTG using rrdtool, graphs are not automatically generated
Use a cgi script to generate images “on Demand”
– mrtg-rrd, 14all.cgi, etc, or a custom CGI– Many included with the mrtg package
Many of the CGI’s include an image caching mechanism
IBM Global Services
© 2003 IBM Corporation15
Sample Port Throughput Chart
IBM Global Services
© 2003 IBM Corporation16
Storage Server – Collection
Create DB tables: Array_Summary, Exception
Configure TSE to collect performance and capacity data:– http://www.redbooks.ibm.com/abstracts/sg246102.html?Open
Create script to execute query (See Appendix E)
Gather ESS data with DB2 queries (See Appendix F & G)
IBM Global Services
© 2003 IBM Corporation17
Storage Server – Post Process and Import
Create array level summary report for each shift of importance, i.e. Prime Shift, 24-hour (See Appendix I).
Array Configuration
Create exception report
Import the summary report, and exception above into MYSQL (See Appendix D) for an example (The SQL statement will need to be modified to fit the summary report file)
Array Performance
2B3172.810000102412288800023-1234523.59020050121
DISK GROUP
LOOPADAPTERCLUSTER
SIZE GB
RPMNVS PER CLUSTER
CACHE PER CLUSTER
MODEL
ESS SNEND TIME
START TIME
START DATE
70.2330814919.4343.0594.6925.371076.5421.59986.21104634.41rank10
MAX NVS FULL
AVG NVS FULL
CACHE HOLD MIN
CACHE HOLD AVG
SEQ PCT
READ PCT
AVG WRITEKBYTESRATE
AVG READKBYTESRATE
AVG ARRAY RT MS
DISK UTIL MAX
DISK UTIL AVG
MAX IO Rate
AVG IO RATE
ARRAY
IBM Global Services
© 2003 IBM Corporation18
Storage Server – View – Define Reports/Charts
MYSQL
Capacity
Health Check
Server
Component
Customer
Exceptions
Bus
ine
ss L
ogic
, S
QL
Que
ries
IBM Global Services
© 2003 IBM Corporation19
Storage Server – View – Define Forms
1) Select ESS Reports, then “continue”
2) Click to select the ESS, or hold the ctrl key to select multiples
IBM Global Services
© 2003 IBM Corporation20
Storage Server – View ESS Array Summary Report
IBM Global Services
© 2003 IBM Corporation21
Storage Server – Chart Array Exceptions
Based on the exception table in the previous slide we can drill down by clicking on the exception and chart the exceptions
IBM Global Services
© 2003 IBM Corporation22
Storage Server – ESS Health Check Customer View
• The pie chart shows the distribution of ranks across the ESS base by score category as a percentage of total ranks.
• The tables shows the count of each ESS's ranks per rank score as a percentage of total ranks.
IBM Global Services
© 2003 IBM Corporation23
Appendix A - Measure End-to-End Host Disk I/O Response Time
Avg. Disk sec/ReadPhysical DiskperfmonNT/Wintel
svctm (ms)iostat –d 2 5*iostatLinux
iostat –xcn 2 5
sar –d
filemon -o /tmp/filemon.log -O all
Command/Object
iostat
sar
filemon
Native Tool
svc_t (ms)Solaris
avserv (ms)HP-UX
read time (ms)
write time (ms)
AIX
Metric(s)OS
The iostat package for Linux is only valid with a 2.4 & 2.6 kernelSee Appendix B for links to more information
IBM Global Services
© 2003 IBM Corporation24
Appendix B: Getting LUN Serial Numbers for ESS Devices
Device NameLUN SNlsvpcfgSDDLinux
SDD
ESS Util
Tool
Device NameSerialDatapath query device
Wintel
VG, hostname, Connection, hdisk
LUN SNlsvp –aAIX, HP-UX, Solaris
Other MetricsKeyCommandOS
Note: ESS Utilities for AIX/HP-UX/Solaris are available at: http://www-1.ibm.com/servers/storage/support/disk/2105/downloading.html
Host config. - http://www.redbooks.ibm.com/abstracts/tips0553.html
IBM Global Services
© 2003 IBM Corporation25
Appendix C: Format ‘lsvp –a’ and ‘filemon’ (Logic)
1. Process ‘lsvp –a’ file
Build hdisk hash with key = hdisk and value = LUN SN
Build ess hash with key = hdisk and value = last 5 chars of LUN SN
Process ‘filemon’ file
Create hashes for each of the following values with hdisk as the key: Date, Start time, Physical Volume, Reads, Avg Read Time, Avg Read Size, Writes, Avg Write Time, Avg Write Size
3. Print data to file with headers and commas to separate fields
Iterate through hdisk hash and use the common hdisk key to index into the other hashes and printing out those that have values
IBM Global Services
© 2003 IBM Corporation26
Appendix D: Import Data into MYSQL - Logic
1. Consolidate files into import directory (FTP/SFTP/RCOPY)
2. Create array of files to import
3. Loop through each file
4. If file is readable open file for reading ($filehandle = fopen($file,’rb’))
5. Loop through each $line in $file and create SQL statement (Example):
a. $sql = "INSERT INTO server ( DATE, TIME, SERVER_NAME, LUN, ESS_SN, HDISK, READS, READ_TIME, READ_SIZE, WRITES, WRITE_TIME, WRITE_SIZE”
6. Run the insert:
a. $db->query($sql,$line);
IBM Global Services
© 2003 IBM Corporation27
Appendix E: DB2 Query Wrapper - Logic
1. Build static configuration file containing remote DB2 aliases (TSEs)
2. Build query template with keyword substitution. For example
%ESSID, %STARTTIME,%ENDTIME,%STARTDATE, etc.
3. Execute query wrapper with query template and configuration file as parameters (default values for %STARTTIME, %STARTDATE, etc)
4. Loop through static configuration file and do the following for each DB:
1. Replace keywords with current variables (This is in the $runfile)
2. Create a shell script to execute the query file (This is the $kshfile)### PERL Code snippet #### This is not a fully functional script, it is just an exampleopen (KSHFILE, "> $kshfile") || &msg("die","Could not write $kshfile! $!"); # Open shell scriptprint KSHFILE "db2 connect to $remote user $db2user using $db2pass\n"; # Print db2 connect info to shellprint KSHFILE "db2 -tf $runfile\n"; # Print command to run query fileprint KSHFILE "db2 connect reset\n"; # Reset connectionclose(KSHFILE); # Close the shell scriptsystem("chmod +x $kshfile"); # Modify shell script to have execute permsexec("$kshfile"); # Execute shell script
IBM Global Services
© 2003 IBM Corporation28
Appendix F: DB2 Query for Array Performance Data
Note: This information is relevant only if you have the TotalStorage Expert installed and access to the DB2 command line on the TSE server.
SELECT DISTINCT
A.*,
B.M_CARD_NUM,
B.M_LOOP_ID,
B.M_GRP_NUM
FROM
DB2ADMIN.VPCRK A,
DB2ADMIN.VPCFG B
WHERE (
(
A.PC_DATE_B >= '%STARTDATE' AND
A.PC_DATE_E <= '%ENDDATE' AND
A.PC_TIME_B >= '%STARTTIME' AND
A.PC_TIME_E <= '%ENDTIME' AND
A.M_MACH_SN = '%ESSID' AND
A.M_MACH_SN = B.M_MACH_SN AND
A.M_ARRAY_ID = B.M_ARRAY_ID AND
A.P_TASK = B.P_TASK
)
)
ORDER BY
A.M_ARRAY_ID, A.PC_DATE_B, A.PC_DATE_E with ur;
IBM Global Services
© 2003 IBM Corporation29
Appendix G: DB2 Query for Array Configuration Data
Note: This information is relevant only if you have the TotalStorage Expert installed and access to the DB2 command line on the TSE server.
SELECT DISTINCTA.M_MACH_SN,A.M_MODEL_N,A.M_CLUSTER_N,A.M_RAM,A.M_NVS,C.I_DDM_RPM,C.I_DDM_GB_CAPACITY
FROM DB2ADMIN.VPVPD A,DB2ADMIN.VMPDX B,DB2ADMIN.VcMDDM C
WHERE (( A.M_MACH_SN = B.I_VSM_SN AND
B.I_VSM_IDX = C.I_VSM_IDX)
)ORDER BY
A.M_MACH_SN, A.M_CLUSTER_N;
IBM Global Services
© 2003 IBM Corporation30
Appendix H: Requirements - URLs
Software Requirements– DB2 SDK 7.2.6 (Default TSE DB2 is 7.2 and is not fully compatible with newer versions!)– PERL v 5.6 or later http://www.perl.com/download.csp– Apache v1.3.33: http://httpd.apache.org/– PHP 4.0 or later (5.0 has full OO support): http://www.php.net/downloads.php– MYSQL 4.1.9 or later: http://dev.mysql.com/downloads/– PEAR 1.3.4: http://www.go-pear.org/
• Installed packages:• ===================• Package Version State• Archive_Tar 1.1 stable• Console_Getopt 1.2 stable• DB 1.6.8 stable• HTML_Common 1.2.1 stable• HTML_QuickForm 3.2.4pl1 stable• HTML_Table 1.5 stable• PEAR 1.3.4 stable• XML_RPC 1.1.0 stable
– JpGraph 2.0alpha2(22 Jan 2005): http://www.aditus.nu/jpgraph/– MRTG/RRDTool 2.12..1 http://people.ee.ethz.ch/~oetiker/webtools/mrtg/
Hardware Requirements – Server with connectivity to all devices
IBM Global Services
© 2003 IBM Corporation31
Appendix I: Array Summary Report Logic
Read in configuration data (Appendix G)– Create hashes with ESS as key and values for Model, Cache,
RPM, DDM Size, NVS
Sort array performance data file (Appendix F) by ESS, Array
Read in Array level metrics into arrays with ESS, Array, Time Stamp as keys filtering out any records with times that are outside of the start and end time specified by user
Create exception report based on pre-defined thresholds
Loop through each array metric and calculate minimums, maximums, averages, etc.
Print out file to CSV (Use ESS as key to configuration data)
IBM Global Services
© 2003 IBM Corporation32
Appendix J: Useful Links Slide
AIX documentation: http://www-1.ibm.com/servers/aix/library/index.html
Linux – iostat: http://www.linuxinsight.com/ HP-UX documentation: http://docs.hp.com/
Solaris documentation: http://docs.sun.com/app/docs
SNIA standards
The Fibre Channel HBA API Project
Perl-SNMP
RRDtool
ESS documentation– ESS Model 800 Performance – IBM TotalStorage Expert Reporting: How to Produce Built-In and Customized Reports
– IBM TotalStorage Expert Hands-On Usage Guide
IBM Global Services
© 2003 IBM Corporation33
BiographyBrett Allison has been doing distributed systems performance related work since 1997 including J2EE application analysis, UNIX/NT, and Storage technologies. His current role is Performance and Capacity Management team lead ITDS. He has developed tools, processes, and service offerings to support storage performance and capacity. He has spoken at a number of conferences and is the author of several White Papers on performance