Upload
vuongque
View
220
Download
4
Embed Size (px)
Citation preview
Operating System, Storage Performance Analysis
Robert M. Smith, Microsoft Corporation
Author: Robert M. Smith, Microsoft Corporation
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
2 2
SNIA Legal Notice
The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations and literature under the following conditions:
Any slide or slides used must be reproduced in their entirety without modification The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations.
This presentation is a project of the SNIA Education Committee. Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney. The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
3 3
Abstract
OS Storage Performance Analysis Analyzing and dealing with storage performance at the OS level can be challenging in many respects. This tutorial covers aspects of performance with respect to storage. This tutorial will also cover tools that can be used to assist in the analysis of operating system performance. This presentation will include the following:
Factors affecting storage performance Examples of tools to monitor storage performance Recommendations to improve storage performance
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
SAN I/O Path, 1000 ft. view
4
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
OS I/O: Closer View
File System
Volume / Partition
Device Class
Command Port
5
User Mode
Kernel Mode
Application
Storage
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Rotational Drives “Capacity Optimized” drives
TB Size: 0.5, 1, 2, 3 *IOPS: >= 120 (worst case, random “full-stroke” workloads) SAS or SATA Regardless of size, same performance, same IOPs ~8.5 ms latency (½ platter seek); worst case 16 to 19 ms (on average across manufacturers)
“Performance Optimized” drives GB Size: 72, 144, 450, 600, 900 *IOPS: 200-400 (worst case) SAS, FC (some SATA) 2-4 ms latency (on average across manufacturers)
6
Disk Drive Factors
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
SSD & Hybrid Storage Cost: Dollars per GB SSD Solid-State Drive
No moving parts Less power consumption 75, 150, 300, 500, 600 GB OS likely has native SSD support (Trim, etc.) Microsecond latency Flash block erase before write Undersized: provides spare cells for wear-leveling and bad-block mapping (ex. 150 GB drive might be sold as a 100 GB drive)
Hybrid Storage Solutions Solid-State and rotational disks in same chassis “Hot” data serviced by SSD, other serviced by rotational 7
Disk Drive Factors (2)
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Hardware Factors
Controller Cache Configurations How much cache? What is read/write ratio of cache? How effective is cache?
Enterprise storage usually has performance measuring capability onboard
What happens when a threshold is reached? (I.E. Flush) Idle flushing: does not interrupt, I/O continues Low and high watermark flushing: triggers flushing, minor performance impact Forced flushing: to free cache pages, all I/O temporarily halted
8
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Hardware Factors (2)
Is cache “mirroring” involved If so, is there a performance impact?
Are there other workloads on the storage device? What hardware is between initiator and target?
If SAN, how many and what types of switches? Virtualization Appliances
Some take the “LUNs” presented and virtualize those Some have onboard storage
9
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Hardware Virtualization
Virtual Disks (AKA LUN) Comprised of a group or “chunk” of a group of physical disks, and then presented by a storage device Possibly indicated by:
Non-standard size Device interrogations returning storage vendor vs. drive vendor
Virtualize to consolidate Aggregation of underlying LUNs (virtualization appliance)
Adds complexity Troubleshooting more difficult (example, very tough to find “hot spots”)
10
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Layout Factors: Disk Configuration
RAID level ex. 1, 5, 6, 1+0, 0+1, 5+0, 0+5, 6+0 etc.
Number of physical disk drives backing Levels of virtualization between server(s) and disks? Any storage pool sharing involved?
Dedicated disks or shared storage pools?
What is the backup schedule for ALL connected hosts
LUN snapshots, database table scans, etc.
11
What decisions affected design?
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Cost Consolidation Migration Risk
RAID types versus performance
Power and cooling Expansion Manageability
12
Storage Layout Factors: Design Decisions
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Layout Factors (2)
What happens to a storage group if a disk drive fails? What is the performance impact? How long to rebuild? Data could be vulnerable during rebuild Is anyone notified of a failure?
13
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Path Factors (FC & iSCSI) Path is usually a “mesh”
Multiple paths may be meshed, but not physically connected
Redundant paths on separate fabrics are common Multipath I/O (MPIO) software can load balance Designed Path Capacity
Oversubscription Fan In, Fan Out Inter-switch links (ISLs)
Intermediate devices Core Switches (Fan-In / Fan-Out Ratio) Routing across disparate fabrics
14
Failover Only Least Queue Depth
Round Robin Weighted Paths
Round Robin with Subset Least Blocks
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Storage Controller Factors
Mass-Storage Controllers Range from on-board to add-in
Some have battery backup ability in either case
Basic controllers report limited diagnostic information Advanced controllers have diagnostics available
Vendor supplied tools Capable of sending events to operating system through extended logging
Enterprise storage may have multiple controllers with shared cache
15
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Fibre Channel or SAS HBAs
Host-Bus Adapter (HBA) 8 Gb and 16 Gb available today SCSI command interface to OS Often synonymous with Fibre Channel SAN Offload packet assembly and disassembly Provides OS a view into the SAN (though most activity is abstracted by default) Vendor provided diagnostics and performance tools No software capture tools Multiple HBAs, or multiple-port HBAs enable Multiple Path I/O (MPIO)
Most OS have native support for MPIO 16
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Ethernet Adapters
Ethernet Network Interface Card (NIC) 10 GbE
TCP/IP and Chimney Offloads Hardware parity, CRC, ECC
Converged Network Adapter (CNA) Combines functionality of HBA and NIC Fibre Channel over Ethernet (FCoE) CPU offloads for FCoE and iSCSI Can present NIC, FCoE, or iSCSI function to host
Teaming software for throughput and availability Software analyzers likely unable to capture all traffic
17
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Latency
Rotational Disks Millisecond latency Sequential writing to rotational drives is the most efficient Sequential, and/or “full-stripe” writes to RAID disks are most efficient Latency occurs as heads have to move position across rotating platter Operating system logical address may be different from physical location on disk device
18
SSD Microsecond latency Small random writes slowest (Flash block) Flushing Firmware
Keeps improving performance and availability
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Queuing
The art of keeping the I/O pipeline populated, but not congested Can happen at many levels
Operating system can build up thousands of I/O requests Can build up at switch ports (buffer credits) Can build up at backend storage ports (inbound queue) Can build up in storage controllers (HBA, NIC, etc.)
I/O throttling via queue depth setting
Individual disk devices Native command queuing (NCQ) for SATA AHCI
19
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
“Short-Stroking” to reduce latency
Forcing the use of a smaller area of a rotating disk to reduce seek distance, thus latency Also a result of “aerial density”
Data is written more densely on outer tracks Outer edge of disk may get 150 MB/s while inner tracks get 80 MB/s
Less latency means more IOPs Penalty is under-utilized storage space
20
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
“Advanced Format” (AF) Technology
AF Refers to physical disk sector size and/or block architecture Previous limits
Physical disk sector size: 512 bytes Master Boot Record (MBR) structure sizes Approximately 2 Terabytes maximum disk size
New Capabilities: Physical sector size: <currently> 4096 bytes (4 kb) 512e is a 4 kb block presented as 512-byte block More space for error checking (CRC) More storage space available in same or less physical space
No corresponding increase in performance capability 21
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Partition Alignment
Previously a problem, manual steps to mitigate Current OS align by default Check partition starting sector to confirm
Using management interface (Ex. WMI) Look for starting offset of 2048 blocks
Cannot easily change Can automate during OS installation Affects legacy and AF drive technology
512e AF blocks can suffer from misalignment
22
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Understanding the workload
Request size Burstiness “Hot” data Concurrency Inter-arrival time (time of arrival from one request to the next) Locality (matters more on rotational than SSD) Few tools can faithfully reproduce a “live” workload
23
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Counters I/O Transfer Time (Latency)
Avg. disk sec/read Avg. disk sec/write
Queuing Avg. disk read queue length Avg. disk write queue length
Throughput Avg. disk bytes/read Avg. disk bytes/write
Network Output queue length
24
Transfers / sec (IOPS) Disk transfers/sec Disk reads/sec Disk write/sec
%Idle Time Can be misleading
Split I/O Fragmentation Large Requests
OS CPU OS Memory
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Analysis Tools
Sampling Tools Samples may be instantaneous or counters Good for long-term analysis
Real-time Tools Software tracing
Kernel Drivers
Hardware tracing Nothing abstracted Can be difficult to see everything in between initiator and target Transport security may be a factor
– IPSEC – Encryption
25
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Vendor Provided Tools (1)
Vendor Provided Tools Provide information about devices that may not all be reported up to OS Provide adapter-wide performance statistics Allow for adapter test Settings changes for tuning
Fabric Software End-to-end visibility Sometimes bundled with devices Ability to easily view fabric devices, including stats Help identify “hot spots” May require <all> device clock sync for accuracy
26
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Vendor Provided Tools (2)
27 Sample from an HBA vendor provided tool
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Vendor Provided Tools (3)
28
Some common FC error counters Link Failure
Link down, zoning change (isolation)
Sync Loss Can be caused by OS reboot
Signal Loss Can be caused by OS reboot
Invalid CRC Not normal
Primitive Sequence
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Other Error Factors
29
iSCSI CRC Digest
TCP/IP CRC Checksum
Fibre Channel Primitive Sequence Buffer_0 ED_TOV RA_TOC
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Virtualization Factors: Hosts Measure overall workload over time
Try to provision storage to meet workload Stripe-Unit size Number of disks per storage pool or LUN
If latency becomes apparent, monitor queue depth If queue depth is too low, disks may not be fully utilized If queue depth is too high, disks might be queuing, or I/O might be delayed in transit
Adapter (FC, iSCSI, CNA, etc.) Consult with vendor for recommendations
Queue depth – Determine if a change is needed based on performance – Too high and could saturate link of cause stalling in transit
Onboard: Add disks, add controllers and disks, spread load
Keep up with host software updates and firmware 30
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Virtualization: General
Fixed size disks for intensive performance needs Over-provisioned disks; SSD or hybrid if possible Pass-Through Disks: Very little overhead, good perf Additions/Integrations
Emulated SCSI or FC controllers may yield better perf Add additional emulated controllers with fewer disks per
Monitor memory within VM Low free memory could lead to excessive paging or trimming
Patch guests as you would physicals: Proactively look for and apply performance and stability related OS and application updates
31
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Recommendations
Update software and drivers running in storage stack Anti-Virus Firewall Other Security File Screening HBA, CNA, NIC Multipath (MPIO) software Teaming software
Discover all software in storage stack Trace Tools
Remove any non-vital software in storage stack Utilize appropriate tier of storage per workload
32
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Recommendations (2) Tune cache on storage controllers
Based on observed workload over time Based on cache effectiveness counters (cache hits, etc.)
Look for hot spots Can be hard to find Visual trace tools may help Symptom: Optimal storage performing poorly for no other reason
Be proactive with alerting SMI-S SNMP
Start with a baseline, periodic snapshot Runbooks 33
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Recommendations (3) Optimize FAN-IN and/or FAN/OUT ratios
Avoid congestion points Monitor fabric for BUFFER_O, and other errors (set alerts; automate as much as possible)
Follow best practices for iSCSI VLAN or dedicated hardware Limit protocols in use Limit or remove sharing Optimize hardware per vendor recommendations
Avoid unplanned changes and track in detail if made Snapshot before and after if possible, and keep logs
Chart all storage related tasks, look for overlap 34
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
Performance Recommendations (4) Keep historical data about workload
Take traces periodically (automate if possible) Provides for trending and lifecycle planning
Use monitoring software and keep data for a year or two Have data readily available for engineering and vendor staff
Plan the workload as much as possible Keep charts, graphs, spreadsheets, databases
Exercise new storage layouts before production Ask vendors for help if needed with load simulation tools Also ask for help if needed with performance tools Simulate failure(s) in test environment Familiarize yourself with support model
Can analysis services be made available (with analyzer)? 35
Operating System Storage Performance Analysis © 2012 Storage Networking Industry Association. All Rights Reserved.
36 36
Q&A / Feedback
Many thanks to the following individuals for their contributions to this tutorial.
- SNIA Education Committee
Chris Lionetti, Flavio Muratore Bruce Worthington, Joseph White, Juniper
Send any questions or comments on this presentation to SNIA: [email protected]