23
IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper 1 © COPYRIGHT IBM CORPORATION, 2016 IBM® Systems December, 2016 IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper Takeshi Ishimoto, Spectrum Archive Development & Architect, IBM Tokyo Carla Corral, Spectrum Archive Performance, IBM Guadalajara Pedro Ramos, Spectrum Archive Performance, IBM Guadalajara Khanh V. Ngo, Spectrum Archive Development, IBM Tucson Osamu Matsumiya, Spectrum Archive Development, IBM Tokyo

Ibm spectrum archive ee v1.2.2 performance_white_paper

Embed Size (px)

Citation preview

Page 1: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

1

© COPYRIGHT IBM CORPORATION, 2016

IBM® Systems

December, 2016

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

Takeshi Ishimoto, Spectrum Archive Development & Architect, IBM Tokyo Carla Corral, Spectrum Archive Performance, IBM Guadalajara

Pedro Ramos, Spectrum Archive Performance, IBM Guadalajara Khanh V. Ngo, Spectrum Archive Development, IBM Tucson

Osamu Matsumiya, Spectrum Archive Development, IBM Tokyo

Page 2: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

2

© COPYRIGHT IBM CORPORATION, 2016

Contents

PREFACE ..................................................................................................................................................................... 3

1. IBM SPECTRUM ARCHIVE................................................................................................................................... 4

1.1. PRODUCT OVERVIEW .................................................................................................................................... 4

1.2. REFERENCE ARCHITECTURE FOR SCALE-OUT ................................................................................................. 5

2. TEST METHODOLOGY......................................................................................................................................... 7

2.1. HARDWARE SETUP AND RECOMMENDATIONS.............................................................................................. 7

2.1.1. PC SERVER ..................................................................................................................................................... 7

2.1.2. TAPE HARDWARE .......................................................................................................................................... 8

2.1.3. DISK SUBSYSTEM AND IBM SPECTRUM SCALE SETTING ................................................................................ 8

2.1.4. SAN ZONING CONSIDERATIONS..................................................................................................................... 8

2.1.5. SOFTWARE VERSIONS ................................................................................................................................... 9

2.2. TEST PROCEDURES .......................................................................................................................................10

3. MIGRATION PERFORMANCE RESULTS ...............................................................................................................12

3.1. PERFORMANCE RESULT WITH TS1150 TAPE DRIVE .......................................................................................12

3.2. PERFORMANCE RESULT WITH LTO 7 TAPE DRIVE ..........................................................................................14

3.3. PERFORMANCE COMPARISON BETWEEN TS1150 AND LTO 7 DRIVES ...........................................................15

3.4. PERFORMANCE SCALABILITY BY NUMBER OF TAPE DRIVES .........................................................................16

4. CONCLUSIONS ..................................................................................................................................................18

APPENDIX - SERVER AND DISK STORAGE TUNING ......................................................................................................19

ACKNOWLEDGMENTS ................................................................................................................................................21

REFERENCES ..............................................................................................................................................................21

Page 3: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

3

© COPYRIGHT IBM CORPORATION, 2016

Preface This white paper describes the I/O performance characteristics of IBM Spectrum Archive™ Enterprise Edition Version 1.2.2 software (hereafter, IBM Spectrum Archive EE) based on IBM’s in-house testing using IBM TS1150 tape drive and IBM LTO Ultrium 7 (hereafter, LTO 7) tape drive. It summarizes the result of measuring the effective data rate under different workload conditions to characterize the software’s horizontal scalability when additional servers and tape drives are added. Specifically, the test was conducted for measuring the throughput of file migration operation from disk-based file system to tape storage, with different file size and with several hardware configurations. And, it intends to provide the recommendations for a given data rate requirement at the new installation of system or at the upgrade of existing system. Chapter 1 describes the high level overview of software functions and how it can create the scale-out system. Chapter2 describes about the test environment and test procedure, and chapter 3 shows the test result from different aspects. And, the chapter4 concludes with the summary of measurements and best practice.

DISCLAIMER Performance measurements presented on this document are limited to the use of the same hardware configuration. And it can vary depending on the hardware used (servers, storage system, SAN) and their configuration.

The following units of measurement are used in this white paper.

Binary Units Decimal Units

Metric Value Symbol Metric Value Symbol

Kibibyte 1024 KiB Kilobyte 1000 KB

Mebibyte 10242 MiB Megabyte 10002 MB

Gibibyte 10243 GiB Gigabyte 10003 GB

Tebibyte 10244 TiB Terabyte 10004 TB

Page 4: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

4

© COPYRIGHT IBM CORPORATION, 2016

1. IBM Spectrum Archive

1.1. Product Overview

Spectrum Archive EE provides seamless integration of tape storage tier with highly available and scalable file system provided by IBM Spectrum Scale™. It performs the policy-based migration of the file from disk storage to tape for free up the disk space, and it also allows the user to recall the data back from tape on demand or by explicit prefetching technique. With the full integration of disk and tape in transparent manner, the data owner can run any application designed for disk while keeping the cold data on low-cost tape storage tier. Spectrum Archive EE runs on one or more Linux servers and it will make the cluster of servers work as the gateway to tape storage. As in Figure 1.1, each server is configured with a couple of dedicated tape drives and Spectrum Archive EE will automatically distribute the I/O workload across the servers so that the aggregated performance will scale out by having more servers.

Figure 1.1: Spectrum Archive EE System

IBM Spectrum Archive EE provides the following benefits (IBM, 2016):

A low-cost storage tier in an IBM Spectrum Scale environment.

An active archive or big data repository for long-term storage of data that requires file system access to that content.

File-based storage in the Linear Tape File System™ (LTFS) tape format that is open, self-describing, portable, and interchangeable across platforms.

Lowers capital expenditure and operational expenditure costs by using cost-effective and energy-efficient tape media without dependencies on external server hardware or software.

Supports the highly scalable automated TS4500, TS3500, and TS3310 tape libraries.

Allows the retention of data on tape media for long-term preservation (10+ years).

Provides the portability of large amounts of data by bulk transfer of tape cartridges between sites for disaster recovery and the initial synchronization of two Spectroscope sites by using open-format, portable, self-describing tapes.

Migration of data to newer tape or newer technology that is managed by IBM Spectrum Scale.

Page 5: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

5

© COPYRIGHT IBM CORPORATION, 2016

Provides ease of management for operational and active archive storage.

Expand archive capacity simply by adding and provisioning media without impacting the availability of data already in the pool.

With Spectrum Archive EE, you can perform the following management tasks on your systems (IBM, 2016):

Create and define tape cartridge pools for file migrations.

Migrate files in the IBM Spectrum Scale namespace to the IBM Spectrum Archive tape tier.

Recall files that were migrated to the IBM Spectrum Archive tape tier back into IBM Spectrum Scale.

Reconcile file inconsistencies between files in IBM Spectrum Scale and their equivalents in IBM Spectrum Archive.

Reclaim tape space that is occupied by non-referenced files and non-referenced content that is present on the physical tapes.

Export tape cartridges to remove them from IBM Spectrum Archive EE system.

Import tape cartridges to add them to IBM Spectrum Archive EE system.

Add tape cartridges to IBM Spectrum Archive EE system to expand the tape cartridge pool with no disruption to your system.

Obtain inventory, job, and scan status of IBM Spectrum Archive EE solution.

1.2. Reference Architecture for Scale-out

The reference architecture of Spectrum Archive EE is to provide a template of server hardware and software configurations, and it is a blueprint to help the IT architect planning and configuring the servers for the use with IBM Spectrum Archive EE. It also provides the idea of future upgrade path for adding additional I/O bandwidth. In this white paper, three different configuration classes are presented, with a couple of model variations by the number of attached tape drives, as in the diagram below.

Figure 1.2: Configuration Options of Spectrum Archive EE for Performance Scale-out

Page 6: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

6

© COPYRIGHT IBM CORPORATION, 2016

Small Configuration is an entry level configuration with single server with two, three, or four tape drives.

Medium Configuration is the dual node configuration using three or four tape drives per node.

Large Configurations are based on a multi node configuration (four server nodes) and the use of four or five tape drives per node.

IMPORTANT: This white paper only includes measurements for small and medium configurations. The large configurations will be integrated in the future.

The configuration models are identified by naming convention of “xNyDzT” in this white paper, where “x” is the number of servers in total, “y” is number of tape drives attached to each server, and “z” is the total number of tape drives (z = x * y).

Configuration Class

Configuration Name xNyDzT

Number of Nodes

(x)

Number of Drives,

per Node (y)

Number of Drives in

Total (z)

Small

1N2D2T 1 2 2

1N3D3T 1 3 3

1N4D4T 1 4 4

Medium 2N3D6T 2 3 6

2N4D8T 2 4 8

Large 4N4D16T 4 4 16

4N5D20T 4 5 20

Table 1.1: Blueprint Configurations for IBM Spectrum Archive EE

Page 7: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

7

© COPYRIGHT IBM CORPORATION, 2016

2. Test Methodology

2.1. Hardware Setup and Recommendations

All performance results in this document were obtained using the system with; - Two single-socket x86-processor servers, running IBM Spectrum Scale and IBM Spectrum

Archive EE - Eight tape drives in the tape library, and at least same number of tape cartridges - Shared SAN disk storage for IBM Spectrum Scale - Fibre Channel adapter and SAN switch for the connection to external SAN disk storage and

tape drives The exact model and type of selected hardware components are in the diagram below (Figure 2.1) Beside the number of servers and number of tape drives, several other factors could affect the final performance: the server performance; tape drive type; disk storage hardware and IBM Spectrum Scale setup; and interconnect speed. It is beyond the scope of this white paper to attempt to present a complete picture of the relative performance characteristics of all possible hardware/software configurations. However, Appendix section of this document will provide some tuning tips, based on the hardware characteristics used on these test measurements.

Figure 2.1: IBM Spectrum Archive EE hardware components

2.1.1. PC server

It is recommended to use latest PC server with single CPU socket and with 3 PCIe slots.

Page 8: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

8

© COPYRIGHT IBM CORPORATION, 2016

The performance test in this white paper uses IBM System x3850 X5 server. It is the fifth generation of the Enterprise X-Architecture that enable optimal performance for databases, enterprise applications, and virtualized environments. In order to improve the performance for IBM Spectrum Archive EE under the Non-Uniform Memory Access (NUMA) architecture, a couple of tuning was implemented as in the Appendix.

2.1.2. Tape Hardware

IBM Spectrum Archive can utilize following latest tape storage technology for maximum cost efficiency and performance.

IBM TS1150 Enterprise Tape Drive - Native data rate performance of up to 360 MB/sec (non-compressible data) - With JD tape cartridge, it can store 10 TB (non-compressible data) or 30 TB (with 3:1 data

compression) IBM LTO 7 Tape Drive - Native data rate performance of up to 300 MB/sec (non-compressible data) - With LTO 7 tape cartridge, it can store 6 TB (non-compressible data) or 15 TB (with 2.5:1

data compression)

The selection of tape technology between two should be made by many factors, such as reliability, cost, data exchangeability, but from the performance perspective, IBM TS1150 should provide a better result. The performance test was conducted by having two logical library in IBM TS4500 Tape Library; one for TS1150 tape drives and the other for LTO 7 tape drives, because single logical library cannot mix them. In the later section, this white paper provides the test result on both TS1150 and LTO 7 using same test case, for comparison.

2.1.3. Disk Subsystem and IBM Spectrum Scale setting

General performance tuning tips can be applied for the selection of disk storage and its configuration. Appendix describes how IBM Storwize V7000 in the test system was configured. The following IBM Spectrum Scale mmchconfig command setup parameters were used for performance testing configuration on a single node and multi node.

>mmchconfig nsdBufSpace=50,nsdMaxWorkerThreads=1024,nsdMinWorkerThreads=1024,nsd MultiQueue=64,nsdMultiQueueType=1,nsdSmallThreadRatio=1,nsdThreadsPerQueue=48,numaMemoryInterleave=yes,maxStatCache=0,ignorePrefetchLUNCount=yes,logPingPongSector=no,scatterBufferSize=256k -N all

2.1.4. SAN Zoning Considerations

The SAN is primarily responsible for managing data traffic between server and storage devices; tape and disk. Zoning plays a key role to improve the performance to avoid the contention and congestion.

Page 9: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

9

© COPYRIGHT IBM CORPORATION, 2016

As in the diagram in Appendix, it is recommended to

Isolate the SAN zones for disk and tape

Assign dedicatedly an HBA port to smaller number of tapes to avoid the overload of ports The test showed different result by HBA from different manufacturers, and the final test was conducted using 8Gbps FC adapter from QLogic (Note that the maximum link speed of tape drive is 8Gbps).

2.1.5. Software Versions This test was conducted with following code levels

Software Version

IBM Spectrum Archive EE 1.2.2.0

IBM Spectrum Scale 4.2.1

IBM Tape Device Driver lin_tape-3.0.10

OS Version

Linux Version RHEL 7.2

Linux Kernel 3.10.0-327.el7

Firmware Level

IBM TS4500 Library Code 1.3.0.4

IBM TS1150 Drive Code D3I4_68E

IBM LTO 7 Full Height Drive Code LTO7_G9Q0

IBM Storwize V7000 code 7.7.1.2

Page 10: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

10

© COPYRIGHT IBM CORPORATION, 2016

2.2. Test Procedures

The performance test in this white paper is focusing on measuring the data rate (MB/sec) of file migration from disk to tape under variety of file sizes, and will evaluate how the performance will change with different number of servers and different number of tape drives. The performance measure the maximum capabilities of IBM Spectrum Archive EE with the least amount of overhead.

The migration test was conducted by running the following steps; 1. Create the uniform size of files on disk 2. Run mmapplypolicy command manually to find the files matching with the policy criteria, and to

pass the list of candidates to Spectrum Archive command (“ltfsee MIGRATE” command) mmapplypolicy command will invoke multiple instances of ltfsee MIGRATE command, depending on the length of file list and optional arguments of mmapplypolicy command. And, once all migration completes, mmapplypolicy will return to the command prompt

3. Measure the elapsed time for step 2 4. Repeat steps 1 to 3, 3 times. 5. Dividing the amount of data transferred by the best elapsed time gives the aggregated

performance The test uses the following parameters

Migration Source

- File size: select one file size from 5MiB, 10MiB, 100MiB, 1GiB and 10GiB, and create the files of

same size.

- File contains the non-compressible random data, generated from /dev/random

- Amount of data prepared on disk: For each test run, step 1 creates the files equal to the 100

GiB per drive. For example of 10MiB files, test with 4 drives will create 40960 files (= 4 *

100GiB/10MiB = 4 * 10240) at the beginning.

Migration Target

- Number of file replica: 1 (specifies one tape pool in Policy)

- The tape is empty at the 1st run

- Target tapes are loaded on to the tape drive (there will be no movement of tape library robot

during the test)

Command and Policy Options used

- “mmapplypolicy filesystem -P policy_file -B 10000 -m 2*T”, where T is the total number of tape

drives in the system

-B specifies how many files are passed for each invocation of the EXEC script. If the number of

files exceeds the value that is specified by -B parameter, mmapplypolicy starts the external

program multiple times.

-m parameter specifies the number of threads that are created and dispatched within each

mmapplypolicy process during the policy execution phase.

Page 11: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

11

© COPYRIGHT IBM CORPORATION, 2016

- The policy file contains, “SIZE 10485760” after OPTS statement

SIZE parameter limits the total number of bytes, in KB, in all of the files named in each list of

files passed to EXEC 'script'. 10485760 is equivalent to 10GiB.

<< Portion of Policy File >>

RULE EXTERNAL POOL 'ltfs'

EXEC '/opt/ibm/ltfsee/bin/ltfsee' OPTS '-p perftest@library1'

SIZE 10485760

See Knowledge Center of IBM Spectrum Archive EE and IBM Spectrum Scale for more information of mmapplypolicy parameters for performance optimization

Test Parameters

File Size 5MiB 10MiB 100MiB 1GiB 10GiB

Number of files per drive 20480 10240 1024 100 10

-B parameter 10000

-m parameter 2 * T (where, T is total number of tape drives in the

system)

SIZE parameter 10485760

Page 12: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

12

© COPYRIGHT IBM CORPORATION, 2016

3. Migration Performance Results

3.1. Performance result with TS1150 tape drive

Table 3.1 shows the aggregated transfer rate of file migration with IBM TS1150 tape drives and IBM 3592 JD tape cartridge. As shown in the upper right corner, IBM Spectrum Archive EE migrates the 10 GiB files at 2.3GB/s with 8 tape drives. Given that each tape drive is capable of transferring the data at 360 MB/s for non-compressible data used at the test, this result is equivalent to 80% of tape drive’s capability.

Table 3.1: Aggregated Migration Rate - TS1150 Tape Drive (in MB/s)

The graph below plots the test results and presents the projected performance curve for each hardware configuration. X axis is the file size in logarithmic scale, and Y axis is the transfer rate in MB/s

Figure 3.1: Migration performance scalable for TS1150

Page 13: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

13

© COPYRIGHT IBM CORPORATION, 2016

Page 14: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

14

© COPYRIGHT IBM CORPORATION, 2016

3.2. Performance result with LTO 7 tape drive

Table 3.2 shows the aggregated transfer rate of file migration with IBM LTO 7 tape drive with LTO 7 tape cartridge. As shown in the upper right corner, IBM Spectrum Archive EE migrates the 10GiB files at 1.9GB/s with 8 tape drives. Given that each LTO 7 tape drive is capable of transferring the data at 300 MB/s for non-compressible data used at the test, this result is equivalent to 80% of tape drive’s capability.

Table 3.2: Aggregated Migration Rate – LTO 7 Tape Drive (in MB/s)

The graph below plots the test results and presents the projected performance curve for each hardware configuration. X axis is the file size in logarithmic scale, and Y axis is the transfer rate in MB/s

Figure 3.2: Migration performance scalable for LTO 7

Page 15: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

15

© COPYRIGHT IBM CORPORATION, 2016

3.3. Performance comparison between TS1150 and LTO 7 drives

The graph below compares the test results between ones with IBM TS1150 presented in Figure 3.1, and ones with LTO 7 tape drives in Figure 3.2. TS1150 tape drive performs better than LTO 7 tape drive in all the tested range, while the difference is very minor in the smaller files.

Figure 3.3: Migration performance scalable for TS1150 and LTO 7 (Comparison)

Page 16: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

16

© COPYRIGHT IBM CORPORATION, 2016

3.4. Performance scalability by number of tape drives

Table 3.3 is exactly same test results as the ones in Table 3.1, but the results are now presented as the performance number per drive, rather than aggregated performance. This table shows that the expected performance per drive will slightly lower as more drives are added to the system.

Table 3.3: Performance per drive TS1150 (in MB/s)

Figure 3.4 illustrates the performance scalability for each file size, and lines show how the performance will improve by adding more drives for given file size. In this graph, scaling factor index is defined as “2” for the result at 2 drive configuration, and other are calculated as the relative performance index. In the perfect linear scalability, the index at 8 drive configuration will “8”, where the actual result is ranging from 7.4 to 6.2, for TS1150 tape drive.

Figure 3.4: Migration performance Scalability for TS1150 configuration

Page 17: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

17

© COPYRIGHT IBM CORPORATION, 2016

Table 3.4 is performance number per drive for LTO 7 configurations, just similar to Table 3.3 above, and both show a similar trend.

Table 3.4: Performance per drive LTO 7 (in MB/s)

Figure 3.5 is the equivalent version of Figure 3.4 but for LTO 7 tape drives, and it shows a similar trend.

Figure 3.5: Migration performance Scalability for LTO 7 configuration

Page 18: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

18

© COPYRIGHT IBM CORPORATION, 2016

4. Conclusions

IBM Spectrum Archive EE lowers the cost of storage infrastructure by integrating the large capacity and economical tape tier seamlessly with IBM Spectrum Scale under single namespace. IBM Spectrum Archive EE has the ability to add the tape drives or nodes and provision it on the tape tier, enabling the easier expansion for meeting the requirements of storage capacity, I/O bandwidth, and data availability with minimal downtime and without impacting the availability of data. The test result in this white paper demonstrates that the increment of tape drives on single node and multi node configurations produces a higher sustained data rate based on its high native data rate. IBM Spectrum Archive EE shows an optimal performance for large files (10 GiB) in all configurations. The measurements also reflect that increasing the number of nodes and drives per node improve the performance. The performance for small size files is also improved by the increment of drives and nodes, however this increment remains small even with the increment of drives. It should be also noted that the performance measurement results are based on the hardware configuration, and they could be better by the use of faster disk storage solution (SSD or Flash) which might be tested at the future revision of this white paper. This white paper reflects the benefit for IBM Spectrum Archive EE in terms of performance, and time required to back up by migrating data from any Spectrum Scale tier to a Spectrum Archive tape tier. The result also shows that IBM Spectrum Scale can serve the high throughput and low latency access when it is optimized for IBM Spectrum Archive EE which reads the data for the files being migrated in streaming manner and updates the file system metadata for stubbing.

Page 19: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

19

© COPYRIGHT IBM CORPORATION, 2016

Appendix - Server and Disk Storage Tuning

Server Optimization for NUMA architecture

“This architecture allows to control de time to access to the memory which varies with data location to be accessed. If data resides in local memory, access is fast. If data resides in remote memory, access is slower. The advantage of the NUMA architecture as a hierarchical shared memory scheme is its potential to improve average case access time through the introduction of fast, local memory. In the NUMA shared memory architecture, each processor has its own local memory module that it can access directly with a distinctive performance advantage. At the same time, it can also access any memory module belonging to another processor using a shared bus (or some other type of interconnect) as seen in the diagram below:

Figure A.1: NUMA Architecture

Thread migration from one core to another poses a problem for the NUMA shared memory architecture because of the way it disassociates a thread from its local memory allocations. That is, a thread may allocate memory on node 1 at startup as it runs on a core within the node 1 package. But when the thread is later migrated to a core on node 2, the data stored earlier becomes remote and memory access time significantly increases.” (Intel, 2011) The numaMemoryInterleave parameter of Spectrum Scale is used on a NUMA based systems to improve the file system performance. It is enabled for this performance testing propose due the servers are using NUMA configuration.

Disk Storage Optimization

When designing a GPFS file system on Storwize V7000 storage for optimum performance there are two basic operating modes that match different usage types.

General IO workloads: By default Storwize creates LUNS (vdisks) over multiple arrays to utilize the available storage

Optimal Sequential Performance: The Storwize V7000 use a Redundant Array of Independent Disks (RAID), this is a method of configuring member drives to create high availability and high

Page 20: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

20

© COPYRIGHT IBM CORPORATION, 2016

performance systems. Storwize V7000 for sequential IO with GPFS use RAID5 or RAID 6 arrays. There are two type of RAID configuration; a distributed array and non-distributed array configuration. (IBM, IBM Storwise V7000 with GPFS, 2015)

RAID Array Type

Performance testing for this paper uses Distributed RAID 5 arrays due the performance of the pool is more uniform because all of the available drives are used for every volume extent and they can tolerate the failure of one member drive. These arrays stripe data over the member drives with one parity strip on every stripe. “Distributed RAID arrays can support 4 - 128 drives and they also contain rebuild areas that are used to maintain redundancy after a drive fails. As a result, the distributed configuration dramatically reduces rebuild times and decreases the exposure volumes have to the extra load of recovering redundancy. Distributed arrays remove the need for separate drives that are idle until a failure occurs. Instead of allocating one or more drives as spares, the spare capacity is distributed over specific rebuild areas across all the member drives. After the failed drive is replaced, data is copied back to the drive from the distributed spare capacity. Unlike "hot spare" drives, read/write requests are processed on other parts of the drive that are not being used as rebuild areas. The number of rebuild areas is based on the width of the array. The size of the rebuild area determines how many times the distributed array can recover failed drives without risking becoming degraded.” (IBM, Distributed array properties, s.f.) DRAID: Distributed array (DRAID) is used for IBM Spectrum Archive EE configuration, it allows a RAID5 or RAID6 array to be distributed over a larger set of drives and you can actually have the spare drive performing reads and writes for your host IO.

RAID Strip Size and File System Block Size

The Drive Assignment configuration was tuning using the total number of drives (48 drives) in a v7000 distributed array (Array width). A stripe (redundancy unit), is the smallest amount of data that can be addressed. It is best to use a GPFS block size that is a multiple of the V7000 stripe size. The V7000 has two strip size options: 128KiB and 256KiB, and 256 KiB was used for this performance propose. To optimize the Storwize V7000 for sequential IO with GPFS, a GPFS file system with 2 MiB block size (2048 KiB) was created. The RAID strip size, by default V7000 will use 256 KiB RAID strips. If you have a large sequential workload, then you may want to look at your host I/O size. For this performance propose is recommended to create a 10 disk RAID5 array (8+P+Q) with a strip default size of 256 KiB. That would give an 8*256 KiB = 2048 KiB stripe size, which matches the filesystem block size. The strip width for RAID 5 is = 10(Number of Disk) – 1 = 9.

SAN Connection

The Storwize V7000 nodes must always be connected to SAN switches only. Multiple connections are permitted from redundant storage systems to improve data bandwidth performance. Use an additional Zone (figure X.X: Zone1) to dedicate the traffic between FC ports from all nodes; and all Storwize V7000 ports together for best performance and availability.

Page 21: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

21

© COPYRIGHT IBM CORPORATION, 2016

Figure A.2: IBM Spectrum Archive EE configuration

Acknowledgments The authors would like to thank Joaquin Quiroz, Vernon Miller, and Bruce for their support, reviews, comments and feedback.

Page 22: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

22

© COPYRIGHT IBM CORPORATION, 2016

References

IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2 . Retrieved from IBM Knowledge Center: http://www.ibm.com/support/knowledgecenter/ST9MBR_1.2.2/ltfs_ee_ichome.html

IBM. (2016, December). IBM Spectrum Archive Enterprise Edition V1.2.2: Installation and Configuration Guide. Retrieved from Redbooks: http://www.redbooks.ibm.com/redpieces/abstracts/sg248333.html?Open

IBM. (2016). IBM TS4500 - Supported tape cartdriges. Retrieved from IBM Knowlage Center: http://www.ibm.com/support/knowledgecenter/en/STQRQ9/com.ibm.storage.ts4500.doc/ts4500_ipg_cartridges_supported.html

IBM. (2016, October). IBM Storwize V7000 - Distributed array properties. Retrieved from IBM Knowlage Center: http://www.ibm.com/support/knowledgecenter/en/ST3FR7_7.7.1/com.ibm.storwize.v7000.771.doc/svc_distributedRAID.html

IBM. (2015, November 30). IBM Storwise V7000 with GPFS. Retrieved from Developers Works: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/IBM%20Storwise%20V7000%20with%20GPFS

IBM. (2013, October). IBM System x3850 X5 and x3950 X5 - Types 7145, 7146, 7143, and 7191. Retrieved from Installation and User's Guide: http://publib.boulder.ibm.com/infocenter/systemx/documentation/topic/com.ibm.sysx.7145.doc/7145_iug_pdf.pdf

IBM. (n.d.). System x Documentation - Memory Modules. Retrieved from Infocenter: http://publib.boulder.ibm.com/infocenter/systemx/documentation/index.jsp?topic=/com.ibm.sysx.7145.doc/bb1pw_r_memorymodules.html

Intel. (2011, November 2). Optimizing Applications for NUMA. Retrieved from Intel Developer Zone: https://software.intel.com/en-us/articles/optimizing-applications-for-numa

Page 23: Ibm spectrum archive ee  v1.2.2 performance_white_paper

IBM® Spectrum Archive™ Enterprise Edition V1.2.2 Performance White Paper

23

© COPYRIGHT IBM CORPORATION, 2016

© International Business Machines Corporation 2016 Printed in the United States of America Dec 2016 All Rights Reserved IBM, the IBM logo, Linear Tape File System, Spectrum Archive, Spectrum Scale, System Storage are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Linear Tape-Open LTO, the LTO logo, Ultrium and the Ultrium logo are registered trademarks of Hewlett Packard Enterprise, IBM and Quantum in the US and other countries. Other company, product and service names may be trademarks or service marks of others. Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This information could include technical inaccuracies and/or typographical errors. IBM may make improvements and/or changes in the product(s) and/or programs(s) at any time without notice. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBM’s intellectually property rights, may be used instead. It is the user’s responsibility to evaluate and verify the operation of any non-IBM product, program or service. The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM is not responsible for the performance or interoperability of any non-IBM products discussed herein. The provision of the information contained herein is not intended to, and does not grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A.