IBM Confidential John Sing IBM Systems & Technology Group A New Era in Technical Computing: Powerful. Comprehensive. Intuitive IBM System x GPFS Storage

IBM Confidential

John SingIBM Systems & Technology GroupA New Era in Technical Computing: Powerful. Comprehensive. Intuitive

IBM System x GPFS Storage ServerA Revolution in HPC Intelligent Cluster Storage

© 2012 IBM Corporation

Technical Computing: Powerful. Comprehensive. Intuitive

IBM innovation for server and storage convergence enables Big Data convergence with High Performance Computing (HPC).

Improved Data Availability: Declustered RAID with GPFS reduces overhead and speeds data rebuilds by ~ 4-6 times

Data Integrity, Reliability & Flexibility:

End-to-end checksum, 2- & 3- faults tolerance, application optimized RAID

Superior Performance: Using more powerful x86 cores instead of the special-use disk controller chips

Higher Efficiency: Server and Storage co-packaging improves density and efficiency

Better Value: Software-based controller reduces hardware overheads and costs

High-Speed Interconnect: Clustering & storage traffic, including failover (InfiniBand, 10 GbE)

IBM System x

GPFS Storage Server



SONAS and GPFS Storage ServerPositioning SONAS GSS

Segment

Target Market Segment

General Purpose, Scale Out File & Object Storage

Technical/High Performance Computing where GPFS is installed

Applicable Industries

Universities, Government, Healthcare/Life Sciences, Oil/Gas

High Performance Compute (aka Grid) workloads within industries that require

them

Technology Placement

WorkloadsHigh Perf IOPs, sequential streaming, general file serving for Windows & NIX

Focused on high bandwidth sequential streaming for GPFS grids

Key Value Proposition

GUI driven administration, clustered Windows CIFS support, TSM & NDMP

integration, flexible storage array options (Storwize V7000, XIV,

DCS3700), encapsulates GPFS controls

Performance Oriented, Grow to Extremely Large Capacity, full administrative GPFS control

Geeky Stuff

Protocol Support NFS, CIFS, HTTPS, FTP GPFS NSD, NFS

Connectivity 1GbE, 10GbE Infiniband, 10 GbE, 40 GbE

Capacity Range100TB-15PB

Flexible options for V7000, XIV, DCS3700 backend Storage

464TB – dozens of PBsFixed building block of 60 drive

increments

Sales GTM

Who owns it?How its sold?Who gets credit?Where can it be sold?

System Storage, AAS,

Storage Sellers,All markets/no restrictions

System xHigh volume resellers,

HPC / technical computingmanaged availability in 2013



4

“Twin Tailed” JBOD Disk Enclosure

x3650 M4

Complete Storage SolutionData Servers, Disk (SSD and NL-SAS), Software, Infiniband and Ethernet

Model 24: Light and Fast4 Enclosures 20U232 NL-SAS 6 SSD10 GB/Second

Model 26: HPC Workhorse!6 Enclosures 28U12 GB/Second 348 NL-SAS 6 SSD

High Density HPC Options18 Enclosures2 - 42u Standard Racks1044 NL-SAS 18 SSD36 GB/Second

GPFS Storage Server: Scalable Building Block Approach to Storage



ol

JBOD Disk Enclosures

NSD File Server 1NSD File Server 1


Clients

RD FDR IB

10 GbE

ol

RD

Custom DedicatedDisk Controllers

JBOD Disk Enclosures

NSD File Server 1x3650

NSD File Server 1x3650


Clients

GPFS Native RAIDGPFS Native RAID

GPFS Native RAIDGPFS Native RAID

What makes this different

File/Data Servers

Migrate RAID and Disk

Management to Standard File

Servers!

Migrate RAID and Disk

Management to Standard File

Servers!



Feature Detail

Declustered RAID– Data and parity stripes are uniformly partitioned and distributed across a disk array.– Arbitrary number of disks per array (unconstrained to an integral number of RAID stripe widths)

2-fault and 3-fault tolerance (RAID-D2, RAID-D3)– Reed-Solomon parity encoding– 2 or 3-fault-tolerant: stripes = 8 data strips + 2 or 3 parity strips– 3 or 4-way mirroring

End-to-end checksum– Disk surface to GPFS user/client– Detects and corrects off-track and lost/dropped disk writes

Asynchronous error diagnosis while affected IOs continue– If media error: verify and restore if possible– If path problem: attempt alternate paths– If unresponsive or misbehaving disk: power cycle disk

Supports service of multiple disks on a carrier– IO ops continue on for tracks whose disks have been removed during carrier service

6



De-clustering – Bringing Parallel Performance to Disk Maintenance

7

20 disks, 5 disks per traditional RAID array

4x4 RAID stripes(data plus parity)

20 disks in 1 De-clustered array

Declustered RAID: Data+parity distributed over all disks– Rebuild uses IO capacity of an array’s 19 (surviving) disks

Striping across all arrays, all file accesses are throttled by array 2’srebuild overhead.

Load on files accesses are reduced by 4.8x (=19/4)during array rebuild.

Failed Disk

16 RAID stripes(data plus parity)

Traditional RAID: Narrow data+parity arrays– Rebuild uses IO capacity of an array’s only 4 (surviving) disks

Failed Disk



Low-Penalty Disk Rebuild Overhead

8

failed disk

Rd-Wr

time

Rd Wr

time

failed disk

Reduces Rebuild Overhead by 3.5x



Non-intrusive disk diagnostics

Disk Hospital: Background determination of problems– While a disk is in hospital, GNR non-intrusively and immediately

returns data to the client utilizing the error correction code.– For writes, GNR non-intrusively marks write data and reconstructs it

later in the background after problem determination is complete.

Advanced fault determination– Statistical reliability and SMART monitoring– Neighbor check, drive power cycling – Media error detection and correction

Supports concurrent disk firmware updates

9



Mean time to data loss 8+2 vs. 8+3

Parity 50 disks 200 disks 50,000 disks

8+2 200,000 years 50,000 years 200 years

8+3 250 billion years 60 billion years230 million years

10

Simulation assumptions: Disk capacity = 600-GB, MTTF = 600khrs, hard error rate = 1-in-1015 bits, 47-HDD declustered arrays, uncorrelated failures. These MTTDL figures are due to hard errors, AFR (2-FT) = 5 x 10-6, AFR (3-FT) = 4 x 10-12

These figures assume uncorrelated failures and hard read errors.



Data management portfolio for high performance Technical Computing

Focus: Ease of UseReliability

IBM Data Management Leadership

Focus:Managed Building

Block

Focus:Raw Raw Performance

I/O Bandwidth

Government

High EndResearch

Petroleum

Financial Services

Smaller Installations

Higher End University

Media/Ent.

CAE

Bio/Life Science

IBM Tape, Tivoli Storage Manager, and HPSS

Direct Attached(DS3500 + V3700)

SONAS

DCS3860DCS3700

GPFS Storage Server

GPFSGPFS



Summary: IBM GPFS Storage Server

What’s New:– Replaces external hardware controller with software based RAID– Modular upgrades improve TCO– Non-intrusive disk diagnostics

Client Business Value:– Integrated and ready to go for GPFS applications– 3 years maintenance and support– Improved storage affordability– Delivers data integrity, end-to-end– Faster rebuild and recovery times– Reduces rebuild overhead by 3.5x

Key Features:– Declustered RAID (8+2p, 8+3p)– 2- and 3-fault-tolerant erasure codes– End-to-end checksum– Protection against lost writes– Off-the-shelf JBODs– Standardized in-band SES management– SSD Acceleration Built-in



Doing Big Data Since 1998

IBM GPFS

Documents

IBM Confidential John Sing IBM Systems & Technology Group A New Era in Technical Computing: Powerful. Comprehensive. Intuitive IBM System x GPFS Storage