22

Storage and performance, Whiptail

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Storage and performance, Whiptail
Page 2: Storage and performance, Whiptail

2

STORAGE and PERFORMANCE

Darren WilliamsTechnical Director, EMEA & APAC

Page 3: Storage and performance, Whiptail

3

3 TB SQL – 17 k IOPS

And

Batch – 20 k IOPS

And

OLTP – 10 k IOPS

And…

VDI

HPC

Analytics

OLTP

Database

THE PROBLEM WITH PERFORMANCE

Accelerate Workloads

DecreaseCosts--------

Storage Decisions

-Accelerate Productivity

Resources

Workload11k IOPS0% Write

13k IOPS25% Write

17k IOPS80% Write

A “More Assets” Problem

Space

Energy

Personnel

96 drivesOr

more discsOr

more cacheOr

more arrays

72 drivesOr

discsOr

cacheOr

arrays

60 drives

-3 TB

3 TB

Speed

Productivity

Total Costs

A Demand Solution

Resources

Workload

12 TB

-Scale

-Total Costs

Batch

Email

Video

Page 4: Storage and performance, Whiptail

4

Speed

Design

• 10s of MB/s Data Transfer Rates

• 100s of Write / Read operation per second

• .001s Latency (ms)

• Motors• Spindles• High Energy

Consumption

SINCE 1956, HDDS HAVE DEFINED APPLICATION PERFORMANCE

Page 5: Storage and performance, Whiptail

5

Speed

Design

• 100s of MB/s data transfer rates

• 1000s of Write or Read operations per second

• .000001 Latency (µs)

• Silicon• MLC/SLC NAND• Low energy

consumption

FLASH ENABLES APPLICATIONS TO WRITE FASTER

Page 6: Storage and performance, Whiptail

6USE OF FLASH – HOST SIDE – PCIE / FLASH DRIVE DAS

• PCIe – Very fast and low latency– Expensive per GB– No redundancy– CPU/Memory stolen from host

• Flash SATA/SAS– More cost effective– Cant get more than 2 drives per blade– Unmanaged can have perf / endurance issues

6

Page 7: Storage and performance, Whiptail

7USE OF FLASH – ARRAY BASED CACHE / TIERING

• Array flash cache– Typically read only– PVS already caches most reads– Effectiveness limited by storage array designed for hard

disks

• Automated storage tiering– “Promotes” hot blocks into flash tier– Only effective for READ– Cache misses still result in “media” reads

7

Page 8: Storage and performance, Whiptail

8USE OF FLASH – FLASH IN THE TRADITIONAL ARRAY

• Flash in a traditional array– Typically uses SLC or eMLC media– High cost per GB– Array is not designed for flash media– Unmanaged will result in poor random write

performance– Unmanaged will result in poor endurance

8

Page 9: Storage and performance, Whiptail

9USE OF FLASH – FLASH IN THE ALL FLASH ARRAY

• Optimized to sustain High Write and Read throughput

• High bandwidth and IOPS. Low latency.• Multi-protocol• LUN Tunable performance• Software designed to enhance lower cost NAND

MLC• Flash by optimizing High Write throughput while

substantially reducing wear• RAID protection and replication

Page 10: Storage and performance, Whiptail

10

RACERUNNER OS

Page 11: Storage and performance, Whiptail

11

4K data blocks

Rewritten data block

A physical HDD is a bit-addressable medium! Virtually limitless write and rewrite capabilities.

NAND FLASH FUNDAMENTALS:HDD WRITE PROCESS REVIEW

Page 12: Storage and performance, Whiptail

12STANDARD NAND FLASH ARRAY WRITE I/O

Fabric

ISCSI FC SRP

Unified Transport

NAND Flash x8

NAND Flash x8

NAND Flash x 8

HBA HBA HBA

RAID

2. Write request passes through the transport stack to RAID.

1. Write request from host passes over fabric through HBAs.

3. Request is written to media.

Page 13: Storage and performance, Whiptail

13

2MB NAND Page

1. NAND Page contents are read to a buffer.

2. NAND Page is erased (aka, “flashed”).

3. Buffer is written back with previous data and any changed or new blocks – including zeroes.

NAND FLASH FUNDAMENTALS:FLASH WRITE PROCESS

Page 14: Storage and performance, Whiptail

14UNDERSTANDING ENDURANCE/RANDOM WRITE PERFORMANCE Endurance

Each cell has physical limits (dielectric breakdown) 2K-5K PE’s Time to erase a block is non-deterministic (2-6 ms) Program time is fairly static based on geometry Failure to control write amplification *will* cause wear out in a

short amount of time Desktop workload is one of the worst for write amplification Most writes are 4-8KB

• Random Write Performance– Write amplification not only causes wear out issues, it also

creates unnecessary delays in small random write workloads.– What is the point of higher cost flash storage with latency

between 2-5ms?

14

Page 15: Storage and performance, Whiptail

15

SRP

NAND SSD x 8

RaceRunnerBlockTranslation Layer:

Alignment | Linearization

RACERUNNER OS:DESIGN AND OPERATION

Fabric

iSCSI

Unified Transport

NAND SSD x 8

HBA HBA HBA

2. Write request passes through the transport stack to BTL.

1. Write request from host passes over fabric through HBAs.

4. Request is written to media.

Data integrity Layer

Enhanced RAID

3. Incoming blocks are aligned to native NAND page size.

NAND SSD x 8

FC

Page 16: Storage and performance, Whiptail

16THE DATA WAITING DAYS ARE OVER

INVICTA

2-6 Nodes6TB-72TB

650,000 IOPS7GB/s Bandwidth

INVICTA – INFINITY (Q1/13)7-30 Nodes21TB-360TB

800,000 – 4 Million IOPS40GB/s Bandwidth

Scalability Path

ACCELA1.5TB – 12TB250,000 IOPS

1.9 GB/s Bandwidth

Page 17: Storage and performance, Whiptail

17THE DATA WAITING DAYS ARE OVERACCELA INVICTA INVICTA INFINITY

Height 2U 6U-14U 16U-64U

Capacity 1.5TB-12TB 6TB-72TB 21TB-360TB

IOPS Up to 250K 250K – 650K 800K – 4M

Bandwidth Up to 1.9GB/Sec Up to 7GB/Sec Up to 40GB/Sec

Latency 120µs 220µs 250µs

Interfaces 2/4/8 Gbit/Sec FC1/10 GBE Infiniband

Protocols FC, ISCSI, NFS, QDR

Features RAID Protection & Hot SparingAsync Replication

VAAIWrite Protection Buffer

RAID Protection and Hot SparingLUN Mirroring and LUN Striping

Async ReplicationVAAI

Write Protection BufferOptions vCenter Plugin/INVICTA Node

KitvCenter

Plugin/INFINITY Switch Kit

vCenter Plugin

Page 18: Storage and performance, Whiptail

18MULTI-WORKLOAD REFERENCE ARCHITECTURE

Dell DVD StoreMS SQL Server

1200 Transactions Per Second (Continuous)

4,000 IOPS.05 GB/s

VMWare View

600 Desktops Boot Storm (2:30)

109,000 IOPS.153 GB/s

SQLIOMS SQL Server

Heavy OLTP Simulation100% 4K Writes (Continuous)

86,000 IOPS.350 GB/s

Batch Report Simulation 100% 64K Reads (Continuous)

16,000 IOPS1 GB/s

Workload Engines

• INVICTA • 350,000 IOPS• 3.5 GB/s • 18 TB

• 8 Servers

Workload Type Workload Demand

215,000 IOPS1.553 GB/s

Mercury

Raid 5 HDD Equivalent = 3,800RAID 10 HDD Equivalent = 2,000

In 2012 Mercury traveled to Barcelona, New York, San Francisco, Santa Clara, and Seattle demonstrating the ability to accelerate multiple workloads on to Solid State Storage.

Page 19: Storage and performance, Whiptail

19FASTER DATABASE BENCHMARKING

$13,000 Power Cost Reduction, 35U to 2U

AMD’s systems engineering department needed to bring various database workloads up quickly and efficiently in the Opteron LabEliminate the time spent performance tuning disk-based storage systems

Replaced 480 Short-Stroked Hard Disk Drives with one 6 TB WHIPTAIL Array supporting multiple storage protocols

50x reduction in Latency

40% improvement in database load times

Engineering team improved workload cycle times

Page 20: Storage and performance, Whiptail

20WHAT WHIPTAIL CAN OFFER:

• Performance

• Cost

Highly experienced - 250+ customers since 2009 for VDI, Database , Analytics etc…

Best in class performance at most competitive price

IOPS ……………… 250K – 4m

Throughput ….. 1.9GB/s – 40GB/s

Latency …………. 120µs

Power ……………. 90% less

Floor Space ……. 90% less

Cooling ………….. 90% less

Endurance ……. 7.5yrs Guaranteed

Making Decision faster …. POA

Page 22: Storage and performance, Whiptail

22

THANKYOU

Darren Williams Email [email protected] @whiptaildarren