22
© 2018 Arm Limited Jun He, [email protected] 2019/4/1 The latest storage status on arm64

The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

© 2018 Arm Limited

Jun He, [email protected]

• 2019/4/1

The latest storage status on arm64

Page 2: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

2 © 2018 Arm Limited

Agenda

• Enterprise Storage Overview

• Arm’s Fit

• Key Takeaways

• Q&A

Page 3: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

© 2018 Arm Limited

Enterprise Storage Overview

Page 4: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

4 © 2018 Arm Limited

Enterprise Storage Overview

• Storage Hierarchy

Page 5: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

5 © 2018 Arm Limited

Enterprise Storage OverviewFrom Core-Tech to Solutions

• Algorithms• CRC, Hash• RAID, Erasure Coding• Encryptions: AES, SMx• Compression• Bloom Filters• …

• Resource Access• NVMe• Fibre Channel• RDMA

Core-Tech

• File System• EXT4• XFS• ZFS/OpenZFS• BTRFS

• Accelerations• SPDK• DPDK• NVMe-oF

File Systems and Accelerators Solutions

TRADEMARK LEGAL NOTICE: All product names, logos, and brands are property of their respective owners. All company, product and service names used in this slide are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.

Page 6: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

6 © 2018 Arm Limited

Enterprise Storage OverviewTrend

• With SSDs being everywhere NVMe is becoming more popular

• Compute resources are separated from storage resources to get right sizing and independent scaling

• Software defined storage is important for system deployment, particularly for general databases and backup

Page 7: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

© 2018 Arm Limited

Arm’s Fit

Page 8: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

8 © 2018 Arm Limited

Core Tech

Algorithms

• Fundamental algorithms are optimizedwith Neon and specific extensions

• CRC/SHA/AES optimizations have beencontributed to various popular opensource projects

• A complete reference implementationsset will be contributed to ISA-L• RAID has been done and merged• CRC and multi-buffer Hash are in progress• AES will be next

Resource Access

• NVMe• Quite a few NVMe SSDs from different vendors

have been validated on Arm

• RDMA• Validated Mellanox ConnectX series on Arm

– 4KB kernel page size– 64KB kernel page size

Page 9: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

9 © 2018 Arm Limited

Core TechAlgorithms

• Benchmarks on Arm64

CRC-16/32/64 Benchmark

RAID-5/6 Benchmark

SHA Benchmark

0

200

400

600

800

1000

1200

1400

SHA-1 SHA-224 SHA-256 SHA-384 SHA-512

Thro

ugh

pu

t (M

B/s

)

0

5000

10000

15000

20000

25000

30000

XOR_gen P+Q_gen

Thro

ugh

pu

t (M

B/s

)

warm cold

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

CRC-16/T10 CRC-32 CRC-32C CRC-64/ISO CRC-64/JONES

Thro

ugh

tpu

t (M

B/s

)

LUT(1tbl) LUT(4tbl) PMULL CRC

Page 10: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

10 © 2018 Arm Limited

File Systems and Accelerators

File Systems

• EXT4

• XFS

• ZFS

• Btrfs

Accelerations

• SPDK• Enabled and tested on Arm.

– Memory barrier– VA address space– 4KB + 64KB kernel page size support

• Fixed several UT failures• Optimized CRC-32/32C using CRC extension.

Significant performance improvement is observed in NVMe-oF/TCP

• DPDK• Added 64KB kernel page size support to pci_vfio• Updated IOMMU configuration setup for Arm64

• NVMe-oF

Page 11: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

11 © 2018 Arm Limited

File Systems and AcceleratorsSPDK

What is it

• Storage Performance Development Kit

• A set of tools and libraries to create highperformance, scalable, user modestorage applications

• Designed for new storage HW devices(NVMe). Can achieve millions of IOPSper core. Better tail latency.

Page 12: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

12 © 2018 Arm Limited

File System and Accelerators

• SPDK Benchmark

0

50

100

150

200

250

300

350

400

450

500

RandRead RandWrite RandRW-read RandRW-write

K IO

PS

IOPS

Kernel SPDK

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

2,000

RandRead RandWrite RandRW-read RandRW-write

MB

/s

Bandwidth

Kernel SPDK

FIO configuration: direct=1, bs=4096, rwmixread=50, iodepth=32, ramp=30s, run_time=180s, jobs=1System configuration: 2.5GHz AArch64 multi-core, 96GB DDR4 Memory, 1NVMe

Page 13: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

13 © 2018 Arm Limited

SolutionsCeph

Quick Primer

• Open source, object-based distributedstorage system

• Offers three kinds of services• Object storage• Block storage• File system

• Highly durable, available

• Popular in Cloud, HPC and BigDatadomains• Dominate in Cinder drivers

Page 14: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

14 © 2018 Arm Limited

SolutionsCeph

• Arm64 packages are available in main distros

• Supported in container world

• Bugfixes, features and improvements• CRC32 optimizations with Arm’s extension• 64KB kernel page size support in NVMEDevices• NVMeDevice crash• NVMeManager thread hang• Tested Ceph + SPDK with 4KB and 64KB kernel page size• Validated Ceph + RDMA with 4KB and 64KB kernel page

size, full coverage test is in progress

TRADEMARK LEGAL NOTICE: All product names, logos, and brands are property of their respective owners. All company, product and service names used in this slide are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.

Page 15: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

15 © 2018 Arm Limited

SolutionsCeph

Ceph+SPDK benchmark Ceph+RDMA benchmark

4.36.4

12.68.9

13.7

27.1

18.7

25.9

41.4

0

5

10

15

20

25

30

35

40

45

1 core 2 cores 4 cores

MB/sfio bandwidth (4KB PAGE_SIZE)

BS=4KB

BS=8KB

BS=16KB

4.87.6

14.3

10

15.6

31.2

21

28.2

45

0

5

10

15

20

25

30

35

40

45

1 core 2 cores 4 cores

MB/s fio bandwidth (64KB PAGE_SIZE)

BS=4KB

BS=8KB

BS=16KB

5.3 6

13.210.7

13.7

36.1

18.3

25

54.6

05

1015202530354045505560

1 core 2 cores 4 cores

MB/s fio bandwidth wi RDMA (4KB PAGE_SIZE)

BS=4KB

BS=8KB

BS=16KB

5.7 7.6

14.310.8

16.5

30.5

19.8

28.6

57.2

05

1015202530354045505560

1 core 2 cores 4 cores

MB/s fio bandwidth (64KB PAGE_SIZE)

BS=4KB

BS=8KB

BS=16KB

Page 16: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

16 © 2018 Arm Limited

SolutionsCeph

• Ceph + RDMA performance optimization

• Ceph + NVMe-oF• With RDMA underlying• Enablement with 4KB and 64KB kernel page size support• Performance profiling and optimization

• Ceph OSD migration to Seastar follow up (https://github.com/ceph/ceph/tree/master/src/crimson)

Take the Next Step

Page 17: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

17 © 2018 Arm Limited

SolutionsOpenStack

• Block storage: Ceph RBD as Cinder backend

• Object storage: Swift compatible RADOS gateway

• 100% pass rate on interoperability tests (2018.02 guidelines)

• Moving to Kolla• Added Ceph bluestore OSD in Kolla

– Blueprints, improvements and CI jobs

OSDOSD

Ceph cluster

MonMon OSD

librados

radosgw librbd

Swift API QEMU

libvirt

OpenStack

CinderSwift

TRADEMARK LEGAL NOTICE: All product names, logos, and brands are property of their respective owners. All company, product and service names used in this slide are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.

Page 18: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

18 © 2018 Arm Limited

SolutionsMisc

Lustre GlusterFS HDFS MiniO ROOK

• Validated with ZFS backend and LDISKFS backend

• Auster Tests

• Built, deployed and unit tested on Arm64

• Benchmarked with gbench, fio and iozone

• Validated with BigData software stack on Arm64

• Built and benchmarked on Arm64 with both BareMetal and docker containers.

• While it has highway hash and SHA-256 are optimized for Arm64, CRC32 is not there

• Enabled native build on Arm64

• Ceph features and improvements

Page 19: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

© 2018 Arm Limited

Key Takeaways

Page 20: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

20 © 2018 Arm Limited

Key Takeaways

• Arm support has been widely adopted in various perspective of storage tiers.

• Optimizations for core tech are important for storage performance

• New technologies and use cases bring new requirements for storage where Arm can be a good fit

Page 21: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

© 2018 Arm Limited

Q&A

Page 22: The latest storage status on arm64 - Amazon S3 · BS=4KB BS=8KB BS=16KB 4.8 7.6 14.3 10 15.6 31.2 21 28.2 45 0 5 10 15 20 25 30 35 40 45 1 core 2 cores 4 cores MB/s fio bandwidth

2222

Thank YouDankeMerci谢谢ありがとうGraciasKiitos감사합니다धन्यवादתודה

© 2018 Arm Limited