20
Ceph Performance on OpenStack (Over 50,000 25,000 Benchmarks!) Open Standard Cloud Association(OSCA) Takehiro Kudou, Hitachi Solutions ,Ltd. (http://www.slideshare.net/tkkd/) Takanori Suzuki, Dell Japan Inc. OpenStack Summit Barcelona 2016 #vBrownBag

Ceph Performance on OpenStack - Barcelona Summit

Embed Size (px)

Citation preview

Page 1: Ceph Performance on OpenStack - Barcelona Summit

Ceph Performance on OpenStack (Over 50,000 25,000 Benchmarks!)

Open Standard Cloud Association(OSCA)

Takehiro Kudou, Hitachi Solutions ,Ltd.

(http://www.slideshare.net/tkkd/)

Takanori Suzuki, Dell Japan Inc.

OpenStack Summit Barcelona 2016 #vBrownBag

Page 2: Ceph Performance on OpenStack - Barcelona Summit

OSCA Introduction

Founded in 2012, the Open Standard Cloud Association (OSCA) partners with the Japan’s leading companies and public organization to solve the technology problems and accelerate next-Gen open standard cloud technology and commercial adoption.

Today, OSCA v2. 0 expands the scope of activity of the IoT solution.

Page 3: Ceph Performance on OpenStack - Barcelona Summit

OSCA Activity

OSCA is always open for everyone

• Proof of Concept, Technical Blog, Whitepaper

• Seminar, Event (Lightning Talk, Networking event for Engineers)

Current working project

• Performance assessments (Ceph, I/O on Docker, NFVI for vCPE)

• Others (Networking OS comparison on Whitebox switch, BigData/IoT Labs

Page 4: Ceph Performance on OpenStack - Barcelona Summit

OpenStack working group Initiatives

• OpenStack on Ceph performance assessment / Design Guide

• OpenStack for NFV solution

• Physical/VM to OpenStack Migration Guide

• Network Design Guide for OpenStack

• Neutron plug-in OVS vs MidoNet comparison

• Swift Step by Step Guide

Page 5: Ceph Performance on OpenStack - Barcelona Summit

© Hitachi Solutions, Ltd. 2016. All rights reserved.

Lead Engineer, Research & Development Department,

Hitachi Solutions, Ltd.

Oct. 27, 2016

Takehiro Kudou

Ceph Performance on OpenStack

(Over 50,000 25,000 Benchmarks!) ~Benchmark Result~

OpenStack Summit Barcelona 2016 #vBrownBag

Page 6: Ceph Performance on OpenStack - Barcelona Summit

© Hitachi Solutions, Ltd. 2016. All rights reserved.

1. Benchmark Method

2. Result and Analysis on Ceph 1.3

3. Result on Ceph 2.0 BlueStore

Tasting of Ceph 2.0 BlueStore.

5

Contents

Page 7: Ceph Performance on OpenStack - Barcelona Summit

6 © Hitachi Solutions, Ltd. 2016. All rights reserved.

1. Benchmark Method

Page 8: Ceph Performance on OpenStack - Barcelona Summit

7 © Hitachi Solutions, Ltd. 2016. All rights reserved.

1-1 Benchmark Environments

Instances on

RHEL-OSP7(Juno)/RHEL-OSP9(Mitaka) - 3 OpenStack Compute Servers

-- 9 instances per Compute Server

(total 27 instances)

OSD Servers

[RHCS1.3(Hammer)/RHCS2.0(Jewel)] - 6 Ceph OSD Servers (total 18 OSD Disks)

-- [ONLY 1.3]Journal: 320GB SSD x1

(50GB x 3 partitions)

-- OSD Disk: 600GB SAS 10Krpm x3 (JBOD)

■ Benchmark from 27 instances to 18 OSD Disks

Page 9: Ceph Performance on OpenStack - Barcelona Summit

8 © Hitachi Solutions, Ltd. 2016. All rights reserved.

1-2 Benchmark Procedure

ssh (user@instance IP) fio -rw=[read/write] -size=1G -ioengine=libaio -iodepth=4 -invalidate=1 -direct=1 -

name=test.bin -runtime=120 -bs=[BlockSize]k -numjobs=[Jobs] -group_reporting > (file name) & ssh・・・・

Benchmark Duration: 2min for each, totally 96 hours.

- VM: 1 | 2 | 3 | 9 | 27

- [read/write]: randread | randwrite

- [BlockSize]: 4 | 16 | 32 | 64 | 128

- [Jobs]: 1 | 4 | 8 | 16

- 3 loops

- OSD Servers: 3 | 4 | 5 | 6

- Ceph 1.3 | Ceph 2.0

■Fio Parameter Options

(1+2+3+9+27)

x 2

x 5

x 4

x 3

x 4

x 2

=40,320 20160

+10,000 5,000

test data We encountered

a trouble!

Page 10: Ceph Performance on OpenStack - Barcelona Summit

9 © Hitachi Solutions, Ltd. 2016. All rights reserved.

2. Result and Analysis on Ceph 1.3

Page 11: Ceph Performance on OpenStack - Barcelona Summit

10 © Hitachi Solutions, Ltd. 2016. All rights reserved.

2-1 Read Benchmark Result

No impacts if

number of OSD

servers has

changed.

Total throughput

exceed 20GB/s

(160Gbps)

NICs:10Gbps x3

It seems something

(memory cache?)

had influence to

benchmark data.

Page 12: Ceph Performance on OpenStack - Barcelona Summit

11 © Hitachi Solutions, Ltd. 2016. All rights reserved.

2-2 Write Benchmark Result

Performance

linearly increased .

Performance is not

good!

SSD journal cache

seems not to

impact both IOPS

and throughput in

this case.

zoom

Page 13: Ceph Performance on OpenStack - Barcelona Summit

12 © Hitachi Solutions, Ltd. 2016. All rights reserved.

2-3 Cause of Write Slow Problem

Write Access

to HDD

Increase

Write Queue

Continuous

Incoming Data

Force Sync

Wait for

Sync Finished

OSD Disk Parameter

--filestore_max_sync_interval 10

(Default 5 seconds)

Performance concerns

- HDD rpms

- Number of HDD

If not,

Journal SSD does NOT work effectively

Page 14: Ceph Performance on OpenStack - Barcelona Summit

13 © Hitachi Solutions, Ltd. 2016. All rights reserved.

3. Result on Ceph 2.0 BlueStore Tasting of Ceph 2.0 BlueStore

Page 15: Ceph Performance on OpenStack - Barcelona Summit

14 © Hitachi Solutions, Ltd. 2016. All rights reserved.

3-1 Ceph 2.0 BlueStore Overview

・BlueStore, a new OSD backend

- Red Hat Ceph Storage 2 (Jewel)’s

Tech Preview function.

- Direct access to Block Device

(Journal bypass)

・Benchmark Environment

- RHEL-OSP9 (Mitaka)

- RHCS2(Jewel) with BlueStore

・There are critical bugs

Ref.)http://www.slideshare.net/sageweil1/bluestore-a-new-faster-storage-backend-for-ceph

http://redhatstorage.redhat.com/2016/06/23/the-milestone-of-red-hat-ceph-storage-2/

Page 16: Ceph Performance on OpenStack - Barcelona Summit

15 © Hitachi Solutions, Ltd. 2016. All rights reserved.

3-2 Trouble 1

[cloud-user@cephbs-15 ~]$ fio

Segmentation fault

[cloud-user@cephbs-15 ~]$ md5sum /usr/bin/fio

0ff2a797ba777aced3c7979a1309ff6c /usr/bin/fio

[cloud-user@cephbs-16 ~]$ md5sum /usr/bin/fio

4f50ea445bd7a8aaae17abcd323dc3c5 /usr/bin/fio

■ Fio binary was broken.

The data from 3000(0x16) were LOST!!!!

Hit the bug that causes dirty blob!! Ref.) Re: segfault in bluestore during random writes (http://www.spinics.net/lists/ceph-devel/msg31384.html)

os/bluestore: refactor dirty blob tracking along with some related fixes #10215 (https://github.com/ceph/ceph/pull/10215)

・Hex dump comparison

Page 17: Ceph Performance on OpenStack - Barcelona Summit

16 © Hitachi Solutions, Ltd. 2016. All rights reserved.

3-3 Trouble 2

・Half of OSDs were BROKEN!!

Page 18: Ceph Performance on OpenStack - Barcelona Summit

17 © Hitachi Solutions, Ltd. 2016. All rights reserved.

Summary

・Ceph 1.3

- Read Performance : Extremely high (Hitting Memory Cache?)

- Write Performance : Not so good under heavy I/O

Journal cache is not effective

with heavy & long time write access pattern.

・Ceph 2.0’s BlueStore

- Not mature enough in tech preview

- Recommend to wait until it becomes a bit more stable

Page 19: Ceph Performance on OpenStack - Barcelona Summit

Special Thanks

•Hirotada Sasaki, Red Hat K.K.

•Masayoshi Hibino, Dell Japan Inc.

•Kazuho Hirahara, Hitachi Solutions, Ltd.

You can download benchmark graphs of Ceph 1.3. (Japanese document)

http://ja.community.dell.com/techcenter/m/mediagallery/3739/download

Page 20: Ceph Performance on OpenStack - Barcelona Summit

Ceph Performance on OpenStack

(Over 50,000 Benchmarks!)

Open Standard Cloud Association(OSCA)

Linux is a trademark of Linus Torvalds. The OpenStack(R) Word Mark and OpenStack Logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community. OSCA™(Open Standard Cloud Association)is a trademark of Dell Japan Inc. PowerEdge, Dell and the Dell logo are trademarks of Dell Inc. RED HAT is a registered trademark of Red Hat, Inc. Other company and product names mentioned in this document may be the trademarks of their respective owners.

This session's article, graph and drawing are provided for the ONLY purpose of reference information. The information is private data that we evaluated with a SPECIFIC circumstance. We NEVER guarantee the information.

The rights of this session's article, graph and drawing are reserved by OSCA, Hitachi Solutions, Ltd, Red Hat K.K and Dell Japan Inc. No reproduction is allowed without previous permission.

OpenStack Summit Barcelona 2016 #vBrownBag

(http://www.slideshare.net/tkkd/)