12
Case Study: Flying Circus Berlin CEPH meetup 2014-01-27, Christian Theune <[email protected]>

Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Embed Size (px)

DESCRIPTION

Slides from the inaugural CEPH users group meeting in Berlin. A quick overview of the CEPH status at the Flying Circus.

Citation preview

Page 1: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Case Study: Flying CircusBerlin CEPH meetup

2014-01-27, Christian Theune <[email protected]>

Page 2: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

/me

• Christian Theune

• Co-Founder of gocept

• Software Developer(formerly Zope, Plone, grok), Python (lots of packages)

[email protected]

• @theuni

Page 3: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)
Page 4: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

What worked for us?

raw image on local server

lvm volume via iSCSI (ietd + open-iscsi)

Page 5: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

What didn’t work (for us)

ATA over Ethernet

Gluster(sheepdog)

Linux HA solution for iSCSI

Page 6: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

CEPH

• been watching for ages

• started work in December 2012

• production roll-out since December 2013

• about 50% migrated in production

Page 7: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Our production structure• KVM hosts with 2x1Gbps (STO and STB)

• Old storages with 5*600GB RAID 5 + 1 Journal SAS 15k drives

• 5 monitors, 6 OSDs currently

• RBD from KVM hosts and backup server, 1 cluster per customer project (multiple VMs)

• Acceptable performance on existing hardware

Page 8: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Good stuff• No single point of failure any more.!

• Create/destroy VM images on KVM hosts!

• Fail-over and self-healing works nicely

• Virtualisation for storage “as it should be”™

• High quality of concepts, implementation, and documentation

• Relatively simple to configure

Page 9: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

ceph -s (and -w)

Page 10: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

ceph osd tree

Page 11: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Current issues• Bandwith vs. Latency: replicas from RBD client?!?.

• Deciding for PG allocation in various situations.

• Deciding for new hardware.

• Backup has become a bottle neck.

• I can haz “ceph osd pool stats” per RBD volume?

• Still measuring performance. RBD is definitely sucking up some performance.

Page 12: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)

Summary• finally … FINALLY … F I N A L L Y !

• feels sooo good

• well, at least we did not want to throw up using it

• works as promised

• can’t stop praising it …