Using Ceph in OStack.de - Ceph Day Frankfurt

Preview:

DESCRIPTION

Burkhard Noltensmeier, Teuto

Citation preview

Burkhard Noltensmeierteuto.net Netzdienste GmbH

Erkan YanarConsultant

teuto.net Netzdienste GmbH

● 18 Mitarbeiter● Linux Systemhaus und

Webdevelopment● Ubuntu Advantage Partner● Openstack Ceph Service● Büros und Datacenter in

Bielefeld

Why Openstack ?

Infrastructure as a Sevice● Cloud Init (automated Instance provisioning) ● Network Virtualization● Multiple Storage Options ● Multiple APIs for Automation

● closed beta since September 2013● updated to Havana in October● Ubuntu Cloud Archive● 20 Compute Nodes● 5 Ceph Nodes● Additional Monitoring with Graphite

Provisioning and Orchestration

Openstack Storage Types

● Block Storage● Object Storage● Image Repository● Internal Cluster Storage

– Temorary Image Store

– Databases (Mysql Galera,MongoDB)

Storage Requirements

● Scalability● Redundancy● Performance● Efficient Pooling

Key Facts for our Decision

● One Ceph Cluster fits all Openstack needs● no „single Point of Failure“● POSIX compatibility via Rados Block Device● seamless scalability● commercial support by Inktank● GPL

Rados Block Storage ● Live migration ● Efficient Snapshots● Different types of storage avaiable (tiering)● Cloning for fast restore or scaling

How to start

● determine Clustersize

uneven amount of Nodes to enable negotiation ● Small start with at least 5 Nodes● either 8 or 12 Disks per Chassis● One Jounal per Disk ● 2 Journal SSD per Chassis

Rough calculation

● 3 Nodes, 8 Disks per Node, 2 replica● Netto = Brutto / 2 replica – 1 Node (33%) =

33%

Cluster Brutto● 24 2TB Sata Disks, 100 IOPS each

Cluster Netto● 15,8 Terrabyte, 790 IOPS

Rough calculation

● 5 Nodes, 8 Disks per Node, 3 replica● Netto = Brutto / 3 replica – 1 Node (20%) =

27%

Cluster Brutto ● 40 2TB Sata Disks, 100 IOPS each

Cluster Netto● 21,3 Terrabyte, 1066 IOPS

Ceph specifics

● Data is distributed throughout the Cluster● Unfortunately this destroys Data locality

tradeoff between blocksize an iops.● The bigger Blocks, the better is sequential

performance ● Double Write, SSD Journals strongly advised● Longterm fragmentation by small writes

Operational Challenges

● Performance● Availability● Qos (Quality of Service)

Ceph Monitoring in ostack

● Ensure Quality with Monitoring● Easy spotting of congestion Problems● Event Monitoring (e.g. disk failure)● Capacity management

What we did

● Disk monitoring with Icinga● Collect data via Ceph Admin Socket Json

interface● put it into Graphite ● enrich it with Meta Data

– with Openstack tennant

– Ceph Node

– OSD

Cumulated osd Performance

Single osd performance

Sum by Openstack tenant

Verify Ceph Performance

● Fio Benchmark with fixed file sizefio ­­fsync=<n> ­­runtime=60 ­­size=1g –bs=<n> ...

● Different sync option nosync, 1, 100● Different Cinder Qos Service Options● Blocksize 64k 512k 1024k 4096k● 1 up to 4 VM Clients● Resulting in 500 Benchmark runs..

Cinder Quality of Service

$ cinder qos­create high­iops consumer="front­end" 

  read_iops_sec=100       write_iops_sec=100 

  read_bytes_sec=41943040 write_bytes_sec=41943040

$ cinder qos­create low­iops consumer="front­end" 

  read_iops_sec=50        write_iops_sec=50 

  read_bytes_sec=20971520 write_bytes_sec=20971520

$ cinder qos­create ultra­low­iopsconsumer="front­end"

  read_iops_sec=10        write_iops_sec=10   read_bytes_sec=10485760  write_bytes_sec=10485760

Speed per Cinder Qos

Does it scale

Effect of syncing Files

Different Blocksize with sync

Ceph is somewhat complex, but

● reliable ● No unpleasent suprises (so far!)● Monitoring is important for resource

management and availabilty !