Ceph and OpenStack - Feb 2014

Preview:

DESCRIPTION

Ceph and OpenStack - updated Feb 2014

Citation preview

20140218

Ian ColleDirector of Engineering, Inktank

ian@inktank.com@ircollewww.linkedin.com/in/ircolleircolle on freenode

inktank.com | ceph.com

AGENDA

3

GETTING INVOLVED

ROADMAP

OPENSTACK AND CEPH

INTRO TO CEPH

CEPH

CEPH UNIFIED STORAGE

FILE SYSTEM

BLOCK STORAGE

OBJECT STORAGE

Keystone

Geo-ReplicationNative API

6

Multi-tenant

S3 & Swift

OpenStack

Linux Kernel

iSCSI

Clones

Snapshots

CIFS/NFS

HDFS

Distributed Metadata

Linux Kernel

POSIX

Copyright © 2014 by Inktank

CEPH OVERVIEW

PHILOSOPHY TODAY

7

2004

2010

2012

HISTORY

2006

Included in Linux kernel

Integrated into

OpenStack

Open sourced for the first

time

Project starts at UCSC

Failure is normal

Self managing

Scale out on commodity hardware

Everything runs in

software

Copyright © 2014 by Inktank

Copyright © 2014 by Inktank

TRADITIONAL STORAGE VS. CEPH

Single Purpose

TRADITIONALENTERPRISE STORAGE

Multi-Purpose, Unified

Hardware Distributed Software

Single Vendor Lock-in Open

Hard Scale Limit Exabyte Scale

8

STRONG & GROWING COMMUNITY

9

2011-Q3 2011-Q4 2012-Q1 2012-Q2 2012-Q3 2012-Q4 2013-Q1 2013-Q2 2013-Q3

8,172

37,946

2,888

11,500

1,4182,715

IRC chatlines

ML messages

Commits

Copyright © 2014 by Inktank

ARCHITECTURE

10

S3/SWIFTHOST/

HYPERVISORiSCSI CIFS/NFS SDK

INTERFACES

STORAGE CLUSTERS

MONITORS OBJECT STORAGE DAEMONS (OSD)

BLOCK STORAGE

FILE SYSTEM

OBJECT STORAGE

NODE

NODE

NODE

NODE

NODE

NODE

NODE

NODE

NODE

Copyright © 2014 by Inktank

CRUSH

11

OBJECT

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

12

OBJECT

10 10 01 01 10 10 01 11 01 10

CRUSH Pseudo-random placement algorithm

Fast calculation, no lookup Repeatable, deterministic

Statistically uniform distribution Stable mapping

Limited data migration on change Rule-based configuration

Infrastructure topology aware Adjustable replication Weighting

13

14

CLIENT

??

15

16

17

CLIENT

??

18

19

20

LIBRADOS

21

M

M

M

VM

LIBRBD

HYPERVISOR

HOW DO YOUSPIN UP

HUNDREDS OF VMsINSTANTLY

ANDEFFICIENTLY?

22

23

144 0 0 0 0

instant copy

= 144

4144

24

CLIENT

write

write

write

= 148

write

4144

25

CLIENTread

read

read

= 148

OPENSTACK AND CEPH

ARCHITECTURAL COMPONENTS

27

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Copyright © 2014 by Inktank

CEPH WITH OPENSTACK

OPEN STACK

KEYSTONE API

SWIFT API CINDER APIGLAN

CE API

NOVAAPI

CEPH STORAGE CLUSTER(RADOS)

CEPH OBJECT GATEWAY

(RGW)

CEPH BLOCK DEVICE(RBD)

HYPERVISOR

(Qemu/KVM)

28Copyright © 2014 by Inktank

Swift RADOS backend? (possibly)

DevStack Ceph – In work

Enable Cloning for rbd-backed ephemeral disks – In Review

PROPOSED ICEHOUSE ADDITIONS

Copyright © 2014 by Inktank

WHAT’S NEXT FOR CEPH?

CEPH ROADMAP

31

Firefly Giant H-Release

Object Versioning

Alternative Web Server for RGW

Performance Improvement

CephFS

Copyright © 2014 by Inktank

Cache Tiering

Erasure Coding

Object Quotas

Object Expiration

Read-Affinity

CACHE TIERING - WRITEBACK

Copyright © 2014 by Inktank

5 PB HDD OBJECT STORAGE

500TB writeback

cache

CLIENT A CLIENT B

CACHE TIERING - READONLY

Copyright © 2014 by Inktank

5 PB HDD OBJECT STORAGE

200TB redonly cache

150TB readonly

cache

CLIENT A CLIENT B CLIENT C

150TB readonly

cache

34

10 MB OBJECT

10 10 01 01 10 10 01 11 01 10

Costs you 30MBof storage

35

10 MB OBJECT

10 10 01 01 10 10 01 11 01 10

Costs you ~14MBof storage

NEXT STEPS

NEXT STEPSWHAT NOW?

• Read about the latest version of Ceph: http://ceph.com/docs

• Deploy a test cluster using ceph-deploy: http://ceph.com/qsg

Getting Started with Ceph

Most discussion happens on the mailing lists ceph-devel and ceph-users. Join or view archives at http://ceph.com/list

IRC is a great place to get help (or help others!) #ceph and #ceph-devel. Details and logs at http://ceph.com/irc

Getting Involved with Ceph

37

• Deploy a test cluster on the AWS free-tier using Juju: http://ceph.com/juju

• Ansible playbooks for Ceph: https://www.github.com/alfredodeza/ceph-ansible Download the code: http:

//www.github.com/ceph The tracker manages bugs and

feature requests. Register and start looking around at http://tracker.ceph.com

Doc updates and suggestions are always welcome. Learn how to contribute docs at http://ceph.com/docwriting

Ian R. ColleDirector of Engineering

ian@inktank.com@ircolle

www.linkedin.com/in/ircolleircolle on freenode

Recommended