31
Best Practices for Ceph- Powered Implementations of Storage as-a-Service Kamesh Pemmaraju, Sr. Product Mgr, Dell Ceph Developer Day, New York City, Oct 2014

Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

  • Upload
    inktank

  • View
    286

  • Download
    3

Embed Size (px)

DESCRIPTION

Presented by Kamesh Pemmaraju, Dell

Citation preview

Page 1: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Best Practices for Ceph-Powered Implementations of Storage as-a-ServiceKamesh Pemmaraju, Sr. Product Mgr, Dell

Ceph Developer Day, New York City, Oct 2014

Page 2: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Outline

• Planning your Ceph implementation• Ceph Use Cases• Choosing targets for Ceph deployments• Reference Architecture Considerations• Dell Reference Configurations• Customer Case Study

Page 3: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• Business Requirements– Budget considerations, organizational commitment– Avoiding lock-in – use open source and industry standards– Enterprise IT use cases– Cloud applications/XaaS use cases for massive-scale, cost-effective storage

• Sizing requirements– What is the initial storage capacity?– Is it steady-state data usage vs. Spike data usage– What is the expected growth rate?

• Workload requirements– Does the workload need high performance or it is more capacity focused?– What are IOPS/Throughput requirements?– What type of data will be stored?

– Ephemeral vs. persistent data, Object, Block, File?

• Ceph is like a Swiss Army knife – it can tuned a wide variety of use cases. Let us look at some of them

Planning your Ceph Implementation

Page 4: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Ceph is like a Swiss Army Knife – it can fit in a wide variety of target use cases

Virtualization and Private Cloud

(traditional SAN/NAS)

High Performance(traditional SAN)

PerformanceCapacity

NAS & Object Content Store(traditional NAS)

Cloud Applicatio

ns

Traditional IT

XaaS Compute CloudOpen Source Block

XaaS Content StoreOpen Source NAS/Object

Ceph Target

Ceph Target

Page 5: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: OPENSTACK

5

OPEN STACKKEYSTONE

APISWIFT API CINDER API

GLANCE API

NOVAAPI

CEPH STORAGE CLUSTER(RADOS)

CEPH OBJECT GATEWAY

(RGW)

CEPH BLOCK DEVICE(RBD)

HYPERVISOR

(Qemu/KVM)

Page 6: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: OPENSTACK

6

OPEN STACKKEYSTONE

APISWIFT API CINDER API

GLANCE API

NOVAAPI

CEPH OBJECT GATEWAY

(RGW)

CEPH BLOCK DEVICE(RBD)

HYPERVISOR

(Qemu/KVM)

Volumes Ephemeral

Copy-on-Write Snapshots

Page 7: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: OPENSTACK

7

RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM

CEPH STORAGE CLUSTER(RADOS)

CEPH OBJECT GATEWAY

(RGW)

CEPH BLOCK DEVICE(RBD)

CERTIFIED!

Page 8: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: CLOUD STORAGE

8

WEB APPLICATIONAPP

SERVERAPP

SERVERAPP

SERVER

CEPH STORAGE CLUSTER(RADOS)

CEPH OBJECT GATEWAY

(RGW)

CEPH OBJECT GATEWAY(RGW)

APP SERVER

S3/Swift S3/Swift S3/Swift S3/Swift

Page 9: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: WEBSCALE APPLICATIONS

9

WEB APPLICATIONAPP

SERVERAPP

SERVERAPP

SERVER

CEPH STORAGE CLUSTER(RADOS)

APP SERVER

NativeProtocol

NativeProtocol

NativeProtocol

NativeProtocol

Page 10: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: PERFORMANCE BLOCK

10

KVM/RHEV

CACHE POOL (REPLICATED)

BACKING POOL (REPLICATED)

CEPH STORAGE CLUSTER

Page 11: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: PERFORMANCE BLOCK

11

KVM/RHEV

CACHE: WRITEBACK MODE

BACKING POOL (REPLICATED)

CEPH STORAGE CLUSTER

Read/Write Read/Write

Page 12: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: PERFORMANCE BLOCK

12

KVM/RHEV

CACHE: READ ONLY MODE

BACKING POOL (REPLICATED)

CEPH STORAGE CLUSTER

Write Write Read Read

Page 13: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: ARCHIVE / COLD STORAGE

13

APPLICATION

CACHE POOL (REPLICATED)

BACKING POOL (ERASURE CODED)

CEPH STORAGE CLUSTER

Page 14: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

CEPH BLOCK DEVICE (RBD)

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: DATABASES

14

MYSQL / MARIADBRHEL7 RBD Kernel Module

CEPH STORAGE CLUSTER(RADOS)

NativeProtocol

NativeProtocol

NativeProtocol

NativeProtocol

Page 15: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Copyright © 2013 by Inktank | Private and Confidential

USE CASE: HADOOP

15

POSIXRHEL7 CephFS Kernel Module

CEPH STORAGE CLUSTER(RADOS)

NativeProtocol

NativeProtocol

NativeProtocol

NativeProtocol

CEPH FILE SYSTEM (CEPHFS)

Page 16: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• Tradeoff between Cost vs. Reliability (use-case dependent)

• Use the Crush configs to map out your failures domains and performance pools

• Failure domains – Disk (OSD and OS)– SSD journals– Node– Rack– Site (replication at the RADOS level, Block replication, consider latencies)

• Storage pools– SSD pool for higher performance– Capacity pool

• Plan for failure domains of the monitor nodes

• Consider failure replacement scenarios, lowered redundancies, and performance impacts

Architectural considerations – Redundancy and replication considerations

Page 17: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Server Considerations• Storage Node:

– one OSD per HDD, 1 – 2 GB ram, and 1 Gz/core/OSD, – SSD’s for journaling and for using SSD pooling (tiering) in Firefly– Erasure coding will increase useable capacity at the expense of additional

compute load– SAS JBOD expanders for extra capacity (beware of extra latency,

oversubscribed SAS lanes, large footprint for a failure zone)

• Monitor nodes (MON): odd number for quorum, services can be hosted on the storage node for smaller deployments, but will need dedicated nodes larger installations

• Dedicated RADOS Gateway nodes for large object store deployments and for federated gateways for multi-site

Page 18: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Networking Considerations• Dedicated or Shared network

– Be sure to involve the networking and security teams early when designing your networking options

– Network redundancy considerations – Dedicated client and OSD networks– VLAN’s vs. Dedicated switches– 1 Gbs vs 10 Gbs vs 40 Gbs!

• Networking design– Spine and Leaf– Multi-rack– Core fabric connectivity– WAN connectivity and latency issues for multi-site deployments

Page 19: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Ceph additions coming to the Dell Red Hat OpenStack solutionPilot configuration Components

• Dell PowerEdge R620/R720/R720XD Servers• Dell Networking S4810/S55 Switches, 10GB• Red Hat Enterprise Linux OpenStack Platform • Dell ProSupport • Dell Professional Services • Avail. w/wo High Availability

Specs at a glance • Node 1: Red Hat Openstack Manager • Node 2: OpenStack Controller (2 additional

controllers for HA)• Nodes 3-8: OpenStack Nova Compute• Nodes: 9-11: Ceph 12x3 TB raw storage • Network Switches: Dell Networking S4810/S55• Supports ~ 170-228 virtual machines

Benefits • Rapid on-ramp to OpenStack cloud• Scale up, modular compute and storage

blocks • Single point of contact for solution support• Enterprise-grade OpenStack software

package

Storage bundles

Page 20: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Example Ceph Dell Server Configurations

Type Size Components

Performance 20 TB • R720XD• 24 GB DRAM• 10 X 4 TB HDD (data drives)• 2 X 300 GB SSD (journal)

Capacity 44TB /105 TB*

• R720XD• 64 GB DRAM• 10 X 4 TB HDD (data drives)• 2 X 300 GB SSH (journal)

• MD1200• 12 X 4 TB HHD (data drives)

Extra Capacity 144 TB /240 TB*

• R720XD• 128 GB DRAM• 12 X 4 TB HDD (data drives)

• MD3060e (JBOD)• 60 X 4 TB HHD (data drives)

Page 21: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• Dell & Red Hat & Inktank have partnered to bring a complete Enterprise-grade storage solution for RHEL-OSP + Ceph

• The joint solution provides:– Co-engineered and validated Reference Architecture – Pre-configured storage bundles optimized for

performance or storage– Storage enhancements to existing OpenStack Bundles– Certification against RHEL-OSP – Professional Services, Support, and Training

› Collaborative Support for Dell hardware customers› Deployment services & tools

What Are We Doing To Enable?

Page 22: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

UAB Case Study

Page 23: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• 900 researchers

• Data sets challenging resources

• Research data scattered everywhere

• Transferring datasets took forever and clogged shared networks

• Distributed data management reduced productivity and put data at risk

• Needed centralized repository for compliance

Overcoming a data delugeUS university that specializes in Cancer and Genomic research

Dell - Confidential

Page 24: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Research Computing System (Originally)A collection of grids, proto-clouds, tons of virtualization and DevOps

HPCCluster

HPCCluster

HPC Storage

DDR Infiniband QDR Infiniband

1Gb Ethernet

University Research Network

Interactive Services

Thumb drives

Local servers

Laptops

Laptops

Thumb drives

Local servers

Dell - Confidential

Page 25: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• Housed and managed centrally, accessible across campus network− File system + cluster, can grow as big as you want

− Provisions from a massive common pool

− 400+ TBs at less than 41¢/GB; scalable to 5PB

• Researchers gain− Work with larger, more diverse data sets

− Save workflows for new devices & analysis

− Qualify for grants due to new levels of protection

• Demonstrating utility with applications− Research storage

− Crashplan (cloud back up) on POC

− Gitlab hosting on POC

Solution: a scale-out storage cloudBased on OpenStack and Ceph

“We’ve made it possible for users to satisfy their own storage needs with the Dell private cloud, so that their research is not hampered by IT.”

David L. Shealy, PhDFaculty Director, Research Computing

Chairman, Dept. of Physics

Dell - Confidential

Page 26: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Research Computing System (Today)Centralized storage cloud based on OpenStack and Ceph

Ceph node

Cep node

Ceph node

Ceph node

Ceph node

POC

OpenStack node

HPCCluster

HPCCluster

HPC Storage

DDR Infiniband QDR Infiniband

10Gb Ethernet

Cloud services layerVirtualized server and storage computing cloud

based on OpenStack, Crowbar and Ceph

University Research Network

Dell - Confidential

Page 27: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

• Designed to support emerging data-intensive scientific computing paradigm− 12 x 16-core compute nodes− 1 TB RAM, 420 TBs storage− 36 TBs storage attached to each compute

node• Individually customized

test/development/ production environments− Direct user control over all aspects of the

application environment− Rapid setup and teardown

• Growing set of cloud-based tools & services− Easily integrate shareware, open source, and

commercial software

Building a research cloudProject goals extend well beyond data management

“We envision the OpenStack-based cloud to act as the gateway to our HPC resources, not only as the purveyor of services we provide, but also enabling users to build their own cloud-based services.”

John-Paul Robinson, System Architect

Dell - Confidential

Page 28: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Research Computing System (Next Gen)A cloud-based computing environment with high speed access to dedicated and dynamic compute resources

OpenStack node

OpenStack node

OpenStack node

Ceph node

Ceph node

Ceph node

Ceph node

Ceph node

OpenStack node

OpenStack node

OpenStack node

OpenStack node

HPCCluster

HPCCluster

HPC Storage

DDR Infiniband QDR Infiniband

10Gb Ethernet

Cloud services layerVirtualized server and storage computing cloud

based on OpenStack, Crowbar and Ceph

University Research Network

Dell - Confidential

Page 29: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

THANK YOU!

Page 30: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service

Contact InformationReach Kamesh additional information:

[email protected]

@kpemmaraju

http://www.cloudel.com

Page 31: Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of Storage as-a-Service