Upload
redhatstorage
View
1.076
Download
0
Embed Size (px)
Citation preview
ARCHITECTING CEPH SOLUTIONS
Brent Compton & Kyle BaderRed Hat StorageJanuary 2016
CLUSTER BUILDING BLOCKS
STANDARD SERVERS AND MEDIA (HDD, SSD, PCIE)
STANDARD NICS AND SWITCHES
WORKLOADS
ACCESS
PLATFORM
NETWORK
CEPH STORAGE CLUSTER
CEPH BLOCK & OBJECT CLIENTS
1. Qualify need for scale-out storage2. Design for target workload IO profile(s)3. Choose storage access method(s)4. Identify capacity5. Determine fault-domain risk tolerance6. Select data protection method
TargetCluster
Architecture
CLUSTER DESIGN CONSIDERATIONS
OpenStack Starter100TB
S500TB
M1PB
L2PB
IOPS OPTIMIZED
THROUGHPUTOPTIMIZED
COST-CAPACITY
OPTIMIZED
TARGET CLUSTER ARCHITECTURE
OpenStack Starter100TB
S500TB
M1PB
L2PB
IOPS OPTIMIZED
2-4x PCIe/NVMe slot servers (PCIe)12x 2.5” SSD bay servers
(SAS/SATA)
THROUGHPUTOPTIMIZED
12-16x 3.5” bay servers 24-36x 3.5” bay servers
24-36x 3.5” bay servers
COST-CAPACITY
OPTIMIZED60-72x 3.5” bay
servers
BROAD SERVER SIZE TRENDS
OpenStack Starter100TB
S500TB
M1PB
L2PB
IOPS OPTIMIZED
• Ceph RBD (block)• OSDs on all flash media (SATA SSD or PCIe)• High-bin, dual-socket CPU• 2x replication w/ backup or 3x replication• Multiple OSDs per drive (if PCIe)
THROUGHPUTOPTIMIZED
• Ceph RBD (block) or RGW (object)• OSDs on HDD media with dedicated SSD write journals (4:1 ratio)• Mid-bin, dual-socket CPU (single-socket adequate, servers <=12 OSDs)• 3x replication (RBD/RGW read intensive) or erasure-coded (RGW write-intensive)• High-bandwidth networking, >10Gb (for servers with >12 OSDs)
COST-CAPACITY
OPTIMIZED
• Ceph RGW (object)• OSDs on HDD media (write journals co-located on HDDs)• Mid-bin, single-socket CPU (dual-socket, servers >12 OSDs)• Erasure-coded data protection (v. replication)
BROAD SERVER CONFIGURATION TRENDS
Elastic provisioning across storage server clusterStandardized servers and networking
Petabyte scale: 10s, 100s, or 1000s of servers/clusterData HA across ‘islands’ of scale-up storage servers
Performance and capacity scaled independentlyIncremental vs. forklift upgrades
STEP 1: QUALIFY NEED FOR SCALE-OUT STORAGE
Performance vs. ‘cheap-and-deep’?Performance: throughput vs. IOPS intensive?
Small block vs. large block?Sequential vs. random IO?
Read vs. write mix?Latency: absolute vs. consistency targets?
STEP 2: DESIGN FOR TARGET WORKLOADS
DISTRIBUTED FILE* OBJECT BLOCK**
CEPH STORAGE CLUSTER
* Support for CephFS is not yet included in Red Hat Ceph Storage** RBD supported with replicated data protection only
STEP 3: CHOOSE STORAGE ACCESS METHODS
OpenStack Starter100TB
S500TB
M1PB
L2PB
IOPS OPTIMIZED
THROUGHPUTOPTIMIZED
COST-CAPACITY
OPTIMIZED
STEP 4: IDENTIFY CAPACITY
How much cluster capacity can you tolerate on one node?• With fewer nodes in the cluster, performance will be more degraded during
recovery• Each node must devote a greater % of its compute/IO utilization to recovery operations
• With fewer nodes in the cluster, maximum node utilization is limited• Each node must contribute a greater % of its reserve capacity for backfill/recovery
operations Guidelines:• Minimum supported (Red Hat Ceph Storage): 3 OSD nodes per cluster• Minimum recommended (performance cluster): 10 OSD nodes per cluster
• 1 node represents <10% of total cluster capacity• Minimum recommended (cost/capacity cluster): 7 OSD nodes per cluster
• 1 node represents <15% of total cluster capacity
STEP 5: DETERMINE FAILURE RISK TOLERANCE
STEP 6: SELECT DATA PROTECTION METHOD
Replication• Data is copied n times and spread onto different disks on different
servers• Clusters can tolerate n-1 disk failures without data loss• 3 replicas is a popular configuration
Erasure Coding (analogous to network RAID)• Data is encoded into k chunks with m parity chunks and spread onto
different disks on different servers• Clusters can tolerate m disk failures without data loss• 8+3 k+m is a popular configuration
This decision will affect the initial cost of your cluster more than any other.
1. Qualify need for scale-out storage2. Design for target workload IO profile(s)3. Choose storage access method(s)4. Identify capacity5. Determine fault-domain risk tolerance6. Select data protection method
TargetCluster
Architecture
CLUSTER DESIGN CONSIDERATIONS
RESOURCES
Ceph on Supermicro Performance & Sizing Guidehttp://www.redhat.com/en/resources/red-hat-ceph-storage-clusters-supermicro-storage-
servers
Ceph on Cisco UCS C3160 Whitepaperhttp://www.cisco.com/c/en/us/products/collateral/servers-unified-computing/ucs-c-series-
rack-servers/whitepaper-C11-735004.html
Ceph on Scalable Informatics Whitepaperhttps://www.scalableinformatics.com/assets/documents/Unison-Ceph-Performance.pdf
RED HAT STORAGE TEST DRIVES
Test drive:bit.ly/glustertestdrive
Test-drive:bit.ly/cephtestdrive