Upload
cloudscaling-inc
View
5.108
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Randy Bias, Co-Founder and CTO of Cloudscaling, speaks on open storage, fault tolerance and the concept of failure "blast radius" at the Open Storage Summit, hosted by Nexenta in May 2012.
Citation preview
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution** All unlicensed or borrowed works retain their original licenses
Cloud Storage Futures (previously: Designing Private & Public Clouds)
May 22nd, 2012
Randy Bias, CTO & Co-founder
@randybias
Part 1:
The Two Cloud Architectures
2
A Story of Two Clouds
3
Scale-out
Enterprise
... Driven by Two App Types
4
New Elastic Apps
Existing Apps
Cloud Computing ... Disrupts
EnterpriseComputing
"Client-Server"
CloudComputing
"Web"
MainframeComputing
"Big Iron"
1960 1980 2000 2020
5
Cloud Computing ... Disrupts
Disruption
EnterpriseComputing
"Client-Server"
CloudComputing
"Web"
MainframeComputing
"Big Iron"
1960 1980 2000 2020
5
Cloud Computing ... Disrupts
Disruption Disruption
EnterpriseComputing
"Client-Server"
CloudComputing
"Web"
MainframeComputing
"Big Iron"
1960 1980 2000 2020
5
Cloud Computing ... Disrupts
Disruption Disruption Disruption
EnterpriseComputing
"Client-Server"
CloudComputing
"Web"
MainframeComputing
"Big Iron"
1960 1980 2000 2020
5
Mainframe“big-iron”
Enterprise“client/server”
Cloud“scale-out”
SLA
Scaling
Hardware
HA Type
Software
Consumption
6
IT – Evolution of Computing Models
99.999
Vertical
Custom
Hardware
Centralized
Centralized Service
Mainframe“big-iron”
Enterprise“client/server”
Cloud“scale-out”
SLA
Scaling
Hardware
HA Type
Software
Consumption
6
IT – Evolution of Computing Models
99.999
Vertical
Custom
Hardware
Centralized
Centralized Service
99.9
Horizontal
Enterprise
Software
Decentralized
Shared Service
Mainframe“big-iron”
Enterprise“client/server”
Cloud“scale-out”
SLA
Scaling
Hardware
HA Type
Software
Consumption
6
IT – Evolution of Computing Models
99.999
Vertical
Custom
Hardware
Centralized
Centralized Service
99.9
Horizontal
Enterprise
Software
Decentralized
Shared Service
Always On
Commodity
Distributed
Self-service
Mainframe“big-iron”
Enterprise“client/server”
Cloud“scale-out”
SLA
Scaling
Hardware
HA Type
Software
Consumption
6
IT – Evolution of Computing Models
Enterprise Computing(existing apps built in silos)
7
Cloud Computing(new elastic apps)
8
Traditional apps Elastic cloud-ready apps
APPS
INFRA
9
Scale-out apps require elastic infrastructure
Traditional apps Elastic cloud-ready apps
APPS
INFRA
10
Scale-out Cloud Technology
Scale-out Principles
• Small failure domains
• Risk acceptance vs. risk mitigation
• More boxes for throughput & redundancy
• Assume app manages complexity:
• Data replication
• Assumes infrastructure is unreliable:
• Server & data redundancy
• Geo-distribution
• Auto-scaling
11
What’s a failure domain?
• “Blast radius” during a failure
• What is impacted?
• Public SAN failures:
• FlexiScale SAN failure in 2007
• UOL Brazil in 2011:
• http://goo.gl/8ct9n
• There are many more
• Enterprise HA ‘pairs’ typically support BIG failure domains
12
Two Diff Arches for Two Kinds of Apps
13
Loca
tion
of c
ompl
exity
Infra
App
Up OutPrimary scaling dimension
Elastic Scale-out Cloud
“Enterprise” Cloud
Part 2:
Storage Architectures Are Changing
14
Uptime in InfraEvery part is redundant
Data mgmt in InfraBigger SAN/NAS/DFS
Two Diff Storages for Two Kinds of Clouds
15
“Scale-out” Storage
“Classic”Storage
Uptime in appsMinimal h/w redundancy
Data mgmt in appsSmaller failure domains
Difference in Tiers
16
Tier $ Purpose Classic Scale-out
1 $$$$Mission Critical
•SAN, then NAS•10-15K RPM•SSD
•On-demand SAN (EBS)•DynamoDB (AWS)•Variable service levels
2 $$ Important •NAS then SAN•7.2K RPM
•DAS•App / DFS to scale out
3 $Archive & Backups
•Tape•Nearline 5.4K •Object Storage
• In scale-out systems, apps are managing the data:
• Riak / Scale-out distributed data store
• Hadoop+HDFS / Scale-out distributed computation systems
• Cassandra / Scale-out distributed columnar database
The Biggest Difference is in Where Data Management Resides
17
Cassandra / Netflix use case
18
• 3 x Replication
• Linearly scaling performance
• 50 - 300 nodes
• > 1M writes/second
• When is this perfect?
• data size unknown
• growth unknown
• lots of elastic dynamism
Cassandra / Netflix use case
19
• DAS (‘ephemeral store’)
• Per node perf is constant
• disk
• CPU
• network
• Client write times constant
• Nothing special here
Cassandra / Netflix use case
20
• On-demand & app-managed
• Cost per GB/hr: $.006
• Cost per GB/mo: $4.14
• Includes: storage, DB, storage admin, network, network admin, etc. etc. etc.
Part 3:
Scale-out Storage ... Now & Future
21
Only Change is Certain
22
There are a few basic approaches being taken ...
23
Dedicated Storage SW
9K Jumbo Frames
SSD caches (ZIL/L2ARC)
No replication
Max HW redundancy
• In-rack SAN == faster, bigger DAS w/ better stat-muxing
• Accept normal DAS failure rates
• Assume app handles data replication
• Like AWS ‘ephemeral storage’
• KT architecture
• Customers didn’t “get it”
• “Ephemeral SAN” not well understood
Scale-out SAN
24
AWS EBS - “Block-devices-as-a-Service”
• Scale-out SAN (sort of)
• Block scheduler
• Async replication
• Some failure tolerance
• Scheduler:
• Allocates customer block devices across many failure domains
• Customer run RAID inside VMs to increase redundancy
25
Core Network1
2
31
2
3
RAC
K
RAC
K
EBS Clusters
1
2
3
4
5
6
7
8
Intra-rack cluster async replication
Inter-rack cluster async replication
VM1
VM2
N1
N2
EBSScheduler
Cloud Control System
API
DAS + Big Data(Storage + Compute + DFS)
• Storage capability:
• Replication
• Disk & server failure
• Data rebalancing
• Data locality
• rack awareness
• Checksums (basic)
• Also:
• Built in computation
26
Distributed File Systems (DFS) over DAS
• Storage capability:
• Replication
• Disk & server failure
• Data rebalancing
• Checksums (w/ btrfs)
• Block devices
• Also:
• No computation
27
ceph architecture
✴ Looks familiar doesn’t it?
✴
Why is DFS at the Physical Layer Dangerous for Scale-out?
28
DFS
==
DAS + Database Replication / Scaling
29
• Storage capability:
• Async/Sync Replication
• Server failure
• Checksums (sort of)
• Also:
• Std RDBMS
• SQL i/f
• Well understood
• Storage capability:
• Replication
• Disk & server failure
• Data rebalancing
• Checksums (sometimes)
• Also:
• Looks like a big web app
• Uses a DHT/CHT to ‘index’ blobs
• Very simple
Object Storage
30
Where does OpenStorage Fit?
31
Scale-out Solution
Purpose / Tier Virtual or Physical?
Fit
Scale-out SAN Tier-1/2 Physical In-rack SAN
EBS Tier-1 PhysicalEBS Clusters
(scale-out SAN)
DAS+BigData Tier-2 VirtualReliable, bit-rot resistant DAS
DAS+DFS Tier-2 Physical / VirtualReliable, bit-rot resistant DAS
(unproven)
DAS+DB Tier-2 Virtual In-VM reliable DAS
Object Storage Tier-3 PhysicalReliable, bit-rot resistant DAS
Summarizing ZFS Value in Scale-out
• Data integrity & bit rot an issue that few solve today
• Most SAN/NAS solutions don’t ‘scale down’
• Commodity x86 servers are winning
• There are two scale-out places ZFS wins:
• Small SAN clusters
• Best DAS management
32
Summary
33
Conclusions / Speculations
• Build the right cloud
• Which means the right storage for *that* cloud
• A single cloud might support both ...
• Open storage can be used for both ...
• ... WITH the appropriate design/forethought
34
35
Q&A@randybias