9
OpenStack Meetup June 29, 2011 Computer History Museum Mountain View, CA Joe Arnold - Cloudscaling twitter: @joearnold blog: http://joearnold.com Swift in the Small Wednesday, June 29, 2011 - The theme of tonight is Corporate IT. - The promise of OpenStack for Corporate IT is the ability to take advantage of -- all the great tooling, -- all the great services, -- all the compatible applications that use infrastructure cloud services as a platform. - It gives the ability to deploy cloud infrastructure in-house. - Tonight I’ll be covering OpenStack Object Storage, Swift -- In the Small - Raise of hands: How many have downloaded and installed either Swift?

OpenStack Meetup - Swift in the Small

Embed Size (px)

DESCRIPTION

Strategies for staring with a single node OpenStack Object Storage (Swift) node and growing up to 4 zones.

Citation preview

Page 1: OpenStack Meetup - Swift in the Small

OpenStack Meetup

June 29, 2011

Computer History Museum

Mountain View, CA

Joe Arnold - Cloudscaling

twitter: @joearnold

blog: http://joearnold.com

Swift in the Small

Wednesday, June 29, 2011

- The theme of tonight is Corporate IT. - The promise of OpenStack for Corporate IT is the ability to take advantage of -- all the great tooling, -- all the great services, -- all the compatible applications that use infrastructure cloud services as a platform. - It gives the ability to deploy cloud infrastructure in-house.

- Tonight I’ll be covering OpenStack Object Storage, Swift -- In the Small- Raise of hands: How many have downloaded and installed either Swift?

Page 2: OpenStack Meetup - Swift in the Small

Wednesday, June 29, 2011

- Swift is an Object Storage system that was designed for scale. - This was one of the first clusters we deployed. - It’s a petabyte of useable storage. It can serve a lot of users. - For the spinning disks of aluminum, bent sheet metal, forged iron for the racks, strands of glass, and silicon wafers, etc... A deployment like this is a great deal at $500,000 and a million dollars. - But not everyone needs a petabyte out of the gate.- Even for these deployments, we have staging clusters in the range of 80-100 TB

Page 3: OpenStack Meetup - Swift in the Small

Wednesday, June 29, 2011

- Challenge for this ‘Corporate IT’ theme is what a small-scale Object Storage (Swift) cluster would look like. - What does it take and what compromises are made when scaling down something designed for large-scale.

- This, for example, is a 4-U, 36 drive system from ComputerLINK. ComputerLINK was nice enough to provide a demo unit for the meetup tonight. - I’ll be powering it up in a few minutes and if you’re interested, you can come over and we can start pulling drives and watch data get replicated around.

Page 4: OpenStack Meetup - Swift in the Small

Zone

Zone 2

Zone Zone Zone

Wednesday, June 29, 2011

Why is this a challenge? — Zones

- Swift is designed for large-scale deployments. - The mechanisms for replication and data distribution are built on the concept that data is distributed across isolated failure boundaries. These isolated failure boundaries are called zones.

- Unlike RAID systems, data isn’t chopped up and distributed throughout the system. - With Swift whole files are distributed throughout the system. Each copy of the data resides in a different zone.

- Swift stores 3 copies of the data, so at least 4 zones are required. (in case 1 zone fails)- Preferably 5 zones (so that 2 zones can fail).

- In the big clusters, failure boundaries can be separate racks with their own networking components.- In medium deployments, a physical node can represent a zone.

- For smaller deployments with fewer then 4 nodes, drives need to be grouped together to form pseudo-failure boundaries. - A grouping of drives is simply declared a zone.

- Here is a scheme for starting small and growing the cluster bit-by-bit (well.. terabyte-by-terabyte).

Page 5: OpenStack Meetup - Swift in the Small

4 Disks 4 Zones

Wednesday, June 29, 2011

- For a single storage node the minimum configuration would have 4 drives for data + 1 boot drive.- Each disk is a zone.- If a single drive fails, it’s data will be replicated to the remaining 3 drives in the system.- The system would grow, 4-disks at at time (one in each zone) until the chassis was full.

Page 6: OpenStack Meetup - Swift in the Small

Zone 1 Zone 2

Zone 3 Zone 4

Wednesday, June 29, 2011

- The strategy here is to split the zones evenly across the two nodes.

- The addition of an additional node does increases availability (assuming that load balancing is configured), - but it does does not create a master-slave configuration. If one of the nodes is down ½ of your zones are unavailable.

- The good news is that if one of the nodes is down (½ of your zones), data is still accessible. - This is because because at least one of the zones will still up on the remaining node.

- The bad news is that there is still a 1 in 2 chance that writes will fail - because at least two of three zones need to be written to for the write to be considered successful.

Page 7: OpenStack Meetup - Swift in the Small

Zone 1 ⅓ Zone 4

Zone 2

Zone 3

⅓ Zone 4

⅓ Zone 4

Wednesday, June 29, 2011

- The addition of a third node further enables distribution of zones across the nodes.

- Something strange is going on here by putting whole zones in each node, - but breaking up zone 4 into thirds and distributing across the three nodes. - This is done to enable smoother rebalancing when going to 4 nodes.

- Again, if a single node is down, data will be available, but there will be a 1 if 5 chance that a write would fail.

Page 8: OpenStack Meetup - Swift in the Small

Zone 1 Zone 2

Zone 3 Zone 4

Wednesday, June 29, 2011

- The strategy of breaking up Zone 4 into thirds with 3 nodes, is to make this transition easier.

- The cluster can be configured with zone 4 entirely on that new server, - then the remaining zones can slowly be rebalanced to fold-in the newly vacated drives on their node.

- Now, if a single node fails, writes will be successful as at least two zones will be available.

Page 9: OpenStack Meetup - Swift in the Small

Wednesday, June 29, 2011

- Why small-scale Swift?- Using OpenStack Object Storage is a private-cloud alternative to S3, CloudFiles, etc. - This enables private cloud builders to start out with a single machine their own data center and scale-up as their needs grow.

- Why not use RAID?- Why not use a banana? :) It’s a different storage system, used for different purposes. - Going with a private deployment of Object Storage gives something that looks and feels just like Rackspace Cloud Files.- App developers don’t need to attach a volume to use the storage system and assets can be served directly to end users or to a CDN.

- The bottom line is that a small deployment can transition smoothly into a larger deployment.

- The great thing about OpenStack being open-source software is that it gives us the freedom to build and design systems however we see fit.