18
Designing and Building Multi-Region Swift Deployment KINX(www.kinx.net) CLOUD TECH GROUP Shawn Kim 1

Designing and Building Multi-Region Swift Deployment

Embed Size (px)

Citation preview

Page 1: Designing and Building Multi-Region Swift Deployment

Designing and BuildingMulti-Region Swift Deployment

KINX(www.kinx.net)CLOUD TECH GROUPShawn Kim

1

Page 2: Designing and Building Multi-Region Swift Deployment

Table of ContentsSwift

Architecture Review

Node Roles & deployment

Ring

Enhancements

Demonstration

Q&A2

Page 3: Designing and Building Multi-Region Swift Deployment

SwiftObject Store Project of OpenStack

CharacteristicsHigh availability and high scalability

By keeping replications, Swift avoids data loss.

Recommended copies >= 3

3 Layered data: Account / Container / Object

Account: Namespace which divides Containers.

Container: Namespace which divides Objects. Min ACL configuration unit.

Object: Data.

Ring

A data structure that determines where data is in cluster.

Rings are built, modified, and distributed by using external tool named ring-builder.

Swift server processes just read the rings.

3

Page 4: Designing and Building Multi-Region Swift Deployment

Swift ContSwift Architecture

Servier Processes + Consistency Processes 로 구성됨Server Processes provide Swift services.

Proxy server

Account / Container / Object servers

Consistency Processes maintains consistency of A/C/O objects or metadata.

A/C/O Auditor

A/C/O Replicator

Account Reaper

Container / Object Updater

Object Expirer

4

Youngsoo Lee
URL 설명 추가
Page 5: Designing and Building Multi-Region Swift Deployment

ArchitectureReview

Problem:Region down induces all data

loss.

Alternative: Multi region Swift.

>=3 copy replications.

Dedicate Keystones with DB synchronization.

Multi Proxies with GSLB.5

Figure 1. Multi-Region Architecture

Page 6: Designing and Building Multi-Region Swift Deployment

Architectural Review - Design DecisionsReplication >= 3

Proxy and Keystone are deployment on the same nodeProxy and Keystone share the same DNS name and SSL configuration.

Easy Keystone HA configuration.

Multi-Region architecture:Region level high availability

Equally distributed replications#zones = #replictions, Each zone has one replication.

1 Region has at least 1 replication.

6

Page 7: Designing and Building Multi-Region Swift Deployment

Architectural Review - Design Decisions Cont.Connection btw regions

Storages across different regions cannot be connected with internal IPs.

Under the multi-region environment, storages are connected by public IPs with iptables configuration for security consideration.

Swift MasterMaster Role: Swift installation base and gateway for all nodes.

MGMT Role: Ring building, management and distribution.

Auth Module:Keystone versus. SWAuth: Keystone Win!

Both: Impossible! Swift Proxy Server’s pipeline supports very simple functionality.

SWAuth: It uses Swift object. But cannot it integrated with other OpenStack services such as cinder, glance, and so forth.

It cannot be integrated with other OpenStack Services such as cinder and glance.

Keystone Auth version: v2.0, v3

7

Page 8: Designing and Building Multi-Region Swift Deployment

Architectural Review - Design Result2 Regions, 3 Zones, 3

storage nodes with 9 volumes.

Node naming convention:r<digit>-[z<digit>]-

<role><digit>

ex) r1-p1

ex) r2-z1-s1

It can be referred when Ring is built. 8Figure 2. Deployed Multi-Region Swift : Minimum Configuration

Page 9: Designing and Building Multi-Region Swift Deployment

Node Roles & Package Deployment

Keystone Node

Python-openstackclient

Mariadb-server-5.5

Keystone

Memcached

9

Proxy Node

Python-openstackclient

Swift

Swift-proxy

Python-swiftclient

Python-keystonemiddleware

Memcached

Storage Node

Python-openstackclient

Xfsprogs, rsync

Swift

Swift-account

Swift-container

Swift-object

Keystone node is integrated with Proxy nodes

Page 10: Designing and Building Multi-Region Swift Deployment

10

Node Roles & Package Deployment

Swift Master + MGMT

Python-openstackclient

Swift

Python-swiftclient

LMA

EX) SwiftStack: Zabbix + ELK

Page 11: Designing and Building Multi-Region Swift Deployment

Ring Ring is abbreviation of Modified Consistent Hashing Ring.

Swift’s Object is located by using Hash function:[Drive_id] = md5([account_x]/[container_y]/[object_z]) MOD [total_drives]

11

Each drive’s weight determine its length of range.

All drives are placed fare random order.Ex) [hash of object] is placed in range of drive 4.

Figure 3. Drives and the hashing ring (OpenStack Swift 2014)*

*Joe Arnold & member of the SwiftStack team. OpenStack Swift: USING, ADMINISTERING, AND DEVELOPING FOR SWIFT OBJECT STORAGE. O’REILLY, 2014.

Page 12: Designing and Building Multi-Region Swift Deployment

ConsistentHashing Ring Cont.

Minimize object moving even if disk is added or removed.

12

Ex)

Object is originally located at drive 4.

After adding drive 5, the hash value of object is belong to drive 5.

Therefore, the object is moved to the drive 5.

Other ranges are not modified. Therefore, no object in the ranges is moved.

Figure 5. New drive added to a ring (OpenStack Swift 2014)**

Figure 4. Drives and the hashing ring (OpenStack Swift 2014)*

*, **Joe Annold & member of the SwiftStack team. OpenStack Swift: USING, ADMINISTERING, AND DEVELOPING FOR SWIFT OBJECT STORAGE. O’REILLY, 2014.

Page 13: Designing and Building Multi-Region Swift Deployment

(Real) Consistent Hashing Ring Cont.

In the Swift rings, each drive is distributed with many narrow ranges in fare random order.

When a new disk is added, new ranges are added tail of arbitrary ranges.

See the ranges of drive 5.

13

Figure 6. Many ranges for each drive (OpenStack Swift 2014)*

*Joe Annold & member of the SwiftStack team. OpenStack Swift: USING, ADMINISTERING, AND DEVELOPING FOR SWIFT OBJECT STORAGE. O’REILLY, 2014.

Page 14: Designing and Building Multi-Region Swift Deployment

Modified Consistent Hashing RingSome ‘Modifications’ applied Consistent Hashing Ring:

Partitions

Swift ring is comprised with fixed ranged partitions.

Partition stores and indicates Object’s location(s).

Partitions power

Total partitions in cluster = 2^partitions_power

Arbitrary integer which was defined when cluster creation.

Replica count

Number of replications is defined when ring is built.

Replica locks

During one partition is moved to another disk, ring and the other replica(s) are locked to be temporary immutable.

Data distribution mechanisms such as drive weight, and unique-as-possible placement.

14

Page 15: Designing and Building Multi-Region Swift Deployment

Modified Consistent Hashing Ring Cont.Hash(path) = md5(path + per_cluster_suffix)

Partitions_powerPartitions_power = UPPER_BOUND[log2(number_of_disks * 100)]

After selecting partitions_power, it is Immutable.

Bigger partitions_power makes ring bigger with more indices.

Bigger rings use more memory.

Partition = hash >> part_shiftPart_shift = 32 - partitions_power

15

Page 16: Designing and Building Multi-Region Swift Deployment

EnhancementsSwift3: Apply S3 API compatible Middleware.

LMA Node: ELK and Zabbix

Enable swift middlewares such as Tempurl, domap_remap, and staticweb

Test automationSwift Tox: test coverage is 50 ~ 80%.

After S3 API (Swift3) integration, use Boto unit Test.

Swift Deployment Automation using PuppetSwift module.Chef Cookbook-openstack-object-storage was deprecated after Mitaka release.

Performance OptimizationAccount, Container Ring files on SSD drive

Account, Container Storage on SSD drive

16

Page 17: Designing and Building Multi-Region Swift Deployment

DemonstrationUploading sample data

LOGs @ each storage nodes

Trying download after killing 2 object server.

Swift-ring-builder and builder file.

Swift-dispersion-populate and report

17

Page 18: Designing and Building Multi-Region Swift Deployment

QnA

18

Q: When Swift’s PUT request returns success after making how many replications?

A: Just one object.

Q: With 3 object servers and 3 replication configured Swift, and PUT request is occasionally failed when 2 arbitrary object servers are stopped. Is it a default behavior? Also, can I configure this behavior?

A: Unfortunately, object’s hash values (maybe 3 counts) can indicate disabled object servers. In this case, the object cannot be written because all object servers are not be reachable.

Q: With HA configured proxy servers, and trying to read right after writing an object, it is possible for end-user to be unreachable to the object?

A: No. By searching its ring, swift can locate the written object among the object’s replication(s) that was written and candidate(s) that is not written yet.

Youngsoo Lee
질문: 전체 3개의 zone의 경우, 2개가 down인 경우 write가 되지 않았습니다. 이것에 대한 설정이 있는지 default behavior 인지?
김시헌
세미나 중 발생했던 현상은 ring에 의한 것으로 보입니다.
김시헌
object-server setting: - up: r1-z1-s1 - down: r1-z2-s1, r2-z1-s1objects: - sample.1: 201 created - sample.2: 201 created - sample.3: 201 created - sample.4: fail
Youngsoo Lee
질문: Write후 바로 read할 때 proxy server가 달라질 경우 write된 데이터가 반영되지 않는 경우가 발생하는가?
Youngsoo Lee
질문: Write시 몇개의 replication을 만들고 나서 성공을 return 하는지?
김시헌
write는 적어도 1개만 성공해도 201 created를 리턴합니다. replication은 확인 중입니다