View
2.048
Download
1
Category
Tags:
Preview:
DESCRIPTION
"In this session, we’ll detail Red Hat Storage Server data replication strategies for both near replication (LAN) and far replication (over WAN), and explain how replication has evolved over the last few years. You’ll learn about: Past mechanisms. Near replication (client-side replication). Far replication using timestamps (xtime). Present mechanisms. Near replication (server side) built using quorum and journaling. Faster far replication using journaling. Unified replication. Replication using snapshots. Stripe replication using erasure coding."
Citation preview
RED HAT STORAGE SERVERREPLICATION: PAST AND PRESENTJeff Darcy, Venky Shankar, Raghavan PichaiGlusterFS/RHS Developers @ Red Hat
Talk Outline
Background Local replication Remote replication Next steps Questions
BackgroundTypes of replication, goals, and challenges
Synchronous Replication
S
S
Y
Y
N
N
C
C
+ high consistency - network sensitive
Quorum Enforcement
Replica #1 Replica #2 Replica #3
Majority can write Minority can’t
There can only be one majority => no split brain
Synchronous Replication Data Flows
X
X
X
Y
Y
Y
Chain Fan Out
Client
Server
Server
Client
Server
Server
Fan Out Replication
Y
Y
Y Client
Server
Server
SplitBandwidth
Wait forSlowest
Chain Replication
X
X
X
Client
Server
Server
FullBandwidth
Two Hops
Asynchronous Replication
A
A
S
S
C
C
Y
Y
N
N
+ low consistency - network insensitive
Effect of Network Partitions
A
A
S
S
MY
Y N
What’s the correct value?
Tradeoff Space
Network Sensitive Network Insensitive
HighConsistency
LowConsistency
S
A
Red Hat StorageSynchronous Near-ReplicationRaghavan PDeveloper, Red Hat
Traditional replication using AFR
“Automatic file replication” Client based replication Entry, meta data and data based replication. Automated Self healing in case bricks recover after failure.
AFR Sequence Diagram
Client 1
Client 2
Server A
Server B
LockPre Op
OpPost Op
Unlock
Lock (blocked) Pre Op
AFR improvements
In 3.4 release Eager locking Piggybacking Server quorum In 3.5 release Granular self heal
In 3.6 release Rewrite of the code Pending counters Self healing in the context of self heal daemon
NSR – new style (aka server side) replication Replication to the back end (brick processes) Controlled by a designated “leader” also known as sweeper. AdvantagesBandwidth usage of client network optimized for direct (fuse) mountsAvoidance of split brain Sweeper elected using majority principle. Per term Changelog on the sweeper preseves the ordering of operations.Variable consistency models for trading consistency with performance.
NSR high level blocks
NSR client side translator
Sends IO to sweeper
Sweeper (leader)
Forwards IO to peers
Commits after all peer completion
Non sweeper (follower)
Accepts IO only from sweeper or reconciliation
Rejects IO from client (client retry)
Change log
Reconciliation
Makes use of membership to figure out terms missing.
Makes use of change logs for syncing the corresponding terms.
NSR Sequence Diagram
Client 1
Client 2
Sweeper
Follower
Client 1 Request
Client 2 Request
Red Hat Storage ServerGeo-ReplicationVenky ShankarDeveloper, Red Hat
Geo-Replication Asynchronous data replication Continuous, Incremental
Across geographies One site (master) to another (slave) Multi-slave Cascading Fan-out
Disaster Recovery
Remote Replication: Past
Single node Change detection Crawling (xtime based crawl)
Data synchronization Rsync
Suboptimal processing rename, deletes, hardlink
Overview
Crawling and xtime
Xtime Inode changed time Marked up to root (marker xlator)
Crawling/Scanning Directory crawl and file synchronization
xtime(master) > xtime(slave)
Slave xtime maintained by master
Remote Replication: Present
Overview Multi node Distributed (parallel) synchronization Replica failover
Change detection Consumable journals
Data synchronization (configurable) Rsync, tar+ssh (large number of small files)
Efficient processing rename, delete, hardlink
Journaling
Journaling Translator (changelog) Records FOP (efficiently) local to a brick Data, Entry, Metadata
Change detection : O(1) relative to number of changes
Consumer library (libgfchangelog) Per brick Publish/Subscribe mechanism Journals periodically published
Remote Replication: Future
Replicating Snapshots Multi Master Vector clocks Conflict detection & resolution
Libgfapi integration Geo-replication to Swift target
Features
Red Hat Storage ServerReplication-related FeaturesJeff DarcyDeveloper, Red Hat
Unified Replication
Leader
ChangeLog
LocalReplica
ChangeLog
RemoteReplica
ChangeLogSync Async
Erasure Coding (a.k.a. “disperse”)
D1 D2 D3 D4 P1 P2 P3
D1 D2 D3 D4 P1 P2 P3
D2
Also…
VolumeSnapshot
FileSnapshot
Deduplication+
CompressionChecksums
OK
OK
Tiering (a.k.a. data classification)
Tier 0
Tier 1
Tier 2
SSD, no replication
Normal disk, sync replication
SMR disk, erasure codingcompression + checksumsasync replication
Questions?
Recommended