Transcript
Page 1: Replication Internals: Fitting Everything Together

Replication

InternalsFitting Everything Together

Page 2: Replication Internals: Fitting Everything Together

2.8, Refactored

● Architecture as of 2.8

● Unit testable; more, and faster, cpp tests

● Many changes (heartbeats, locking, future)

● Interop with 2.6

● Larger replica sets

Page 3: Replication Internals: Fitting Everything Together

Large Blocks

● Topology Manager (state machine)

● Replication Coordinator (repl facade)

● Applier (replicate/apply oplog)

● Executor (network, heartbeats, serialization)

● Commands (re-config, init, status, etc)

● External (writes, storage, query, commands)

Page 4: Replication Internals: Fitting Everything Together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Query

Executo

r

Page 5: Replication Internals: Fitting Everything Together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Query

Executo

r

Page 6: Replication Internals: Fitting Everything Together

Topology

● Maintains Authoritative Stateo Heartbeat, ping, member state

o Roles and transitions

● Contains Decision Logic

● Unit Testable

● Serial AccessTopology Manager

CFG

Page 7: Replication Internals: Fitting Everything Together

Examples

● updateConfig

● prepare*Response for commands

● getSyncSource, *

● setFollowerMode (state)

● processHeartbeat

● prepareHeartbeatResponse

Page 8: Replication Internals: Fitting Everything Together

PrepareHeartbeatResponseStatus TopologyCoordinatorImpl::prepareHeartbeatResponse(...) {

// Check error conditions, then set response fields …

response->setElectable(!_getMyUnelectableReason(...));

response->setHbMsg(_getHbmsg(...));

response->setTime(...);

response->setOpTime(lastOpApplied);

if (!_syncSource) {

response->setSyncingTo(_syncSource); }

… topology_coordinator_impl.cpp:628

Page 9: Replication Internals: Fitting Everything Together

Failover Scenario

Heart

beats P

S

S

Health Check (rsHB)Active Primary

Page 10: Replication Internals: Fitting Everything Together

Failover Scenario

Heart

beats P

S

S

Active PrimaryP

Failed Primary

Page 11: Replication Internals: Fitting Everything Together

Failover Scenario

Heart

beats Failed

P

S

Health Check (rsHB)

Page 12: Replication Internals: Fitting Everything Together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Query

Executo

r

Page 13: Replication Internals: Fitting Everything Together

Replications Coordinator

● Interface to other subsystems

● Uses executor to scheduleo Commands

o Elections, Initiate, Reconfig

o Role/State Changes

● Unit Testableo With help, requires mocking out bridge for

subsystems

Replication

Coordinator

Page 14: Replication Internals: Fitting Everything Together

Blocks

ApplierReplication

Coordinator

OplogC

MD

s

Write

s

Query

Executo

r

Topology ManagerCFG

Page 15: Replication Internals: Fitting Everything Together

Examples

● process*Response for commands

● awaitReplication* (for writes or migration)

● isReplEnabled

● canAcceptWrites*

Page 16: Replication Internals: Fitting Everything Together

Accepting writesstatic bool checkIsMasterForDatabase(const std::string& db, ...) {

if (!getReplicationCoordinator()->canAcceptWritesForDatabase(db)){

errorDetail->setErrCode(ErrorCodes::NotMaster);

errorDetail->setErrMessage("Not primary while writing to " + ns);

return false;

}

return true;

}

Page 17: Replication Internals: Fitting Everything Together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Query

Executo

r

Page 18: Replication Internals: Fitting Everything Together

Applier

● Reads from *upstream* oplog

● Applier operations transformations

● Mostly unchanged since 2.4

● Includes UpdatePosition commands

Applier

Page 19: Replication Internals: Fitting Everything Together

Read + Apply Decoupled

● Background oplog reader thread (net)

● Pool of oplog applier threads (by collection)

Repl Source

Applier

Pool

Buffer

DB4

DB3

DB1 DB2

Local Oplog

Network

Page 20: Replication Internals: Fitting Everything Together

Replication Operations

oplog entry (fields):

o = update, o2 = query

{ "ns" : "test.tags",

"op" : "u", "v" : 2, "ts": ...,

"o2" : { "_id" : 1 },

"o" : { "$set" : { "tags.4" : "e" } } }

Page 21: Replication Internals: Fitting Everything Together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Query

Executo

r

Page 22: Replication Internals: Fitting Everything Together

Executor

● Serializes access to Topology state

● Serializes global state changes wrt db writes

● Processes network requests in IO pool

● Supports event/signal notification

Page 23: Replication Internals: Fitting Everything Together

Write Request

● Sent by user

● Interpreted by command subsystem

● Checked by replication coordinator

● Executed

● Idempotent entry recorded in oplog

● ~ Replicated

● ~ Possibly verified during user write request

Page 24: Replication Internals: Fitting Everything Together

Write Request

ApplierReplication

Coordinator

OplogC

MD

s

Write

s

Query

Executo

r

Topology ManagerCFG

Page 25: Replication Internals: Fitting Everything Together

● Topology Manager (state machine)

● Replication Coordinator (repl facade)

● Applier (replicate/apply oplog)

● Executor (network, heartbeats, serialization)

● Commands (re-config, init, status, etc)

● External (writes, storage, query, commands)

Page 26: Replication Internals: Fitting Everything Together

Thanks

Questions?