Transcript
Page 1: MongoDB 2.8 Replication Internals: Fitting it all together

Replication

InternalsFitting Everything Together

Page 2: MongoDB 2.8 Replication Internals: Fitting it all together

2.8, Refactored

● Architecture as of 2.8

● Unit testable; more, and faster, cpp tests

● Many changes (heartbeats, locking, future)

● Interop with 2.6

● Larger replica sets

Page 3: MongoDB 2.8 Replication Internals: Fitting it all together

Large Blocks

● Topology Manager (state machine)

● Replication Coordinator (repl facade)

● Applier (replicate/apply oplog)

● Executor (network, heartbeats, serialization)

● Commands (re-config, init, status, etc)

● External (writes, storage, query, commands)

Page 4: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Qu

ery

Executo

r

Page 5: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Qu

ery

Executo

r

Page 6: MongoDB 2.8 Replication Internals: Fitting it all together

Topology

● Maintains Authoritative Stateo Heartbeat, ping, member state

o Roles and transitions

● Contains Decision Logic

● Unit Testable

● Serial AccessTopology Manager

CFG

Page 7: MongoDB 2.8 Replication Internals: Fitting it all together

Examples

● updateConfig

● prepare*Response for commands

● getSyncSource, *

● setFollowerMode (state)

● processHeartbeat

● prepareHeartbeatResponse

Page 8: MongoDB 2.8 Replication Internals: Fitting it all together

PrepareHeartbeatResponseStatus TopologyCoordinatorImpl::prepareHeartbeatResponse(...) {

// Check error conditions, then set response fields …

response->setElectable(!_getMyUnelectableReason(...));

response->setHbMsg(_getHbmsg(...));

response->setTime(...);

response->setOpTime(lastOpApplied);

if (!_syncSource) {

response->setSyncingTo(_syncSource); }

… topology_coordinator_impl.cpp:628

Page 9: MongoDB 2.8 Replication Internals: Fitting it all together

Failover Scenario

Heart

beats P

S

S

Health Check (rsHB)Active Primary

Page 10: MongoDB 2.8 Replication Internals: Fitting it all together

Failover Scenario

Heart

beats P

S

S

Active PrimaryP

Failed Primary

Page 11: MongoDB 2.8 Replication Internals: Fitting it all together

Failover Scenario

Heart

beats Failed

P

S

Health Check (rsHB)

Page 12: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Qu

ery

Executo

r

Page 13: MongoDB 2.8 Replication Internals: Fitting it all together

Replications Coordinator

● Interface to other subsystems

● Uses executor to scheduleo Commands

o Elections, Initiate, Reconfig

o Role/State Changes

● Unit Testableo With help, requires mocking out bridge for

subsystems

Replication

Coordinator

Page 14: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

ApplierReplication

Coordinator

OplogC

MD

s

Write

s

Qu

ery

Executo

r

Topology ManagerCFG

Page 15: MongoDB 2.8 Replication Internals: Fitting it all together

Examples

● process*Response for commands

● awaitReplication* (for writes or migration)

● isReplEnabled

● canAcceptWrites*

Page 16: MongoDB 2.8 Replication Internals: Fitting it all together

Accepting writesstatic bool checkIsMasterForDatabase(const std::string& db, ...) {

if (!getReplicationCoordinator()->canAcceptWritesForDatabase(db)){

errorDetail->setErrCode(ErrorCodes::NotMaster);

errorDetail->setErrMessage("Not primary while writing to " + ns);

return false;

}

return true;

}

Page 17: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Qu

ery

Executo

r

Page 18: MongoDB 2.8 Replication Internals: Fitting it all together

Applier

● Reads from *upstream* oplog

● Applier operations transformations

● Mostly unchanged since 2.4

● Includes UpdatePosition commands

Applier

Page 19: MongoDB 2.8 Replication Internals: Fitting it all together

Read + Apply Decoupled

● Background oplog reader thread (net)

● Pool of oplog applier threads (by collection)

Repl Source

Applier

Pool

Buffer

DB4

DB3

DB1 DB2

Local Oplog

Network

Page 20: MongoDB 2.8 Replication Internals: Fitting it all together

Replication Operations

oplog entry (fields):

o = update, o2 = query

{ "ns" : "test.tags",

"op" : "u", "v" : 2, "ts": ...,

"o2" : { "_id" : 1 },

"o" : { "$set" : { "tags.4" : "e" } } }

Page 21: MongoDB 2.8 Replication Internals: Fitting it all together

Blocks

Applier

Topology Manager

Replication

Coordinator

Oplog

CFG

CM

Ds

Write

s

Qu

ery

Executo

r

Page 22: MongoDB 2.8 Replication Internals: Fitting it all together

Executor

● Serializes access to Topology state

● Serializes global state changes wrt db writes

● Processes network requests in IO pool

● Supports event/signal notification

Page 23: MongoDB 2.8 Replication Internals: Fitting it all together

Write Request

● Sent by user

● Interpreted by command subsystem

● Checked by replication coordinator

● Executed

● Idempotent entry recorded in oplog

● ~ Replicated

● ~ Possibly verified during user write request

Page 24: MongoDB 2.8 Replication Internals: Fitting it all together

Write Request

ApplierReplication

Coordinator

OplogC

MD

s

Write

s

Qu

ery

Executo

r

Topology ManagerCFG

Page 25: MongoDB 2.8 Replication Internals: Fitting it all together

● Topology Manager (state machine)

● Replication Coordinator (repl facade)

● Applier (replicate/apply oplog)

● Executor (network, heartbeats, serialization)

● Commands (re-config, init, status, etc)

● External (writes, storage, query, commands)

Page 26: MongoDB 2.8 Replication Internals: Fitting it all together

Thanks

Questions?


Recommended