19
Smokehouse Software | Jonathan Lau | [email protected] THE INNER WORKINGS OF AMAZON DYNAMO Jonathan Lau Nov 2013

The inner workings of Dynamo DB

Embed Size (px)

DESCRIPTION

An introduction to the inner works of dynamo db

Citation preview

Page 1: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

THE INNER WORKINGS OF AMAZON DYNAMO

Jonathan Lau Nov 2013

Page 2: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

MOTIVATION AND BIO

• Early stage companies

• Build bigger system

• Specialize in backend system

Page 3: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

DISTRIBUTE / CENTRALIZEDistributed Centralized

Data Different data for each node One master copy

Replicas Replicate smaller data set for each of the nodes

Replicate the master copy into read slaves

Scaling Data are shared into the nodes by default Extra work to shard

Page 4: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

WHAT ABOUT NOSQL?High performance solution != scaling

Page 5: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

DYNAMO DESIGN CONSIDERATION

• Distributed key value store

• Incremental scalability - Scaling one node at a time

• Decentralized design - Gossip-based protocol for membership and failure detection

• Symmetry - All the nodes have the same functionality

• Heterogeneity - The system will be deployed in a environment with huge variance on hardware and system performance.

Page 6: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

H

F

A

G C

E

B

D

put()get()

Request for key "K", which is in [C, D)

HIGH LEVEL CONCEPTDistribute the data in N nodes in a ring

Page 7: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

DYNAMO’S CHALLENGES• Data partitioning

• N-1 replicas

• High availability for writes

• Handling temporary failures

• Recovering from permanent failures

• Membership and failure detection

Page 8: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

A CB D

Request for key K in [B, C)

PARTITIONING

• 128 bit MD5 hash

• Consistent hashing for key partitioning

• Virtual node helps improve the local distribution

• Request can hit any of the node on the key preference list (coordinator)

Page 9: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

REPLICATION

• Replication is stored by N-1 successor nodes

• The nodes with the replicas and the coordinator node forms the preference list.

Page 10: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

AVAILABLE FOR WRITES• Accepts all the writes based on the version modified

• Tracking modification and base version by vector clock

• Accepts all the writes and the vector clock

• Conflict resolution by examining the vector clock on the objects and reconcile during the read operation

• Consistency issue arises because of network or node failure

• Oldest vector clock items will be purged

Page 11: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

HANDLING TEMPORARY FAILURES

• Trade off between durability and availability

• Sloppy Quorum - write / read is only consider successful if the first N healthy nodes return from the preference list.

• Hinted hand off - write will be picked up by the replicas when the designated coordinator node is down. The write picked up by replica will have hint about the intended recipient for the write so we can reconstruct the state.

Page 12: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

REPLICA SYNCHRON

• Dynamo uses Merkle tree to track hash for the keys

• Passing only the root hash to validate synchronization states between the replicas

• If a replica is deemed to be out of sync, the node can traverse down the tree to figure out the exact mismatch portion.

Page 13: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

NODE MEMBERSHIP• Partition and placement information is propagate via a

gossip protocol

• Each node will be aware of the token range of its peer

• They have seed node in the cluster to speed up the membership and the key range membership for the ring

• Nodes are not really aware of each other until an actual delete happens

Page 14: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

GET() AND PUT()What happen during a read or write request?

Page 15: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

GET() AND PUT()• get() and put() are routed through a generic load balancer +

partition aware library to route traffic

• top N nodes in the preference list for key K are the coordinators.

• Requests basically go down the list and bad nodes are skipped over

• Two configuration parameters: R and W, where R + W > N. 

Page 16: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

MORE ON GET() AND PUT()When a writes happens: 

• coordinator generates a vector clock value

• sends the new value along with the vector clock value to N highest ranked reachable nodes

• If at least W-1 node responded, the write is considered successful.

When a read happens: 

• coordinate sends a read request to N highest ranked reachable nodes

• wait for R nodes return, and then return the result to client

Page 17: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

WHAT DOES IT ALL MEANHow does all these ties in together?

Page 18: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

WHAT DOES IT MEAN?• Dynamo shards the data from day 1

• Replica and redundancy is baked in from day 1

• The configuration parameter W and R has a huge effect our trade off between availability and durability.

• W + R > N

• Consistency resolution at read will allow more controlled conflict resolution strategy

Page 19: The inner workings of Dynamo DB

Smokehouse Software | Jonathan Lau | [email protected]

HAPPY SCALING

Read the dynamo design paper @

http://bit.ly/QeM8AC