Upload
belinda-richard
View
219
Download
0
Embed Size (px)
Citation preview
Replication and Consistency
Reference
The Dangers of Replication and a Solution,
Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha.
In Proceedings of the ACM SIGMOD international conference on Management of Data, 1996
Introduction
When you have mobility, replication allows mobile nodes to read and update the database while disconnected from the network.
Eager Replication
All replicas synchronized to the same value immediately
RR
R
R R R
time
Eager Replication
All replicas synchronized to the same value Lower update performance and response
time
RR
R
R R R
time
Lazy Replication
One replica is updated by the transaction Replicas synchronize asynchronously Multiple versions of data
RR
R
R R R
time
Example
Consider a joint checking account. Suppose that it has $1,000 in it.
The account is replicated in three places: the wife’s checkbook, the husband’s checkbook and the bank’s ledger.
Eagar replication assumes that all three books have the same account balance. It prevents the husband and wife from
writing checks totaling more than $1,000.
Example
Lazy replication allows both the husband and wife to write check totaling $1,000 for a total of $2,000 in withdrawels.
When these checks arrived at the bank or when husband and wife communicate, someone or something reconciles the transactions.
The bank is the does the reconciliation by rejecting updates that cause an overdraft.
Lots of time may be spent reconciling.
Example
The database for a checking account is a single number, and a log of updates to that number.
Databases are usually more complex. Disconnected operation and message
delays mean lazy replication has more frequent reconciliation.
Concurrency Anomaly in Lazy Replication
R` - Which version of data should it see? If committed transaction is ‘wrong’, conflict Conflicts have to be reconciled
R’R
R```
R R`` R`
time
Scaleup pitfall
When the nodes divulge hopelessly we get system delusion – database is inconsistent and no obvious way to repair it
R’R
R```
R R`` R`
time
Regulate Replica Updates
Group: Any node with a copy can update item Update anywhere
Master: Only a master can update the primary copy. All replicas are read-only. All update requests are sent to the master
Replication StrategiesPropagation
Vs.Ownership
Lazy Eager
Group N transactionsN object owners
1 transactionN object owners
Master N transactions1 object owner
1 transaction1 object owner
Two tier N+1 transactions, 1 object ownerTentative locate update, eager base update
Eager Replication and Mobile Nodes
Read on disconnected clients may give stale data
Simple eager replication prohibits updates if any node is disconnected
RR
R
R RR
time
Eager Replication and Mobile Nodes
For high availability, eager replication systems allow updates among members of the cluster.
When a node joins a cluster, the cluster sends the new node all replica updates since the node was disconnected.
Eager Replication and Mobile Nodes
Even if all the nodes are connected all the time, updates may fail due to deadlocks that prevent serialization errors. The probability of deadlocks and
consequently failed transactions rises very quickly with transaction size and with the number of nodes. It is estimated that a 10-fold increase in nodes gives a 1000-fold increased in failed transactions.
Lazy Replication and Mobile Nodes
With lazy group replication, we have to wait for all nodes to come online to commit
Lazy master replication cannot work for mobile nodes and network connection is needed for transaction to complete
Lazy Replication and Mobile Nodes
Lazy group replication allows any node to update any local data.
When the transaction commits, a transaction is sent to every other node to apply the root transaction’s updates to the replicas at the destination node.
Two nodes may race to update the same object. This must be detected and reconciled.
Lazy Replication and Mobile Nodes
Timestamps are commonly used to detect and reconcile lazy-group transactional updates.
Each object carries the timestamp of its most recent update.
Each replica update carries the new value and is tagged with the old object timestamp.
Each node detects incoming replica updates that would overwrite earlier committed updates.
The node tests if the local replica’s timestamp and the update’s old timestamp are equal.
If so, the update is safe.
Lazy Replication and Mobile Nodes
The local replica’s timestamp advances to the new transaction’s timestamp and the object value is updated.
If the current timestamp of the local replica does not match the old timestamp seen by the root transaction, then the update may be “dangerous”. The node rejects the incoming transaction
and submits it for reconciliation.
Example Replication Scenario: #1
Replicated DNS servers One primary DNS server Multiple replicas
• DNS1.UGA.EDU 128.192.1.9• DNS2.UGA.EDU 128.192.1.193• DNS3.UGA.EDU 168.24.242.249
Replicas use zone transfers to get an up-to-date database from the the primary server
Transfers database every so often Inconsistent state between transfers
Lazy, master replication
Example Replication #2
Palm Pilot Synchronization
Database (your address book) is in PIM (Outlook say), Palm Desktop, your Palm device. Updates are allowed anywhere. You could authorize your secretary to add items to your Outlook
Lazy group update
Example Replication #3
Gnutella – when you add a new song into your computer, when do the other nodes see it? Eventually
Lazy group update
Example Replication #4 Newsgroups Everyone can post to newsgroup. You
post in comp.risks from UWO, and your friend also posts at the same time from Toronto. My friend at Waterloo will see it in some order (UWO first and then Toronto or the other way around)
Lazy group replication
Example Replication #5
Distributed databases with ACID syntax Eager master
Convergence Property
If no new transactions arrive, if all the nodes are connected together, they will all converge to the same replicated state after exchanging replica updates
Updates may be lost because of newer updates
Commutative updates – incremental transformations that can be applied in any order
Two-Tier Replication
Mobile nodes Disconnected most of the time. Mobile nodes store Master version and
Tentative version• Master version on disconnected or lazy replica
maybe outdated• Most recent value due to local updates is
maintained as a tentative value
Base Nodes Always connected. Store a replica of the
database. Items are mastered in base nodes
Two-Tier Transaction
Base transaction Work only on master data Produce new master data
Tentative transaction Work on local tentative data Produce new tentative versions Also produce base transaction to be run at a
later time on the base nodes Acceptance criteria for each transaction
update
Key Properties of Two-Tier Replication Schemes
Mobile nodes may make tentative database updates
Base transactions execute with single-copy serializability so the master base system state is the result of a serializable execution
A transaction becomes durable when the base transaction completes
Replicas at all connected nodes converge to the base system state
If all transactions commute, there are no reconciliations