Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007...

Preview:

Citation preview

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

DISTRIBUTED SYSTEMSPrinciples and Paradigms

Second EditionANDREW S. TANENBAUM

MAARTEN VAN STEEN

Chapter 7Consistency And Replication

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Reasons for Replication

• Data are replicated to increase the reliability of a system.

• Replication for performance Scaling in numbers Scaling in geographical area

Caveat Gain in performance Cost of increased bandwidth for maintaining

replication

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Replication

Synchronous replication• A read operation returns the same results on every

copy;• Write operations are atomic: they are propagated on all

nodes before other operations can happen.

To improve synchronization the consistency constraints can be “relaxed”.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Data-centric Consistency Models

Figure 7-1. The general organization of a logical data store, physically distributed and replicated across multiple processes.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Consistency model

Consistency model • A “Contract” between processes and data store

Continuous consistency measure deviations in replicas:• numerical value: absolute/relative;• staleness: last time the replica was updated;• ordering of update operations.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Consistency unit

Conit: measure of inconsistency.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Continuous Consistency (1)

Figure 7-2. An example of keeping track of consistency deviations [adapted from (Yu and Vahdat, 2002)].

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Continuous Consistency (2)

Figure 7-3. Choosing the appropriate granularity for a conit. (a) Two updates lead to update propagation.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Continuous Consistency (3)

Figure 7-3. Choosing the appropriate granularity for a conit. (b) No update propagation is needed (yet).

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (1)

Figure 7-4. Behavior of two processes operating on the same data item. The horizontal axis is time.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (2)

A data store is sequentially consistent when:

The result of any execution is the same as if the (read and write) operations by all processes on the data store …

• were executed in some sequential order and …

• the operations of each individual process appear … in this sequence in the order specified by its program.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (3)

Figure 7-5. (a) A sequentially consistent data store. (b) A data store that is not sequentially consistent.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (4)

Figure 7-6. Three concurrently-executing processes.6!=720 possible execution sequences

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (5)

Figure 7-7. Four valid execution sequences for the processes of Fig. 7-6. The vertical axis is time.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Sequential Consistency (6)

Examples of not valid sequences• 000000:

– statements must be executed in program order;

• 001001: – 00: y=z=0 both statements of P1 executed;

– 10: P2 must run after P1 starts;

– 01: P3 complete before P1 starts.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Causal Consistency (1)

For a data store to be considered causally consistent, it is necessary that the store obeys the following condition:

Writes that are potentially causally related …• must be seen by all processes• in the same order.

Concurrent writes …• may be seen in a different order • on different machines.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Causal Consistency (2)

Figure 7-8. This sequence is allowed with a causally-consistent store, but not with a sequentially consistent store.

W1(x)c and W2(x)b are concurrent

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Causal Consistency (3)

Figure 7-9. (a) A violation of a causally-consistent store.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Causal Consistency (4)

Figure 7-9. (b) A correct sequence of events in a causally-consistent store.

Note: this is not acceptable in sequentially consistency

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Critical sections

Mutual exclusion, transactions

Use synchronization variables:• Acquire when enter in section• Release when leave the section

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Solution to Critical-Section Problem1. Mutual Exclusion - If process Pi is executing in its critical section, then

no other processes can be executing in their critical sections

2. Progress - If no process is executing in its critical section and there exist some processes that wish to enter their critical section, then the selection of the processes that will enter the critical section next cannot be postponed indefinitely

3. Bounded Waiting - A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is grantedAssume that each process executes at a nonzero speed No assumption concerning relative speed of the N processes

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Grouping Operations (1)

Necessary criteria for correct synchronization:• An acquire access of a synchronization variable, not

allowed to perform until all updates to guarded shared data have been performed with respect to that process.

The acquire may not complete until the guarded shared data are up to date. All remote changes must be made visible.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Grouping Operations (2)

• Before exclusive mode access to synchronization variable by process is allowed to perform with respect to that process, no other process may hold synchronization variable, not even in nonexclusive mode.

Before update shared data a process must enter in critical sections in exclusive mode to be sure no other processes can update the data at the same time.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Grouping Operations (3)

• After exclusive mode access to synchronization variable has been performed, any other process’ next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable’s owner.

If a process wants to enter in a critical region in nonexclusive mode, it must first check with the owner of synchronization variables to fetch the most recent copies of shared data.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Grouping Operations (4)

Figure 7-10. A valid event sequence for entry consistency.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Consistency and Coherence

Consistency model• what can be expected when a set of concurrent

processes operate on data. (Read and write operations on a set of data items)

Coherence model• what can be expected on a single data item.

(sequential consistency on a single data item).

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Eventual consistency

Most application can tolerate a high degree of inconsistency.

If no update take place for a long time all replicas will gradually become consistent.

Es. DNS, web-pages

• updates are guaranteed to propagate to all replicas;• write-write conflicts are solved assuming that only few

processes can update data

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Client-Centric Consistency

Figure 7-11. The principle of a mobile user accessing different replicas of a distributed database.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Client centric consistencyConsistency for a single client

NO guarantee of consistency for concurrent accesses of different client

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Reads (1)

A data store is said to provide monotonic-read consistency if the following condition holds:

If a process reads the value of a data item x … • any successive read operation on x by that

process • will always return that same value • or a more recent value.

Ex. distributed email database

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Reads (2)

Figure 7-12. The read operations performed by a single process P at two different local copies of the same data store.

(a) A monotonic-read consistent data store.

Insieme delle operazioni su x1 e x2

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Reads (3)

Figure 7-12. The read operations performed by a single process P at two different local copies of the same data store.

(b) A data store that does not provide monotonic reads.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Writes (1)

In a monotonic-write consistent store, the following

condition holds:

A write operation by a process on a data item x …• is completed before any successive write

operation on x • by the same process.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Writes (1)

A write operation by a process on a copy of item x is performed only if that copy is update by means of any preceding write operation on other copies of x

FIFO consistency.

Ex. Update systems: to apply an update all previous updates have to be applied.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Writes (2)

Figure 7-13. The write operations performed by a single process P at two different local copies of the same data store. (a) A monotonic-write

consistent data store.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Monotonic Writes (3)

Figure 7-13. The write operations performed by a single process P at two different local copies of the same data store. (b) A data

store that does not provide monotonic-write consistency.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Read Your Writes (1)

A data store is said to provide read-your-writes consistency, if the following condition holds:

The effect of a write operation by a process on data item x …

• will always be seen by a successive read operation on x

• by the same process.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Read Your Writes (2)

Figure 7-14. (a) A data store that provides read-your-writes consistency.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Read Your Writes (3)

Figure 7-14. (b) A data store that does not.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Writes Follow Reads (1)

A data store is said to provide writes-follow-reads consistency, if the following holds:

A write operation by a process …• on a data item x following a previous read

operation on x by the same process … • is guaranteed to take place on the same or a

more recent value of x that was read.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Writes Follow Reads (2)

Figure 7-15. (a) A writes-follow-reads consistent data store.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Writes Follow Reads (3)

Figure 7-15. (b) A data store that does not provide writes-follow-reads consistency.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Replica-Server Placement

Replica servers• scegliere il miglior posto per posizionare un server

Content• Scegliere il miglior server per inserire i contenuti

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Replica-Server Placement

Figure 7-16. Choosing a proper cell size for server placement.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Content Replication and Placement

Figure 7-17. The logical organization of different kinds of copies of a data store into three concentric rings.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Content Replication and PlacementEs. siti web:• Cluster web server• Mirroring

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Server-Initiated Replicas

Figure 7-18. Counting access requests from different clients.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Server-Initiated Replicas

• Conteggio degli accessi per capire quando replicare o cancellare le repliche;

• Meccanismi per garantire che almeno una copia sopravviva.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

State versus Operations

Possibilities for what is to be propagated:

1. Propagate only a notification of an update (invalidation protocols).

2. Transfer data from one copy to another.

3. Propagate the update operation to other copies (active replication).

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Pull versus Push ProtocolsPush• Protocolli Server-based• Le modifiche sono propagate senza che le repliche lo

richiedano• Alto grado di consistenza

Pull• Protocolli client-based• Le modifiche sono propagate solo quando richieste da

chi le deve usare

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Pull versus Push Protocols

Figure 7-19. A comparison between push-based and pull-based protocols in the case of multiple-client, single-server systems.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Lease

Push temporaneo:• Un client richiede update automatici per un tempo

limitato• Scaduto il periodo si usano protocolli pull, o si richiede

un nuovo lease

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Protocolli di consistenza

Implementazione di un modello di consistenza.

Primary-based protocols• Usati per la consistenza sequenziale• Ogni item ha una replica primaria• Tutte le scritture sono applicate e coordinate dalla

copia primaria

Replication-based• Le operazioni possono essere eseguite su una replica

qualsiasi.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Remote-Write Protocols

Figure 7-20. The principle of a primary-backup protocol.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Local-Write Protocols

Figure 7-21. Primary-backup protocol in which the primary migrates to the process wanting to perform an update.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Replicated-write protocols

• No single primary copy• Writes can be performed at multiple replicas• Two types:

– Active Replication: All operations are forwarded to all replicas

– Quorum-based: Operations are forwarded to a subset of all replicas

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Active replication

• All operations are propagated to all replicas• More complex to achieve consistency• Need to order operations

– Use Lamport timestamps

– Central sequencer

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Quorum-Based Protocols• Operations are sent to a subset of replicas• Maintaining consistency• Use voting

– If a quorum (e.g.: majority) agrees, then, consistency is maintained

– Write: Apply write only if majority of replicas agree on the update

– Read: Perform read only if majority of replicas agree on the current data value

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Quorum-Based Protocols

Figure 7-22. Three examples of the voting algorithm. (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct choice,

known as ROWA (read one, write all).

Recommended