90
Chapter 10 Consistency And Replication

Chapter 10

Embed Size (px)

DESCRIPTION

Chapter 10. Consistency And Replication. Topics. Motivation Data-centric consistency models Client-centric consistency models Distribution protocols Consistency protocols. Motivation. Make copies of services on multiple sites, improve … Reliability (by redundancy) - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 10

Chapter 10

Consistency And Replication

Page 2: Chapter 10

Topics Motivation Data-centric consistency models Client-centric consistency models Distribution protocols Consistency protocols

Page 3: Chapter 10

Motivation Make copies of services on multiple sites,

improve … Reliability(by redundancy)

If primary FS crashes, standby FS still works Performance

Increase processing power Reduce communication delays

Scalability Prevent overloading a single server (size scalability) Avoid communication latencies (geographic scale)

However, updates are more complex When, who, where and how to propagate the

updates?

Page 4: Chapter 10

Concurrency Control on Remote Object

a) A remote object capable of handling concurrent invocations on its own.b) A remote object for which an object adapter is required to handle

concurrent invocations

Page 5: Chapter 10

Object Replication

a) A distributed system for replication-aware distributed objects.b) A distributed system responsible for replica management

Page 6: Chapter 10

Distributed Data Store

Clients point of view:

Its data store is capable of storing an amount of data

Page 7: Chapter 10

Distributed Data Store

Data Store’s point of view:

General organization of a logical data store, physically distributed and replicated across multiple tasks.

Page 8: Chapter 10

Operations on A Data Store Read:ri(x)b client i or process Pi performs a

read for data item x and it returns value b Write: wi(x)a client i or process Pi performs

a write on data x setting it to the new value a

Operations not instantaneous Time of issue (when request is sent by client) Time of execution (when request is executed at

a replica) Time of completion (when reply is received by

client)

Page 9: Chapter 10

Example

Page 10: Chapter 10

Consistency Models Defines which interleaving of operations is valid

(admissible) Different levels of consistency

strong (strict, tight) weak (loose)

Consistency model: Concerned with the consistency of a data store Specifies characteristics of valid ordering of operations

A data store that implements a particular consistency model will provide a total ordering of operations that is valid according to this model

Page 11: Chapter 10

Consistency Models Data-centric models

Described consistency experienced by all clients

Clients P1, P2, P3, … see same kind of orderings

Client centric models: Described consistency only seen by clients

who request it Clients P1, P2, P3 may see different kinds of

orderings

Page 12: Chapter 10

Data-Centric Consistency Models Strong ordering:

Strict consistency Linear consistency Sequential consistency Causal consistency FIFO consistency

Weak ordering: Weak consistency Release consistency Entry consistency

Page 13: Chapter 10

Strict Consistency Definition: A DDS (distributed data store) is

strict consistent if any read on a data item of the DDS returns the value corresponding to the result of the most recent write on x, regardless of the location of the processes doing read or write

Analysis: 1. In a single processor system strict consistency is

for nothing, that’s exact the behavior of local shard memory with atomic reads/writes

2. However, it’s hard to establish a global time to determine what’s the most recent write

3. Due to message transfer delays this model is not achievable

Page 14: Chapter 10

Example

Behavior of two processes, operating on the same data item.

(a) A strictly consistent store. (b) A store that is not strictly consistent.

Page 15: Chapter 10

Strict Consistency Problems

Assumption: y = 0 is stored on node 2, P1 and P2 are processes on node 1 and 2,

Due to message delays, r(y) at t = t2 may result in 0 or 1 and at

t = t4 may result in 0, 1 or 2

Furthermore: If y migrates to node 1 between t2 and t3 then r(y) issued at time t2 may even get value 2 (i.e. .back to the future.).

Page 16: Chapter 10

Sequential Consistency (1) Definition: A DDS offers sequential consistency, if all

processes see the same order of accesses to the DDS, whereby reads/writes of individual processes occur in program order, and reads/writes of different ones are performed in some sequential order.

Analysis: 1. Sequential consistency is weaker than strict consistency 2. Each valid permutation of accesses is allowed iff all tasks

see same permutation 2 runs of a distributed application may have different results

3. No global timing ordering is required

Page 17: Chapter 10

Example

Each task sees all writes in the same order, even though not strict consistent.

Page 18: Chapter 10

Non-Sequential Consistency

Page 19: Chapter 10

Linear Consistency Definition: A DDS is said to be linear consistent

(linearizable) when each operation is time-stamped and the following holds: The result of each execution is the same as if the (read and write) operations by all processes on the DDS were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. In addition, if TSOP1(x) < TSOP2(y), then operation OP1(x) should precede OP2(y) in this sequence

Page 20: Chapter 10

Assumption Each operation is assumed to receive a time

stamp using a globally available clock, but with only finite precision, e.g. some loosely coupled synchronized local clocks.

Linear consistency is stricter than sequential one, i.e. a linear consistent DDS is also sequentially consistent.

With linear consistency no longer each valid interleaving of reads and writes is allowed, the ordering has also obey the order implied by the time-stamps of these operations.

Page 21: Chapter 10

Causal Consistency (1) Definition: A DDS is assumed to provide

causal consistency if, the following condition holds: Writes that are potentially causally related* must be seen by all tasks in the same order. Concurrent writes may be seen in a different order on different machines. * If event B is caused or influenced by an earlier

event A, causality requires that everyone else also sees first A, and then B.

Page 22: Chapter 10

Causal Consistency (2) Definition: write2 is potentially

dependent on write1, when there is a read between these 2 writes which may have influenced write2

Corollary: If write2 is potential dependent on write1 the only correct sequence is: write1 write2.

Page 23: Chapter 10

Causal Consistency: Example

This sequence is allowed with a casually-consistent store, but not with sequentially or strictly consistent store.

Page 24: Chapter 10

Causal Consistency: Example

a) A violation of a casually-consistent store.b) A correct sequence of events in a casually-consistent store.

Page 25: Chapter 10

Implementation Implementing causal consistency

requires keeping track of which processes have seen which writes.

Construction and maintenance of a dependency graph, expressing which operations are causally related (using vector time stamps)

Page 26: Chapter 10

FIFO or PRAM Consistency Definition: DDS implements FIFO consistency,

when all writes of one process are seen in the same order by all other processes, i.e. they are received by all other processes in the order they were issued. However, writes from different processes may be seen in a different order by different processes.

Corollary: Writes on different processors are concurrent

Implementation: Tag each write-operation of every process with: (PID, sequence number)

Page 27: Chapter 10

Example

Both writes are seen on processes P3 and P4 in a different order, they still obey FIFO-consistency, but not causal consistency because write 2 is dependent on write1.

Page 28: Chapter 10

Example (2)

Possible results A B Nil AB?

Two concurrent processes with variable x,y = 0

Process P1 Process P2

x=1; y=1;

if ( y==0 ) print(“A”); if (x==0) print(“B”);

Page 29: Chapter 10

Synchronization Variable Background: not necessary to

propagate intermediate writes. Synchronization variable

Associated with one operation synchronize(S).

Synchronize all local copies of the data store.

Page 30: Chapter 10

Compilation Optimization

int a, b, c, d, e, x, y; /* variables */int *p, *q; /* pointers */int f( int *p, int *q); /* function prototype */

a = x * x; /* a stored in register */b = y * y; /* b as well */c = a*a*a + b*b + a * b; /* used later */d = a * a * c; /* used later */p = &a; /* p gets address of a */q = &b /* q gets address of b */e = f(p, q) /* function call */

A program fragment in which some variables may be kept in registers.

Page 31: Chapter 10

Weak Consistency Definition: DDS implements weak

consistency, if the following hold: Accesses to synchronization variables obey

sequential consistency No access to a synchronization variable is

allowed to be performed until all previous writes have completed everywhere

No data access (read or write) is allowed to be performed until all previous accesses to synchronization variables have been performed

Page 32: Chapter 10

Interpretation A synchronization variable S knows just

one operation: synchronize(S) responsible for all local replicas of the data store

Whenever a process calls synchronize(S) its local updates will be updated on all replicas of the DDS and all updates of the other processes will be updated to its local replica of the DDS

All tasks see all accesses to synchronization-variables in the same order

Page 33: Chapter 10

Interpretation (2) No data access allowed until all

previous accesses to synchronization-variables have been done By doing a synch before reading shared

data, a task can be sure of getting the “ up to date value”

Unlike previous consistency models “weak consistency” forces the programmer to collect critical operations all together

Page 34: Chapter 10

Example

Via synchronization you can enforce that you’ll get up-to-date values. Each process must synchronize if its writes should be seen by others.

A process requesting a read without any synchronization

measures may get out-of-date values.

Page 35: Chapter 10

Non-weak Consistency

Page 36: Chapter 10

Release Consistency Problems with weak consistency: When a

synch-variable is accessed, the DDS doesn’t know whether this is done because a process has finished writing the shared variables or whether it is about reading them. It must take actions required in both cases,

namely making sure that all locally initiated writes have been completed (i.e. propagated to all other machines), as well as gathering in all writes from other machines.

Provide two operations: acquire and release

Page 37: Chapter 10

Details Idea:

Distinguish between memory accesses in front of a critical section (acquire) and those behind of a critical section (release).

Implementation: When a release is done, all the

protected data that have been updated within the critical section will be propagated to all replicas.

Page 38: Chapter 10

Definition Definition: A DDS offers release consistency,

if the following three conditions hold: 1. Before a read or write operation on shared

data is performed, all previous acquires done by the process must have completed successfully.

2. Before a release is allowed to be performed, all previous reads and writes by the process must have been completed

3. Accesses to synchronization variables are FIFO consistent.

Page 39: Chapter 10

Example

Valid event sequence for release consistency, even though P3 missed to use acquire and release.

Remark: Acquire is more than a lock or enter_critical_section, it waits until all updates on protected data from other nodes are propagated to its local replicas, before it enters the critical section

Page 40: Chapter 10

Lazy Release Consistency Problems with “eager” release consistency:

When a release is done, the process doing the release pushes out all the modified data to all processes that already have a copy and thus might potentially read them in the future.

There is no way to tell if all the target machines will ever use any of these updated values in the future above solution is a bit inefficient, too much overhead.

Page 41: Chapter 10

Details With “lazy” release consistency nothing

is done at a release. However, at the next acquire the

processor determines whether it already has all the data it needs. Only when it needs updated data, it needs to send messages to those places where the data have been changed in the past.

Time-stamps help to decide whether a data is out-dated.

Page 42: Chapter 10

Entry Consistency Unlike release consistency, entry consistency

requires each ordinary shared variable to be protected by a synchronization variable.

When an acquire is done on a synchronization variable, only those ordinary shared variables guarded by that synchronization variable are made consistent.

A list of shared variables may be assigned to a synchronization variable (to reduce overhead).

Page 43: Chapter 10

How to Synchronize? Every synch-variable has a current owner

An owner may enter and leave critical sections protected by this synchronization variable as often as needed without sending any coordination message to the others.

A process wanting to get a synchronization-variable has to send a message to the current owner.

The current owner hands over the synch-variable all together with all updated values of its previous writes.

Multiple reads in the non-exclusive reads are possible.

Page 44: Chapter 10

Example

A valid event sequence for entry consistency

Page 45: Chapter 10

Summary of Consistency Models

Consistency Description

Strict Absolute time ordering of all shared accesses matters.

LinearizabilityAll processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp

SequentialAll processes see all shared accesses in the same order. Accesses are not ordered in time

Causal All processes see causally-related shared accesses in the same order.

FIFOAll processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order

(a)

Consistency Description

Weak Shared data can be counted on to be consistent only after a synchronization is done

Release Shared data are made consistent when a critical region is exited

Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.

(b)

a) Consistency models not using synchronization operations.b) Models with synchronization operations.

Page 46: Chapter 10

Up to Now System wide consistent view on DDS Independent of number of involved

processes Mutual exclusive atomic operations on DDS Processes access only local copies Propagation of updates have to be made,

whenever it is necessary to fulfill requirements of the consistency model

Are there still weaker consistency models?

Page 47: Chapter 10

Client-Centric Consistency Provide guarantees about ordering of

operations only for a single client, i.e. Effects of an operations depend on the client

performing it Effects also depend on the history of client’s

operations Applied only when requested by the client No guarantees concerning concurrent

accesses by different clients Assumption:

Clients can access different replicas, e.g. mobile users

Page 48: Chapter 10

Mobile Users

The principle of a mobile user accessing different replicas of a distributed database.

Page 49: Chapter 10

Eventual Consistency If updates do not occur for a long period of time, all

replicas will gradually become consistent Requirements:

Few read/write conflicts No write/write conflicts Clients can accept temporary inconsistency

Examples: DNS:

No write/write conflicts Updates slowly (1 – 2 days) propagating to all caches.

WWW: Few write/write conflicts Mirrors eventually updated

Cached copies (browser or Proxy) eventually replaced.

Page 50: Chapter 10

Client Centric Consistency Models Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads

Page 51: Chapter 10

Monotonic Reading Definition: A DDS provides

monotonic-read consistency if the following holds: If process P reads the value of data

item x, any successive read operation on x by that process will always return the same value or a more recent one (independently of the replica at location L where this new read will be done).

Page 52: Chapter 10

Example Systems Distributed e-mail database with

distributed and replicated user-mailboxes.

Emails can be inserted at any location.

However, updates are propagated in a lazy (i.e. on demand) fashion.

Page 53: Chapter 10

Example

The read operations performed by a single process P at two different local copies of the same data store.

a) A monotonic-read consistent data storeb) A data store that does not provide monotonic reads.

Page 54: Chapter 10

Monotonic Writing Definition: DDS provides monotonic-write

consistency if the following holds: A write operation by process P on data item x is

completed before any successive write operation on x by the same process P can take place.

Remark: Monotonic-writing ~ FIFO consistency Only applies to writes from one client process P Different clients -not requiring monotonic writing

may see the writes of process P in any order

Page 55: Chapter 10

Example

The write operations performed by a single process P at two different local copies of the same data store

a) A monotonic-write consistent data store.b) A data store that does not provide monotonic-write consistency.

Page 56: Chapter 10

Reading Your Writes Definition: DDS provides “read your

write” consistency if the following holds: The effect of a write operation by a process P

on a data item x at a location L will always be seen by a successive read operation by the same process.

Example of a missing read-your-write consistency: Updating a website with an editor, if you want

to view your updated website, you have to refresh it, otherwise the browser uses the old cached website content.

Updating passwords

Page 57: Chapter 10

Example

a) A data store that provides read-your-writes consistency.

b) A data store that does not.

Page 58: Chapter 10

Writes Following Reads Definition: DDS provides “ writes-

follow-reads” consistency if the following holds: A write operation by a process P on a

data item x following a previous read by the same process, is guaranteed to take place on the same or even a more recent value of x, than the one having been read before.

Page 59: Chapter 10

Example

a) A writes-follow-reads consistent data storeb) A data store that does not provide writes-follow-

reads consistency

Page 60: Chapter 10

Implementing Client Centric Consistency Naive Implementation (ignoring

performance): Each write gets a globally unique identifier Identifier is assigned by the server that

accepts this write operation for the first time For each client two sets of write identifiers

are maintained: Read-set(client C) := RS(C)

{write-IDs relevant for the reads of this client C} Write-set(client C) := WS(C)

{write-IDs having been performed by client C}

Page 61: Chapter 10

Implementing Monotonic Reads

When a client C performs a read at server S, that server is handed the client’s read set RS(C) to control whether all identified writes have taken place locally at server S. If not, server has to be updated before reading!

Page 62: Chapter 10

Implementing Monotonic Write If client initiates a write on a server S, this

server S gets the clients write-set in order to update server S. A write on this server is done according to the times stamped WID.

Having done the new write, client’s write-set is updated with this new write. The response time of a client might thus increase with an ever increasing write-set.

However, what to do if all the reader write-sets of a client get larger and larger?

Page 63: Chapter 10

Improving Efficiency with RS and WS Major drawback: potential sizes of

read- and write sets Group all write- and read-operations

of a client in a so called session (mostly assigned with an application)

Every time a client closes its current session, all updates are propagated and these sets are deleted afterwards

Page 64: Chapter 10

Summary on Consistency Models Choosing the right consistency model requires

an analysis of the following trade-offs: Consistency and redundancy

All replicas must be consistent All replicas must contain full state Reduced consistency reduced reliability

Consistency and performance Consistency requires extra work Consistency requires extra communication May result in loss of overall performance

Page 65: Chapter 10

Distribution Protocols Replica Placement

Permanent Replicas Server-Initiated Replicas Client-Initiated Replicas

Update Propagation State versus Operations Pull versus Push Protocols Unicasting versus Multicasting

Epidemic Protocols Update Propagation Models Removing data

Page 66: Chapter 10

Replica Placement

The logical organization of different kinds of copies of a data store into three concentric rings.

Page 67: Chapter 10

Replica Placement Permanent replicas

Initial set of replicas. Created and maintained by DDS-owner(s) Writes are allowed E.g., web mirrors

Server-initiated replicas Enhance performance Not maintained by owner of DDS Placed close to groups of clients

Manually Dynamically

Client-initiated replicas Client caches Temporary Owner not aware of replica Placed closest to a client Maintained by host (often the client)

Page 68: Chapter 10

Update Propagation

Page 69: Chapter 10

What to Be Propagated? Propagate only a notification of an update (“invalidation”)

Typical for invalidation protocols May include information which part of the DDS has been

updated Work best, when ratio of reads/write is low

Propagate updated data from one replica to another Work best, if ratio of reads/writes is high You may also aggregate some update before sending them

across the network Propagate the update operation to other replicas (“active

replication”) This approach called active replication works if the size of

parameters associated with each operation is small compared to the updated data

Page 70: Chapter 10

Pull versus Push Protocols Push protocol , i.e. updates are propagated to

other replicas without those replicas having asked for them

Used between permanent and server initiated replicas, i.e. to achieve a relatively high degree of consistence

Pull protocol , i.e. a server (or a client) asks another server to provide the updates

Used by client caches, e.g. when a client requests a website, not having updated for a longer period of time, it may check the original web site, whether updates have been made

Efficient when read-to-write ratio is relatively low.

Page 71: Chapter 10

Pull versus Push ProtocolsIssue Push-based Pull-based

State of server List of client replicas and caches None

Messages sent Update (and possibly fetch update later) Poll and update

Response time at client

Immediate (or fetch-update time) Fetch-update time

A comparison between push-based and pull-based protocols in the case of multiple client, single server systems.

Page 72: Chapter 10

Unicasting

Potential overhead with unicasting in a LAN.Good for pull-based approach.

Page 73: Chapter 10

Multicasting

With multicasting an update message can be propagated more efficiently across a LAN.

Good for push-based approach.

Page 74: Chapter 10

Epidemic Protocols Implementing eventual consistency you

may rely on epidemic protocols. No guarantees for absolute consistency are

given, but after some time epidemic protocols will send updates to all replicas.

Notions: An infective is a server with a replica that is

willingly to spread to other servers, too A susceptible, is a server that has not yet been

infected, i.e. updated A removed server is a server, that does not want

to propagate any information

Page 75: Chapter 10

Anti-Entropy Protocol Server P picks another server Q at

random, and subsequently exchanges updates with Q, there are 3 approaches how to exchange updates: P only pushed its own updates to Q P only pulls in new updates from Q P and Q exchange to each other their

updates, i.e. a push-pull approach

Page 76: Chapter 10

Gossip Protocols Rumor spreading or gossiping

works as follows: If server P has been updated for data

item x, it contacts another arbitrary server Q and tries to push its new update of x to Q.

However, if Q got this update already by some other server, P is so much disappointed, that it will stop gossiping with a prob. = 1/k

Page 77: Chapter 10

Gossip Protocols (2) Although gossiping really works

quite well on average, you cannot guarantee that every server will be updated.

In a DDS with a “large” number of replicas, the fraction s of servers remaining ignorant towards an update, i.e. are still susceptible is:

s = e-(k+1)(1-s)

Page 78: Chapter 10

Analysis of Epidemic Protocols Advantages:

Scalability, due to limited # of update messages

Disadvantage: Spreading the deleting of a data is quite

cumbersome, due to an unwanted side effect:

Suppose, you have deleted on server S data item x, but you may receive again an old copy of data item x from some other server due to still ongoing gossiping

Page 79: Chapter 10

Consistency Protocols Primary-Based Protocols

Remote-Write Protocols Local-Write Protocols

Replicated-Write protocols Active Replication Quorum-Based Protocols

Page 80: Chapter 10

Primary-Based Protocols Each data item of a DDS has an

associated primary, responsible for coordinating write operations on x

Primary server: Fixed,i.e. a specific remote server, i.e.

remote writing Dynamic, primary is migrated to the

place, of the next write

Page 81: Chapter 10

Remote-Write Protocols (1)

Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.

Page 82: Chapter 10

Remote-Write Protocols (2)

The principle of primary-backup protocol.

Page 83: Chapter 10

Local-Write Protocols (1)

Primary-based local-write protocol in which a single copy is migrated between processes.

Page 84: Chapter 10

Local-Write Protocols (2)

Primary-backup protocol in which the primary migrates to the process wanting to perform an update.

Page 85: Chapter 10

Replicated-Write Protocols Writes can take place at multiple

replicas, instead of on only a specific primary server. Active replication

Operation is forwarded to all replicas Problem:

make sure all operations need to be carried out in the same order everywhere.

Scalability Replicated invocation

Majority voting Before reading or writing ask a subset of all replicas

Page 86: Chapter 10

Replicated Invocation for Active Replication

Page 87: Chapter 10

Solutions

a) Forwarding an invocation request from a replicated object.b) Returning a reply to a replicated object.

Page 88: Chapter 10

Quorum-Based Protocols Preliminaries:

If a client wants to read or write, it first must request and acquire permission of multiple servers.

Example: A DFS with file F being replicated on N servers. If an

update has to be made, demand, that the client first contacts half of the servers plus 1, and get them to agree to do his update. Once, they have agreed, file F gets a new version number F(x.y)

To read file F, a client also must contact at least half of the servers and ask them, to hand out the current version number of F.

Page 89: Chapter 10

Gifford’s Quorum-Based Protocols To read a file F a client must use a read-

quorum, an arbitrary assemble of NR servers. To write a file F, at least NW servers( the

write quorum) is required. The following must hold: A) NR +NW > N B) NW > N/2 A) Is used to prevent read-write conflicts B) Is used to prevent write-write conflicts

Page 90: Chapter 10

Examples

Three examples of the voting algorithm:a) A correct choice of read and write setb) A choice that may lead to write-write conflictsc) A correct choice, known as ROWA (read one, write all)