Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Registers
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register: definitionA register is a shared variable accessed by processes through read and write operations
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Distributed Systems: register abstraction
• Multiprocessor machine:
� Processes typically communicate through registers at hardware level
• The set of these registers constitute the physical memory
• Distributed message passing system:
� no physical shared memory
� Processes communicate exchangingmsg over a network
• Register abstraction support the design of distributed solution, by hiding the complexity of the underlying message passing system and the distribution of the data
......
Hwregister
p1 p2 pnpi
Bus
Register(abstraction)
NETWORK
p1 p2 pi pn
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register operations
A process accesses a register through:
� Read operation, read()→v: it returns the “current”value v of the register; this operation does not modify the content of the register;
� Write operation, write(v): it writes the value v in the register and returns true at the end of the operation
Each operation starts with an invocation and terminates when the corresponding response is received
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register: Assumption
•• A register stores only A register stores only positive integers and positive integers and it is initialize to 0it is initialize to 0
•• Each value written is univocally identifiedEach value written is univocally identified
•• Processes are sequentialProcesses are sequential: a process cannot invoke a new operation before the one it previously invoked (if any) returned
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register: Notation
(X,Y) denotes a register where X processes can write and Y processes can read
� (1,1) denotes a register where only a process can write and only a process can read. It is a priori known which process can write and which can read
� (1,N) denotes a register where a single process, a priori known, can write, and N processes can read
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Assumptions:
• serial access: a process does not invoke an operation on the register if there is another process that previously invoked an operation on it and this latter does not yet complete
•no failures
Sequential Sequential SSpecificationpecification
��LivenessLiveness.. Each operation eventually terminates
��Safety.Safety. Each read operation returns the last value written
Register Semantics: Serial System, No failures
p1
write(5) read()→8
p2
write(8)read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Assumptions:
• several processes can access the register
• concurrent access
• no failures
• Which value does the read operation has to return?
Register semantics: Concurrency
p1
write(5)
p2
read()→?
write(8)
p1
write(5)
p2
write(8)
read()→?
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Assumptions:
• several processes can access the register
• serial access
• processes can fail by crashing, i.e. after some point in time they stop to run their algorithm forever
Failed operation: a process fails at some time in between the invocation and the response of the operation
Which value does the read operation has to return?
Register semantics: failures
p1
write(5)
p2
write(8)
read()→?
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register Semantics: Concurrency & Failures
A process can invokes a write operation and then crash before the corresponding response event is generated. The write operation could have taken place or not
Register semantics: a read may return both:
� The value written by the last write operation which completes
� The value given as input to the last write operation, even thoughthis operation will fail
p1
write(5)
p2
write(8)
read()→?
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Operations• Every operation is characterized by two events:
� Invocation
� Return (Confirmation for the write operation and a value for the read)
• Each of these events occur at a single indivisible point of time
• An operation is complete if both the invocation and the return events are occurred
• A Failed operation is an operation invoked by some process pi that crashes before obtaining a return
Opt
Invocation Return
Op’
Invocation
Crash
t
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Precedence between Operations
• The execution of an operation invoked by a process p, is the time interval defined by the invocation event and the return event
• Given two operations o e o’, o precedeo precedes s oo’’ if the response event of o precedes the invocation event of o’
• An operation o invoked by a process p may precedes an operation o’ invoked by p’ only if o completes
• If it is not possible to define a precedence relation between two operations, they are said to be concurrent
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Example
p1
Op1
p2
Op2
Op1 preceeds Op2
p1
Op1
p2
Op2
Op1 is concurrent Op2
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Register Specification:
Regular-Atomic
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Regular Register: Specification
Termination.Termination. If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation
ValidityValidity.. A read operation returns the last value written or
the value concurrently written
p1
write(5)
p2read()→0 read()→5
No regular
p1
write(5)
p2read()→0 read()→5
Regular
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Regular Register: Scenario
NNOTEOTE: In a regular register, a process can read a value v and then a value v’, even if the writer has written v’ and then v, as long as the write and the read operations are concurrent
This behavior is not allowed in an ATOMIC register
p1
write(5)
p2read()→5 read()→6
write(6)
read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Atomic Register: Specification
IDEA:IDEA: regular register+ ordering.regular register+ ordering.
Properties:Properties:
TerminationTermination. . If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation.
ValidityValidity.. A read operation returns the last value written or
the value concurrently being written.
OrderingOrdering.. If a read returns v2 after a read that it precedes it has returned v1, then v1 cannot be written after v2
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Atomic Register: scenarioes.1
es.2
1. Regular but notatomic register: Write(5) precedeswrite(6). But processp2 read first the value 6 and then the value 5
2. The register isatomic
p1
write(5)
p2read()→6
write(6)
read()→5
p1
write(5)
p2read()→5 read()→6
write(6)
read()→6
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Atomic register: scenario
Not atomic register: the precedence relation also refers toread operations issued by different processes
p1
write(5)
p2read()→6
write(6)
p3read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 1
p1
write(5)
p2
write(6)
read()→5
ATOMICATOMIC and REGULARREGULAR.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 2
NOT ATOMICNOT ATOMIC and NOT REGULARNOT REGULAR
p1
write(5)
p2
write(6)
read()→6
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 3
p1
write(5)
p2read()→0
ATOMICATOMIC and so REGULARREGULAR. write(5) executed by p1
fails. So, it does not complete and it is concurrentwith the read by p2. Validity is respected.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 4
REGULAR REGULAR butbut nonnon ATOMICATOMIC. The ordering propertyis violated.
p1
write(5)
p2read()→5
p3read()→0
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 5
ATOMICATOMIC and REGULARREGULAR
p1
write(5)
p2
write(6)
read()→6
p3read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Regular Register:Implementation
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
onon--rregrreg denotes the implementation of a regular register where1 process can write (writer) and N processes can read (readers).
Events:
• RequestRequest:<:<onon--rregReadrregRead,, regreg>>
Used to invoke a read operation on register reg
• ConfirmationConfirmation:<:<onon--rregReadReturnrregReadReturn, , regreg, , vv>>
Used to return v as a response to the read invocation on
register reg and indicates that the operation completed
• RequestRequest:<:<onon--rregWriterregWrite ,,regreg, v, v>>
Used to invoke a write operation of value v on register reg.
• Confirmation:Confirmation:<<onon--rregWriteReturn,rregWriteReturn,regreg>>
Confirms that the write operation has taken place at register
reg and is complete.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Termination.Termination. If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation
ValidityValidity.. A read operation returns the last value written or
the value concurrently being written
NOTATION: RR sometimes used to denote a Regular Register
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
FailFail--Stop AlgorStop Algorithmithm:processes can crash but the crashes canbe reliably detected by all the other processes
• failure model: crash
• perfect failure detector:
�� Strong completenessStrong completeness. The crash of a process is eventually detected by every correct process
�� Strong accuracyStrong accuracy. No process is detected to have crashed until it has really crashed
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Perfect pointPerfect point--point links:point links:
1.1. Reliable deliveryReliable delivery - Let pi be any process that sends a message m to aprocess pj. If neither pi nor pj crashes, then pj eventually delivers m.
2.2. No duplicationNo duplication – No message is delivered by a process more thanonce.
3.3. No creationNo creation – If a message m is delivered by some process pj, then mwas previously sent to pj by some process pi.
BestBest--Effort Broadcast (Effort Broadcast (bebBroadcastbebBroadcast):):
1.1. BestBest--effort validityeffort validity –For any two processes pi and pj . If pi and pj arecorrect, then every message broadcast by pi is eventually delivered bypj.
2.2. No duplicationNo duplication-- No message is delivered more than once
3.3. No creationNo creation- If a message m is delivered by some process pj, then mwas previously broadcast by some process pi
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Algorithm Idea:
� Each process stores a local copy of the register
�� ReadRead--OneOne: each read operation returns the value stored in its local copy of the register
��WriteWrite--AllAll: each write operation updates the value locally stored at each process the writer consider to have not crashed
� A write completes when the writer receives an ack from each process that has not crashed
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register
NOTE.NOTE. The algorithmimplements an array of RR. We consider a single entry, i.e. a single Regular Register
• ValueValue[r][r]: current value of the register
• writeSetwriteSet[r][r]: used by the writer to track when a writehas been propagated to allcorrect processes
• correctcorrect: set of correctprocesses.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register: write
upon event <on-rregWrite | reg, val> dotrigger < bebBroadcast | [Write, reg, val] >;
upon event < bebDeliver |pj , [Write, reg, val] > dovalue[reg] := val;trigger < pp2pSend | pj , [Ack, reg] >;
upon event < pp2pDeliver | pj , [Ack, reg] > dowriteSet[reg] := writeSet[reg] ∪ {pj};
upon exists r such that correct ⊆ writeSet[r] do
writeSet[r] := ∅;trigger <on-rregWriteReturn | r>;
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
upon event < on-rregRead | reg > do
trigger < on-rregReadReturn | reg, value[reg] >;
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular registerCorreCorrectnessctness::
Termination –
� read: trivial, it is local.
� write: from the properties of the communication primitives and from the completeness property of the perfect failure detector.
� Validity – Because of the strong accuracy property of the perfect failure detector, each write operation can complete only after all processes that do not crash have updated their local copy of the register. So, the two following cases can hold:
� The read operation is not concurrent with the last write that has been invoked,the process will read the last value written
� The read operation is concurrent with the last write. For the no creation property of the channels, the value returned is either the last value written or the one being written. This latter is concurrent with the read operation
Performance:� Write – At most 2N messages.
� Read – 0 msg, it is local
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
• The algorithm does not ensure validity if the failure detector isnot perfect. The following scenario could happen:
• P1 invokes write()6 and then falsely suspects p2. Thus, p1
completes the write operation without waiting for the ack of p2, i.e. without being sure that the value 6 has been written in the local copy of the register at p2
p1
write(5)
p2
write(6)
read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Fail-silent algorithm:”process crashes can never be reliably detected”� Failure model: crash� No perfect failure detector
Assumptions:
� N processes whose 1 writer and N readers
� A majority of correct processes
Communication Primitives:
� Perfect point-to-point link
� Best-effort broadcast
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
IDEA:
� Each process locally stores a copy of the current value of the register
� Each written value is univocally associated to a timestamp
� The writer and the reader processes use a set of witness processes, to track the last value written
� Quorum: the intersection of any two sets of witness processes is not empty
�� ““Majority VotingMajority Voting””: each set is constituted by a majority of processes
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
1.1. snsn[r]:[r]: timestamp for register r
2.2. v[r]:v[r]: value of register r
3.3. acksacks[r]:[r]: data structure used by the writer to track howmany processes have updated the copy of the register
4.4. reqreq[r][r]=i:denotes that a given process has invoked its i-thread operation
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register
When the writer writes:
� it increments the timestamp by 1
� it locally stores the value written
� it tracks 1 ack
� it broadcasts a message to propagate the write
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process receives a write message m, it verifies if the value to be written is more recent than the last value locallywritten. In that case:
� it locally stores the new value and the correspondingtimestamp
� it sends an ack to the writer. The ack piggybacks the register id and the timestamp associated to the valuewritten
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register
When the writer receives an ack:
� It compares the timestamp in the message and its currentone. If they are equal, it increments by 1 the control structure that tracks the number of received acks.
�When the writer has received the majority of acks, i.e. the majority of processes have stored the value, the writecompletes.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process invokes a read on a register r
� it increments reqid[r] by 1
� it makes the readSet[r] empty
� it broadcasts the read request
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register
When a process receives a read request, it sends back the value in its local copy of the register and the correspondingtimestamp
When a process receives a response for a read request
� It checks if the response is for the current request
� If so, it inserts the pair (timestamp, value) in the readSet
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process has received a majority of responses to its readrequest:
� It choices the value with the highest timestamp,
� It updates its local copy of the register and timestamp
� It returns the value
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Functioning Scenario(1,3) regular
register
• Π={p1,p2,p3}
• p1 is the writer
• I=invokation
• R=response
v=5sn=1
v=5sn=1
v=5sn=1
v=5sn=1
Read Quorumread()�5
Write Quorumwrite(5)
v=6sn=2
p1
p2
p3
I R
I R
I Rv=6sn=2
Write Quorumwrite(6)
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) regular register
CorreCorrectnessctness::
� Termination – from the properties of the communication primitives and the assumption of a majority of correct processes
� Validity – from the intersection property of the quorums
Performance:
�Write –at most 2N messages
� Read - at most 2N messages
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Atomic Register Implementation
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Events:
�Request: <<onon--aregReadaregRead | | regreg>>
�
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(1,N) Atomic Register: Specification
TerminationTermination. . If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation
ValidityValidity.. A read operation returns the last value written or the value concurrently being written
OrderingOrdering.. If a read returns v2 after a read that precedes it has returned v1, then v1 cannot be written after v2
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
The algorithm consists of two phases
PHASE 1.PHASE 1. We use a (1,N) regular register to build a (1,1) atomic
register
PHASEPHASE 2.2. We use a set of (1,1) atomic registers to build a (1,N)
atomic register
NOTANOTATIONTION::
Hereafter, rr and ra, will be sometimes used to respectively
denote regular register and atomic register
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
IDEA:IDEA:
� p1 is the writer and p2 is the reader of the (1,1) atomic register, we aim to implement
� We use a (1,N) regular register where p1 is the writer and p2 is the reader
� Each write operation on the atomic register writes the pair (value, timestamp) into the underlying regular register
� The reader tracks the timestamp of previously read values to avoid to read something old
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
To read a value stored into the atomic register, the reader:
� reads the value and the corresponding timestamp from the rr
� checks if the timestamp of the value read from rr is greater than the timestamp of the last value read and in case it locally stores the new value and its timestamp
�returns the value locally stored
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
CorreCorrectnessctness::� Termination – from the termination property of the regular register
� Validity – from the validity property of the regular register
� Ordering – from the validity property and from the fact that the read tracks the last value read and its timestamp. A read operation always returns a value with a timestamp greater or equal to the one of the previously read value
Performance:
� Write – Each write operation requests a write on a (1,N) regular register
� Read - Each read operation requests a read on a (1,N) regular register
�� NOTE:NOTE: no more msg w.r.t. (1,1) regular register implementation
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
IDEA:IDEA:
(1,N) Atomic Register implies a writer p1 and N readers
The writer p1 communicates with every other reader by using N (1,1) atomic registers
� p1 is the writer of the (1,N) atomic register and is the writer of the (1,1) atomic registers used to communicate with the readers (N registers)
� the value of variable writer[r,i] is the identifier of the (1,1) atomic register whose reader is process pi
� Each time the writer wants to write a value, it writes in all the N atomic registers
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
A set of N2 (1,1) atomic registers are used for communication among readers
readers[r,i,j] stores the identifier of the (1,1) atomic register used process pj to inform process pi about the last value pj read
Each time a reader pi wants to read:
� for every j, it reads the values written in register readers[r,i,j];
� it reads the value written by the writer in the atomic register shared with pi, writer[r,i]
� it decides which is the last value written v
� it writes v in all registers readers[r,j,i] for every j
� it returns v
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When the writer invokes a write operation on the (1,N) atomic register:
• It increments the timestamp
• It sets reading to false
• It invokes the write of the pair (timestamp,value) on all the N (1,1)
atomic registers shared with the readers, i.e. writer[r,j] for all j
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When all the write operations on the (1,1) atomic registerscomplete, the write on the (1,N) atomic register completes
Each time a write on a (1,1) atomic register writer[r,j] completes, the writer tracks one more ack
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a reader wants to read the (1,N) atomic register:
� It makes the readSet empty
� For each j, it invokes the read operation on the (1,1) atomic register, readers[r,i,j]
When the reader receives a response by a (1,1) atomic register, itinserts the corresponding pair (timestamp, value) in the readSet
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When the reader pi has read all the registers shared with the other readers, itinvokes the read on the (1,1) atomic register shared with the writer
When reader pi has also read the (1,1) atomic register shared with the writer:
� pi chooses the value with the highest timestamp
�� NOTICE: NOTICE: pi informs all the readers about the value it read by writing into
readersreaders[r,j,i][r,j,i] for each j.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Each time the reader pi obtains an ack corresponding to the previous writeon readersreaders[r,j,i][r,j,i] for each j, pi increments by 1 the number of received ack
When pi has received all the acks, it returns the chosen value as the response of the (1,N) atomic register
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Correctness:
Termination – from the termination of the (1,1) atomic register
Validity – from the validity of the (1,1) atomic register.
Ordering - Consider a write operation w1 which writes value v1 with timestamp s1. Let w2 be a write which precedes w1. Let v2 and s2 (s1 < s2) be the value and the timestamp corresponding to w2.
Let assume that a read returns v2: by the algorithm, for each j in [1;N], pi has written (s2,v2) in readers[r; i; j].
For the ordering property of the underlying (1,1) atomic registers, each successive read will return a value with timestamp greater or equal to s2. Then s1 cannot be returned.
Performance:
� Write – each write operation on a (1,N) atomic register requests N write operationsi on the (1,1) atomic registers.
� Read – Each read operation on a (1,N) atomic register requests to read N (1,1) atomic registersand to write N (1,1) atomic registers.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
The algorithm is a modified version of the Read-One Write-All (1,N) Regular Register
IDEA: “the read operation writes”
The algorithm is called “Read-Impose Write-All” because a read operation imposes to all correct processes to update their local copy of the register with the value read, unless they store a more recent value
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
A process can usethe writeSet bothwhen reading and writing.
Variable reading isused to distinguishthe currentoperation
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a reader wants to read the value of the (1,N) atomic register
� It increments by 1 the number of read requests
� reading=true “I am reading”
� It stores the value of the local copy of the register in the variabile readval
� It broadcasts a message to write such a value
When the writer invokes the write of a value into the atomic register,
� It increments by 1 the number of write requests
� It broadcasts the write message
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process delivers a write request:
� If the value in the message is more recent that the one already stored (i.e. the first hasa bigger timestamp), the process locally applies the value
� In any cases, it sends back the ack
When an ack is delivered, if it corresponds to the current operation, the process is inserted in the writeSet
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When the process has received an ack by all correct processes
� If it was reading (reading =true), it terminates the read on the (1,N) atomic register by returning the value of readval
� If it was writing, it returns the ack to state the completion of the write
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Correctness:
� Termination – as for the Read-One Write-All (1,N) Regular Register.
� Validity - as for Read-One Write-All (1,N) Regular Register.
� Ordering – to complete a read operation, the reader process has to be sure that every other process has in its local copy of the register a value with timestamp bigger or equal of the timestamp of the value read. In this way, any successive read could not return an older value.
Performance:
� Write - a write requests at most 2N messages
� Read - a read requests at most 2N messages
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Failure model: crash
A majority of correct processes is assumed.
The algorithm is a variation of the Majority Voting (1,N) Regular
Register
IDEA:IDEA: A read imposes to a majority of processes to have the
value read
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process delivers a write message, if it contains a more recent value, this latter is locally stored
In any case it sends the ack
A process track the number of acks delivered
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a majority of ack is delivered, the current operationcompletes: if it was a read, the value to be read is returned
When a process invokes a read on the (1,N) atomic register, a read requestis broadcast
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Each process answer to a read request with its local values
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
When a process has received a majority of read response, itchooses the one with the highest timestamp
The process also broadcasts a request to write this latter value
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
CorreCorrectnessctness::� Termination – as Majority Voting (1,N) Regular Register
� Validity – as Majority Voting (1,N) Regular Register.
� Ordering – due to the fact that the read imposes the write of the value read to a majority of processes and to the property ofintersection of quorums.
Performance:�Write – at most 2N messages
� Read – at most 4N messages
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
(N,N) Atomic Register: Specification
TerminationTermination:: If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation
Atomicity: Every failed operation appears to be complete ordoes not appear to have been invoked at all, and everycomplete operation appears to have been executed at someinstant between its invocation and the correspondingconfirmation event.
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 1
NO ATOMIC
p1
write(5)
p2
write(8)
read()→8
read()→5
Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB
Scenario 2
p1
write(5)
p2
write(8)
read()→8
read()→5
p3
ATOMIC