Upload
roberto-baldoni
View
520
Download
1
Tags:
Embed Size (px)
DESCRIPTION
A new challenge is emerging due to the adventof new classes of applications and technologies such as smart environments, sensor networks, mobile systems, peertopeer systems, cloud computing etc. In these settings, the underlying distributed systems cannot be fully managed but it needs some degree of self-management that depends on the specific application domain. However, it is possibleto delineate some common consequences of the presence of such self management: first, there is no entity that can always ensure the validity of the system assumptions during the entire computation and, second, no one knowsaccurately who joins and who leaves the system at anytime introducing a kind of unpredictability in the systemcomposition (this phenomenon of arrival and departureof processes in a system is also known as churn).As a consequence, distributed computing abstractions have to deal not only with asynchrony and failures, but also with this dynamic dimension where a process that does not crash can leave the system at any time implying that membership can fully change several times during the samecomputation. Hence, the abstractions for reliable distributed compiuting have to be reconsideredto take into account this new “adversary” setting. This selfdefinedand continuously evolving distributed system, thatwe will name in the following dynamic distributed system,makes abstractions more difficult to understand and masterthan in distributed systems where the set of processes isfixed and known by all participants. The churn notionbecomes thus a system parameter whose aim is to maketractable systems having their composition evolving alongthe time.The presentation analyzes the issues in building a regula register in an environment that considers crashs and byzantine failures. This presentation has been delivered during the Retirement Seminar for Professor Santosh Shrivastava that took place in Newcastle (UK) on september 2011.
Citation preview
Roberto BaldoniUniversità di Roma “La Sapienza”
Retirement Seminar for Professor Santosh Shrivastava8th of September 2011, Newcastle, UK
The Price of Mastering Churn in Distributed Systems
Roberto Baldoni, “The price of mastering churn in a distributed system”
1
Santosh reminds me … a set of acronims
MIDAS (2001)
EUCOSM (2003) LUCID (2004) MAGNET (2005) VIRTUE (2007) SEGOVIA (2009)
Roberto Baldoni, “The price of mastering churn in a distributed system”
2
Large and promising IP rejected --too many Chinese!
FET IP - Very strong consortium - rejected reason «very nice projects, however it wants to provide a real
software platfom for pooling together on-demand resources in a multi-tenant
environment resistant to byzantine attack…. in FET program we do not
fund engineering work»
Just below the bar!
Outline Dynamic Distributed Systems System Model with Churn Regular Registers Other interesting Abstractions Conclusion
Roberto Baldoni, “The price of mastering churn in a distributed system”
3
Advent of Complex Distributed Applications
Peer-to-peer Sensor Networks Mobile networks Cloud computing federations Internet supercomputing Smart environments
Roberto Baldoni, “The price of mastering churn in a distributed system”
4
Managed vs. Unmanaged distributed applications (i)
Managed Distributed Application Existence of a manager that can control the
entities comprising or running the application The manager guarantees a suitable environment
for a duration of time sufficient for a distributed system to behave correctly wrt its system model assumptions, e.g., Providing needed/sufficient/appropriate entities to enable
correct behavior of the application (global application view)
Providing operational guarantees of QoS and the necessary degree of synchrony in the underlying distributed platformRoberto Baldoni, “The price of mastering churn in a distributed
system”5
Managed Distributed Applications: Consequences
Main characteristics: a predefined setting, i.e., The application knows, directly or indirectly, the set of processes
that will participate in the computation The application knows if it can exploit synchrony assumptions
The system can be carefully and "centrally" configured through an appropriate tuning phase in order to get the best performance
The application cycle is: Design, deployment optimization, configuration, final deployment, operation
Managed Distributed Applications run on the top of a Distributed System that is piecewise static wrt time
N entitiesN-1
entitiesN-2 N+3 time
Roberto Baldoni, “The price of mastering churn in a distributed system”
6
Managed vs. Unmanaged distributed applications (ii)
Unmanaged Distributed Applications No assumption of a manager or access to
equivalent management facilities Each process autonomously decides to locally run
a component of a distributed application when (a) joining and (b) leaving the system the system and/or its components do not start with a
known and pre-defined setting “Nice” manageable system model assumptions
either cannot be guaranteed or do not last for long
Roberto Baldoni, “The price of mastering churn in a distributed system”
7
Unmanaged distributed applications: Consequences
Autonomic/autonomous behavior of entities Self-defined, self-instantiating (& self*?) and
perpetually evolving distributed system It is impossible to know the set of processes
participating to the computation because it changes dynamically and can potentially grow without bounds
E.g., the system could cease existing when no process is active, and at other times the system may be made of thousands of active processes
. . . Dynamic Distributed System
Roberto Baldoni, “The price of mastering churn in a distributed system”
8
WorldOrderly Chaotic
Spectrum of Possible System Models
Roberto Baldoni, “The price of mastering churn in a distributed system”
9
Air traffic Control
Mobile ad-hoc Systems
Cloud Computing
Peer-to-peer
Uncertainty in Dynamic Distributed Systems Static Distributed Systems:
Lack of temporal knowledge Failures Unknown communication delays
Dynamic Distributed Systems Same issues as in static distributed systems,
plus Non-monotonic and unknown size of the system Potentially changing properties of the “universe” Unclear notions of efficiency, effectiveness,
scalability
Roberto Baldoni, “The price of mastering churn in a distributed system”
10
• Solid theoretical foundations
• Precise problem specifications
• Rigorously correct solutions• Solid theoretical foundations
• Precise problem specifications
• Rigorously correct solutions
System Model with Churn
Roberto Baldoni, “The price of mastering churn in a distributed system”
11
The distributed system is dynamic In each run, infinitely many processes can arrive and depart from
the system but at any point in time the number of processes is finite (Infinite Arrival Model)
Processes participate in a distributed computation running on top of the distributed system Processes of the distributed system decide at their will to join and
leave the distributed computation (i.e. the computation is affected by continuous churn)
No process is guaranteed to participate for ever in the distributed computation
Each process has a unique identifier
Processes can crash and this can be seen as a leave of the process
Roberto Baldoni, “The price of mastering churn in a distributed system”
12
System Model with Churn
Abstractions Shared Memory
Registers Sets
One-shot problem Interval valid queries
Agreement Problem Leader Election
Roberto Baldoni, “The price of mastering churn in a distributed system”
13
Churn
Distributed System
Distributed Computation
Connectivity Protocol
Communication Protocols
Abstraction
For simplicity we assume N processes are in the distributed computation at any given time
Roberto Baldoni, “The price of mastering churn in a distributed system”
14
Object Abstraction: The Regular Register
A register is a shared variable accessed by processes through read and write operations
Roberto Baldoni, “The price of mastering churn in a distributed system”
15
Regular Register Architecture at node i
Roberto Baldoni, “The price of mastering churn in a distributed system”
16
Connectivity Layer
Point-to-PointLink
Broadcast
Regular Register
If pi invokes the send(m) operation to pj at time t then pj will receive m by time t+ if it has not left the system by that time
If pi invokes the broadcast(m) operation at time t and does not leave the system by time t+ then all the processes that are in the system at time t and does not leave the system by time t+ will deliver m by time t+
(liveness) If a process invokes a read or a write operation and does not leave the system, it eventually returns from that operation
(safety) A read operation returns the last value written or a value written by a concurrent write
Read() write(v) join()
REG
SystemComputation
Regular Register: write()
Roberto Baldoni, “The price of mastering churn in a distributed system”
17
The writer process pw wants to write the value v
pw sends a broadcast message (WRITE, v, sn)
… in the meanwhile processes join and leave the computation
OBS. Only processes belonging to the computation when pw starts the write and that remain in the computation for all the time of the write will maintain the updated copy of the register
Active Processes keeps the state of the computation
Dis
trib
uted
Sys
tem
A subset of processes participate to the register computation
pw
Processes in the distributed computation vs Active Processes
Roberto Baldoni, “The price of mastering churn in a distributed system”
18
N
ChurnA(t)
t
Correctness bound
#pro
cess
es
Joining processe=leaving processes
Processes in the distributed computation vs Active Processes
Roberto Baldoni, “The price of mastering churn in a distributed system”
19
N
ChurnA(t)
t
Correctness bound
#pro
cess
es
Joining processe=leaving processes
Movement of the bound is impacted by the system model. The weaker the system model is the more «static» the system becomes. This brings several impossibility results in presence of churn.
Processes in the distributed computation vs Active Processes
N
ChurnA(t)
t
#pro
cess
es
Joining processe=leaving processes
Correctness bound
Liveness and Safety issues
Movement of the bound is impacted by the system model. The weaker the system model is the more «static» the system becomes. This brings several impossibility results in presence of churn. Roberto Baldoni, “The price of mastering churn in a distributed system”20
21
An Algorithm in Synchronous System
Assumption there is a bound δ such that any message sent
(broadcast) at time τ ≥ t, is received (delivered) by time τ + δ to the processes that are in the system during the interval [τ, τ + δ].
A process remain in the system at least 3δ
Algorithm Read local Write global Join global
Roberto Baldoni, “The price of mastering churn in a distributed system”
Synchronous System Safety: case registeri ≠
Roberto Baldoni, “The price of mastering churn in a distributed system”
22
Join()
0
0
0
1pi
pj
ph
pk
Join
Write
Reply
pi has received a WRITE(< val,sn >) message during the first waiting phase and accordingly updated registeri
• the write operation lasts time• the join operation lasts at least time• a write message takes at most time to be delivered
Then the join and the write are concurrent and the join terminates with the last value written
write (1) 1
1
1
WRITE(1, 1)
Synchronous System Safety: case registeri =
Roberto Baldoni, “The price of mastering churn in a distributed system”
23
Join()
0
0
0
0pi
pj
ph
pk
Join
Write
Reply
INQUIRY(i)
REPLY(h, 0, 0)
If no write is concurrent with the join operation, and c<1/3 then there always exists an active process that replies with the last written value
Synchronous System Safety: case registeri =
Roberto Baldoni, “The price of mastering churn in a distributed system”
24
write (1)
Join()
0
0
0
1
1
1pi
pj
ph
pk
INQUIRY(i)
REPLY(h, 0, 0)
WRITE(1, 1)
pi can receive both WRITE(< val,sn >) messagesand REPLY(< j, val, sn >) messages. Accordingthe values received at time τ + 2δ, pi will updateregisteri to the value written by a concurrent update,or the value written before the concurrent writes
WRITE(1, 1)
If pi receives the write before the reply, pi does not overwrite the value and then any following write will return the last value written.
25
Synchronous System
Termination. If a process invokes the join() operation and does not leave the system for at least 3 time units, or invokes the read() operation, or invokes the write() operation and does not leave the system for at least time units, it does terminates the invoked operation.
Safety. Let [, + ] any interval of the computation. if (c x n) in [, + ] is lesser than n/(3) (i.e., c < 1/3 ). A read() operation returns the last value written before the read invocation, or a value written by a write operation concurrent with it.
Roberto Baldoni, “The price of mastering churn in a distributed system”
Horizontal Quorums for Register Persistence
Roberto Baldoni, “The price of mastering churn in a distributed system”
26
3δ
1
5
9
3
1
5
9
8
1
5
7
8
2
5
7
8
2 joining
Active processNon-active process
Horizontal Quorums for Register Persistence
Roberto Baldoni, “The price of mastering churn in a distributed system”
27
3δ3δ
1
5
9
3
1
5
9
8
1
5
7
8
2
5
7
8
2
6
7
8
2
6
7
32 joining joining
3
Active processNon-active process
The register persistence is preserved iff the churn is below a given bound depending of protocol implementation
Eventually Synchronous System
Assumption There exists a time t after that there is a bound δ such that any
message sent (broadcast) at time τ ≥ t, is received (delivered) by time τ + δ to the processes that are in the system during the interval [τ, τ + δ].
There exists a time t after that c < 1/3δ. A process remain in the system at least 3δ
Algorithm Read global Write global Join global
Roberto Baldoni, “The price of mastering churn in a distributed system”
28
Roberto Baldoni, “The price of mastering churn in a distributed system”
29
Vertical Quorums for Register Validity in Asynchronous Periods
Validity of the read:During asynchrony periods to be sure to read the last written value you need to read/write registers from a majority of processes in the system (you do not have anymore the guarantee that messages are delivered within a known bound)
time
Termination. Let us assume that |A(t)| > n/2 (i.e., majority of processes is active at any time), if a process invokes join(), read() or write (), and does not leave the system, it terminates its operation.
Safety. Let us assume that |A(t)| > n/2, a read operation returns the last value written before the read invocation, or a value written by a write operation concurrent with i
Asynchronous System There are no bound on message transfer
delays
Theorem It is not possible to implement a regular register
in a fully asynchronous dynamic system.
The results is similar to the one of [Attiya – Bar-Noy -Dolev JACM95] when considering a static system with any number of process failures
Roberto Baldoni, “The price of mastering churn in a distributed system”
30
Regular Register with Byzantine Failures
Roberto Baldoni, “The price of mastering churn in a distributed system”
31
Regular Register with Byzantine Failures
Composed by an arbitrary large set of client c1... cm
Dynamic: servers may join and leave (infinite arrival model)
Join_System() operation: connects new processes to the system
Leave_System() operation: passive leave
Connection Layer (e.g. Overlay Management
Protocol)
(Authenticated)Communication Layer
(Best-effort Semantics)
Distributed Computation(i.e. Regular Register)
Computation Model
Client are correctNo information about register stateClients triggers read() and write() operations
Write (v)
Read ()
Computation Model
Initially n servers are part of the register computationUp to f byzantine failures (f < n/3)Servers maintain locally a copy of the register valueAlternating periods of churn and stability
No stable processes In churn periods the servers
set is refreshed of cn servers in each time unit (c [0, 1]).
Write (v)
Read ()
v
v
v
vx
vx
v
Join_Server()
Requirements
Write Persistency: Servers maintain the last value written by a write operation despite servers departures
Byzantine Resiliency: There are always at least f+1 servers maintaining the same value
Read- Validity: any read() operation returns the last value written by a completed write() or a value concurrently written
Issues in read() operations
v xxv
v xxvv
v xxv
v xxv
v
v
y
time
t1
t2
ti
tk
Validity Bound
Consider a generic protocol P= {AJS, AR, AW } implementing a regular register such that1) every operation eventually terminates and
2) there exists a period of churn longer than the longest operation issued on the register
Theorem: Let AJS, AR and AW be the algorithms implementing respectively join_Server(), read() and write() operations. Let tj, tr and tw be the maximum time intervals needed by the previous algorithm to terminate the operation. If
c min {(n-3f)/(n tr), (n-3f)/(n (tj+ tw)}
then it is not possible to ensure both write persistency and read validity
38 Roberto Baldoni, “The price of mastering churn in a distributed system”
Validity Bound in a synchronous system TimelyBroadcastDelivery(TBDel) : There exists a known and finite
bound such that every message broadcast at some time t is delivered up to time t + .
TimelyChannelDelivery(TCDel) : There exists a known and finite bound ’ < such that every message sent at some time t is delivered up to time t + ’ .
Pictorial Related Work and summary of results for Regular Register
System Model
Churn Model
Failure model
Asyncronous
Eventuallysynchronous
synchronous
crash
byzantine
static quiescent continuous
Aguilera et al. PODC 2010
Baldoni et al. ICDCS 2009
Baldoni et al. PODC 2011
Roberto Baldoni, “The price of mastering churn in a distributed system”
39
No Churn Quiescent Churn
Continuous Churn
Synch Crash
BFT papers
Baldoni et al ICDCS 2009
Byzant Baldoni et al. PODC 2011 (ba)
Event Synch
crash Baldoni et al ICDCS 2009
byzantine Open Problem
Asynch Crash Aguillera et al 2009 Impossible
byzant Open Problem
Pictorial Related Work and summary of results for Regular Register
Roberto Baldoni, “The price of mastering churn in a distributed system”
40
Other Abstractions we faced
Set object (Europar 2010, EWDC2011) More complex semantic than the one of registers The set containts all its history
Main result: It is not possible to implement a set object in an eventually synchronous distributed system prone to continuous churn if:a) Processes have only finite memory space for local computation
b) Accesses to the set are continuous
c) There are no stable processes participating in the set computation
k-bounded set in an eventually synchronous distributed system
Roberto Baldoni, “The price of mastering churn in a distributed system”
41
Other Abstractions we faced
Leader Election (EDCC2010) There is a bounded set of (good) processes that
gets into the computation and remain forever (no one knows who they are)
Churn is continuous Communication is synchronous with finite losses
and unknown maximum transfer delay
Risk: elect an infinite sequence of processes that leave the system (bad processes)
Main result: «under these assumptions we can implement leader election»
Roberto Baldoni, “The price of mastering churn in a distributed system”
42
43 Roberto Baldoni, “The price of mastering churn in a distributed system”
done in 2 Steps
The HB* Oracle Provide a list of processes
deemed to be up (alive list). The list aims to: Put good processes on the top of
the list Stabilize the position of a good
process in the list
protocol Take the list provided by the
HB* protocol and output the leader
HB*
leader
alive list
unicast multicastse
nd/r
ecei
ve
multicast/receive
Conclusion Dynamic Distributed Systems are everywhere
Most of the todays systems are unmanaged to some extent
Some of the functionality have to be autonomic and do not rely on a manager
Dynamic Distributed Systems are unquestionably more complex than static ones this leads to more complex solutions to solve the same problem
Scalability and dynamicity are not synonymous Understanding the how to implement abstractions
in a efficient way and well-suited to a dynamic distributed systems is stil an open and fashinating problem
Roberto Baldoni, “The price of mastering churn in a distributed system”
44
One slide to remember
Roberto Baldoni, “The price of mastering churn in a distributed system”45
One slide to remember
N
ChurnA(t)
t
#pro
cess
es
Joining processe=leaving processes
Correctness bound
Liveness and Safety issues
Movement of the bound is impacted by the system model. The weaker the system model is the more «static» the system becomes. This brings several impossibility results in presence of churn. Roberto Baldoni, “The price of mastering churn in a distributed system”46