Upload
fearghus-frain
View
42
Download
3
Tags:
Embed Size (px)
DESCRIPTION
CS 603 Mid-Semester Review. March 4, 2002. One or Two Day Review?. One day: Skim material and Test Overview What to do with Wednesday? More on replication Start on distributed processes Two day: Discuss material to date Wednesday: Finish Review Work out sample question. Basics. - PowerPoint PPT Presentation
Citation preview
CS 603Mid-Semester Review
March 4, 2002
One or Two Day Review?
• One day: Skim material and Test Overview– What to do with Wednesday?
• More on replication• Start on distributed processes
• Two day: Discuss material to date– Wednesday:
• Finish Review• Work out sample question
Basics
• Why do we want distributed systems?– Scaling– Heterogeneity– Geographic Distribution
• What is a distributed system?– Transparency vs. Exposing Distribution
• Hardware Basics– Communication Mechanisms
Basic Software Concepts
• Hiding vs. Exposing– Distribution – Distributed OS– Location, but not distribution – Middleware– None – Network OS
• Concurrency Primitives– Semaphores– Monitors
• Distributed System Models– Client-Server– Multi-Tier– Peer to Peer
Communication Mechanisms
• Shared Memory– Enforcement of single-system view– Delayed consistency: δ-Common Storage
• Message Passing– Reliability and its limits
• Stream-oriented Communications
• Remote Procedure Call
• Remote Method Invocation
RPC Example: DCE
• Language / Platform Independent
• Implementation Issues:– Data Conversion– Underlying Mechanisms
• Fault Tolerance Approaches
Java RMI
• Supports remote invocation of Java objects• Key: Java Object Serialization
Stream objects over the wire • Language specific• Advantages
– True object-orientation: Objects as arguments and values– Mobile behavior: Returned objects can execute on caller– Integrated security– Built-in concurrency (through Java threads)
• Disadvantage – Java only• Implementation / Use
– Registry
SOAP
• Goal: RPC protocol that works over wide area networks– Interoperable– Language independent
• Problem: Firewalls– Solution: HTTP/XML
• Client side: Ability to generate http calls and listen for response
• Server:– Listen for HTTP– Bind to procedure– Respond with HTTP
• SOAP message format and use mechanisms
Naming Requirements
• Disambiguate only
• Access resource given the name
• Build a name to find a resource
• Do humans need to use name?
• Static/Dynamic Resource
• Performance Requirements
Naming Approaches
• Scope– Global vs. Hierarchical– Unique ID vs. Non-Unique Description
• Namespaces– URN, URI, URL
• Registries
Registry Example: X.500
• Goal: Global “white pages”– Lookup anyone, anywhere– Developed by Telecommunications Industry– ISO standard directory for OSI networks
• Idea: Distributed Directory– Application uses Directory User Agent to
access a Directory Access Point
Directory Information Base(X.501)
• Tree structure– Root is entire directory– Levels are “groups”
• Country• Organization• Individual
• Entry structure– Unique name
• Build from tree– Attributes: Type/value
pairs– Schema enforces type
rules• Alias entries
X.500
• Directory Entry:– Organization level – CN=Purdue University, L=West
Lafayette– Person level – CN=Chris Clifton, SN=Clifton,
TITLE=Associate Professor
• Directory Operations– Query, Modify
• Authorization / Access control– To directory– Directory as mechanism to implement for others
X.500 – Distributed Directory
• Directory System Agent• Referrals• Replication
– Cache vs. Shadow copy– Access control– Modifications at Master only– Consistency
• Each entry must be internally consistent• DSA giving copy must identify as copy
X.500 Subsets
• LDAP– X.500 without OSI– Intended for use over IP
• Active Directory– Microsoft’s answer to LDAP– Extensible “default” naming schema– Limited replication facilities
Clock Synchronization
• Definition: All nodes agree on time– What do we mean by time?– What do we mean by agree?
• Lamport Definition: Events– Events partially ordered– Clock “counts” the order
Event-based definition(Lamport ’78)
Define partial order of processes• A B: A “happened before” B: Smallest
relation such that:1. If A and B in same process and A occurs first, A
B2. If A is sending a message and B is receipt of a
message, A B3. If A B and B C, then A C
• Clock: C(x) is time x occurs:– C(x) = Ci(x) where x running on node i.– Clocks correct if a,b: ab C(a) < C(b)
Lamport Clock Implementation
• Node i Increments Ci between any two successive events
• If event a is sending of a message m from i to j,– m contains timestamp Tm = Ci(a)– Upon receiving m, set Cj ≥ current Cj and > Tm
• Can now define total ordering. a b iff:– Ci(a) < Cj(b)– Ci(a) = Cj(b) and Pi < Pj
What if we want “wall clock” time?
• Ci must run at correct rate: κ << 1 such that | dCi(t)/dt – 1 | < κ
• Synchronized: small ε such that i,j: | Ci(t) – Cj(t) | < ε
• Assume transmission time between μ and μ+ξ• Algorithm: Upon receiving message m,
set Cj(t) = max(Cj(t), Tm+μ)• Theorem: Assume every τ seconds a message
with unpredictable delay ξ is sent over every arc. Then
t ≥ t0 + τd, ε ≈ d(2κτ + ξ)
Clock Synchronization:Limits
• Best Possible: Delay Uncertainty– Actually ε(1 – 1/n)
• Synchronization with Faults– Faulty clock– Communication Failure– Malicious processor
• Worst case: Can only synchronize if < 1/3 processors faulty– Better if clocks can be authenticated
Real example: NTP
I doubt you need to review this...
Process Synchronization
• Problem: Shared Resources– Model as sequential or parallel process– Assumes global state!
• Alternative: Mutual Exclusion when Needed– Coordinator approach– Token Passing– Timestamp
Mutual Exclusion
• Requirements– Does it guarantee mutual exclusion?– Does it prevent starvation?– Is it fair?– Does it scale?– Does it handle failures?
CS 603Mid-Semester Review
March 6, 2002
Mutual Exclusion:Colored Ticket Algorithm
• Goals:– Decentralized– Fair– Fault tolerant– Space Efficient
• Idea: Numbered Tickets– Next number gets resource– Problem: Unbounded Space– Solution: Reissue blocks
Multi-ResourceMutual Exclusion
• New Problem: Deadlock– Processes using all resources– Each needs additional resource to proceed
• Dining Philosophers Problem– Coordinated vs. truly distributed solutions
• Problems with deterministic solutions• Probabilistic solution – Lehman & Rabin
– Starvation / fairness properties
Distributed Transactions
• ACID properties• Issues:
– Commit Protocols– Fault ToleranceWhy is this enough?
• Failure Models and Limitations• Mechanisms:
– Two-phase commit– Three-phase commit
Two-Phase Commit(Lamport ’76, Gray ’79)
• Central coordinator initiates protocol– Phase 1:
• Coordinator asks if participants can commit• Participants respond yes/no
– Phase 2:• If all votes yes, coordinator sends Commit• Participants respond when done
• Blocks on failure– Participants must replace coordinator– If participant and coordinator fail, wait for recovery
• While blocked, transaction must remain Isolated– Prevents other transactions from completing
Transaction Model
• Transaction Model– Global Transaction State– Reachable State Graph
• Local states potentially concurrent if a reachable global state contains both local states
– Concurrency set C(s) is all states potentially concurrent with s
• Sender set S(s) = {local states t | t sends m and s can receive m}
• Failure Model– Site failure assumed when expected message not
received in time– Independent Recovery
Problems with 2-PC
• Blocking on failure– 3-PC as solution
• Theorems on recovery limits– Independent recovery: No two-site failure– Non-independent recovery
• Anything short of total failure okay• Recovery protocol for total failure
3PC assuming timeout on receipt of message
c1
a1
c2
a2
q1
w1 w2
q2
xact request/start xact
no/abort
start xact/no
start xact/yes
pre-commit/ack
abort/-
yes/pre-commit
Coordinator Participant
p1
ack/commit
p2
commit/-
Termination Protocol
• If participant times out in w2 or p2:– Elect new Coordinator
If coordinator alive, would have committed/aborted
• New coordinator requests state of all processes. Termination rules:– If any aborted, broadcast abort– If any committed, broadcast commit– If all w2, broadcast abort– If any p2, send pre-commit and enter state p1
• Complete failure protocol
Test Basics
• Mechanics: Open book/notes– No electronic aids
• Two questions– Each multi-part– Will include scoring suggestions
• Underlying question: Do you understand the material?– No need to regurgitate “best in literature” answer
• Reasonable self-designed solution fine
– Key: Do you really understand your answer• Can you build CORRECT distributed systems?
Sample Question:Clock Synchronization
• Develop synchronization protocol for a four processor system with fully-connected processors.
• Linear envelope of real time• Bounded difference between
clocks on correct processors.• Time set to 0 when the
protocol begins (but not synchronized).
• Assume:– Clocks don't drift– Messages take between time
0 and e– At most one faulty processor– No authentication
• Discuss the correctness of your algorithm, including the types of faults handled.
Scoring:• Protocol: Up to five points• Argument for correctness: 2
points– requires believable proof
sketch for full 2 points• Faults supported / not
supported: 1-3 points– 3 points requires proof sketch
that it handles supported faults and examples showing failure with unsupported fault types.