View
0
Download
0
Category
Preview:
Citation preview
Distributed Systems Fall 2009 II 1
0. Course Overview
I. IntroductionII. Fundamental Concepts of Distributed Systems
Architecture models; network architectures: OSI, Internet and LANs; interprocess communication
III. Time and Global StatesClocks and concepts of time; Event ordering; Synchronization; Global states
IV. CoordinationDistributed mutual exclusion; Multicast; Group communication, Byzantine problems (consensus)
V. Distribution and Operating SystemsProtection mechanisms; Processes and threads; Networked OS; Distributed and Network File Systems (NFSs)
VI. Peer to peer systemsRouting in P2P, OceanStore, Bittorrent, OneSwarm, Ants P2P, Tor, Freenet, I2P
VII. SecuritySecurity concepts; Cryptographic algorithms; Digital signatures; Authentication; Secure Sockets
Distributed Systems Fall 2009 II 2
Architecture
Distributed Systems are foremost highly complex software systemsNortel Networks DMS100 switch: 2530 million lines of code, 3000 software developers, 20 years life cycle to date.Motorola: 20% of engineers produce hardware, 80% produce softwareSubject to all kinds of software engineering problems
Investigation of Software Architectures to deal with design challenges
"… include the organization of a system as composition of components; global control structures; the protocols for communication, synchronization, and data access; the assignment of functionality to design elements; the composition of design elements; physical distribution; scaling and performance; dimensions of evolution; and selection among design alternatives. This is the software architecture level of design." [Garlan and Shaw]
Architectural paradigms pertinent to distributed systemslayersclientserver
Distributed Systems Fall 2009 II 3
Layers
Basic IdeaBreaking up the complexity of systems by designing them through layers and services
layer: group of closely related and highly coherent functionalitiesservice: functionality provided to a superior layer
Examples of layered architecturesoperating systems (kernel, other services), historically: the THE operating systemcomputer network protocol architectures
layer n
layer n+1
nservice
nservice
layer n1
Distributed Systems Fall 2009 II 4
Layers
Typical layering in Distributed Systems
Platform: Hardware and operating system Windows NT / Pentium processorSolaris / SPARC processor
Middleware: achieve transparency of heterogeneity at platform levelAchieve communication and resource sharing
e.g., remote method invocationExamples
CORBA (OMG)DCOM (Microsoft)ODP (ITUT/ISO)Java Remote Method Invocation (Sun)
Note: Not all communication related functions can be abstracted
© Pearson Education 2001
Distributed Systems Fall 2009 II 5
ClientServer
Basic Model
Client: Process wishing to access data, use resources or perform operations on a different computerServer: Process managing data and all other shared resources amongst servers and clients, allows clients access to resource and performs computationInteraction: invocation / result message pairsExample
http server: client (browser) requests page, server delivers pageCaching of services (proxy servers)
caching of frequently used web pagesPeer processes (not clientserver)
processes that have largeley identical functionality
© Pearson Education 2001
Distributed Systems Fall 2009 II 6
ClientServer
VariantsService provided by multiple servers
Examples: many commercial web services are implemented through different physical serversMotivation
performance (e.g., cnn.com, download servers, etc.)reliability
Servers maintain either replicated or distributed database
© Pearson Education 2001
Distributed Systems Fall 2009 II 7
ClientServer
VariantsProxy servers: render replication/distributedness transparent
CacheingProxy server maintains cache store of recently requested resourcesFrequently used in searchengines:
© Pearson Education 2001
Distributed Systems Fall 2009 II 8
ClientServer
Further Variants of ClientServer ModelMobile Code
Code that is sent to a client process to carry out a specific taskExamples
AppletsActive Messages (containing communications protocol code)
Mobile Agentsexecuting program (code + data), migrating amongst processes, carrying out of an autonomos task, usually on behalf of some other processadvantages: flexibility, savings in communications costvirtual markets, worm programs
Thin Clientsexecuting windowsbased user interface on local computer while application executes on compute serverexample: X11 server (run on the application client side)
Mobile Devices for mobile computingpersonal digital assistants (PDAs)how to connect to Internet
wireless LANs/MANswireless Personal Area Networks
Distributed Systems Fall 2009 II 9
ClientServer
Further Variants of ClientServer ModelSpontaneous networking
CharacteristicsWLAN confronted with constantly changing set of heterogeneous mobile devicesDevices roaming in heterogeneous WLAN environments
Benefitsno need for wireline connectioneasy access to locally available services
© Pearson Education 2001
Distributed Systems Fall 2009 II 10
ClientServer
Further Variants of ClientServer ModelSpontaneous networking
Challengessupport for convenient connection and integration:
Internet assumes device has IP address in fixed subnetworkpossible solution: dynamic allocation of IP addresses (c.f. PPP)problem: how to find device if it has server capabilities
intermittent connectivity of devicesunavailable when in tunnels, airplanes, etc.
privacyubiquity of location information
securityaccess to deviceaccess rights in a dynamic, heterogeneous environment?
© Pearson Education 2001
Distributed Systems Fall 2009 II 11
ClientServer
Further Variants of ClientServer ModelSpontaneous networking
Discovery servicessevices available in the networktheir properties, and how to access them (including devicespecific driver information)
Interfaces of discovery servicesregistration service
accept registration requests from servers, stores properties in database of currently available services
lookup servicematch requested services with available servers
© Pearson Education 2001
Distributed Systems Fall 2009 II 12
ClientServer
InterfacesUse of clientserver architecture has impact on the software architecture used
what are the synchronization mechanisms between client and serveradmissible types of requests/responses
Design ChallengesQuality of service
performanceresponse timesthroughputtimeliness
dependent on network latencies and compute times (including scheduling)reliabilityadaptability
Dependabilityfault tolerance: system is expected to continue to function correctly in the presence of faultssecurity
Distributed Systems Fall 2009 II 13
Fundamental Interaction Model
Distributed SystemMultiple processesConnected by communication channels
Distributed AlgorithmSteps to be taken by each processCommunication between processes
synchronizationinformation flow
General Paradigm to capture the behavioural aspects of a messagebased distributed system, executing algorithm
Communicating Extended Finite State Machines [Brand and Zafiropoulo]Also called FIFOChannel Systems (firstinfirstout principle)
I/O Automata [Lynch]
Pi
Pk
Pl
Distributed Systems Fall 2009 II 14
CFSMs
Following [Brand and Zafiropulo] concurrent FSMs (≥ 2) + communication channels (=“protocol“)every FSM represents a concurrent, communicating process with a finite number of control statesevery communication channel is
1. fullduplex,2. errorfree,3. has a firstinfirstout service strategy4. has unbounded capacity(1. 3. characterizes a perfect fullduplex channel)(question: how does one model an imperfect channel)
one pair of channels (cij and cji) for each pair (i,j) of machines
M1
M2 M3
Distributed Systems Fall 2009 II 15
CFSMs
Formalization
N: a positive integer
i, j = 1, ..., N: index over processes
: N disjoint, finite sets, Qi denotes the state set of process I
: N disjoint sets, with (∀i)(Aii = ∅), Aij denotes the message alphabet
for the channel i → j
δ : relation, determining, for each pair i, j the following functions:
Qi x Aij → Qi (send from i to j)Qi x Aji → Qi (receive from j at i)
: tupel of initial states,
Definition
we call a protocol
N
1iiQ=N
1j,iijA=
0iq ( ) ( )i
0i Qqi ∈∀
( )δ,A,q,Q ij0ii
Distributed Systems Fall 2009 II 16
CFSMs
Notation si ∈ Qi: state of process ixij ∈ Aij: a message
?xij reception of a message!yji sending of a message
f((s1, .., sn)) = (f(s1), .., f(sn))x, y: messageX, Y: sequence of messagesx, xy, xY, xXY: concatenated sequences of messages
Distributed Systems Fall 2009 II 17
CFSMs
Alternating Bit Protocol (cf., [Holzmann 91])simple protocol securing unreliable message channelssender sends message msgn with n ∈ {0, 1} a sequence numberreceiver acknowledges with acknsender sets new sequence number at 1 + n mod 2retransmission of current message when wrong sequence number receivedsymmetric variant exists
!msg1
?ack0?ack1
?ack1 !msg0
?ack0
sender
?msg1!ack1
?msg0!ack0
receiver
s1
s2
r0
r1 r2
Distributed Systems Fall 2009 II 18
CFSMs
Semantics of a protocol?set of admissible state sequences
State of a protocol?sum of
local state of each of the 1 ... N processes, plusstate of all channels cij ∈ Aij*
each cij corresponds to a sequence of messages that have been sent, but not yet received
we call this the global system state
Distributed Systems Fall 2009 II 19
CFSMs
How do we obtain the set of all computations of a protocol, i.e., sets of sequences of global system states
initially: all processes in qi0 and all cij = ∅
system is in a current global system state sstate transition triggered by send and receive events
send event add a message to the tail of the corresponding message queue (= channel)change the local system state of the sending process
receive eventtake the message to be received from the head of the message queuechange the local system state of the receiving process
leads into a new global system state s'
Distributed Systems Fall 2009 II 20
CFSMs
Global System Statelet
P = a protocolS = (S1, .. ,SN) an Ntuple of local process states C an NxN matrix
so that for all i, j: cij ∈ Aij* we call (S, C) a global system state
( )δ,A,q,Q ij0ii
ε
ε=
N
1
N1
c
c
cc
COM
L
Distributed Systems Fall 2009 II 21
CFSMs
State Transition Relationlet P a protocol and G = {(S, C) | (S, C) is a global system state}|— : G → G is defined as follows
(S, C) |— (S’, C’) iff ∃ i, k, xij such that either a) (S, C) and (S’, C’) are identical except for the following exceptions
si’ = δ (si, !xij) (sending by i)cij’ = c ij xij
orb) (S, C) and (S’, C’) are identical except for the following exceptions
si’ = δ (si, ?xji) (receiving by i)cji = x ji cji'
Distributed Systems Fall 2009 II 22
CFSMs
Reachable Global System Statelet
G¤ the initial global system state of a protocol,G a global system state of the same protocol, |— the state transition relation of this same protocol, and |—* denote the transitive closure of |—.
We say that G is reachable if G0 |—* G
Paths and the language accepted by a protocol can be defined through |— (as it would be done for NFAs)
Distributed Systems Fall 2009 II 23
CFSMs
ExpressivenessTheorem: CFSMs are Turingcomplete
proof idea (other approaches are possible…):three processes: P1, P2, P3, simulate the control of the TM in the state machine of P2use P1 and the channels c21 and c12 to simulate the left half tape, and use P3 and c32 and c23 to simulate the right half tape note: all cij have unbounded length
Consequencesglobal state space has unbounded sizeundecidable problems:
terminationwill some communication event ever be executed?is some system state reachable?is the protocol deadlock free?
Distributed Systems Fall 2009 II 24
Fundamental Interaction Model
Performance characteristics of communication channelslatency: delay between sending and receipt of message
network access time (e.g., Ethernet retransmission delay)time for first bit to travel from sender’s network interface to receiver’s network interfaceprocessing time within the sending and receiving processes
throughput: number of units (e.g., packets) delivered per time unitbandwidth: amount of information (e.g., bits) transmitted per time unitdelay jitter: variation in delay between different messages of the same type (e.g., video frames in ATM networks)
Distributed Systems Fall 2009 II 25
Fundamental Interaction Model
Synchronous distributed systemtime to execute each step of computation within a process has known lower and upper boundsmessage delivery times are bounded to a known valueeach process has a clock whose drift rate from real time is bounded by a known value
Asynchronous distributed system: no bounds onprocess execution timesmessage delivery timesclock drift rate
Notesynchronous distributed systems are easier to handle, but determining realistic bounds can be hard or impossibleasynchronous systems are more abstract and general: a distributed algorithm executing on one system is likely to also work on another one
Distributed Systems Fall 2009 II 26
Fundamental Interaction Model
Event orderingas we will see later, in a distributed system it is impossible for any process to have a view on the current global state of the systempossible to record timing information locally, and abstract from real time (logical clocks)event ordering rules
if e1 and e2 happen in the same process, and e2 happens after e1, then e1 → e2if e1 is the sending of a message m and e2 is the receiving of the same message m, then e1 → e2
hence, → describes a partial ordering relation on the set of events in the distributed system
© Pearson Education 2001
Distributed Systems Fall 2009 II 27
Fundamental Interaction Model
Event orderingas we will see later, in a distributed system it is impossible for any process to have a view on the current global state of the systempossible to record timing information locally, and abstract from real time (logical clocks)event ordering rules
if e1 and e2 happen in the same process, and e2 happens after e1, then e1 → e2if e1 is the sending of a message m and e2 is the receiving of the same message m, then e1 → e2
hence, → describes a partial ordering relation on the set of events in the distributed system
Distributed Systems Fall 2009 II 28
Failures
Omission Failuresprocess omission failures: process crashes
detection with timeoutscrash is failstop if other processes can detect with certainty that process has crashed
communication omission failures: message is not being delivered (dropping of messages)
possible causes:network transmission errorreceiver incomming message buffer overflow
Arbitrary failuresprocess: omit intended processing steps or carry out unwanted onescommunication channel: e.g., nondelivery, corruption or duplication
© Pearson Education 2001
Distributed Systems Fall 2009 II 29
Failures
© Pearson Education 2001
© Pearson Education 2001
Distributed Systems Fall 2009 II 30
Security
Protecting access to objectsaccess rightsin client server systems: involves authentication of clients
Protecting processes and interactionsthreats to processes: problem of unauthenticated requests / replies
e.g., "man in the middle"threats to communication channels: enemy may copy, alter or inject messages as they travel across network
use of “secure” channels, based on cryptographic methodsDenial of service
e.g., “pings” to selected web sitesgenerating debilitating network or server load so that network services become de facto unavailable
Mobile coderequires executability privileges on target machinecode may be malicious (e.g., mail worms)
© Pearson Education 2001
Distributed Systems Fall 2009 II 31
Computer Networks
Computer Networks"interconnected collection of autonomous computers" [Tanenbaum 1996]
Types of NetworksLocal Area Networks (LANs)
highspeed communication on proprietary grounds (oncampus)most typical solution: Ethernet with 100 Mbps
Metropolitan Area Networkshighspeed communication for nodes distributed over mediumrange distances, usually belonging to one organizationproviding "backbone" to interconnect LAN's technology often based on ATM, FDDI or DSLtypical example: the Universitynetwork:
ATM based155 Mbit/sTransports data and voice (telephony)
Distributed Systems Fall 2009 II 32
Computer Networks
Types of NetworksWide Area Networks
communication over long distancescovers computers of different organizationshigh degree of heterogeneity of underlying computing infrastructureinvolves routersspeeds up to a few Mbps possible, but around 50100 Kbps more typicalmost prominent example: the Internet
Wireless Networksend user equipment accesses network through short or mid range radio or infrared signal transmissionWireless WANs
GSM (up to about 20 Kbps)UMTS (up to Mbps)PCS
Wireless LANs/MANsWaveLAN (211 Mbps, radio up to 150 metres)
Wireless Personal Area Networksbluetooth (up to 2 Mbps on low power radio signal, < 10 m distance)
Distributed Systems Fall 2009 II 33
Computer Networks
Network Type Performance Characteristics
© Pearson Education 2001
Distributed Systems Fall 2009 II 34
Computer Networks
Network topologies for pointtopoint networking
Star• short paths (always 2 hops)• robust against leaf node failure• but: whole network down if central node fails• sometimes physical star used to implement logical ring
Ring• varying path lengths• robust against node failure• basis for Token Ring/FDDI LANs
Tree• varying, relatively long path lengths• robust against leaf node failure• sensitive to internal node failure• suitable topology for multicast / broadcast
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 35
Computer Networks
Network topologies for pointtopoint networking
Mesh• completely connected graph • short paths (always 1 hop)• robust against node failure• expensive pointtopoint wireline implementation• inexpensive shared ether implementation
Intersecting Rings• internetworking for token ring networks• sensitive to bridge node failure
Irregular• most commonly found Wide Area Network topology
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 36
Computer Networks
ProtocolsAgreement between two communicating parties how the communication is to proceed
syntax message formatsdata representation
semantics: when to send which messageappropriate responseshow to detect and handle failures
Servicesprovide functions to invoker of serviceuse other services while providing abstraction from the particulars of the used services
Distributed Systems Fall 2009 II 37
Computer Networks
Layered protocol architectures
layer n
layer n+1
nservice
nservice
layer n1
layer n
layer n+1
nservice
nservice
layer n1
n+1 protocol
n protocol
n1 protocol
Distributed Systems Fall 2009 II 38
Computer Networks
Layered protocol architectures
layer n
layer n+1
nservice
nservice
layer n1
layer n
layer n+1
nservice
nservice
layer n1
n+1 protocol
n protocol
n1 protocol
file transfer (ftp)
tcp
ip
Distributed Systems Fall 2009 II 39
Computer Networks
Message formats
header: sequence numbers, synchronization patterns, message types, etc.data: user datatrailer: end sequence, error check sum
Encapsulation
header data trailer
n+1-data
n+1-datan-header n-trailer
n+1 datan-header n-trailern-int. contr. inf
n-data
layer n
SAP
Distributed Systems Fall 2009 II 40
Network Architectures
A generic protocol architecture: the ISO Open Systems Interconnection Basic Reference Model (OSIBRM)
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 41
Network Architectures
A generic protocol architecture: the ISO Open Systems Interconnection Basic Reference Model (OSIBRM)
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 42
Network Architectures
Protocol suite/stackstacked combination of protocol implementation collaborating to provide application servicesthere are no efficient implementations of protocol stacks conforming with OSIBRM
Most actual protocol implementations follow the Internet Reference Model, forming the Internet Protocol Suite / Stack
Application
Presentation
Session
Transport
Network
Data Link
Physical
Application
Transport
Internet
Hosttonetwork
OSI Internet RM
SMTP (simple mail transfer protocol)FTP (file transfer)
telnet (remote login)http (hypertext transfer protocol)
XDR (external data representation)
Internet Protocol Examples
Transmission Control Protocol (TCPUser Datagram Protocol (UDP)
Internet Protocol
Distributed Systems Fall 2009 II 43
OSIBRM
Application LayerProvide services that support the various types of distributed applicationsOSI protocols
electronic mail (X.400, almost entirely extinct these days)name/directory services (X.500, some residual interest and some implementations)
Internet protocolsSMTP (simple mail tranfer protocol)FTP (file transfer)telnet (remote login)http (hypertext transfer protocol)
Application Application
ComputerApplication
ComputerApplication
Distributed Systems Fall 2009 II 44
OSIBRM
Presentation LayerProblem: different computers represent data in different formatsExample: represenation of unsigned short integer 1 in 2 bytes
“bigendian” (e.g., Motorola 680x0) 0000000000000001“littleendian” (e.g., Intel 80x86) 1000000000000000
In the Internet: XDR (external data representation), fixed conventions for the representation of data
all integers 4byte bigendiansfloating point numbers in IEEE formattexts in ASCIIall fields aligned on 4byte word boundaries
Problem: may require two conversionsXDRtoC compiler exists
Presentation Presentation
Application Application
ComputerApplication
ComputerApplication
Distributed Systems Fall 2009 II 45
OSIBRM
Session LayerSupport session oriented traffic (classical database applications, file transfer, etc.)Two main functions
send token managementsynchronization/resynchronization after failures
Nonexistent in Internet RMfunctions are the responsibility of the application or the application layer protocolsexample: resynchronization in ftp after failure
maintain pointer to last transmitted byte in source file
Presentation Presentation
Application Application
ComputerApplication
ComputerApplication
Session Session
Distributed Systems Fall 2009 II 46
OSIBRM
Transport LayerProvide services for application message exchanges between peer application entitiesInterface with the underlying network
if application messages are too big for network layer, segment them and reassemble at the receiving endmultiple network connections for one application connection (if higher bandwidth needed than what one network connection can deliver)multiplex multiple application connections via one network connection, if possible, to efficiently use network bandwidth
Presentation Presentation
Application Application
ComputerApplication
ComputerApplication
Session Session
Transport Transport
Distributed Systems Fall 2009 II 47
OSIBRM
Transport LayerProvide connection across network with welldefined qualities (QoS, quality of service)
connection establishment delayconnection establishment failure probabilitythroughputtransit delayresidual error ratioprotectionpriority
Presentation Presentation
Application Application
ComputerApplication
ComputerApplication
Session Session
Transport Transport
Distributed Systems Fall 2009 II 48
OSIBRM
Transport LayerProvide connectionoriented as well as connectionless services
connectionoriented:1. establish connection on a welldefined source service access point
(or port) p and destination service access point p’
2. send messages to ports without providing target address
Transport Transport
Distributed Systems Fall 2009 II 49
OSIBRM
Transport LayerProvide connectionoriented as well as connectionless services
connectionoriented:1. establish connection on a welldefined source service access point
(or port) p and destination service access point p’
2. send messages to ports without providing target address
Transport Transport
con(hostB) acc(hostB, q) con(hostB, q)acc(hostB, p)
Transport Transportp q
Distributed Systems Fall 2009 II 50
OSIBRM
Transport LayerProvide connectionoriented as well as connectionless services
connectionoriented:1. establish connection on a welldefined source service access point
(or port) p and destination service access point p’
2. send messages to ports without providing target address
Transport Transport
con(hostB) acc(hostB, q) con(hostB, q)acc(hostB, p)
Transport Transportp q
msg msg
Distributed Systems Fall 2009 II 51
OSIBRM
Transport LayerProvide connectionoriented as well as connectionless services
connectionless:send messages providing target address for each message sent
connectionless vs. connectionorientedconnectionless
no overhead for connection setup and releasepotential bandwidthloss due to complete address informationno possibility to perform errorcorrection (pushed into application)
connectionorientedoverhead, but no bandwidth lossneed to reserve network resourcesfacilitates ensuring connection properties
order preservationretransmission
Transport Transport
Distributed Systems Fall 2009 II 52
OSIBRM
Transport LayerProvide connectionoriented as well as connectionless services
connectionless:send messages providing target address for each message sent
connectionless vs. connectionorientedconnectionless
no overhead for connection setup and releasepotential bandwidthloss due to complete address informationno possibility to perform errorcorrection (pushed into application)
connectionorientedoverhead, but no bandwidth lossneed to reserve network resourcesfacilitates ensuring connection properties
order preservationretransmission
Transport Transport
send(hostB, msg) rec(hostB, msg)
Distributed Systems Fall 2009 II 53
OSIBRM
Transport LayerPorts
link an application process to a transport connectionpermit identifying a remote application process (or service)
note: use of process id in target node would be unsuitable since pids are generated and destroyed dynamically in most operating systems
The internet protcocol architecture defines reserved port numbers, e.g.FTP: 21 (ftp connection establishment etc.)FTPDATA: 20 (ftp data transfer)TELNET: 23 (terminal connection)SMTP: 25 (mail delivery)HTTP: 80 (http requests)
Presentation Presentation
Application Application
ComputerApplication
ComputerApplication
Session Session
Transport Transport
Distributed Systems Fall 2009 II 54
OSIBRM
Transport LayerInternet Protocols
UDP (User Datagram Protocol)provides unreliable, connectionless transport service
no guarantee of order preservation, deliverymessage duplications are possiblefacilitates multicast
application areas: contextfree protocols, simple clientserver applications, i.e., one request one reply
Domain Name Server lookupSNMP requestsNFS requestsMultimedia protocols that do not require error correction
UDP header format:
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 55
OSIBRM
Transport LayerInternet Protocols
TCP (Transport Control Protocol)provides connectionoriented transport service
errorcorrectingorder preservingsegmentation of applicationlayer data streamduplex communication
transport connection uniquely identified throughnetwork (IP) addresses of sender end receiverport addresses of sender and receiverprotocol identifier for TCP (=6)
TCP header and pseudoheader:
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 56
OSIBRM
Transport LayerInternet Protocols
TCP (Transport Control Protocol)despite complexity, allows for high data rates (experimentally up to 100 Mbit/s)useable in LAN/MAN/WAN environmentstypical applications
email (SMTP)file transfer (ftp)remote terminal (telnet)remote graphics terminal (X11 for XWindows)http
Distributed Systems Fall 2009 II 57
OSIBRM
Network Layer
Transport Transport
Distributed Systems Fall 2009 II 58
OSIBRM
Network Layer
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 59
OSIBRM
Network Layer
Central questions:addressing: how to identify the target computerrouting: how to route the message most effectively through the networkpacket switching: will there be a new path for every packet, or will there be predescribed paths from source to destination connection setup and releaseendtoend error detection, ensuring packet ordering, flowcontrol
General functionality: network transparency for the transport layerprovide for endtoend transport connection independent of actual routing and switching decisions
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 60
OSIBRM
Network Layer
Packet switchingvirtual circuit: a fixed path for all packets of a connection will be determined at connection setup time
facilitates order preservationroute determination costs only once per connectioninflexible to adapt to changing network loads and configurations
datagram: routing decision for every packet in every nodefull address information in every packetless overhead for connection establishment, easier to implementmore flexible for shortlived connections
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 61
OSIBRM
Network Layer
Routing algorithmsobjectives
minimize average packet delaymaximize total throughputefficient implementation
conflicting, therefore often used: minimize number of hops (visited nodes) per packet
reduces delayreduces needed bandwidthincreases throughput
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 62
OSIBRM
Network Layer
Routing algorithmsstatic (nonadaptive) algorithms
determination of network routes for every pair of nodes at network setup timeno consideration of current network status and load (average values used)no change of routes during network operation
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 63
OSIBRM
Network Layer
Routing algorithmsdynamic (adaptive) algorithms
determination of network routes based on measurement/estimation of current network load and configuration
centralized: one central node makes routing decisionsisolated: decision on routing based solely on local traffic and load information (backward learning, routing, deltarouting)distributed: nodes are exchanging routing information (distance vector routing, RIP)
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Physical
NetworkData Link Physical
NetworkData Link
Distributed Systems Fall 2009 II 64
Data Link Layer
Main functionserror detection and correction
physical media are prone to signal distortions due to external impulses and material propertiestypical value for error probability of a 32 bit block over telephone wire: 0.0016error detection: using check sum (e.g., parity bits)
to detect e bit errors one needs code with Hamming distance of e+1error correction: check sum plus exact information, which bit flipped
to correct e bit errors, need 2e+1 Hamming distance often used: check sum generated through cyclic reduncancy check
detects all error burst with a length of up to 16, 99.998 of all longer burstsbased on polynom division
OSIBRM
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Distributed Systems Fall 2009 II 65
Data Link Layer
Main functionsframes: errordetecting and errorcorrecting codes need frame delimiters
bit stuffingacknowledgement and retransmission of erroneous frames
sequence numbers gobackn
flow control: nodes may receive more traffic than they can deliver to adjacent nodes, but have limited buffer capacity: buffer overflow
sliding window protocol (of size n)sending node may race ahead a number of n unacknowledged messagesif last acknowledged packet is k, and new acknowledgement l>k arrives, then sender may transmit up to sequence number k+n
includes acknowledgement/retransmission functionality
OSIBRM
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Distributed Systems Fall 2009 II 66
Data Link Layer
Examplefor WANs
HDLC: High Level Data Link Control (ISOstandardized)LAPB: Link Access Procedure Balanced (CCITT/ITUT, for X.25)
for LANsLLC 2: Logical Link Control Typ 2
OSIBRM
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
Distributed Systems Fall 2009 II 67
Physical Layer
Featuresdefines the physical characteristics of the signal transmissionsexample: bit encoding mechanisms
OSIBRM
Transport Transport
Physical
NetworkData Link
Physical
NetworkData Link
© Prentice-Hall 1996
Distributed Systems Fall 2009 II 68
Internet Protocol Architecture
Comparison OSIBRM vs. Internetpresentation and session layers not implemented in Internet architecture, will be implemented in application (e.g., XDR encoding)IP provides less functionality than network layer in OSIBRML2 and L1 not specified in Internet architecture
L7 Application
L6 Presentation
L5 Session
L4 Transport
L3 Network
L2 Data Link Control
L1 Physical
smtpftp
telnethttp
TCP
IPLAN
(M)WANproprietary netwoks
OSIBRM Internet
Distributed Systems Fall 2009 II 69
Addressing
Addressing in the Internet Protocoladdresses used in souce and destination fields of the Internet Protocolrequirements
define a unique address for any node in the Internetno two nodes on the Internet may have the same address
define a sufficiently large address spaceIPv4 (1982): 32bit addresses for 232 (appr. 4 billion) addressesinsufficient due to
unforeseen growth of internetinefficient use of address space
IPv6 (1994): 128bit addresses for 2128 (appr. 3x1038) addressable nodes
max. 7x1023 IP addresses per m2 of entire earth surfaceif as inefficiently allocated as phone numbers: 103 per m2
support a flexible routing scheme, but addresses themselves should not contain routing information
Distributed Systems Fall 2009 II 70
Addressing
Addressing in the Internet Protocoladdress class structure
for very large networks
for smaller networks with more than
255 nodes
for all other networks
for multicast communication
unallocated for future use
© Pearson Education 2001
Distributed Systems Fall 2009 II 71
Addressing
Addressing in the Internet Protocoldecimal address representation
problems:network administrators cannot predict growth of their subnetstend to apply for class B network addresses, even though subnets then tend to be smaller than 255 nodes (i.e., class C would be sufficient)leads to inefficient use of address space
for very large networks
for smaller networks with more than
255 nodes
for all other networks
for multicast communication
unallocated for future use
© Pearson Education 2001
Distributed Systems Fall 2009 II 72
Addressing
Addressing in the Internet Protocola typical intranet/subnet architecture
© Pearson Education 2001
Distributed Systems Fall 2009 II 73
Routing
Principles of Distance Vector routing in WANs
graph of nodes and links, links labeled with link numbersfind cost optimal path from one node to another, i.e., paths with minumum number of hops
e.g., from C to D: CED has minimal cost of 2every node maintains a routing table with the following information
in node C
in node E
target outlink costD 5 2
target outlink costD 6 1
© Pearson Education 2001
Distributed Systems Fall 2009 II 74
Routing
Principles of Distance Vector routing in WANs
© Pearson Education 2001
Distributed Systems Fall 2009 II 75
Routing
Principles of Distance Vector routing in WANscost column not used for routing decision, but for construction of routing tableconstruction of routing tables: Routing Information Protocol (RIP)
each node specifies single hop routing informationuse of a distributed algorithm, based on FordFulkerson shortest path algorithmprinciple
each node shares its local routing table information with all its direct neighbours: will send copy of local routing table to all adjacent nodes
periodically, when a timer t expires (in internet typically t=30 sec.)when local routing table changes
when receiving a routing table from a node Rupdate table with new route or with existing route with better cost (compare local value with R’s value +1)when received on link n, and received table shows different value for some local route starting with n, then copy the value from received table (R is closer to destination and therefore it’s table is more authorative)
Ford and Fulkerson showed that this algorithms converges on best routes whenever there’s a change in the network
Distributed Systems Fall 2009 II 76
Routing
Principles of Distance Vector routing in WANsThe Routing Information Protocol (RIP) Algorithm
Tl: local routing tableTr: received routing table
© Pearson Education 2001
Distributed Systems Fall 2009 II 77
Routing
Principles of Distance Vector routing in WANsThe Routing Information Protocol (RIP) Algorithm
dealing with links going downset cost value in table to ∞ and perform Send actionwill get propagated until there is a node that has a working link to target node, which will then eventually get propagated to all other nodes
extensions: RIP1cost represents actual bandwidthincreased speed of convergenceavoidance of loops
Distributed Systems Fall 2009 II 78
Routing
Linkstate algorithmsevery node has knowledge of the complete network topology including link states and link costspossible since routers become more powerfulOpen Shortest Path First (OSPF) algorithm
all nodes will compute shortest paths locally using Dijkstra’s shortest path first algorithmwhen topology changes, nodes will exchange change information and update local databasesadvantages
faster convergenceno undefined states as in RIP
Coexistence of algorithmsdifferent routing algorithms may coexist since routing tables contain identical information for all algorithmshowever, for routing table creation and update, the same algorithm needs to be usedoften, network divided into subnets, and one algorithm used in every subnet
Distributed Systems Fall 2009 II 79
Routing
Routing table explosionneed to store information from every node in the IP address space to every other node leads to table size explosiontwo remedies
topological/georgraphic grouping of IP addresses, so that addresses in one topological area are all routed to a central router of that area
e.g., all addresses 194.0.0.0 to 195.255.255.255 in Europeoutside Europe, addresses in this range (discernible through the first octet) are routed to the closest European router, which then perform detailed routing
problem: before 1993, IP addresses were assigned without regard to geographic location, still in use
usage of default routesnot all nodes in a subnet need to store complete routing information as long as key routers close to backbone have complete routing information
Distributed Systems Fall 2009 II 80
Routing
Default routes
B 2 1C local 0E 5 1default 5
© Pearson Education 2001
Distributed Systems Fall 2009 II 81
Routing
Routing on a local subnetif subnet is an Ethernet or Token Ring, no routing is necessaryIP layer uses the Address Resolution Protocol (ARP) do determine physical address of local host
Distributed Systems Fall 2009 II 82
Routing
Classless Inter Domain Routing (CIDR) 1996Scarcity of Class B addresses, while plenty of Class C addresses were availableSubdivide Class C address space by assigning batches of contiguous addresses to subnets of more than 255 nodesFor efficient routing: add mask field to routing table information
arbitrary, unmasked portion of Class C address can form subnet addressExample:
net A: 2048 addresses 194.24.0.0 194.24.7.255, mask 255.255.248.0net B: 4096 addresses 194.24.16.0 194.24.31.255 (on 4096 boundary), mask 255.255.240.0net C: 1024 addresses 194.24.8.0 194.24.11.255, mask 255.255.252.0
given address 194.24.17.4, bitwise AND with all masks in tableonly result of andin with net B mask gives valid address
11000010 00011000 00010001 0000010011111111 11111111 11110000 0000000011000010 00011000 00010000 00000000
=> route according to net B line routing table information
11000010 000110000 00000000 000000011000010 000110000 00010000 000000011000010 000110000 00001000 0000000
11111111 111111111 11111000 000000011111111 111111111 11110000 000000011111111 111111111 11111100 0000000
ABC
address mask
Distributed Systems Fall 2009 II 83
Internet Protocol
IP protocolnetwork protocol of the Internet protocol stackprovides network service with the following characteristics
no guarantee of deliveryduplication possibleunbounded delayno order preservation
address resolutionIP addresses may need to be mapped to physical addresses (e.g., on Ethernet)use Address Resolution Protocol (ARP)
either direct relation between IP and physical address, or mappingtransport layer service provides data stream transmission service
broken into network layer datagrams by transport layerIP: max length of datagram 64 Kbytes, usually 1500 bytes, if necessary by fragmenting and reassembling them further
© Pearson Education 2001
Distributed Systems Fall 2009 II 84
Internet Protocol
IP version 6 (IPv6), 1994enlarged address spaceimproved routing speed
no checksum applied to body, only to headerno datagram fragmentation occurs inside network (mechanism detemining smalles datagram size along path before packet is transmitted)support for realtime traffic
priority: prefered handling of highpriority packetsflow labels: resource reservation for specific types of realtime traffic
future evolution of protocol: next header may point to special header inside packet body (e.g., for router information)support for multicast, anycast, and security
© Pearson Education 2001
Distributed Systems Fall 2009 II 85
Internet Protocol
MobileIP
support for roaming of laptop computers, personal digital assistants (PDAs), wearable computing devices, etc.IP addresses are bound to subnet addresses, but roaming may leave subnet boundary
© Pearson Education 2001
Distributed Systems Fall 2009 II 86
Internet Protocol
MobileIP
every IP address is assigned to “home” domainhome agent (HA) and foreign agent (FA)
when device is at home, will behave as local routerwhen device leaves home area it informs HA
HA behaves as proxy: will answer ARP requests for mobile device with own local network address
when device arrives at a new side it informs FAFA assigns temporary, local address to mobile deviceFA then contacts HA and gives mobile device’s temporary and permanent address
arriving packet for mobile devicerouted to HAtunneled to FA and delivered to mobile devicesubsequent packets from same source tunneled directly from sender to FAA
© Pearson Education 2001
Distributed Systems Fall 2009 II 87
Local Area Networks
IEEE 802
Two most important LAN technologiesToken RingEthernet
Both technologies share a single medium (Ring, Bus)LLC: error correction etc.MAC: mechanisms for synchronizing access to the shared medium
Network
Physical
Medium Access Control (MAC)
Logical Link Control (LLC)
© Pearson Education 2001
Distributed Systems Fall 2009 II 88
Local Area Networks
EthernetPrinciple of operation: Carrier Sensing, Multiple Access with Collision Detection (CSMA/CD)
all stations connected to linear or treelike branching cableframe format
carrier sensing:all stations permanently listen for frames with own address as destination address
multiple access with collision detectionan arbitrary number of stations can attempt to send frame by broadcasting it on the shared medium if they sense medium is free (“carrier”)if more than one process sends at the same time, collision occursif a sending station detects collision, it will start applying a jamming signal to inform all stations that currently transmitted data is invalidretransmission necessary, either immediately, or after deterministic or random delay
© Pearson Education 2001
Distributed Systems Fall 2009 II 89
Local Area Networks
EthernetPerformance
currently, mostly 100 Mbit/sec, up to 1 Gbit/secnondeterministic delays
very fast at low to midutilizationincreasing delays due to collision and retransmissions at utilizations of above 50%
Characteristicscomparatively low costeasy extensibility
Distributed Systems Fall 2009 II 90
Local Area Networks
Wireless LAN (IEEE 802.11)
shared radio medium, but differences to Ethernethidden stations: one station may fail to detect that another transmitsfading: one station may easily be out of reach of another, although they can both send to some common station in the middlecollision masking: listening to detect collision won’t work since own sending signal will be much stronger than remote signal and hence collision may be nondetectable by sending station
© Pearson Education 2001
Distributed Systems Fall 2009 II 91
Local Area Networks
Wireless LAN (IEEE 802.11)
Carrier Sensing Multiple Access/Collision Avoidance (CSMA/CA)sender to receiver: Request To Send (RTS) message, specifying duration of requested send slotreceiver replies: Clear To Send (CTS) message, repeating durationstations within reach of sender with notice RTS, and stations within reach of receiver will notice CTS, so stations in reach of both sender and receiver will refrain from sending during slotshould unlikely RTS and CTS collisions be happening, random delay is used
© Pearson Education 2001
Distributed Systems Fall 2009 II 92
Interprocess Communication
Distributed Systems rely on exchanging data and achieving synchronization amongst autonomous distributed processes
Inter process communication (IPC)shared variablesmessage passing
message passing in concurrent programming languageslanguage extensionsAPI calls
Principles of IPC(see also [Andrews and Schneider 83] G. Andrews and F. Schneider, Concepts and Notations for Concurrent Programming, ACM Computing Surveys, Vol. 15, No. 1, March 1983)Concurrent programs: collections of two or more sequential programs executing concurrentlyConcurrent processes: collection of two or more sequential programs in operation, executing concurrently
Distributed Systems Fall 2009 II 93
Interprocess Communication
Synchronizationconcurrent processes on different computers execute at different speedsneed for one process to influence computation in another process
specify constraints on the ordering of events in concurrent processesmessage passing
send message happens before receive messageMessage Passing Primitives
send expression_list to destination_designatorevaluates expression_listadds a new message instance to channel destination_designator
receive variable_list from source_designatorassignes received values to variables in variable_listdestroys received message
Central questionshow are destination designators specified?how is communication synchronized?
Distributed Systems Fall 2009 II 94
Interprocess Communication
Destination Designationdirect naming: source and destination process names serve as designators (a pair of source and destination designators defines a channel)
send cur_status to monitor or monitor!cur_statusreceive message from handler or handler?message
easy to implement and useallows a process easy control when to receive which message from which other processuse to implement client/server applications
well suited to implement client/server paradigm if there is one client and one serverotherwise: server should be capable of accepting invocations from any client at any time, and a client should be allowed to invoke many services at a time if more than one server available
Distributed Systems Fall 2009 II 95
Interprocess Communication
Destination Designationglobal names or mailboxes: process nameindependent destination designator shared by many processes
messages sent to a mailbox can be received by any process that executes a receive referring to that mailbox nameto implement Client/Server applications
clients send requests to mailbox, an available server picks them up drawback: costly implementation
message sent to mailboxrelayed to all other sites that could potentially receive from that mailboxif one site decides to receive, inform all other sites that message is no longer available for receiptmutual exclusion for concurrent access
ports: mailbox, but only one process is permitted to receive from that mailboxeasy to implement: receives can occur in only one process, no distribution and coordination necessarysuitable for multiple clients / single server applications
Message destinations in Internet programminghybrid direct naming/port scheme(Internet_address, port_number)
port corresponds to many sender/one receiver concept
Distributed Systems Fall 2009 II 96
Interprocess Communication
Channel Namingstatic (at compile time)
impossible for a program to communicate along a channel not known at compile timeinflexibility: if a process might ever need to communicate with a receipient, that channel must be available throughout the entire runtime of the sending programme
dynamic (at runtime)administrative overhead at runtimemore flexible allocation of communication resources
Distributed Systems Fall 2009 II 97
Interprocess Communication
Semantics of message passing primitivesBlocking
nonblocking: the execution will never delay the invoking process blocking: otherwise
Synchronizationasynchronous message passing: message passing using buffers with unbounded capacity
sender may race ahead an unbounded number of stepssender never blocksreceiver blocks on empty queue
synchronous message passing: no buffering between sender and receiver
sender blocks until receiver ready to receivereceiver blocks until sender ready to send
buffered message passing: buffers with bounded, finite capacitysender may race ahead a finite, bounded number of stepssender blocks on full bufferreceiver blocks on empty buffer
Distributed Systems Fall 2009 II 98
Interprocess Communication
Nonblocking primitives for asynchronous or buffered message passing
receivebackground variant: process continues, receives interrupt upon arrival
overhead for implementationignoring variant: programm polls for availability
sendsending process waits for empty buffer or drops message to be sent
Distributed Systems Fall 2009 II 99
Interprocess Communication
Distributed Applicationsavailability of a set of preimplemented application services, like email, ftp, http, etc.what if you want to build your own, customized Internet application?access to Transport Layer services
Services provided by Internet Transport LayerUDP: message passing (datagram)TCP: data stream
L7 Application
L6 Presentation
L5 Session
L4 Transport
L3 Network
L2 Data Link Control
L1 Physical
smtpftp
telnethttp
TCP
IPLAN
(M)WANproprietary netwoks
OSIBRM Internet
© Pearson Education 2001
Distributed Systems Fall 2009 II 100
Interprocess Communication
SocketsInternet IPC mechanism of Unix and other operating systems (BSD Unix, Solaris, Linux, Windows NT, Macintosh OS)processes in these OS can send and receive messages via a socketsockets are duplexsockets need to be bound to a port number and an internet address in order to be usable for sending and receiving messageseach socket has a transport protocol attribute (TCP or UDP)messages sent to some internet address and port number can only be received by a process using a socket that is bound to this address and port numberUDP socket can be connected to a remote IP address and port numberprocesses cannot share ports (exception: TCP multicast)
© Pearson Education 2001
Distributed Systems Fall 2009 II 101
Interprocess Communication
IPC based on UDP datagramsUDP datagram properties: no guarantee of order preservation, message loss and duplications are possiblenecessary steps
create socketbind socket to a port and local Internet address
client: arbitrary free portserver: server port
receive method: returns Internet address and port of sender, plus messagemessage size: IP allows for messages of up to 216 bytes
most implementations restrict this to around 8 kbyteslarger application messages: application’s responsibility to perform fragmentation/reassemblingif arriving message is too big for array allocated to receive message content, truncation occurs
send: nonblocking blocks only until message given to UDP/IPupon arrival placed in perport queue
receive: blockingpreemption by timeout possibleif process wishes to continue while waiting for packet, use separate thread
Distributed Systems Fall 2009 II 102
Interprocess Communication
Java API for UDP datagramsClasses– DatagramPacket constructor generating message for sending from
array of bytesmessage content (byte array)length of messageInternet address and port number (destination)
similar constructor for receiving a message– DatagramSocket class for sending and receiving of UDP datagrams
one constructor with port number as argument, another withoutnoargument constructor to use free local portmethods* send and receive* setSoTimeout* connect for connecting a socket to a particular remote Internet
address and port
Distributed Systems Fall 2009 II 103
Interprocess Communication
Java API for UDP datagramsExample
process creates socket, sends message to server at port 6789, and waits to receive reply
import java.net.*;import java.io.*;public class UDPClient{
public static void main(String args[]){ // args give message contents and destination hostnametry {
DatagramSocket aSocket = new DatagramSocket(); // create socket byte [] m = args[0].getBytes();InetAddress aHost = InetAddress.getByName(args[1]); // DNS lookupint serverPort = 6789; DatagramPacket request = new DatagramPacket(m, args[0].length(), aHost, serverPort);aSocket.send(request); //send nessagebyte[] buffer = new byte[1000];DatagramPacket reply = new DatagramPacket(buffer, buffer.length);aSocket.receive(reply); //wait for replySystem.out.println("Reply: " + new String(reply.getData()));aSocket.close();
}catch (SocketException e){System.out.println("Socket: " + e.getMessage());}catch (IOException e){System.out.println("IO: " + e.getMessage());}} // can be caused by send
Distributed Systems Fall 2009 II 104
Interprocess Communication
IPC based on TCP streamsabstract service: stream of bytes to be written to or received fromfeatures
message size: no constraint, TCP decides when to send a transport layer message consisting of multiple application messages, immediate transmission can be forcedconnection orientedretransmission to recover from message lost (timeoutbounded)queue at destination socketblocked on receiveflow control to block sender when overflow might occurneed to agree on data sent and receivedserver generates new thread for new connection
API for streamsconnection establishment using client/server approach, afterwards peer communication
client: issue connect requestsserver: has listening port to receive connect request messagesaccept of a connection: create new stream socket for new connection
Distributed Systems Fall 2009 II 105
Interprocess Communication
Java API for TCP streamsClasses– ServerSocket class: create socket at server side to listen for connect
requests– Socket class: for processes with connections
constructor to create a socket and connect it to remote host and port of a servermethods for accessing input and output stream
Distributed Systems Fall 2009 II 106
Interprocess Communication
ExampleTCPbased server for stream communication
import java.net.*;import java.io.*;public class TCPServer {
public static void main (String args[]) {try{
int serverPort = 7896; // the server portServerSocket listenSocket = new ServerSocket(serverPort);
// new server port generatedwhile(true) {
Socket clientSocket = listenSocket.accept();// listen for new connection
Connection c = new Connection(clientSocket);// launch new thread
}} catch(IOException e) {System.out.println("Listen socket:"+e.getMessage());}}
}
Distributed Systems Fall 2009 II 107
Interprocess Communication
ExampleTCPbased server for stream communication
class Connection extends Thread {DataInputStream in;DataOutputStream out;Socket clientSocket;public Connection (Socket aClientSocket) {
try {clientSocket = aClientSocket;in = new DataInputStream( clientSocket.getInputStream());out =new DataOutputStream( clientSocket.getOutputStream());this.start();
} catch(IOException e){System.out.println("Connection:"+e.getMessage());}}public void run(){
try { // an echo serverString data = in.readUTF();
// read a line of data from the streamout.writeUTF(data);
// write a line to the streamclientSocket.close();
} catch (EOFException e){System.out.println("EOF:"+e.getMessage());} catch (IOException e) {System.out.println("readline:"+e.getMessage());}
}}
Distributed Systems Fall 2009 II 108
Interprocess Communication
Data Representationdata representation problem
use agreed external representation, two conversions necessaryuse sender’s or receiver’s format and convert at the other end
transmission of structured data typesdata types may not change during transmissionusage of a commonly understood “flattened” transfer format (structured types are reduced to their primitive components)
data representation formatsSUN Microsystems XDRCORBA CDRASN.1 (OSI layer 6)
marshalling/unmarshallingmarshalling: assembling a collection of data items in a form suitable for transmissionunmarshalling: disassembling and recovery of original data itemsusually performed automatically by middleware layer
handprogramming errorproneuse of compilers for programs working directly at transport API
Distributed Systems Fall 2009 II 109
Interprocess Communication
CORBA Common Data Represenation (CDR)CORBA: Common Object Request Broker Architecture
middleware architecture standardized by the Object Management Groupsee www.omg.org
CORBA CDRsupports types allowed in CORBA remote object invocationsprimitive types
little/big endian according to sender’s representation formatprimitive values start at indexed byte positions (multiples of 1, 2, 4 or 8) floating point according to IEEE standardcharacters using an agreed character set
Distributed Systems Fall 2009 II 110
Interprocess Communication
CORBA Common Data Represenation (CDR)CORBA CDR
structured typesdefinitions
example: struct with value {‘Smith’, ‘London’, 1934}
© Pearson Education 2001
© Pearson Education 2001
Distributed Systems Fall 2009 II 111
Interprocess Communication
CORBA Common Data Represenation (CDR)CORBA marshalling/unmarshalling
CORBA Interface Definition Language (IDL)IDL compilers will generate marshalling and unmarshalling operations (“stubs”) that transforms data objects into CDR format
Distributed Systems Fall 2009 II 112
Interprocess Communication
Java Object Serializationexample: class Person
serialization: flattening an object into a linear form such that it can be stored in a file or transmitted in a messagelinear format must be such that deserialization routine is capable of recovering the complete object structure and state
inclusion ofhandles (references to other objects)name of class that an object belongs toversion number of class
note: mark objects as nonserializable (“transient”) if they are not supposed to be serialized (e.g., socket references, files, etc.)
public class Person implements Serializable {private String name;private String place;private int year;public Person(String aName, String aPlace, int aYear){
name = aName;place = aPlace;year = aYear;}
... }
Distributed Systems Fall 2009 II 113
Interprocess Communication
Java Object Serializationexample: class Person
serialization example: – Person p = new Person(“Smith”, “London”, 1934)
serialization: create instance of class ObjectOutputStream and invoke writeObject method, passing object to be serialized as argumentdeserialization: open ObjectInputStream on the serialized structure and use readObject method
public class Person implements Serializable {private String name;private String place;private int year;public Person(String aName, String aPlace, int aYear){
name = aName;place = aPlace;year = aYear;}
... }
© Pearson Education 2001
Distributed Systems Fall 2009 II 114
Interprocess Communication
Remote Object Referencesneeded when a client invokes an object that is located on a remote serverreference needed that is unique over space and time
space: where is the object locatedtime: correct version of an undeleted object
a generic format proposal
internet address/port number: process which created objecttime: creation timeobject number: local counter, incremented each time an object is created in the creating processinterface: how to access the remote object (if object reference is passed from one client to another)
extensionobject references that are location transparent
© Pearson Education 2001
Distributed Systems Fall 2009 II 115
Interprocess Communication
ClientServer Communicationoften built over UDP datagrams
clientserver protocol consists of request/response pairs, hence no acknowledgements at transport layer are necessaryavoidance of connection establishment overheadno need for flow control due to small amounts of data tranfered
generic protocol example (for RPC or RMI communication)
© Pearson Education 2001
Distributed Systems Fall 2009 II 116
Interprocess Communication
ClientServer Communicationformat of protocol messages
message type (request/reply)request ID
sending process identifier (e.g., IP address/port number)integer sequence number incremented by sender with every request
object referencemethod ID and arguments
if implemented over UDP: failure recoveryomission failures
use timeout and resend request when timeout expires and reply hasn’t arrivedserver receives repeated request
indempotent operations: same result obtained on every invocationnonindempotent operations: resend result stored from previous request, requires maintenance of a history of replies
loss of replies: request reply ack reply protocolmessage duplication: return request ID with reply
Distributed Systems Fall 2009 II 117
Interprocess Communication
ClientServer Communicationexample: Hypertext Transfer Protocol (HTTP)
lightweight request reply protocol for the exchange of network resources between web clients and web serversprotocol steps
connection establishment between client and server (likely TCP, but any reliable transport protocol is acceptable)client sends requestserver sends replyconnection closure
inefficient scheme, therefore HTTP 1.1 allows “persistent transport connections” (remains open for successive request/reply pairs)Resources can have mimetype data types, e.g.
text/plaintext/htmlimage/jpeg
data is marshalled into ASCII transfer syntax
Distributed Systems Fall 2009 II 118
Interprocess Communication
ClientServer Communicationexample: Hypertext Transfer Protocol (HTTP)
request
GET: request of resource, identified by URL, may refer todata: server returns dataprogram: server runs program and returns output data
HEAD: request similar like GET, but only meta data on resource is returned (like date of last modification)POST: specifies resource (for instance, a server program) that can deal with the client data provided with previous requestPUT: supplied data to be stored in given URLDELETE: delete an identified resource on server
© Pearson Education 2001
Distributed Systems Fall 2009 II 119
Interprocess Communication
ClientServer Communicationexample: Hypertext Transfer Protocol (HTTP)
reply
© Pearson Education 2001
Recommended