Upload
brooke-hamilton
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
DistributedSoftware Engineering
Lecture 2Communication Fundamentals
Middleware Solutions
Sam MalekSWE 622, Fall 2012
George Mason University
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 2
outlineNetworking fundamentals
OSI layers
Java sockets
Middleware solutions
Remote Procedure Calls (RPC)
Remote Method Invocation (RMI)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 3
OSI layer 1
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
specifies: pin layout, voltages, modulation
does: establish & terminate access to medium,flow control, contention resolution
at this level: hubs, repeaters,network adapters
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 4
OSI layer 2
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
specifies: how to transfer data in a LAN
does: detect and correct errors
at this level: MAC addresses (flat, HW-based)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 5
OSI layer 3
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
specifies: how to transfer data sequences across LANs (e.g., IP)
does: routing
at this level: hierarchical address scheme,routers, bridges & switches
IP Internet
Concatenation of Networks
Protocol Stack
R2
R1
H4
H5
H3H2H1
Network 2 (Ethernet)
Network 1 (Ethernet)
H6
Network 3 (FDDI)
Network 4(point-to-point)
H7 R3 H8
R1
ETH FDDI
IPIP
ETH
TCP R2
FDDI PPP
IP
R3
PPP ETH
IP
H1
IP
ETH
TCP
H8
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 6
IP Service ModelConnectionless (datagram/packet-based)Best-effort delivery (unreliable service)
packets are lostpackets are delivered out of orderduplicate copies of a packet are deliveredpackets can be delayed for a long time
Datagram formatVersion HLen TOS Length
Ident Flags Offset
TTL Protocol Checksum
SourceAddr
DestinationAddr
Options (variable) Pad(variable)
0 4 8 16 19 31
Data
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 7
Datagram Forwarding Strategy
every datagram contains destination’s addressif directly connected to destination network, then forward to hostif not directly connected to destination network, then forward to some routerforwarding table maps network number into next hopeach host has a default routereach router maintains a forwarding table
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 8
Forwarding Tables
Suppose there are n possible destinations, how many bits are needed to represent addresses in a routing table?
log2n
So, we need to store and search n * log2n bits in routing tables?
We’re smarter than that!
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 9
Global Addresses
Propertiesglobally uniquehierarchical: network + host
Dot Notation10.3.2.4128.96.33.81192.12.69.77
Network Host
7 24
0A:
Network Host
14 16
1 0B:
Network Host
21 8
1 1 0C:
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 10
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 11
OSI layer 4
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
specifies: reliable transference of data(e.g., TCP, UDP)
does: flow control, segmentation, error control,retransmission
UDP vs TCPUDP (User Datagram Protocol)
connectionless - sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrivaleach time a datagram is sent, the local and receiving socket address need to be sent as well
TCP (Transmission Control Protocol)connection-oriented - provides a reliable flow of data between two computers: data sent from one end of the connection gets to the other end in the same orderin order to communicate using TCP protocol, a connection must first be established between the pair of socketsonce two sockets have been connected, they can be used to transmit data in both (or either one of the) directions
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 12
UDP vs. TCP
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 13
Options (variable)
Data
Checksum
SrcPort DstPort
HdrLen 0 Flags
UrgPtr
AdvertisedWindow
SequenceNum
Acknowledgment
0 4 10 16 31
Data
SrcPort DstPort
Length (bytes)
0 16 31
Checksum
UDP TCP
Which protocol to use?Overhead
UDP - every time a datagram is sent, the local and receiving socket address need to be sent along with itTCP - a connection must be established before communications between the pair of sockets start (i.e. there is a connection setup time in TCP)
Packet SizeUDP - there is a size limit of 64 kilobytes per datagramTCP - there is no limit; the pair of sockets behaves like streams
ReliabilityUDP - there is no guarantee that the sent datagrams will be received in the same order by the receiving socketTCP - it is guaranteed that the sent packets will be received in the order in which they were sent
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 14
Which protocol to use? (cont.)
TCP - useful when indefinite amount of data need to be transferred ‘in order’ and reliably
otherwise, we end up with jumbled files or invalid information
examples: HTTP, ftp, telnet, …
UDP - useful when data transfer should not be slowed down by the extra overhead of the reliable connection
examples: real-time applications
e.g. consider a clock server that sends the current time to its client
• if the client misses a packet, it doesn't make sense to resend it because the time will be incorrect when the client receives it on the second try
• the reliability of TCP is unnecessary - it might cause performance degradation and hinder the usefulness of the service
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 15
ExamplesSome Internet Application and their Underlying Transport Protocols
Application App. Protocol Transp. Protocol
e-mail smtp TCPremote access telnet TCPWeb http TCPfile transfer ftp TCPstreaming media proprietary TCP or UDPdomain name service DNS TCP or UDPinternet telephony proprietary UDP
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 16
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 17
OSI layer 5
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
specifies: establishing long lived connections
does: checkpointing, adjournment, restart
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 18
OSI layer 6
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7specifies: data formats and transformation(e.g., MIME)
does: serialization, compression, encryption, encoding transformation (EBCDIC/ASCII)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 19
OSI layer 7
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7specifies: application-specific protocols(e.g., http, smtp, ftp, telnet)does: support app-specific functionality
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 20
the Open Systems Interconnectionis a reference model
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7
Goal: separation of concerns enables good implementationat each level
each layer is independentof the ones on top
layer n depends on the spec of n-1, but not on its implementation/manufacturer
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 21
the OSI reference modelis roughly adhered
physical
network
data link
transport
session
presentation
application
1
2
3
4
5
6
7realm of middlewareapp-specific (SMPT, http…)or independent (RMI, CORBA…)
TCP/IP protocol stack
bits
frames
packets
segments
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 22
outlineNetworking fundamentals
OSI layers
Java sockets
Middleware solutions
Remote Procedure Calls (RPC)
Remote Method Invocation (RMI)
What is a port?
Generally, a computer has a single physical connection to the network
this connection is identified by the computer’s 32-bit IP addressall data destined for a particular computer arrives through this connection
TCP and UDP use ports to identify a particular process/application
port = abstract destination point at a particular hosteach port is identified by a positive 16-bit number, in the range 0 - 65,535port numbers 0 - 1023 are reserved for well-known services (HTTP - 80, telnet – 23)SWE 622 – Distributed Software
Engineering© Malek Lecture 2 – 23
What is a socket?
socket = basic abstraction for network communication
“end-point of communication” uniquely identified with IP address and port
• example: Socket MyClient = new Socket("Machine name", PortNumber);
gives a file-system like abstraction to the capabilities of the network• two end-points communicate by “writing” into and “reading” out of socket
there are two types of transport via sockets• reliable, byte-stream oriented unreliable datagram
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 24
Socket programming with TCPServer Side:
server runs on a specific computer and has a socket bound to a specific port numberserver listens to the socket for a client to make a connection request
Client Side:client tries to rendezvous with the server on the server's machine and port
Server Side:the server accepts the connection by creating a new socket bound to a different port
Client Side:if the connection is accepted, the client uses the new socket to communicate with the server
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 25
Socket programming with UDP
All clients use the same socket to communicate with the server
Packets of data (datagrams) are exchangedNo new sockets need to be created
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 27
C- vs. Java- socket programming cont.
Java keeps all the socket complexity “under the cover”
It does not expose the full range of socket possibilitiesBut, it enables sockets to be opened/used as easily as a file would be opened/used
By using the java.net.Socket class instead of relying on native code, Java programs can communicate over the network in a platform-independent fashion
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 30
Java socket programmingall classes related to sockets are in java.net package Socket class - implements client sockets (also called just "sockets") ServerSocket class - implements server sockets
A server socket waits for requests to come in over the network. It performs some operation based on that request, and then possibly returns a result to the requester.
DatagramSocket class - socket for sending and receiving datagram packets DatagramPacket class - represents a datagram packet
Datagram packets are used to implement a connectionless packet delivery service. Multiple packets sent from one machine to another might be routed differently, and might arrive in any order.
InetAddress class - represents an Internet Protocol (IP) address MulticastSocket class - useful for sending and receiving IP multicast packets.
A MulticastSocket is a (UDP) DatagramSocket, with additional capabilities for joining "groups" of other multicast hosts on the internet. A multicast group is specified by a class D IP address.
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 31
Java client/server example
A client reads a line from its standard input (keyboard) and sends the line to the server
• The server reads the line• The server converts the line to uppercase• The server sends the modified line back to client• The client reads the modified line, and prints the line on its
standard output
Implement above client/server scenario using both TCP and UDP!
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 32
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 42
outline• Networking fundamentals
– OSI layers
– Java sockets
• Middleware solutions– Remote Procedure Calls (RPC)
– Remote Method Invocation (RMI)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 43
middleware offersconceptual model for communication
device device
network
middleware middleware
c1 c2
distributed app
conceptualmodel
underthe hood
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 44
different styles of conceptual modeladdress different problems
data volume
interactioncomplexity
protocols
call/return
read/write
messages
RPC/RMI
streaming data store
• middleware is more generic
• app writer works harder
• middleware is more specialized
• app writer is more constrained
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 45
different styles have
different data sharing assumptions
data store data stream
C
RPC
address space(memory)
C S
object refs(middleware)
RMIc1 c2
messages
c1 c2
files / objectspersistent store
req
C S
datastore/source
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 46
different styles have
different control flow assumptions
RPC/RMI messages
data store data stream
C Scall
return
c1 c2m1
m3
m2app-specific
protocol
C Sr/w
r/w
...
C Sreq
datastream-control protocol
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 47
outline• Networking fundamentals
– OSI layers
– Java sockets
• Middleware solutions– Remote Procedure Calls (RPC)
– Remote Method Invocation (RMI)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 48
lifting the hood on RPC
• putting the R in Procedure Calling
• how to pass parameters?
• how about shared memory?
• handling limitations in practice
device
C Scall
return
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 49
device device
mid
dle
ware
device
idea: stubs hide communication
• client stub, aka server proxy, appears to C like a server running on the client device
• server stub, aka client skeleton, appears to S like a client running on the server device
C Scall
return
C Scall
return
Cs Ss
call
return
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 50
app-specific stubs need to be generated
globally unique interface ID (machine, timestamp)
procedure signatures
parameter marshalling
shipping bits
the u
sual
SsCs
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 51
RPC is implementedby… sending messages
device device
C Scall
Cs Ss
call
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 52
whereto send the messages?
• hardwired for fixed deployment• some RPC environments support dynamic binding
(more to come during the lecture on Service Discovery)
device
Ccall
Cs
?
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 53
marshalling parametersis type-specific andplatform-specific
device device
C Scall
Cs Ss
call
char *myString;…someProc(257,”Fred”,myString);
void someProc(int d, char *n, char *m){…
OS send buffer OS receive bufferwire
id invertbig/littleendian
copy?
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 54
simulate shared address space
to some extent
• references to simple, small structures resolved by copy/restore
• complex data structures not supported (structure contains pointers, e.g., linked lists)
RPC
address space(memory)
C S
device
Ccall
return
Cscopy
contents
restore contents
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 55
outline• Networking fundamentals
– OSI layers
– Java sockets
• Middleware solutions– Remote Procedure Calls (RPC)
– Remote Method Invocation (RMI)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 56
solution: increase granularityfrom bytes to objects
• both local objects and references to remote objects are passed by value (serialization)
• the result of the called method is also serialized and passed back to the caller
C S
object refs(middleware)
RMI
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 57
RMI uses similar ideas to RPC
• communication facilitated by local stubs (proxy/skeleton)
• stubs define/support an interface for method calling
• calling and return implemented by message passing
• separate mechanisms for dynamic binding (object registry)
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 58
RMI is different from RPCin a number of ways
• doesn’t try to hide distribution in the language:remote objects are declared “remote”
• marshalling is simplified– by passing by value only
(object references can be used in nested RMIs)– (in Java) by having JVMs hide platform dependencies
in data representation
• serialization could be much heavier by having to pass the code for the objects with every call, but that can be avoided by passing URLs for downloading the code, rather than the code itself
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 59
reasons to escapethe call-return style
• no result needs to be returned
• a server may not be availableat the time of the request
• make the client more responsiveto other events/user
• allow any component to initiate communication
RPC/RMI
C Scall
return
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 60
some middlewarepush the envelope
• dealing with errors:idempotent, at-least-once, at-most-once…
• the promised simplicity of procedure calling sometimes hinders more sophisticated solutions
RPC/RMI
C Scall
return
• when does it make sense?
• who is resp. for reissuing?
SWE 622 – Distributed Software Engineering
© Malek Lecture 2 – 61
when to usethe call-return style
• the server is ready to process each request
• components and network are mostly reliable
• not many concurrent events in the caller:it is fine to block the caller
• one component (client) has the initiative,others (servers) wait for requests
RPC/RMI
C Scall
return