Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
CS 571 Operating Systems
Angelos Stavrou, George Mason University
Network File Systems: the use of RPC & RMI
GMU CS 571
Network File System (NFS)
Simple idea: access disks attached to other computers Share the disk with other networked machines
NFS is a protocol for remote access to a file system Does not implement an ”actual” file system Remote access is transparent to applications
File system, OS, and architecture independent Originally developed by Sun Although Unix in flavor, explicit goal to work beyond Unix
Client/server architecture Local file system requests are forwarded to a remote server Requests are implemented as remote procedure calls (RPCs)
2
GMU CS 571
Remote Procedure Call
RPC is common means for remote communication It is used both by operating systems and applications
NFS is implemented as a set of RPCs DCOM, CORBA, Java RMI, etc., are all basically just RPC
Someday (soon?) you will most likely have to write an application that uses remote communication (or you already have) You will most likely use some form of RPC for that remote
communication It’s good to know how RPC works in practice
We will focus on RPC before getting back to NFS Once we have RPC, NFS isn’t too hard to build…
3
GMU CS 571
Clients and Servers
The prevalent model for structuring distributed computation is the client/server paradigm�
A server is a program (or collection of programs) that provide a service (file server, name service, etc.) The server may exist on one or more nodes Often the node is called the server, too, which is confusing�
A client is a program that uses the service A client first binds to the server (locates it and establishes a
connection to it) A client then sends requests, with data, to perform actions, and the
servers sends responses, also with data
4
GMU CS 571
Messages
Initially with network programming, people hand-coded messages to send requests and responses
Hand-coding messages gets tiresome Need to worry about message formats Have to pack and unpack data from messages Servers have to decode and dispatch messages to handlers Messages are often asynchronous
Messages are not a very natural programming model Could encapsulate messaging into a library Just invoke library routines to send a message Which leads us to RPC…
5
GMU CS 571
Procedure Calls
Procedure calls are a more natural way to communicate Every language supports them Semantics are well-defined and understood Natural for programmers to use
Idea: Have servers export a set of procedures that can be called by client programs Similar to module interfaces, class definitions, etc. Clients just do a procedure call as it they were directly linked with the
server
Clients just do a procedure call as it they were directly linked with the server Under the covers, the procedure call is converted into a message
exchange with the server
6
GMU CS 571
Remote Procedure Calls
We would like to use procedure call as a model for distributed (remote) communication
Introduces some problems and limitations: How do we make this transparent to the programmer? What are the semantics of parameter passing? How do we bind (locate, connect to) servers? How do we support heterogeneity (OS, arch, language)? How do we make it perform well? Can it scale to many servers? What are the locking and synchronization problems?
7
GMU CS 571
RPC Model
A server defines the server’s interface using an interface definition language (IDL) The IDL specifies the names, parameters, and types for all client-
callable server procedures
A stub compiler reads the IDL and produces two stub procedures for each server procedure (client and server) The server programmer implements the server procedures and
links them with the server-side stubs The client programmer implements the client program and links it
with the client-side stubs The stubs are responsible for managing all details of the remote
communication between client and server
8
GMU CS 571
RPC Stubs
A client-side stub is a procedure that looks to the client as if it were a callable server procedure
A server-side stub looks to the server as if a client called it
The client program thinks it is calling the server In fact, it’s calling the client stub
The server program thinks it is called by the client In fact, it’s called by the server stub
The stubs send messages to each other to make the RPC happen “transparently”
9
GMU CS 571
Implementing RPC
Client stub and server stub communicate over the network to perform RPC
Client Machine
Server Process
Server Stub
Server OS
Client Process
Client Stub
Client OS
ServerMachine
10
GMU CS 571
RPC steps
1. Client procedure calls client stub 2. Client stub builds message, calls local OS 3. Client's OS sends message to remote OS 4. Remote OS gives message to server stub 5. Server stub unpacks parameters, calls server 6. Server does work, returns result to the stub 7. Server stub packs it in message, calls local OS 8. Server's OS sends message to client's OS 9. Client's OS gives message to client stub 10. Stub unpacks result, returns to client
11
GMU CS 571
Remote Procedure Call When a process on the client machine A calls a procedure on the
server machine B, the calling process on A is suspended. Information can be transported from the caller to the callee in
the parameters and can come back in the procedure result. No message passing at all is visible to the programmer.
12
GMU CS 571
RPC: Client and Server Stubs For the client and server sides, the library
procedures will act between the caller/callee and the Operating System.
When a remote procedure is called, the client stub will pack the parameters into a message and call OS (through send primitive). It will then block itself (through receive primitive).
The server stub will unpack the parameters from the message and will call the local procedure in a usual way. When the procedure completes, the server stub will pack the result and return it to the client.
13
GMU CS 571
RPC Example
If the server were just a library, then Add would just be a procedure call
14
GMU CS 571
RPC Marshalling
Marshalling is the packing of procedure parameters into a message packet
The RPC stubs call type-specific procedures to marshal (or unmarshal) the parameters to a call The client stub marshals the parameters into a message
The server stub unmarshals parameters from the message and uses them to call the server procedure
On Return The server stub marshals the return parameters
The client stub unmarshals return parameters and returns them to the client program
17
GMU CS 571
RPC Binding
Binding is the process of connecting the client to the server
The server, when it starts up, exports its interface Identifies itself to a network name server Tells RPC runtime it’s alive and ready to accept calls
The client, before issuing any calls, imports the server RPC runtime uses the name server to find the location of a server
and establish a connection
The import and export operations are explicit in the server and client programs Breakdown of transparency
18
GMU CS 571
Parameter Passing in RPC
Passing value parameters Poses no problems as long as both machines have the
same representation for all data types (numbers, characters, etc.)
May not be taken as granted Floating-point numbers and integers may be
represented in different ways. Some architectures (e.g. Intel Pentium) numbers its
bytes from right to left, while some others (e.g. Sun SPARC) numbers from left to right.
19
GMU CS 571
Parameter Passing in RPC (Cont.)
Passing reference parameters Pointers, and in general, reference parameters are
passed with considerable difficulty.
Solutions Forbid reference parameters Copy the entire data structure (e.g. an entire array
may be sent if the size is known). How to handle complex data structures (e.g. general
graphs) ?
20
GMU CS 571
RPC Transparency
One goal of RPC is to be as transparent as possible Make remote procedure calls look like local procedure calls
We have seen that binding breaks transparency What else?
Failures – remote nodes/networks can fail in more ways than with local procedure calls Need extra support to handle failures well
Performance – remote communication is inherently slower than local communication If program is performance-sensitive, could be a problem
Now we’re ready to continue talking about NFS
21
GMU CS 571
NSF Binding: Mount
Before a client can access files on a server, the client must mount the file system on the server The file system is mounted on an empty local directory Same way that local file systems are attached Can depend on OS (e.g., Unix dirs vs NT drive letters)
Servers maintain ACLs of clients that can mount their directories When mount succeeds, server returns a file handle Clients use this file handle as a capability to do file operations
Mounts can be cascaded Can mount a remote file system on a remote file system
22
GMU CS 571
NSF Protocol
The NFS protocol defines a set of operations that a server must support Reading and writing files Accessing file attributes Searching for a file within a directory Reading a set of directory links Manipulating links and directories
These operations are implemented as RPCs Usually by daemon processes (e.g., nfsd) A local operation is transformed into an RPC to a server Server performs operation on its own file system and returns
23
GMU CS 571
NSF is Stateless
Note that NFS has no open or close operations NFS is stateless
An NFS server does not keep track of which clients have mounted its file systems or are accessing its files
Each RPC has to specify all information in a request Operation, FS handle, file id, offset in file, sequence #
Robust No reconciliation needs to be done on a server crash/reboot Clients detect server reboot, continue to issue requests
Writes must be synchronous to disk, though Clients assume that a write is persistent on return Servers cannot cache writes
24
GMU CS 571
Consistency
Since NFS is stateless, consistency is tough NFS can be (mostly) consistent, but limits performance NFS assumes that if you want consistency, applications will use
higher-level mechanisms to guarantee it
Writes are supposed to be atomic But performed in multiple RPCs (larger than a network packet) Simultaneous writes from clients can interleave RPCs (bad)
Server caching Can cache reads, but we saw that it cannot cache writes
25
GMU CS 571
Consistency (Cont)
Client caching can lead to consistency problems Caching a write on client A will not be seen by other clients Cached writes by clients A and B are unordered at server Since sharing is rare, though, NFS clients usually do cache
NFS statelessness is both its key to success and its Achilles’ heel NFS is straightforward to implement and reason about But limitations on caching can severely limit performance
Dozens of network file system designs and implementations that perform much better than NFS
But note that it is still the most widely used remote file system protocol and implementation
26
GMU CS 571
Summary
NFS allows access to file systems on other machines Implementation uses RPC to access file server
RPC enables transparent distribution of applications Also used on same node between applications
RPC is language support for distributed programming Examples include DCOM, CORBA, Java RMI, etc. What else have we seen use language support?
Stub compiler automatically generates client/server code from the IDL server descriptions These stubs do the marshalling/unmarshalling, message sending/
receiving/replying Programmer just implements functions as before
27
GMU CS 571
Remote Method Invocation (RMI)
RMI is a Java feature similar to RPC: it allows a thread to invoke a method on a remote object.
The main differences with RPC RPC supports procedural programming whereby only
remote procedures or functions may be called. With RMI, it is possible to invoke methods on remote objects.
In RMI, it is possible to pass objects as parameters to remote methods.
28
29
“The network is the computer”
Consider the following program organization:
If the network is the computer, we ought to be able to put the two classes on different computers
SomeClass AnotherClass
method call
returned object
RMI is one technology that makes this possible
computer 1 computer 2
GMU CS 571
30
RMI and other technologies
CORBA (Common Object Request Broker Architecture) was used for a long time CORBA supports object transmission between virtually
any languages Objects have to be described in IDL (Interface Definition
Language), which looks a lot like C++ data definitions CORBA is complex and flaky CORBA has fallen out of favor
Microsoft supported CORBA, then COM, now .NET
RMI is purely Java-specific Java to Java communications only As a result, RMI is much simpler than CORBA
GMU CS 571
31
What is needed for RMI
Java makes RMI (Remote Method Invocation) fairly easy, but there are some extra steps
To send a message to a remote “server object,” The “client object” has to find the object
Do this by looking it up in a registry The client object then has to marshal the parameters
(prepare them for transmission) Java requires Serializable parameters The server object has to unmarshal its parameters, do its
computation, and marshal its response The client object has to unmarshal the response
Much of this is done for you by special software
GMU CS 571
32
Terminology
A remote object is an object on another computer The client object is the object making the request
(sending a message to the other object) The server object is the object receiving the request As usual, “client” and “server” can easily trade
roles (each can make requests of the other) The rmiregistry is a special server that looks up
objects by name Hopefully, the name is unique!
rmic is a special compiler for creating stub (client) and skeleton (server) classes
GMU CS 571
33
Processes
For RMI, you need to be running three processes The Client The Server The Object Registry, rmiregistry, which is like a DNS
service for objects Require network connectivity (TCP/IP)
GMU CS 571
34
Interfaces
Interfaces define behavior Classes define implementation
In order to use a remote object, the client must know its behavior (interface), but does not need to know its implementation (class)
In order to provide an object, the server must know both its interface (behavior) and its class (implementation)
The interface must be available to both client and server The class of any transmitted object must be on both client and
server The class whose method is being used should only be on the
server
GMU CS 571
35
Classes
A Remote class is one whose instances can be accessed remotely On the computer where it is defined, instances of this
class can be accessed just like any other object On other computers, the remote object can be
accessed via object handles A Serializable class is one whose instances can
be marshaled (turned into a linear sequence of bits) Serializable objects can be transmitted from one
computer to another It probably isn’t a good idea for an object to be
both remote and serializable
GMU CS 571
36
Conditions for serializability
If an object is to be serialized: The class must be declared as public The class must implement Serializable
However, Serializable does not declare any methods The class must have a no-argument constructor All fields of the class must be serializable: either
primitive types or Serializable objects Exception: Fields marked transient will be ignored
during serialization
GMU CS 571
37
Remote interfaces and class
A Remote class has two parts: The interface (used by both client and server):
Must be public Must extend the interface java.rmi.Remote Every method in the interface must declare that it throws
java.rmi.RemoteException (other exceptions may also be thrown)
The class itself (used only by the server): Must implement the Remote interface Should extend java.rmi.server.UnicastRemoteObject May have locally accessible methods that are not in its Remote
interface
GMU CS 571
38
Remote vs. Serializable
A Remote object lives on another computer (such as the Server) You can send messages to a Remote object and get responses
back from the object All you need to know about the Remote object is its interface Remote objects don’t pose much of a security issue�
You can transmit a copy of a Serializable object between computers The receiving object needs to know how the object is
implemented; it needs the class as well as the interface There is a way to transmit the class definition Accepting classes does pose a security issue
GMU CS 571
39
Security
It isn’t safe for the client to use somebody else’s code on some random server System.setSecurityManager(new
RMISecurityManager()); The security policy of RMISecurityManager is the same
as that of the default SecurityManager Your client program should use a more conservative
security manager than the default Most discussions of RMI assume you should do
this on both the client and the server Unless your server also acts as a client, it isn’t really
necessary on the server
GMU CS 571
40
The server class
The class that defines the server object should extend UnicastRemoteObject This makes a connection with exactly one other computer If you must extend some other class, you can use exportObject()
instead Sun does not provide a MulticastRemoteObject class
The server class needs to register its server object: String url = "rmi://" + host + ":" + port + "/" + objectName;
The default port is 1099 Naming.rebind(url, object);
Every remotely available method must throw a RemoteException (because connections can fail)
Every remotely available method should be synchronized
GMU CS 571
41
Hello world server: interface
import java.rmi.*;
public interface HelloInterface extends Remote { public String say() throws RemoteException; }
GMU CS 571
42
Hello world server: class
import java.rmi.*; import java.rmi.server.*;
public class Hello extends UnicastRemoteObject implements HelloInterface { private String message; // Strings are serializable
public Hello (String msg) throws RemoteException { message = msg; }
public String say() throws RemoteException { return message; } }
GMU CS 571
43
Registering the hello world server
class HelloServer { public static void main (String[] argv) { try { Naming.rebind("rmi://localhost/HelloServer", new Hello("Hello, world!")); System.out.println("Hello Server is ready."); } catch (Exception e) { System.out.println("Hello Server failed: " + e); } } }
GMU CS 571
44
The hello world client program
class HelloClient { public static void main (String[] args) { HelloInterface hello; String name = "rmi://localhost/HelloServer"; try { hello = (HelloInterface)Naming.lookup(name); System.out.println(hello.say()); } catch (Exception e) { System.out.println("HelloClient exception: " + e); } } }
GMU CS 571
45
rmic
The class that implements the remote object should be compiled as usual
Then, it should be compiled with rmic: rmic Hello
This will generate files Hello_Stub.class and Hello_Skel.class
These classes do the actual communication The “Stub” class must be copied to the client area The “Skel” was needed in SDK 1.1 but is no longer
necessary; it’s still generated, though
GMU CS 571
46
Trying RMI
In three different terminal windows: 1. Run the registry program: • rmiregistry
2. Run the server program: • java HelloServer
3. Run the client program: • java HelloClient
If all goes well, you should get the “Hello, World!” message
GMU CS 571
47
Summary
Start the registry server, rmiregistry Start the object server
The object server registers an object, with a name, with the registry server
Start the client The client looks up the object in the registry server
The client makes a request The request actually goes to the Stub class The Stub classes on client and server talk to each other The client’s Stub class returns the result
GMU CS 571
48
References
Trail: RMI �by Ann Wollrath and Jim Waldo http://java.sun.com/docs/books/tutorial/rmi/index.html
Fundamentals of RMI Short Course �by jGuru http://java.sun.com/developer/onlineTraining/rmi/RMI.html
Java RMI Tutorial �by Ken Baclawski http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html