Network File Systems: the use of RPC & RMI

CS 571 Operating Systems

Angelos Stavrou, George Mason University

Network File Systems: the use of RPC & RMI

GMU CS 571

Network File System (NFS)

  Simple idea: access disks attached to other computers   Share the disk with other networked machines

  NFS is a protocol for remote access to a file system   Does not implement an ”actual” file system   Remote access is transparent to applications

  File system, OS, and architecture independent   Originally developed by Sun   Although Unix in flavor, explicit goal to work beyond Unix

  Client/server architecture   Local file system requests are forwarded to a remote server   Requests are implemented as remote procedure calls (RPCs)

2

GMU CS 571

Remote Procedure Call

  RPC is common means for remote communication   It is used both by operating systems and applications

  NFS is implemented as a set of RPCs   DCOM, CORBA, Java RMI, etc., are all basically just RPC

  Someday (soon?) you will most likely have to write an application that uses remote communication (or you already have)   You will most likely use some form of RPC for that remote

communication   It’s good to know how RPC works in practice

  We will focus on RPC before getting back to NFS   Once we have RPC, NFS isn’t too hard to build…

3

GMU CS 571

Clients and Servers

  The prevalent model for structuring distributed computation is the client/server paradigm�

  A server is a program (or collection of programs) that provide a service (file server, name service, etc.)   The server may exist on one or more nodes   Often the node is called the server, too, which is confusing�

  A client is a program that uses the service   A client first binds to the server (locates it and establishes a

connection to it)   A client then sends requests, with data, to perform actions, and the

servers sends responses, also with data

4

GMU CS 571

Messages

  Initially with network programming, people hand-coded messages to send requests and responses

  Hand-coding messages gets tiresome   Need to worry about message formats   Have to pack and unpack data from messages   Servers have to decode and dispatch messages to handlers   Messages are often asynchronous

  Messages are not a very natural programming model   Could encapsulate messaging into a library   Just invoke library routines to send a message   Which leads us to RPC…

5

GMU CS 571

Procedure Calls

  Procedure calls are a more natural way to communicate   Every language supports them   Semantics are well-defined and understood   Natural for programmers to use

  Idea: Have servers export a set of procedures that can be called by client programs   Similar to module interfaces, class definitions, etc.   Clients just do a procedure call as it they were directly linked with the

server

  Clients just do a procedure call as it they were directly linked with the server   Under the covers, the procedure call is converted into a message

exchange with the server

6

GMU CS 571

Remote Procedure Calls

  We would like to use procedure call as a model for distributed (remote) communication

  Introduces some problems and limitations:   How do we make this transparent to the programmer?   What are the semantics of parameter passing?   How do we bind (locate, connect to) servers?   How do we support heterogeneity (OS, arch, language)?   How do we make it perform well?   Can it scale to many servers?   What are the locking and synchronization problems?

7

GMU CS 571

RPC Model

  A server defines the server’s interface using an interface definition language (IDL)   The IDL specifies the names, parameters, and types for all client-

callable server procedures

  A stub compiler reads the IDL and produces two stub procedures for each server procedure (client and server)   The server programmer implements the server procedures and

links them with the server-side stubs   The client programmer implements the client program and links it

with the client-side stubs   The stubs are responsible for managing all details of the remote

communication between client and server

8

GMU CS 571

RPC Stubs

  A client-side stub is a procedure that looks to the client as if it were a callable server procedure

  A server-side stub looks to the server as if a client called it

  The client program thinks it is calling the server   In fact, it’s calling the client stub

  The server program thinks it is called by the client   In fact, it’s called by the server stub

  The stubs send messages to each other to make the RPC happen “transparently”

9

GMU CS 571

Implementing RPC

  Client stub and server stub communicate over the network to perform RPC

Client Machine

Server Process

Server Stub

Server OS

Client Process

Client Stub

Client OS

ServerMachine

10

GMU CS 571

RPC steps

1.  Client procedure calls client stub 2.  Client stub builds message, calls local OS 3.  Client's OS sends message to remote OS 4.  Remote OS gives message to server stub 5.  Server stub unpacks parameters, calls server 6.  Server does work, returns result to the stub 7.  Server stub packs it in message, calls local OS 8.  Server's OS sends message to client's OS 9.  Client's OS gives message to client stub 10.  Stub unpacks result, returns to client

11

GMU CS 571

Remote Procedure Call   When a process on the client machine A calls a procedure on the

server machine B, the calling process on A is suspended.   Information can be transported from the caller to the callee in

the parameters and can come back in the procedure result.   No message passing at all is visible to the programmer.

12

GMU CS 571

RPC: Client and Server Stubs   For the client and server sides, the library

procedures will act between the caller/callee and the Operating System.

  When a remote procedure is called, the client stub will pack the parameters into a message and call OS (through send primitive). It will then block itself (through receive primitive).

  The server stub will unpack the parameters from the message and will call the local procedure in a usual way. When the procedure completes, the server stub will pack the result and return it to the client.

13

GMU CS 571

RPC Example

  If the server were just a library, then Add would just be a procedure call

14

GMU CS 571

RPC Example: Call 15

GMU CS 571

RPC Example: Return 16

GMU CS 571

RPC Marshalling

  Marshalling is the packing of procedure parameters into a message packet

  The RPC stubs call type-specific procedures to marshal (or unmarshal) the parameters to a call   The client stub marshals the parameters into a message

  The server stub unmarshals parameters from the message and uses them to call the server procedure

  On Return   The server stub marshals the return parameters

  The client stub unmarshals return parameters and returns them to the client program

17

GMU CS 571

RPC Binding

  Binding is the process of connecting the client to the server

  The server, when it starts up, exports its interface   Identifies itself to a network name server   Tells RPC runtime it’s alive and ready to accept calls

  The client, before issuing any calls, imports the server   RPC runtime uses the name server to find the location of a server

and establish a connection

  The import and export operations are explicit in the server and client programs   Breakdown of transparency

18

GMU CS 571

Parameter Passing in RPC

  Passing value parameters   Poses no problems as long as both machines have the

same representation for all data types (numbers, characters, etc.)

  May not be taken as granted   Floating-point numbers and integers may be

represented in different ways.   Some architectures (e.g. Intel Pentium) numbers its

bytes from right to left, while some others (e.g. Sun SPARC) numbers from left to right.

19

GMU CS 571

Parameter Passing in RPC (Cont.)

  Passing reference parameters   Pointers, and in general, reference parameters are

passed with considerable difficulty.

  Solutions   Forbid reference parameters   Copy the entire data structure (e.g. an entire array

may be sent if the size is known).   How to handle complex data structures (e.g. general

graphs) ?

20

GMU CS 571

RPC Transparency

  One goal of RPC is to be as transparent as possible   Make remote procedure calls look like local procedure calls

  We have seen that binding breaks transparency   What else?

  Failures – remote nodes/networks can fail in more ways than with local procedure calls   Need extra support to handle failures well

  Performance – remote communication is inherently slower than local communication   If program is performance-sensitive, could be a problem

  Now we’re ready to continue talking about NFS

21

GMU CS 571

NSF Binding: Mount

  Before a client can access files on a server, the client must mount the file system on the server   The file system is mounted on an empty local directory   Same way that local file systems are attached   Can depend on OS (e.g., Unix dirs vs NT drive letters)

  Servers maintain ACLs of clients that can mount their directories   When mount succeeds, server returns a file handle   Clients use this file handle as a capability to do file operations

  Mounts can be cascaded   Can mount a remote file system on a remote file system

22

GMU CS 571

NSF Protocol

  The NFS protocol defines a set of operations that a server must support   Reading and writing files   Accessing file attributes   Searching for a file within a directory   Reading a set of directory links   Manipulating links and directories

  These operations are implemented as RPCs   Usually by daemon processes (e.g., nfsd)   A local operation is transformed into an RPC to a server   Server performs operation on its own file system and returns

23

GMU CS 571

NSF is Stateless

  Note that NFS has no open or close operations   NFS is stateless

  An NFS server does not keep track of which clients have mounted its file systems or are accessing its files

  Each RPC has to specify all information in a request   Operation, FS handle, file id, offset in file, sequence #

  Robust   No reconciliation needs to be done on a server crash/reboot   Clients detect server reboot, continue to issue requests

  Writes must be synchronous to disk, though   Clients assume that a write is persistent on return   Servers cannot cache writes

24

GMU CS 571

Consistency

  Since NFS is stateless, consistency is tough   NFS can be (mostly) consistent, but limits performance   NFS assumes that if you want consistency, applications will use

higher-level mechanisms to guarantee it

  Writes are supposed to be atomic   But performed in multiple RPCs (larger than a network packet)   Simultaneous writes from clients can interleave RPCs (bad)

  Server caching   Can cache reads, but we saw that it cannot cache writes

25

GMU CS 571

Consistency (Cont)

  Client caching can lead to consistency problems   Caching a write on client A will not be seen by other clients   Cached writes by clients A and B are unordered at server   Since sharing is rare, though, NFS clients usually do cache

  NFS statelessness is both its key to success and its Achilles’ heel   NFS is straightforward to implement and reason about   But limitations on caching can severely limit performance

  Dozens of network file system designs and implementations that perform much better than NFS

  But note that it is still the most widely used remote file system protocol and implementation

26

GMU CS 571

Summary

  NFS allows access to file systems on other machines   Implementation uses RPC to access file server

  RPC enables transparent distribution of applications   Also used on same node between applications

  RPC is language support for distributed programming   Examples include DCOM, CORBA, Java RMI, etc.   What else have we seen use language support?

  Stub compiler automatically generates client/server code from the IDL server descriptions   These stubs do the marshalling/unmarshalling, message sending/

receiving/replying   Programmer just implements functions as before

27

GMU CS 571

Remote Method Invocation (RMI)

  RMI is a Java feature similar to RPC: it allows a thread to invoke a method on a remote object.

  The main differences with RPC   RPC supports procedural programming whereby only

remote procedures or functions may be called. With RMI, it is possible to invoke methods on remote objects.

  In RMI, it is possible to pass objects as parameters to remote methods.

28

29

“The network is the computer”

  Consider the following program organization:

  If the network is the computer, we ought to be able to put the two classes on different computers

SomeClass AnotherClass

method call

returned object

  RMI is one technology that makes this possible

computer 1 computer 2

GMU CS 571

30

RMI and other technologies

  CORBA (Common Object Request Broker Architecture) was used for a long time   CORBA supports object transmission between virtually

any languages   Objects have to be described in IDL (Interface Definition

Language), which looks a lot like C++ data definitions   CORBA is complex and flaky   CORBA has fallen out of favor

  Microsoft supported CORBA, then COM, now .NET

  RMI is purely Java-specific   Java to Java communications only   As a result, RMI is much simpler than CORBA

GMU CS 571

31

What is needed for RMI

  Java makes RMI (Remote Method Invocation) fairly easy, but there are some extra steps

  To send a message to a remote “server object,”   The “client object” has to find the object

  Do this by looking it up in a registry   The client object then has to marshal the parameters

(prepare them for transmission)   Java requires Serializable parameters   The server object has to unmarshal its parameters, do its

computation, and marshal its response   The client object has to unmarshal the response

  Much of this is done for you by special software

GMU CS 571

32

Terminology

  A remote object is an object on another computer   The client object is the object making the request

(sending a message to the other object)   The server object is the object receiving the request   As usual, “client” and “server” can easily trade

roles (each can make requests of the other)   The rmiregistry is a special server that looks up

objects by name  Hopefully, the name is unique!

  rmic is a special compiler for creating stub (client) and skeleton (server) classes

GMU CS 571

33

Processes

  For RMI, you need to be running three processes   The Client   The Server   The Object Registry, rmiregistry, which is like a DNS

service for objects   Require network connectivity (TCP/IP)

GMU CS 571

34

Interfaces

  Interfaces define behavior   Classes define implementation

  In order to use a remote object, the client must know its behavior (interface), but does not need to know its implementation (class)

  In order to provide an object, the server must know both its interface (behavior) and its class (implementation)

  The interface must be available to both client and server   The class of any transmitted object must be on both client and

server   The class whose method is being used should only be on the

server

GMU CS 571

35

Classes

  A Remote class is one whose instances can be accessed remotely   On the computer where it is defined, instances of this

class can be accessed just like any other object   On other computers, the remote object can be

accessed via object handles   A Serializable class is one whose instances can

be marshaled (turned into a linear sequence of bits)   Serializable objects can be transmitted from one

computer to another   It probably isn’t a good idea for an object to be

both remote and serializable

GMU CS 571

36

Conditions for serializability

  If an object is to be serialized:  The class must be declared as public  The class must implement Serializable

 However, Serializable does not declare any methods  The class must have a no-argument constructor  All fields of the class must be serializable: either

primitive types or Serializable objects  Exception: Fields marked transient will be ignored

during serialization

GMU CS 571

37

Remote interfaces and class

  A Remote class has two parts:   The interface (used by both client and server):

  Must be public   Must extend the interface java.rmi.Remote   Every method in the interface must declare that it throws

java.rmi.RemoteException (other exceptions may also be thrown)

  The class itself (used only by the server):   Must implement the Remote interface   Should extend java.rmi.server.UnicastRemoteObject   May have locally accessible methods that are not in its Remote

interface

GMU CS 571

38

Remote vs. Serializable

  A Remote object lives on another computer (such as the Server)   You can send messages to a Remote object and get responses

back from the object   All you need to know about the Remote object is its interface   Remote objects don’t pose much of a security issue�

  You can transmit a copy of a Serializable object between computers   The receiving object needs to know how the object is

implemented; it needs the class as well as the interface   There is a way to transmit the class definition   Accepting classes does pose a security issue

GMU CS 571

39

Security

  It isn’t safe for the client to use somebody else’s code on some random server   System.setSecurityManager(new

RMISecurityManager());   The security policy of RMISecurityManager is the same

as that of the default SecurityManager   Your client program should use a more conservative

security manager than the default   Most discussions of RMI assume you should do

this on both the client and the server   Unless your server also acts as a client, it isn’t really

necessary on the server

GMU CS 571

40

The server class

  The class that defines the server object should extend UnicastRemoteObject   This makes a connection with exactly one other computer   If you must extend some other class, you can use exportObject()

instead   Sun does not provide a MulticastRemoteObject class

  The server class needs to register its server object:   String url = "rmi://" + host + ":" + port + "/" + objectName;

  The default port is 1099   Naming.rebind(url, object);

  Every remotely available method must throw a RemoteException (because connections can fail)

  Every remotely available method should be synchronized

GMU CS 571

41

Hello world server: interface

import java.rmi.*;

public interface HelloInterface extends Remote { public String say() throws RemoteException; }

GMU CS 571

42

Hello world server: class

import java.rmi.*; import java.rmi.server.*;

public class Hello extends UnicastRemoteObject implements HelloInterface { private String message; // Strings are serializable

public Hello (String msg) throws RemoteException { message = msg; }

public String say() throws RemoteException { return message; } }

GMU CS 571

43

Registering the hello world server

class HelloServer { public static void main (String[] argv) { try { Naming.rebind("rmi://localhost/HelloServer", new Hello("Hello, world!")); System.out.println("Hello Server is ready."); } catch (Exception e) { System.out.println("Hello Server failed: " + e); } } }

GMU CS 571

44

The hello world client program

class HelloClient { public static void main (String[] args) { HelloInterface hello; String name = "rmi://localhost/HelloServer"; try { hello = (HelloInterface)Naming.lookup(name); System.out.println(hello.say()); } catch (Exception e) { System.out.println("HelloClient exception: " + e); } } }

GMU CS 571

45

rmic

  The class that implements the remote object should be compiled as usual

  Then, it should be compiled with rmic:   rmic Hello

  This will generate files Hello_Stub.class and Hello_Skel.class

  These classes do the actual communication   The “Stub” class must be copied to the client area   The “Skel” was needed in SDK 1.1 but is no longer

necessary; it’s still generated, though

GMU CS 571

46

Trying RMI

In three different terminal windows: 1.  Run the registry program: •  rmiregistry

2.  Run the server program: •  java HelloServer

3.  Run the client program: •  java HelloClient

  If all goes well, you should get the “Hello, World!” message

GMU CS 571

47

Summary

Start the registry server, rmiregistry Start the object server

The object server registers an object, with a name, with the registry server

Start the client The client looks up the object in the registry server

The client makes a request The request actually goes to the Stub class The Stub classes on client and server talk to each other The client’s Stub class returns the result

GMU CS 571

48

References

  Trail: RMI �by Ann Wollrath and Jim Waldo   http://java.sun.com/docs/books/tutorial/rmi/index.html

  Fundamentals of RMI Short Course �by jGuru   http://java.sun.com/developer/onlineTraining/rmi/RMI.html

  Java RMI Tutorial �by Ken Baclawski   http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html

Documents

Network File Systems: the use of RPC & RMI