Download pdf - DIS MATERIAL

8/7/2019 DIS MATERIAL

1/49

Page 1 of 49

DISTRIBUTED OPERATING

SYSTEM

What is Distributed System? Give some examples of Distributed System?

Distributed System :

A distributed system is a system in which hardware or software components located at

different computers communicate with each other so that they can pass messages which

will help them share resources with each other.

In normal life users mostly share hardware resources such as printers, data resources

such as files etc.

But the importance of resource sharing increases when the users are able to share the

higher level resources that play a part in their everyday work and social activities.

For example, users are concerned with sharing data in the form of a shared database or

a set of web pages not the disks and processors that those are implemented on.

A Distributed System has the following advantages associated with itConcurrency :

By implementing Distributed system, programs can be executed concurrently between

two or more computers so that the users can share resources such as web pages or files

when necessary.

No global clock :

When users want to share resources with one another, then there is no particular global

clock that is associated with them i.e. there is no fixed time for the user when he wants

to share a resource .

Independent failures :

All computer systems can fail. Similarly Distributed system can fail in many ways.

Faults in the network result in the disconnection of the computers that are connected to

that particular network, but still the computers dont stop running. Even though the


2/49

Page 2 of 49

network fails the user can use his computer without any problem.

What are the different Hardware Concepts involved with Distributed Operating

Systems? (or)

Write about Multi-processors and Multi-Computers?

HARDWARE CONCEPTS :

Even though all distributed systems consist of multiple CPUs, there are several

different ways the hardware can be interconnected so that they can communicate with

each other.

Single Input Stream Data :

A computer with a single instruction stream and a single data stream is called SISD.

All traditional uni-processor computers i.e., computers which have only one CPU are

considered to be SISD such as Personal Computers.

Single Input Multiple Stream Data :

The next category is SIMD, single instruction stream, multiple data stream. In this type

of hardware multiple CPUs will be present. Each CPU will be given a single

instruction, and this instruction will be executed on multiple data items. Such hardware

is referred as SIMD.

Multiple Instruction Stream Single data :

The next category is MISD, multiple instruction stream, single data stream means a

group of independent computers, each with its own program and data. All distributedsystems are MIMD.

We divide all MIMD computers into two groups:

Multiprocessors : MIMD computers which have shared memory, are called as

multiprocessors

Multi-Computers: MIMD computers that dont have shared memory and each


3/49

Page 3 of 49

computer has its own memory are called as multi-computers .

Each of these categories can be further divided based on the architecture of the

interconnection network. We describe these two categories as bus and switched .

Bus Architecture : By bus we mean that there are a number of computers which are

connected through only a single cable line. Cable television uses a scheme like this: the

cable company runs a wire down the street, and all the subscribers have taps running to

it from their television sets.

Switch Architecture : Switched systems do not have a single wire connection like cable

television. Instead, there are individual wires from machine to machine, with many

different wiring in use.

Draw Diagram HC - 1

Bus-Based Multiprocessors:

Bus-based multiprocessors consist of some number of CPUs all connected to a

common bus, along with a memory module.

To read a word of memory, a CPU puts the address of the word it wants on the address

requesting CPU to read it in.

The problem with this scheme is that with as few as 4 or 5 CPUs, the bus will usuallybe overloaded and performance will drop drastically. The solution is to add a high-

speed cache memory between the CPU and the bus.

Draw Diagram HC - 2

Switched Multiprocessors :

To build a multiprocessor with more than 64 processors, a different method is needed

to connect the CPUs with the memory. One possibility is to divide the memory into

tiny electronic crosspoint switch that can be opened and closed in hardware.

Draw Diagram HC - 3

The downside of the crossbar switch is that with n CPUs and n memories, n2

crosspoint switches are needed. The omega network is one example. This network

contains four 2x2 switch. In the above diagram, with proper settings of the switches,


4/49

Page 4 of 49

every CPU can access every memory.

Bus-Based Multicomputer :

On the other hand, building a multicomputer (i.e., no shared memory) is easy. Each

CPU has direct connection to its own local memory. It looks topologically similar to

the bus-based multiprocessor, but since there will be much less traffic over it, it need

not be a high-speed bus.

Draw Diagram HC - 4

Switched Multicomputer :

Our last category consists of switched multi-computer. In this type of category each

and every CPU will have its own memory and can directly access that memory.The most two popular topologies, a grid and a hypercube.

Draw Diagram HC 5

A hypercube is an n-dimensional cube where each vertex represents a CPU and each

edge represents the connection between two CPUs.

SORTWARE CONCEPTS :

Although the hardware is important, the software is even more important. The

operating system software is responsible for the interaction between the user and the

hardware.

Two kinds of Operating systems for multiple CPU systems one is loosely coupled and

the other is tightly coupled system.

Loosely Coupled System : This software allows machines and users of a distributed

system to be independent of each other, but still they can interact with each other

whenever necessary. For example a group of personal computers, where each computer

has its own CPU, its own memory, its own hard disk and its own operating system, but

Tightly Coupled System : This software allows machines to be run by a central

machine, which controls all the operations of the other machines.

Network Operating Systems :


5/49

Page 5 of 49

The Network Operating Systems are loosely-coupled software on loosely-coupled

hardware. An example for Network Operating System is a network of computers or

workstations connected by LAN.

In this model, each user has a single computer or workstation allocated to him. It may

or may not have a hard disk. It definitely has its own operating system. All commands

are normally run locally, right on the workstation.

One approach is to provide a shared, global file system accessible from all the

workstations. . Each incoming request is examined and executed, and the reply

is sent back.

Draw Diagram SC 1

True Distributed Systems :

A True Distributed System provides tightly-coupled software on the same loosely-

coupled (i.e., multicomputer) hardware. The goal of such a system is to create the

illusion in the minds of the users that the entire network of computers is a single

timesharing system, rather than a collection of different machines.

In this system, global inter process communication take place so that each process cancommunicate with any other process in the network.

Process management is the same everywhere i.e. how the processes are created,

destroyed, started and stopped must be same on every machine.

Multiprocessor Timesharing Systems :

Multi-processor Timesharing systems are tightly-coupled software on tightly-coupled

hardware. The important characteristic of this system is the existence of a single queue

in memory. .

Draw Diagram SC - 2

All five processes are located in the shared memory, and three of them are currently

executing: process A on CPU 1, process B on CPU 2, and process C on CPU 3. The

other two processes, D and I , are also in memory, waiting their turn.


6/49

Page 6 of 49

What are the different design issues involved with Distributed Operating

Systems?

Transparency :

A distributed system is made up of independent components, but still the user doesnt

know about the individual components and he feels that it is a single system. Such type

of hiding is referred as Transparency.

ccess Transparency : The users can access resources in his system or on any other

system by using the same operation. Such transparency is referred as Access

Transparency.

Location transparency : The user will never know the location of the resources that he

is accessing.

Concurrency transparency : The users can share the same resource without any

interference.

Failure transparency : It hides the failures that have occurred from the user.

Flexibility :

The different systems involved in the Distributed Operating system must be flexible.

So that we can add any number of system without having to change much of theconfigurations of the hardware as well as the software.

The structure of the Distributed Operating System can be arranged in two ways

Monolithic Kernel : Each machine should run a kernel that provides all the services to

the other systems and involved in the network. For example a centralized operating

system with networking facilities can be considered as a Monolithic kernel.

Micro Kernel : In Micro kernel the kernel provides very little service to the other

systems, but the bulk of the operating system services are provided by the user-level

servers.

Reliability :

One of the original goals of building distributed systems was to make them more

reliable than single-processor systems. The idea is that if a machine goes down, some


7/49

Page 7 of 49

other machine takes over the job.

They are two important aspects of reliability

vailability : A system is said to be reliable if critical components are available for the

other systems at the correct time.. It is important to distinguish various aspects of

reliability.

Security : Another aspect of reliability is security. Many of the information resources

that are present in the distributed system are very important to the users. Therefore

security becomes a very important issue.

Security must be provided for resources from the following problems.

Confidentiality : Resources being accessed by unauthorized individuals. Integrity : Resources being altered or corrupted by unauthorized users.

In a distributed system clients send requests to access data managed by server

which involves sending information in messages over a network

Denial of service attacks : Another security problem is that a user may wish to

disrupt a service for some reason This can be achieved by bombarding the

service with such a large number of pointless requests that the serious users are

unable to use it. This is called a denial of service attack.

Scalability :

Distributed systems must be able to share resources to a group of users. But the main

challenge is if the numbers of users increase still the distributed system must be able to

provide the same service. Such service is referred as a scalable service by the

distributed system.Controlling the cost of physical resources :

As the demand for a resource grows, it should be possible to include the resource at a

reasonable cost.

Controlling the performance loss :

Some times if the number of resources will be same as that of the number of users. For


8/49

Page 8 of 49

example the number of domain names and the IP addresses are always same. In such

cases we should see that performance is not lost.

Preventing software resources running out :

Software resources must not run out for example if we consider IPv4 addresses can

provide only IP addresses to 4 billion people where as they are much more people who

want IP addresses. .

Write about the different Layered Protocols?

OSI Model :

An ISO has developed a model which covers all the aspects of network

communications. Such model is referred as the Open Systems Interconnection model(OSI).

The OSI model is a designed in the form of layers, where each layer will have its own

Layers in the OSI model : The Seven Layers in the OSI Model are

1. Physical layer :

The main function of the Physical layer is to carry a sequence of bits over a physical

medium. Here the Physical medium refers to cable wire etc.

The main functions of the physical layer are

Physical characteristics of interface and medium : The physical layer defines the

characteristics of the interface between the devices and the transmission medium. It

also defines the type of transmission medium.

Representation of bits : The physical layer data consists of a stream of bits (sequence of

0s and 1s). These bits must be encoded into signals electrical or optical so that they

can be transmitted.

ata rate : The number of bits sent each second is referred as the transmission rate or

Data rate. The physical layer specifies the Data rate.

Physical topology : The physical topology defines how devices are connected to make a

network.


9/49

Page 9 of 49

Transmission mode : The physical layer also defines the direction of transmission

between two devices i.e. whether they have Simplex, half-duplex or full- duplex

transmission modes.

2. Data Link Layer :

The data link layer is responsible to provide the necessary services to the physical

layer. It makes the physical layer appear error-free to the upper layer.

Framing : The data link layer divides the stream of bits received from the network layer

into smaller parts called as frames.

Physical addressing : If frames are to be distributed to different systems on the

network, the data link layer adds a header address to the frame to define the sender

and/or receiver of the frame.

Flow Control : If the speed at which the receiver takes the data is less than the speed at

which the sender sends data, then the receiver receive too much of data which in cannot

receive at a time.

Error Control : The data link layer sees that data is not lost by adding mechanisms to

detect and retransmit damaged or lost frames.

ccess control : When two or more devices are connected to the same link, data link layer protocols are necessary to determine which device has control over the link at any

given time.

3. Network layer:

The network layer is responsible for the source-to-destination delivery of a packet,

possibly across multiple network (links). Whereas the data link layer oversees the

delivery of the packet between two systems on the same network (links), the network

layer ensures that each packet gets from its point of origin to its final destination.

Logical Addressing : If a packet passes the network layer, a header address of the next

link is added to it and sent to the next link. This address is the referred as Logical

addressing.

Routing : In order to send the packet from one link to another, we will use connecting

devices called routers or switches which route the packets to their final destination.


10/49

Page 10 of 49

4. Transport Layer :

The transport layer is responsible for process-to-process delivery of the entire message.

A process is an application program running on a computer packet is related to

what.

Segmentation and reassembly : A message is divided into smaller parts called

segments, where each segment will have a sequence number. These segments are also

referred packets.

Flow Control : Like the data link layer, the transport layer is responsible for flow

control.

Error Control : Like the data link layer, the transport layer is responsible for error

control. However, error control at this layer is performed process-to-process rather than

across a single link.

5. Session Layer :

ialog Control : The session layer allows two systems to enter into a dialog. It allows

the communication between two processes to take place in either half duplex (one way

at a time) or full-duplex (two ways at a time) mode.

Synchronization : The session layer allows a process to add checkpoints, orsynchrobization points, to a stream of data. If they are 2000 pages then it will add

checkpoints at every 100th page, so that if a crash occurs at 530th page, then we have

to retransmit only pages from 500 to 530 instead of retransmitting all the 530 pages.

6. Presentation layer :

The presentation layer is concerned with the syntax and semantics of the information

exchanged between two systems.

Translation : Since the formats will be different for different computers, therefore if

one computer sends data in its format, then it will be translated into a common format,

which is understood by the destination system, and will be sent to the receiving system.

Encryption : To carry important information, a system must be able to ensure privacy,

so that the other users cannot understand the data. Encryption means that the sender


11/49

Page 11 of 49

transforms the original information to another form and sends the resulting message out

over the network.

7. Application Layer :

File transfer, access and management : This application allows a user to access files,

as well as retrieve files and also manage, and control files present in a remote

computer.

Mail Service : This application provides the basis for e-mail forwarding and storage.

irectory service : This application provides distributed database sources and access for

global information about various objects and services.

Draw Diagram LP 1

Write about Asynchronous Transfer Mode Networks (ATM Networks)?

ATM Networks :

When the telephone companies decided to build networks for the 21st century, they

faced a problem: In order to transfer voice they required a low, but constant bandwidth,

whereas in order to transfer data they required low bandwidth not constantly but very

short periods of time.

The solution was to build a system with fixed-size blocks over virtual circuits which

gave good performance for both types of traffic. Such type of network is called ATM

(Asynchronous Transfer Mode ). It has become an international standard and is likely

to play a major role in future distributed systems, both local-area ones and wide-area

ones.

In the ATM model the sender first establishes a connection (i.e., a virtual circuit) to the

receiver. During connection establishment, a route is determined from the sender to the

receiver(s) and routing information is stored in the switches along the way. Using this

connection, packets can be sent, but they are chopped up by the hardware into small,

fixed-sized units called cells.

Draw Diagram ATM - 1


12/49

Page 12 of 49

The ATM Physical Layer :

An ATM adaptor board plugged into a computer can put out a stream of cells onto a

wire or fiber. The transmission stream must be continuous. When there are no data to

be sent, empty cells are transmitted, which means that in the physical layer, ATM is

really synchronous, not asynchronous. Within a virtual circuit, however, it is

asynchronous.

Alternatively, the adaptor board can use SONET (Synchronous Optical NETwork) in

the physical layer, putting its cells into the payload portion SONET frames.

In SONET, the basic unit (analogous to a 193-bit T1 frame) is a 9x90 array of bytes of

payload. One frame is transmitted every 125 sec,

The ATM Layer :

Draw Diagram ARM - 2

In the above diagram The GFC (Generic Flow Control) is used for flow control The

VPI(Virtual Path Identifier) and VCI (Virtual Circuit Identifier) fields together identify

which path and virtual circuit a cell belongs to. Routing tables along the way use this

information for routing.

The ATM Adaptation Layer:

SEAL (Simple and Efficient Adaptation Layer), is a very simple and efficient way

to identify the starting and ending of a stream of cells.

It uses only one bit in the ATM header, one of the bits in the payload type field. This

bit is normally 0, but is set to 1 in the last cell of packet. The last cell contains a trailer

in the final 8 bytes.

ATM Swathing :

ATM networks are built up of copper or optical cables and switches.

Draw Diagram ATM 3 (a)

In the above diagram ATM consists of four switches. Cells originating at any of the

eight computers attached to the system can be switched to any of the other computers


13/49

Page 13 of 49

by traversing one or more switches. Each of these switches has four ports, each used

for both input and output.

A different switch design copies the cell into a queue associated with the output buffer

and lets it wait there, instead of keeping it in the input buffer. This approach eliminates

head-of-line blocking and gives better performance. It is also possible for a switch to

have a pool of buffers that can be used for both input and output buffering.

Draw Diagram ATM 3 (b)

Write about Client Server Communication?

Client server communication :The Client Server communication implements the request reply methodology where the

client will request for a service and the server will take that request and perform some

operation and send the result in the form of response to the client.

In the normal case, request reply communication is synchronous because the client

process blocks until the reply arrives from the server.

It can also be reliable because the reply from the server is effectively an

acknowledgement to the client.

The request-reply protocol :

The following protocol is based on a three communication methods: doOperation,

getRequest and sendReply. Most RMI and RPC systems are supported by a similar

protocol.

doOperation : The doOperation method is used by clients to call methods present on the

server. Its parameters specify the server which method to invoke, together with

additional information required by the method. It is assumed that the client calling

doOperation marshals the arguments into an array of bytes and unmarshals the results

from the array of bytes that are returned.

etRequest : The getRequest method is used by the server to receive the arguments sent

from the client and then unmarshal the arguments in a format which is understandable


14/49

Page 14 of 49

by the system and then invoke the method which will perform operation on the

arguments.

sendReply : Once the operation is performed then the sendReply method will sent the

result in the form of a response.

Message identifiers :

Each and every message is attached with a message identifier so that the message can

be uniquely identified as well as the order of the message can be determined. A

message identifier consists of two parts

1. A requestID, which is taken from an increasing sequence of integers by the

sending process.2. An identifier for the sender process, for example its port and Internet addresses.

Failure model of the request reply protocol :

If the three methods doOperation, getReply and sendReply are implemented over UDP

datagrams then they suffer from the same communication failures. That is

1.

They suffer from omission failures2. Messages are not guaranteed to be delivered in sender order.

Timeouts :

Whenever a doOperation fails then they are number of options for it to perform. One

such operation is once the timeout has occurred it returns immediately from the

doOperation and indicate to the client that the doOperation has failed. Another option

for it is to again perform the doOperation and wait for the reply.iscarding duplicate request messages :

In cases when the request message is retransmitted, the server may receive it more than

once. For example the server may receive the first request message but take longer than

the clients timeout to execute the command and return the reply. Once the timeout

occurs the sender will retransmit the message again. This causes duplicate messages at


15/49

Page 15 of 49

the receiver end.

Lost reply message :

Sometimes the server will send a reply, but that reply will get lost before reaching the

client. Then the client will send the same request once the timeout occurs.

The server will receive that request and perform the same operation and send the replay

again.

History :

In the Lost reply message, the server has to execute the operation again, but instead of

performing the operation again the server stores the reply in the history.

RPC exchange protocols :

Three protocols, which produce differing behaviors in the presence of communication

failures are

1. The request (R ) protocol

2. The request-reply (RR) protocol

3. The request-reply-acknowledge reply (RRA) protocol.

In the request (R) protocol the client will send a request, without any reply from the

server.

In the request-reply (RR) protocol the client will send a request, and the reply will be

sent by the server

In request-reply-acknowledge reply (RRA) protocol the client will send a request and

the server will send a reply along with an acknowledgement.HTTP: an example of request and reply protocols :

HTTP :

The Hyper Text Transfer Protocol defines the ways in which browsers and other types

of clients interact with web servers.

Request Reply interactions :


16/49

Page 16 of 49

HTTP is a request reply protocol The client sends a request messages to the server

containing the URL of the required resource.

Content types :

Here content type refers to the type of content that is sent to the web server or received

from the web server by the Browsers. These contents may include text, images, audio,

video etc.

One resource per request :

Clients specify one resource per HTTP request. If a web page contains nine images, say

then the browser will issue a total of ten separate requests to obtain the entire contents

of the page.

Simple access control :

By default any user with network connectivity to a web server can access any of its

published resources. If users wish to restrict access to a resource, then they can provide

access control on that resource. For example the user can set read access on a file if he

wants to give access on a file such that they can only read from a file but cannot write

into it. .

Explain about the implementations of Remote Procedure call?

Remote procedure call :

A remote procedure call is very similar to a remote method invocation, in RMI the

object at the client would call the other object on the server to execute a method or a

procedure similarly in RPC instead of an object calling another object a procedure or

method at the client will call the procedure at the server to perform an operation.

RPC like RMI may be implemented to have one of the choices of invocation semantics

at-least-once or at-most-once are generally chosen.

RPC is generally implemented over a request reply protocol which is simplified by the

omission of object references from request messages i.e. objects are not used to send

requests or receive requests instead of objects simple procedure are used to implement


17/49

Page 17 of 49

requests and reply.

Draw Diagram RPC 1

In the above diagram

Client Process :

The client requests a service by invoking a client procedure, which in turn calls the stub

procedure.

Client Stub Procedure : The role of a stub procedure is similar to that of a proxy

method. It behaves like a local procedure to the client i.e. the client process will feel

that it is calling a procedure in the system, but instead of executing the call, it passes

the procedure identifier and the arguments into a request message, which it sends via

its communication module to the server. When the reply message arrives from the

server, it un-marshals the results.

Server Process :

The server process contains a dispatcher together with one server stub procedure and

one service procedure.

ispatcher : The dispatcher selects one of the server stub procedures according to the

procedure identifier in the request message.Server Stub Procedure : A server stub procedure is like a skeleton method in that it un-

marshals the arguments in the request message, calls the corresponding service

procedure and marshals the return values for the reply message.

Service Procedure : This procedure has the actual body of the procedure which is

executed at the server and the result is returned to the Server Stub Procedure.

Case Study: Sun RPC :

The Sun RPC is designed for client server communication in the Sun NFS network file

system. Sun RPC is sometimes called ONC (Open Network Computing).

Interface definition language :

The Sun XDR language, which was originally designed for specifying external data

representations, was extended to become an interface definition language. It may be

used to define a stub procedure for sun RPC. The procedures used in the Sun RPC must


18/49

Page 18 of 49

have the following characteristics

A procedure consists of procedure signature and a procedure number. The

procedure number is used as a procedure identifier in request messages.

Only a single input parameter is allowed. Therefore, procedures requiring

multiple parameters must parameter in a structure and then pass the structure as a

parameter.

The output parameters of a procedure are returned via a single result.

The procedure signature consists of the return type, the name of the procedure

and the type of input parameters. The type of both the result and the input

parameter may specify either a single value or a structure contain in severalvalues.

uthentication : Sun RPC request and reply messages provide additional fields for

authentication. Only procedures which has authentication have the rights to access a

procedure at the server. The request message will have the user id of the user which

will be passed to the server. The server will authenticate the user id and if it is valid the

procedure will allow to call the server procedure.

Write about Clock Synchronization?

Clocks :

Computers each contain their own physical clock. These clocks are electronic devices

that count occurring oscillations store the result in a counter register.

Clock skew and clock drift :

Computer clocks, present in different computers dont have the same reading i.e. the

time in their clocks always seem to be different.

The difference between the readings of any two clocks is called their skew. Also, the

crystal-based clocks used in computers are, like any other clocks, which have chances

of clock drift, which means that they count time at different rates, and so the difference

occurs between them .


19/49

Page 19 of 49

Clock Synchronization :

In a distributed system since such drifts in clocks between two systems should not

occur therefore we will use Clock Synchronization. In this mechanism the main aim is

to synchronize the clocks of all the systems that are present in the Distributed System.

They are two ways which can be used to implement clock synchronization

1. Logical Clocks

2. Physical Clocks

Logical Clocks ::

From the point of view of any single process, events are ordered uniquely by times

shown on the local clock. However, as Lamport pointed out, since we cannot

synchronize clocks perfectly across a distributed system, we cannot in general use

physical time to find out the order of any pair of events occurring within it.

If two events occurred at the same process Pi (I = 1, , N), then they occurred

in the order in which Pi observes them. Whenever a message is sent between processes, the event of sending the

message occurred before the event of receiving the message.

Synchronizing Logical clocks:

Lamport invented a simple mechanism by which the ordering can be captured

numerically, called a logical clock. A Lamport logical clock is a increasing software

counter. Each process Pi keeps its own logical clock, li, which it uses to apply so-called

Lamport timestamps to events.

To capture the relation , processes update their logical clocks and transmit the values

of their logical clocks in messages as follows:


20/49

Page 20 of 49

LC1: Li is incremented before each event is issued at process Pi. Li = Li + 1;

LC2: When a process Pi sends a message m, it sends on m the value t = Li.

Draw Diagram from notes.

Physical Clocks :

Synchronizing physical clocks :

The physical clocks present in the system can be synchronized in two ways.

External Synchronization :

External Synchronization refers to the synchronization of clocks that are maintained

externally by all the computers in a distributed system. The UTC provides the external

time which helps the different system synchronize their events accordingly.

Internal Synchronization :

Internal Synchronization refers to the synchronization of clocks that are maintained

locally by all the computers in a distributed system. Each computer is associated with a

local clock and if these clocks are synchronized in such a way that events are kept in an

order then we say that it achieves internal synchronization.

Clock Synchronization Algorithms: Cristians method for synchronizing clocks :

Cristian suggested the use of a time server, connected to a device that receives signals

from a source of UTC, to synchronize computers externally. Upon request, the server

process S supplies the time according to its clock.

A process p requests the time in a message mr, and receives the time value t in a

message mt (is inserted in mt at the last possible point before transmission from Ss

computer). Process p records the total round-trip time Tround taken to send the request

mr and receive the reply mt. it can measure this time with reasonable accuracy if its

rate of clock drift is small.

Draw diagram from notes.

The Berkeley algorithm :

Gusella and Zatti describe an algorithm for internal synchronization that they


21/49

Page 21 of 49

developed for collections of computers running Berkeley UNIX.

In it, a coordinator computer is chosen to act as the master. This computer periodically

checks the other computers whose clocks are to be synchronized, called slaves. The

slaves send back their clock values to it.

The master estimates their local clock times by observing the round-trip times

(similarly to Cristians technique), and it averages the values obtained. The accuracy of

the protocol depends upon a nominal maximum round-trip time between the master and

the slaves.

Write about Mutual Exclusion?

Mutual Exclusion :

A Centralized Algorithm :

The most straightforward way to achieve mutual exclusion in a distributed system is in

a one-processor system. One process is elected as the coordinator (e.g., the one running

on the machine with the highest network address).

Whenever a process wants to enter a critical region, it sends a request message to the

coordinator stating which critical region it wants to enter and asking for permission. If

no other process is currently in that critical region, the coordinator sends back a replygranting permission. Draw Diagram ME 1 (a)

Let us consider another process asks for permission to enter the same critical region.

The coordinator knows that a different process is already in the critical region, so it

cannot grant permission. The coordinator just does not reply, thus blocking process 2,

which is waiting for a reply. Alternatively, it could send a reply saying permission

denied.

Draw Diagram ME 1 (b)

A Distributed Algorithm :

Having a single point of failure is frequently unacceptable, so researchers have looked

for distributed mutual exclusion algorithms.

Ricart and Agrawalas algorithm requires that all events must be kept in an order

therefore for any pair of events we can find out which event occurred first.


22/49

Page 22 of 49

It then sends the message to all other processes, conceptually including itself. The

following three cases can occur.

If the receiver is not in the critical region and does not want to enter it, it sends

back an OK message to the sender.

If the receiver is already in the critical region, it does not reply. Instead, it queues

the request.

If the receiver wants to enter the critical region but has not yet done so, it

compares the timestamp in the incoming message with the one contained in the

message that it has sent everyone. The lowest one wins. If the incoming messageis lower, the receiver sends back an OK message. If its own message has a lower

timestamp, the receiver queues the incoming request and sends nothing.

After sending out requests asking permission to enter a critical region, a process sits

back and waits until everyone else has given permission. As soon as all the permissions

are in, it may enter the critical region. When it exits the critical region, it sends OK

message to all processes on its queue and deletes them all from the queue.

Draw Diagram ME - 2

A Token Ring Algorithm :

In software, a logical ring is constructed in which each process is assigned a position in

the ring. The ring positions may be allocated in numerical order of network addresses

or some other means.Draw Diagram ME - 3

When the ring is initialized, process 0 is given a token . The token circulates around the

ring. It is passed from process k to process k + 1 (modulo the ring sixe) in point-to-

point messages.


23/49

Page 23 of 49

Write about Election Algorithms?

Election Algorithms :

Many distributed algorithms require one process to act as coordinator, initiator,

sequencer, or otherwise perform some special role.

The Bully Algorithm :

Consider the bully algorithm devised by Garcia-Molina (1982). When a process

notices that the coordinator is no longer responding to requests, it initiates an election.

A process, P, holds an election as follows:

P sends an ELECTION message to all processes with higher numbers.

If no one responds, P wins the election and becomes coordinator. If one of the higher-ups answers, it takes over. Ps job is done.

At any moment, a process can get an election message from one of its lower numbered

colleagues. When such a message arrives, the receiver sends and OK message back to

the sender to indicate that he is alive and will take over.

If a process that we previously down comes back up, it holds an election. If it happens

to be the highest-numbered process currently running, it will win the election and take

over the coordinators job. Thus the biggest guy in town always wins, hence the name

bully algorithm.

Draw Diagram ME - 4

Ring Algorithm :

Another election algorithm is based on the use of a ring, but without a token. When anyprocess notices that the coordinator is not functioning, it builds an ELECTION message

containing its own process number and sends the message to its successor.

If the successor is down, the sender skips over the successor and goes to the next

member along the ring, or the one after that, until a running process is located. At each

step, the sender adds its own process number to the list in the message.


24/49

Page 24 of 49

When this message has circulated one\ce, it is removed and everyone goes back to

work.

Draw Diagram ME - 5

Write about Atomic Transactions?

Atomic Transactions :

Clients require a sequence of separate requests to a server to be atomic in the sense

that:

They are free from interference by operations being performed on behalf of other

concurrent clients; and

Either all of the operations must be completed successfully or they must have no

effect at all in the presence of server cashes.

A client that performs a sequence of operations on a particular bank account will fist

lookup to account by name and then apply the deposit , withdraw and getBalance

operation directly to the relevant account.

Transaction Primitives :

Whenever transaction takes place the following transaction primitives are applied on

that transaction

BEGIN_TRANSACTION : This state indicates the beginning of the transaction.READ OR WRITE : These specify read or write operations on the database items that

are executed as a part of a transaction.

END_TRANSACTION : This specifies that READ and WRITE transaction operations

have ended and marks the end of transaction execution. However at this point it may be

necessary to check whether the changes introduced by the transaction can permanently


25/49

Page 25 of 49

applied to the database or whether the transaction has to be aborted because it violates

serializability.

COMMIT_TRANSACTION : This signals a successful end of the transaction so that

any changes executed by the transaction can be safely committed to the database and

will be undone.

ROLLBACK (Abort): This signals that the transaction has ended unsuccessfully, so

that any changes that are applied on the transaction must be undone.

Properties of a Transaction(ACID) : Transaction are associated with following

properties which are often referred as ACID properties.

tomicity : A transaction is an atomic unit or processing i.e. if we have to start a

transaction then first we have to see whether that transaction will be completed or not.

If we are sure that it will be completed then only we will start the transaction other

wise the transaction will not be started.

Consistency Preservation : A transaction is consistent, if its complete execution of the

database takes it from one consistent state to another consistent state i.e. if we make

changes and even after making those changes the data is correct then we say that

Consistency is preserved.Isolation : A transaction should appear as though it is being executed in isolation from

other transactions i.e. the execution of a transaction must not effect the execution and

output of the other transactions.

urability or permanency : If any changes made in the database by a transaction must

be stored permanently in them. These changes must not be lost because of any failure.

Nested transactions:

Nested transaction extend the above transaction model by allowing transactions to be

composed of other transactions. Thus several transactions may be started from within a

transaction, allowing transactions to be regarded as modules that can be composed as

required.

A subtransaction appears atomic to its parent with respect to transaction failures and to


26/49

Page 26 of 49

concurrent access. Subtransactions at the same level, such as T1 and T2 can run

concurrently but their access to common objects is serialized, for example by locking

scheme.

Subtransactions one level (and their descendants) may run concurrently with

other subtransaction at the same level in a hierarchy. This can allow additional

concurrency in a transaction.

Subtransactions can commit or abort independently. In comparison with a single

transaction, a set of tested subtransactions is potentially more robust.

Implementation of Transactions :

The two phase commit protocol :

A client request to commit (or abort) a transaction is directed to the coordinator. If the

client requests abortTransation, or if the transaction is aborted by one of the

participants, the coordinator informs the participants immediately.

If a participant can commit its part of transaction, it will agree as soon as it have

recorded the changes and its status in permanent storage and is prepared to commit,.

The coordinator in a distributed transaction communicates with the participants to carry

out the two-phase commit protocol.

The two-phase commit protocol consists of a voting phase and completion phase By

the end of step(2) the coordinator and all the participants that voted Yes a prepared to

commit. By the end of step (3) the transaction is effectively completed. At step (3a) the

coordinator and the participants are committed, so the coordinator can report a decisionto commit to the client. At (3b) reports a decision to abort to the client.

The two Phase commit protocol:

Phase1 (voting Phase) :

The coordinator sends a cancommit ? Request to each participant in the


27/49

Page 27 of 49

transaction.

When a participant receives a canCommit? Request it replies with its vote (yes

or No) to the coordinator. Before voting Yes , it prepares to commit by saving

objects in permanent storage. If the vote is No the participant aborts

immediately.

Phase (completion according to outcome of vote);

The coordinator collects the votes (including its own).

If there are no failures and all the cotes are Yes the coordinator decides to

commit the transaction and sends a do commit request to each of the

participants.

Otherwise the coordinator decides to abort the transaction and sends doAbort

requests to all participants that voted Yes .

Participants that voted Yes are waiting for a doCommit or doAbort request from

the coordinator. When a participant receives one of these messages it acts

accordingly and in the case of commit, makes a haveCommitted call as

confirmation to the coordinator.

Concurrency control :

When multiple transactions are executing simultaneously in different processes, some

mechanism is needed to keep them out of each others way. That mechanism is called a

concurrency control.

Locking :

In a distributed transaction, the locks on an object are held locally (in the same server).

The local lock manager can decide whether to grant a lock or make the requesting

transaction wait.

When locking is used for concurrency control, the objects remain locked and are

unavailable for other transitions during the atomic commit protocol, although an


28/49

Page 28 of 49

aborted transaction releases its locks after phase 1 of the protocol.

Optimistic concurrency control:

Recall that with optimistic concurrency control, each transaction is validated before it

is allowed to commit. Transaction numbers are assigned at the start of validation and

transaction are serialized according to the order of the transition number.

A distributed transaction is validated by a collection of independent servers, each of

which validates transactions that access its own objects. The validation at all of the

servers takes place during the first phase of the two phase commit protocol.

Timestamp concurrency control:

In a single server transaction, the coordinator issues a unique timestamp to each

transaction when it starts.

Serial equivalence is enforced by committing the versions of objects in the order of the

timestamps of transactions that accessed them. In distributed transactions, we require

that each coordinator issue globally unique timestamps.

A globally unique transaction timestamp is issued to the client by the first coordinator

accessed by a transaction. The transaction timestamp is passed to the coordinator ateach server whose objects perform an operation in the transaction.

Explain about Distributed Deadlocks?

Distributed deadlocks:

Deadlocks can arise within a single server when locking is used for concurrency

control. Servers must either prevent of detect and resolve deadlocks. Using timeouts to

resolve possible deadlocks is a clumsy approach it is difficult to choose an

appropriate timeout internal, and transactions are aborted unnecessarily.

With deadlock detection schemes, a transaction is aborted only when it is involved in a

deadlock. Most deadlock detection schemes operate by finding cycles in the transition

wait-for graph. There can be a cycle in the global wait-for graph that is not in any

single local one that is there can be a distributed deadlock. Recall that the wait-for


29/49

Page 29 of 49

graph is directed graph in which nodes represent transaction waiting for an object.

There is a deadlock if and only if there is cycle in the wait-for graph.

Definition of deadlock :

Deadlock is a state in which each member of a group of transactions is waiting for

some other member to release a lock. A wait for graph can be used to represent the

waiting relationships between current transactions. In a wait for graph the nodes

represent transactions and the edges represent wait-for relationships between

transactions there is an edge for node T to node when transaction T is waiting for

transaction U to release a lock.

Deadlock prevention :

One solution is to prevent deadlock. An apparently simple but not very good way to

overcome deadlock is to lock all of the entire objects used by a transaction when it

starts. This would need to be done as single atomic steep so as to avoid deadlock at this

stage. Such a transaction cannot run into deadlock with other transactions, but it

unnecessarily restricts access to shared resources. In addition, I is sometimes

impossible to predict at the start of a transaction which objects will be used. Deadlock

can also be prevented requesting locks on objects in a predefined order, but this canresult I premature locking and a reduction in concurrency.

Deadlock detection :

Deadlocks may b detected by finding cycle in the wait-for graph. Having detected a

deadlock, a transaction must be selected for abortion to break the cycled.

The software responsible for deadlock detection can be part of the lock manager. It

must hold a representation of the wait-for graph so that it can check it for cycles from

time to time. Edges are added to the graph and removed from the graph by the lock

managers setlock and unlock operations. At the point illustrated by Figure 13.22 on the

left it will have the following information: An edge T U is added whenever the lock

Timeouts :

Lock timeouts are a method for resolution o deadlocks that is commonly used. Each


30/49

Page 30 of 49

lock is given a limited period in which it is invulnerable. After this time, a lock

becomes vulnerable. Provided that no other transaction is competing for the object that

is locked, an object with a venerable lock remains locked. However, if any other

transaction is waiting to access the object protected by a vulnerable lock, the lock is

broken (that is, the object is unlocked) and the waiting transaction resumes.

Write about Threads in Distributed System?

THREADS :

In most traditional operating systems, each process has an address space and a single

thread of control.

Introduction to Threads :

Consider, for example, a file server that occasionally has to block waiting for the disk.

If the server had multiple threads of control, a second thread could run while the first

one was sleeping. The net result would be a higher throughput and better performance.

It is not possible to achieve this goal by creating two independent server processes

because they must share a common operating systems.

Only this process contains multiple threads of control, usually just called threads, or

sometimes lightweight processes. Threads share the CPU just as processes do: first

one thread runs, then another does (timesharing).

Only on a multiprocessor do they actually run in parallel. Threads can create childthreads and can block waiting for system calls to complete, just like regular processes.

While one thread is blocked another thread in the same process can run, in exactly the

same way that when one process blocks, another process in the same machine can run.

The analogy: thread is to process as process is to machine, holds in many ways.

Thread usage :


31/49

Page 31 of 49

Threads were invented allow parallelism to be combined with sequential execution and

blocking systems calls. Here one thread the dispatcher, reads incoming requests for

work from the system mailbox.

After examining the request, it chose an idle (i.e., blocked) worker thread and hands it

the request, possibly by writing a pointer to the message into a special word associated

with each thread. The dispatcher then wakes up the sleeping worker (e.g., by doing UP

on the semaphore on which it is sleeping).

The scheduler will now be invoked and another thread will be started, possibly the

dispatcher, in order to acquire more work, or possibly another worker that is now ready

to run.

Design Issues for Threads Package :

A set of primitives (e.g., library calls) available to the user relating to threads is called a

threads package .

The first issue we will consider how threads management. Two alternatives are

possible here, static threads and dynamic threads.

With a static design, the choice of how many threads there will be is made when theprogram is written or when it is complied. Each thread is allocated a fixed stack. This

approach is simple, but inflexible.

Implementing a Threads Package :

Implementing Threads in User Space :

The first method is to put the threads package entirely in user space. The kernel knows

nothing about them. As far as the kernel is concerned, it is managing ordinary, single-

threaded processes.

This first, and most obvious, advantage is that a user-level threads package can be

implemented on an operating system that does not support threads. The threads run on

top of runtime system, which is a collection of procedures that manage threads. When a

thread executes a system call, goes to sleep, performs an operation on semaphore or

mutex, or otherwise does something that may cause it to be suspended, it calls a


32/49

Page 32 of 49

runtime system procedure.

User-level threads also have other advantages. They allow each process to have its own

customized scheduling algorithm. For some applications, for example, those with a

garbage collector thread, not having to worry about a thread being stopped at an

inconvenient moment is plus.

Draw Diagram T1

To manage all the threads, the kernel has one table per process with one entry per

thread. Each entry holds the threads register, state, priority, and other information.

The information is the same as with user-level threads, but it is now in the kernel of in

user space (inside the runtime system). This information is also the same information

that traditional kernels maintain about each of their single-threaded processes, that is,

the process state.

What are the different types of System Models?

System Model:

They are three types of system models which specify about the distributed system

o Work Station Model

o Processor Pool Modelo Hybrid Model

1. The Workstation model :

The workstation model is a system that consists of workstation (personal computers)

scattered throughout a building or campus and connected by a high speed LAN. Some

of the workstations may be in offices, and thus dedicated to a single user, whereas


33/49

Page 33 of 49

other may be in public areas and have several different users during the course of a

day.

Draw Diagram WM - 1

iskless Workstations :

Workstations that do not have a local hard disk are referred as diskless workstations.

If the workstations are diskless, the files are stored in a remote file server. Requests to

read and write files are sent to a file server, which performs the work and sends back

the replies.

Diskless workstations are popular at universities and companies. Having large number

of workstations equipped with small, slow disks is typically much more expensive than

having one or two file servers equipped with huge, fast disks and accessed over the

LAN.

iskful Workstations :

Workstations that have a local hard disk that they are referred as diskful or disky

workstations. They are a number of different models of Diskful workstations.

In the first model all the user files are kept on a central file server, and the local disks

are used for paging, and for storing temporary files.

A third approach to using local disks is to use them as explicit caches (addition to using

them for paging, temporaries, and binaries. In this mode operation, users can download

files from the file servers to their own disks, read and write them locally, and then

upload the modified ones at the end of the login session.

Fourth each machine can have its own self-contained file system, with the possibility o

mounting or otherwise accessing other machines file systems. The idea here is that

each machine is basically self-contained and that contact with the outside world is

limited.

Using Idle Workstations :

If the user is not using the workstation or he has not activated it for some time then that

workstation is said to be an idle workstation. Most of the time workstations are usually


34/49

Page 34 of 49

idle in universities which is wastage of resources therefore there must be some means

to use the idle workstations.

Server Driven : When a workstation goes idle, it announces its availability by entering

its name, network address, and properties in registry file (or data base),

An alternative way for the newly idle workstation to announce the fact that it has

become unemployed is to put a broadcast message onto the network. All other

workstations then record this fact.

Client Driven : The other way to locate idle workstations is to use a client-driven

approach. When remote is invoked, it broadcasts a request saying what program it

wants to run, how much memory it needs, whether or not floating point is needed, and

so on. These details are not needed if all the workstations are identical, but if the

system is heterogeneous and not every program can run on every workstation, they are

essential.

When the replies come back, the idle workstation picks one and sets it up.

2. The Processor Pool Model :

Although using idle workstations adds a little computing power to the system, it doesnot address a more fundamental issue: What happens when it is feasible to provide 10

or 100 times as many CPUs as there are active user?

An alternative approach is to construct a process pool , a rack full of CPUs in the

machine room, which can be dynamically allocated to users on demand.

The motivation for the processor pool idea comes from taking the diskless workstation

idea a step further.

In effect, we are converting all the computing power into idle workstations that can

be accessed dynamically. User can be assigned as many CPUs as they need for short

periods, after which they are returned to the pool so that other users can have them.

There is no concept of ownership here: all the processors belong equally to everyone.

Draw Diagram WM - 2


35/49

Page 35 of 49

A Hybrid Model :

A possible compromise is to provide each user with a personal workstation and to have

a processor pool in addition. Although this solution is more expensive than either a

pure workstation model or a pure processor pool model, it combines the advantages of

both of the others.

What are the different Processor Allocation Algorithms?

Processor Allocation :

A distributed system consists of multiple processors. These may be organized as a

collection of personal workstations, a public processor pool, or some hybrid form.

Some algorithm is needed for deciding which process should be run on which machine.Allocation Models :

Processor allocation strategies can be divided into two broad classes.

Non-Migratory : In non-migratory allocation algorithms, when a process is created, a

decision is made about where to put it. Once placed on a machine, the process stays

there until it terminates.

It may not move, no matter how badly overloaded its machines become and no matter

how many other machines are idle.

Migratory : With migratory allocation algorithms, a process can be moved even if it has

already started execution. While migratory strategies allow better load balancing, they

are more complex and have a major impact on system design.

Design Issues for Processor Allocation Algorithms :

A large number of processor allocation algorithms have been proposed over:

o Deterministic versus heuristic algorithms.

o Centralized versus distributed algorithms.

o Optimal versus suboptimal algorithms.

4. Local versus global algorithms.

5. Sender-initiated versus receiver-initiated algorithm.


36/49

Page 36 of 49

Deterministic versus Heuristic Algorithms :

Deterministic algorithms are appropriate when everything about process

behavior is known in advance. Imagine that you have a complete list of all

processes, their computing requirements, their file requirements their

communication requirements, and so on.

Heuristic algorithms are those where the load is completely unpredictable.

Requests for work depend on whos doing what, and can change dramatically

form hour to hour, or even from minute to minute. Processor allocation in such

system cannot be done in a deterministic, mathematical way, but of necessity

uses ad hoc techniques called heuristics.

Centralized versus Distributed Algorithms :

The second design issue is centralized versus distributed. Collecting all the

information in one place allows a better decision to be made, but is less robust

and can put a heavy load on the central machine. Decentralized algorithms are

usually preferable, but some centralized algorithms have been proposed for lack

of suitable decentralized alternatives.

Optimal versus Sub-Optimal Solutions :Optimal solutions can be obtained in both centralized and decentralized systems,

but are invariably more expensive than suboptimal ones. They involve collecting

more information and processing it more thoroughly. In practice, most actual

distributed systems settle for heuristic, distributed, suboptimal solutions because

it is hard to obtain optimal ones.

Local versus Global Algorithms :

When a process is about to be created, a decision has to be made whether or not

it can be run on the machine where it is being generated. If that machine is too

busy, that new process must be transferred somewhere else. Sometimes they are

run on local machines and some times on Global machines.

Sender Initiated versus Global initiated :

If a workstation has too much or overload then it sends request to other machines


37/49

Page 37 of 49

to share its load, then it is Sender initiated. But if a workstations is idle then it

sends a request to other work stations that it is idle and can send any additional

work to it.

Example Processor Allocation Algorithms :

1. A Graph -Theoretic Deterministic Algorithm :

A widely-studied class of algorithm is for systems consisting of processes with

known CPU and memory requirements, and a known matrix given the average

amount of traffic between each pair of processes. .

Draw Diagram WM 2 (a)

In the above diagram the graph is partitioned with processes A, I , and G on one

processor, B,F , and H on a second, and processes C , D , and I on the third. The

total network traffic is the sum of the arcs intersected by the dotted cut lines, or

30 units.

Draw Diagram WM 2 (b)

In the above diagram we have a different partitioning that has only 28 units of

network traffic. Assuming that it meets all the memory and CPU constrains, this

is a better choice because it requires less communication.2. A Centralized Algorithm :

The unusual thing about this algorithm, and the reason that it is centralized, is

that instead of trying to maximize CPU Utilization, it is concerned with giving

each workstation owner a fair share of the computing power.

Whereas other algorithms will happily let one user take over all the machines if

he promises to keep them all busy (i.e., achieve a high CPU utilization), this

algorithm is designed to prevent precisely that.

Usage table entries can be positive, zero, or negative. A positive score indicates

that the workstation is a net user of system resources, whereas a negative score

means that it needs resources. A zero score is neutral.

3. A Hierarchical Algorithm :


38/49

Page 38 of 49

One approach that has been proposed for keeping tabs on a collection of

processors is to organize them in a logical hierarchy independent of the physical

structure of the network.

Draw Diagram WM - 3

For each group of k workers, one manager machines (the department head) is

assigned the task of keeping track of who is busy and who is idle. If the system

is large, there will be an unwieldy number of department heads, so some

machines will function as deans, each riding herd on some number of

departments heads. If there are many deans, they too can be organized

hierarchically, with a big cheese keeping tabs on a collection of deans.

This hierarchy can be extended ad infinitum, with the number of levels needed

growing logarithmically with the number of workers.

4. A Receiver-Initiated Distributed Heuristic Algorithm :

With this algorithm, whenever a process finishes the system checks to see if it

has enough work. If not, it picks some machine at random and asks it for work.

If that machine has nothing to offer, a second, and then a third machine is asked.

If no work is found with N probes, the receiver temporarily stops asking, doesany work is has queued up, and tries again when the next process finishes. If no

work is available, the machine goes idle. After some fixed time interval, it

begins probing again.

5. A Bidding Algorithm :

Each processor advertises its approximate price by putting it in a publicly

readable file. This price is not guaranteed, but gives an indication of what the

service is worth (actually, it is the price that the last customer paid).

Different processors may have different prices, depending on their speed,

memory size, presence of floating-point hardware, and other features. An

indication of the service provided, such as expected response time, can also be

published.

When a process wants to start up a child process, it goes around and checks out


39/49

Page 39 of 49

is currently offering the service that it needs. It then determines the set of

processors who services it can afford.

Write about Real Time Distributed Systems?

Real Time Distributed Systems:

Real-time programs (and systems) interact with the external world in a way

that involves time. When a stimulus appears, the system must respond to it in a

certain way and before a certain deadline, the system is regarded as having

failed. When the answer is produced is as important as which answer is

produced.

Real-time systems are generally split into two types depending on how serious

their deadlines are and the consequences of missing one. These are:

6. Soft real-time systems.

7. Hard real-time systems.

Soft Real Time System : Soft real-time means that missing an occasional

deadline is all right. For example, a telephone switch might be permitted to lose

or misroute one call in 105 under overload conditions and still be within

specification.

Hard Real Time System : In contrast, even a single missed deadline in a hard

real-time system is unacceptable, as this might lead to loss of life or an

environmental catastrophe. In practice, there are also intermediate systems

where missing a deadline means you have to kill off the current activity, but theconsequence is not fatal. For example, if a soda bottle on a conveyor belt has

passes by the nozzle, there is no point in continuing to squirt soda at it, but the

results are not fatal. Also, in some real-time systems, some subsystems are heard

real time whereas others are soft real time.

Design Issues :


40/49

Page 40 of 49

Event- Triggered versus Time-Triggered Systems :

In an event-triggered real-time system , when a significant event in the outside

world happens, it is detected by some sensor, which then causes the attached

CPU to get an interrupt. Event-triggered systems are thus interrupt driven.

Most real-time systems work this way. For soft real-time system with lots of

computing power to spare, this approach is simple, works well, and is still

widely used. Even for more complex systems, it works well is the complier can

analyze the program and know all there is to know about the system behavior

once and event happens, even fi it cannot tell when the event will happen.

The main problem with event-triggered systems is that they can fail under

conditions of heavy load, that is, when many events are happening at once.

Consider, for example, what happens when a pipe raptures in computer-

controlled nuclear reactor. Temperature alarms, pressure alarms, radioactivity

alarms, and other alarms will all so off at once, causing massive interrupts. This

event shower may overwhelm the computing system and bring it down,

potentially causing problems far more serious than the rapture of a single pipe.

Finally, some events may be shorter than a clock tick, so they must be saved to

avoid losing them. They can be preserved electrically by latch circuits or by

microprocessors embedded in the external devices.

Predictability :

One of the most important properties of any real-time system is that its behavior

be predictable. Ideally, it should be clear at design time that the system can meet

all of its deadlines, even at peak load.

Fault Tolerance :

In a safety-critical system, it is especially important that the system be able to

handle the worst-case scenario. It is not enough to say that the probability of

three components failing at once is so low that it can be ignored. Failures are not

always independent.


41/49

Page 41 of 49

During a sudden electric power failure, everyone grabs the telephone, possibly

causing the phone system to overload, even though it has its own independent

power generation system.

Language support :

The language should be designed so that the maximum execution time of every

task can be computed at compile time. This requirements means that the

language cannot support general while loops. Interaction must be done using for

loops with constant parameters.

Real-time languages need a way to deal with time itself. To start with, a special

variable, clock, should be available, containing the current time in ticks.

However, one has to be careful about the unit that time is expressed in. The finer

the resolution, the faster clock will overflow.

Real-Time Communication :

Communication in real-time distributed systems is different from

communication in other distributed systems.

Consider a token ring LAN. Whenever a processor has a packet to send, it waits

for the circulating token to pass by, then it captures the token, send, sends itspacket, and puts the token back on the ring so that the next machine downstream

gets the opportunity to seize it.

Draw Diagram WM - 4

Write about Distributed File System Design?

Distributed File System Design:

File Service :

The file service is the specification of what the file system offers to its clients. It

describes the primitives available, what parameters they take, and what actions

they perform.

File Server :

A file server, in contrast, is a process that runs on some machine and helps


42/49

Page 42 of 49

implement the file service.

Distributed File System :

A distributed file systems typically has two reasonably distinct components: the

file service and the directory service. The File Service is concerned with the

operations on individual files, such as reading, writing, and appending, whereas

Directory service is concerned with creating and managing directories, adding

and deleting files from directories, and so on.

The File Service Interface :

A file is an un-interpreted sequence of bytes. The meaning and structure of the

information in the files is entirely up to the application programs; the operating

system is not interested.

A file can be structured as a sequence of records, for example, with operating

systems calls to read or write a particular record. The record can usually be

specified by giving either its record number (i.e., position within the file) or the

value of some field. In the latter case, the operating system either maintains the

file as a B-tree or other suitable data structure, or uses hash tables locate records

quickly.A file can have attributes, which are pieces of information about the file by

which are not part of the file itself. Typical attributes are the owner, size,

creating date, and access permissions.

File services can be split into two types, depending on whether they support an

upload/download model or a remote access model. In the upload/download

model, the file service providers only two major operations: read file and write

file.

The former operation transfers an entire file from one of the file servers to the

requesting client. The latter operation transfers an entire file the other way, from

client to server. Thus the conceptual model is moving whole files in either

direction. The files can be stored in memory or on a local disk, as needed.

Inter Directory Server Interface :


43/49

Page 43 of 49

The other part of the file service is the directory service, which provides

operations for creating and deleting directories, naming and renaming files, and

moving them from one directory to another. The nature of the directory service

does not depend on whether individual files are transferered in their entirety or

accessed remotely.

The directory service defines some alphabet and syntax for composing file (and

directory) names. File names can typically be from I to some maximum number

of letters, numbers, and curtail special characters. Some systems divide file

names into two parts, usually separated by a period, such as prog.c for a C

program or man.txt for a text file. The second part of the name, called the file

extension, identifies the file type.

All distributed systems allow directories to contain subdirectories, to make it

possible for users to group related files together.

Accordingly, operations are provided for creating and deleting directories as well

as entering, removing and looking up files in them. Normally, each subdirectory

contains all the files for one project, such as a large program or document (e.g., a

book).When the (sub) directory is listed, only the relevant file are shown; unrelated

files are in other (sub) directories and do not clutter the listing. Subdirectories

can contain their own subdirectories and so on, leading to a tree of directories,

often called a hierarchical file system.

Naming Transparency :

The principal problem with this form of naming is that it is not fully transparent.

Two forms of transparency are relevant in this context and are worth

distinguishing. The first one location transparency means that the path name

gives no hint as to where the file (or other object) is located.

Write about the Sun Network File System?

Case Study: Sun Network file system


44/49

Page 44 of 49

All implementations of NFS support the NFS protocol a set of remote

procedure calls that provide the means for clients to perform operations on a

remote file.

The NFS server module resides in the kernel(Operating System) on each

computer that acts as an NFS server.

When a request is made for a file by the client, then the request is first taken by

the NFS protocol at client and then it is translated to the NFS server module at

the computer holding the actual file.

Virtual file system :

Draw diagram: FS 4

The integration is achieved by a virtual file system (VFS) module, which has

been added to the UNIX kernel to distinguish between local and remote files.

In addition, VFS keeps track of the files ystems that are currently available both

locally and remotely, and it passes each request to the appropriate local system

module (the UNIX file system, the NFS client module or the service module for

another file system).

The file identifiers used in NFS are called file handles. A file handle is opaque toclients and contains whatever information the server needs to distinguish an

individual file.

Client integration :

The NFS client module is responsible to provide an interface that can be used by

conventional application programs.

The NRS client module cooperates with the Virtual File System in each client

machine. It operates in a similar manner to the conventional UNIX file system,

transferring blocks of files to and from the server and caching the blocks in the

local memory.

It shares the same buffer cache that is used by the local input output system. But

since several clients in different host machines may simultaneously access the

same remote file.


45/49

Page 45 of 49

Access control and authentication :

Whenever a user tries to access a file the NFS server checks the users identity

against the files access permission, to see whether the user is permitted to

access the file in the manner requested.

But unlike the conventional UNIX server, the NFS server has to check the user

identity each time a request is made, because it doesnt store the details

NFS server interface :

NFS server interface is a simplified representation of the RPC interface provided

by NFS Version 3 server.

Write method description from diagram FS 5

The NFS file access operations read , write , getattr and setattr are almost

identical to the Read , Write , GetAtrtributes and SetAttributes operations defined

for our flat file service.

The lookup operation and most of the other directory operations defined in the

above description are similar to those in our directory service model.

Access transparency : The NFS client module provides an applicationprogramming interface to local processes that are identical to the local operating

systems interface.

Location transparency : Each client establishes a file name space by adding

mounted directories in remote files systems to its local name space.

Mobility transparency :

Scalability : The published performance figures show that NFS server can be

built to handle very large real-world loads in an efficient and cost-effective

manner.

File replication : Read-only file stores can be replicated on several NFS servers,

but NFS does not support file replication with updates.

Security : The need for security in NFS emerged with the connection of most


46/49

Page 46 of 49

intranets to the Internet.

Write about Distributed Shared Memory?

In 1986, Li proposed a different scheme, now known under the name

distributed shared memory (DSM). A collection of workstations connected by

a LAN share a single paged, virtual address space. A reference to local pages is

done in hardware.

An attempt to reference a page on a different machine causes a hardware page

fault, which traps to the operating system. The operating system then sends a

message to the remote machine, which finds the needed page and sends into the

requesting processor. The faulting instruction is then restarted and can nowcomplete.

On-chip Memory :

Although most computers have an external memory, self-contained chips

containing a CPU and all the memory also exist. Such chips are produced by the

millions, and are widely used in cars, appliances, and even toys. In this design,

the CPU portion of the chip has address and data lines that directly connect to

the memory portion.

One could imagine a simple extension of this chip to have multiple CPUs

directly sharing the same memory. While it is possible to construct a chip like

this, it would be complicated, expensive, and highly unusual.

Draw Diagram SM - 1

Bus-Based Multiprocessors :

We see that the connection between the CPU and the memory is a collection of

parallel wires, some holding the address the CPU wants to read or write, some

for sending or receiving data, and the rest for controlling the transfer. Such a

collection of wires is called a bus.

SM - 2


47/49

Page 47 of 49

On a desktop computer, the bus is typically etched onto the main board

(the parent-board), which holds the CPU and some of the memory, and into

which I/O cards are plugged. On minicomputers the bus is sometimes a flat cable

that wends its way amount the processors, memories, and I/O controllers.

The disadvantage of having a single bus is that with as few as three or four CPUs

the bus is likely to become overloaded. The usual approach taken to reduce the

bus load is to equip each CPU with a snooping cache (sometimes called a

snoopy cache), so called because it snoops on the bus.

One particularly simple and common protocol is called write through. When a

CPU first reads a word from memory, that word is fetched over the bus and is

stored in the cached of the CPU making the request. If that word is needed again

later, the CPU can take it from the cached without making a memory request,

thus reducing bus traffic.

Each CPU does its caching independent of the others. Consequently, it is

possible for a particular word to be cached at two or more CPUs at the same

time.Our protocol manages cache blocks, each of which can be in one of the

following there states:

8. INVALID This cache block does not contain valid data.

9. CLEAN Memory is up-to-date; the block may be in other caches.

10. DIRTY Memory is incorrect; no other cache holds the block.

Ring-Based Multiprocessors :

The next step along the path toward distributed shared memory systems are ting-

based multiprocessors, In Memnet, a single address space is divided into a

private part and a shared part.

Draw Diagram SM - 3


48/49

Page 48 of 49

The ring consists of 20 parallel wires, which together allow 16 data bits and 4

control b