Chapter 16-2 Distributed System Structures. 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 16 Distributed System Structures

Chapter 16-2 Distributed System Chapter 16-2 Distributed System StructuresStructures

17.2 Silberschatz, Galvin and Gagne ©2005Operating System Concepts

Chapter 16 Distributed System StructuresChapter 16 Distributed System Structures

Chapter 16.1

Background

Motivation

Types of Distributed Operating Systems

Network Structure

Chapter 16.2

Network Topology

Communication Structure

Communication Protocols

Robustness

Design Issues


Network Topology

When we speak of topology, we are speaking of physical connections.

Each of the types I will present differ in

installation cost (cost of linking up the sites),

communication costs (amount of time / money it takes to send a message from node A to node B, and

availability, essentially the ability to use the topology in the face of a downed links or sites.

Some topologies have all nodes directly connected to every other node, some have only ‘some’ nodes directly connected and others indirectly connected, some topologies look like trees, stars, and rings.

Each has advantages and disadvantages.


Fully & Partially Connected Networks and Trees

Fully Connected: Here, every node is connected to every other node. Adv: no switching or broadcasting is needed. Dis: as the number of nodes increases, cost rises dramatically! Good for small network.

Partially Connected: Adv: Clearly, installation costs are lower since not all nodes are connected

to every node – only some. Dis: for nodes that wish to communicate and are not directly connected,

messages must be routed through communication links, which, of course, raises the cost.

Trees: Adv: installation and communication costs are low but the very nature of a

tree implies that there is only one path to a node. Dis: If this path ‘goes down’ we have the network ‘partitioned.’ Partitioning refers to the situation where the network is broken into two (or

more) subsystems that cannot communicate between themselves.


Rings and Star Network Topologies

Rings:

Adv: higher degree of reliability,

Dis: but communication costs are high because a message may need to travel through a number of links before it arrives at its destination.

Adv: Better availability than the tree – not likely to result in a partition…

Adv: At least two links must go down for a partition to occur.

Star:

Failure of any link results in a partition, but a partition may be only a single site.

Adv: low communication costs, because every node is at most two nodes away from the target node, but

Dis: the central site is critical. If it goes down, the entire network is down.


Communication Structure

Need to look away from some of the physical aspects of networking to the internal workings of communications.

While this might appear to have become a course in communications, understanding of these topics is absolutely essential to understanding how distributed operating systems work.

So, we will look at common issues that a communications network must address:

Naming and name resolution

Routing Strategies

Packet Strategies

Connection Strategies, and

Contention


Naming and Name Resolution - DNS

How do two processes locate each other in order to communicate?

Processes need to be able to reference each other by a name.

So, within a computer system, each process has a process identifier.

Processes on remote systems are identified by a <host name, identifier>

‘host name’ is unique within a network – usually alphanumeric; ‘identifier’ may be a process id or other unique number at host site.

But computers like numbers, so we try to bind names to a host-id that describes the destination system to the networking hardware.

Nowadays we distribute names among systems on the network, and the network must use a protocol to distribute and receive the information.

We call this the domain-name system (DNS).


DNS Naming Resolution DNS specifies naming structure of the host as well as name-to-address resolution

Component separated by periods. Hosts (on the Internet) - logically addressed w/multi-part name:

More specific to more general. We know there are several popular domains: .com, .org, .mil, .gov,… and countries.

Each component has a name server, which is only a process on a system. Name servers accept a name, return address of name server responsible for that name

The location of the name server for domain .edu is known and is issued a request for the address of the name server for csuchico.edu. The domain name server returns the address of the host on which the csuchico.edu

name server resides. This name server is sent a request for the name server of cs.chico.edu. Address retnd. Then a request to this name server for broggio.cs.csuchico.edu returns an Internet

address host-id for that host, such as 137.62.37.20.

In practice using local caches makes this process quick. The .edu name server would have csuchico.edu in its cache and would inform the sending process that it could resolve two parts of the address, then returns a pointer to the cs.csuchico.edu name server.


Routing Strategies How are messages sent? If there’s only one path, then the path is clear. But this is often

not the case, and we have many options.

Normally, each site has a routing table, which points to sites ‘along the way’ that may be used in transmitting a message.

These tables are often updated, as sites go down and are changed.

Fixed routing: a specific path is fixed ahead of time;

generally a shortest path is preferred.

Cannot change the path despite potentially better ones.

If path goes down, communication is lost.

Virtual routing: a specific path is established for a session.

Later sessions would likely have a different path.

Since fixed path is determined / maintained at session time, different routes will be selected for different sessions.

This is a more reliable routing mechanism

Dynamic routing: Path determined only when a message is sent.

Message may go from site to site to site usually advancing to site with least traffic.

Path may not be direct, but it’ll get there!

Unix uses fixed routing for simple networks; dynamic for the rest.


Routing Strategies – Comparisons Gateways: Sometimes we only need to know how to route to a gateway,

These usually connect a local network to other networks and the Internet.

So, here we might use a fixed route to the gateway server realizing that the gateway will use dynamic routing from there on out.

Router: a host computer with routing software or a special purpose device that has at least two network connections. (A simple PC can serve as a router)

Router has special software that enables it to access routing tables to decide whether a received message on one network needs to be passed to any other network connected to the router.

Router checks its tables to determine location of the destination host or at least of the network to which it will send the message toward the destination host.

(Gateways and routers are normally dedicated hardware devices that run code out of firmware.)

So, without getting into too much detail, what is sent?


Packet Strategies

We know messages may clearly be of variable size. But this can make things difficult.

In practice, messages are usually broken into fixed length messages called packets, frames, or, datagrams, and these are transmitted.

What is sent, however, is determined by the connection strategy.

Let’s turn our attention to Connection Strategies…


Connection Strategies

Processes that need to communicate via communication sessions using a number of mechanisms, the most popular of which are:

Circuit Switching

Message Switching, and

Packet Switching

These are quite different and each may be appropriate under specific circumstances.


Circuit, Message, Packet Switching

Circuit switching is a fixed physical link – as in our telephone system. The connection exists for the length of the session, and no other process can use

this particular connection. Circuit switching requires a lot of setup and may incur a real waste of bandwidth

during potentially idle periods. But it incurs less overhead per message sent. Message switching is a communications pathway is temporary and only exists for the

time it takes to transfer the message. Physical links are dynamically established for the short transfer. The message itself consists of a number of parts including source and destination

addresses, error-correction codes, start and end of text indicators, and a number of other items used for management and control of the transmission.

Requires little set up time but incurs more overhead per message in establishing a communications pathway.

Packet switching is used to facilitate the transmission of message packets likely over different routes. The paths are dynamically determined (previously discussed), and the packets are reassembled at the destination address. Lots of overhead here due to breaking up of messages, incorporation of

management and control items in the message (necessary to ensure packets are delivered and reassembled properly), and, of course, the cost of reassembly.

Packet Switching makes the best use of available bandwidth and is the most commonly used (save the telephone) switching strategy for data transmission.


Contention

Realities of transmission imply that there is the real possibility that more than one source may transmit on common links at the same time.

This occurs often in a ring or multi-access bus topology.

Clearly, the transmissions can be garbled and sites must retransmit.

This problem must be addressed to avoid significantly degraded service.

Two techniques are in wide:

CSMA/CD – Carrier Sense Multiple Access w Collision Detection, and

Token Passing

Both are used in different topologies and with good success.


Contention using CSMA/CD

In this approach, a node wishing to transmit ‘listens’ on the line for traffic.

If link is free, node (site) will start transmitting; otherwise it waits and listens before trying again.

But if two sites – detecting no network traffic – start to transmit at the same time, we have a collision.

If so, both sites must stop transmitting. Wait a while, then try again.

But if the system is loaded, performance may be seriously degraded!

This has become a successful contention strategy for quite some time.

Big advantage: we can add more hosts (nodes) to a network – as long as collisions don’t become too frequent.

All in all, network traffic makes this approach a standard one, and it is in widespread use.


Contention using Token Passing

This approach is used a lot in a ring topology, where a token (small packet of information) circulates within the ring from node to node to node...

A site that wishes to transmit must wait for the token, remove it from the ring, transmit the data, and then retransmit the token to continue its journey around the ring.

Token Lost. If somehow a token gets lost (like a site goes down when it has the token) this network topology must first detect a loss and generate a new token. Election. It does this by declaring an election that results in a site that will

generate and propagate a new token. Advantages of token ring:

Ethernet (multi-bus architectures) can experience serious performance degradation if too many nodes are busy and transmitting.

In the ring approach, adding new sites may increase wait time, but it will normally not result in any serious performance losses

For networks with limited or modest transmission requirements, Ethernet is more efficient, because sites can transmit at any time.


===== Ethernet =====

Ethernet is a family of frame-based computer networking technologies for local area networks (LANs).

It defines a number of wiring and signaling standards for the Physical Layer of the OSI networking model, through means of network access at the Media Access Control (MAC)/Data Link Layer, and a common addressing format.

Ethernet is standardized as IEEE 802.3.

The combination of the twisted pair versions of Ethernet for connecting end systems to the network, along with the fiber optic versions for site backbones, is the most widespread wired LAN technology.

It has been in use from around 1980 to the present, largely replacing competing LAN standards such as token ring, FDDI, and ARCNET.


Communication Protocols There are so very many activities involved in a communications network

where so much communication is asynchronous.

Systems on a network agree on a set of protocols such that

each layer of the protocols or set of protocols has specific responsibilities in overall communication and

each layer of the protocol at the sending site communications with its corresponding layer at the receiving site.

Again, each layer has things it expects to ‘get’ and things it will provide.

i.e., it looks for all kinds of communication parameters appropriate at that level of the protocol stack, so to speak.

Implementation:

Some of the layers are implemented in hardware (lower three levels)

Mid and upper layers are implemented in software.

Let’s look at the ISO network model – it is formal and provides a great framework for discussion – although the TCP/IP protocol has now largely replaced the ISO model…


Layers in Hardware ISO Model

Three lowest levels of the protocol accommodated in Hardware

Physical Layer – here we are concerned with agreement on electrical representations of bit stream signals consisting of 1s and 0s, and how the sites are able to interpret these.

All 1s and 0s are not ‘created equal.’ We have different code sets, not just the most familiar ones!

Data Link Layer – data link control is responsible for

handling packets and also for providing for

error detection and recovery that might have occurred in the physical layer.

Network Layer – This layer is responsible for

routing packages within the communications network, decoding addresses of incoming packets, and

maintaining routing tables, etc.

Routers play the key role here.


Layers – in Software Software Layers

Transport Layer – This layer is responsible for the

transfer of messages between clients,

partitioning the messages into packets,

maintaining packet order, controlling flow, and

generating physical addresses.

Session Layer –

Responsible for implementing sessions or

process to process communication protocols, such as communications via remote log ins and for file and mail.

Presentation Layer – Here we are looking for responsibilities for

resolving format differences between various sites in the network, such as necessary for character conversions, full and half duplex lines, etc.

Application Layer – Responsible for

interacting direction with the user to accommodate file transfer, remote log-in protocols, email, etc.


ISO Protocol Stack All these constitute what we refer to as the ISO Protocol Stack These represent a set of cooperating protocols such that each layer in the protocol

stack communicates with its peer at the other end. Each layer may modify the message – adding attributes, computing items, adding

headers or trailers or other ‘indicators’, etc. At the end of a transmission, the data reaches the data link control and moves

up through the protocol stack, where everything is acted upon by its respective layer - ultimately presented to the user at application level.

TCP/IP Protocol Stack – the most widely adopted set of protocols. The TCP, as its name implies, is the transport protocol. Almost all Internet sites now use this. Has fewer layers because it combines some activities of the ISO layers. More difficult to implement but more efficient than the ISO model Model identifies a number of protocols at the application area widely used today,

including http, ftp, telnet, DNS, SMTP, and more. The transport layer identifies both the

unreliable, connectionless user control protocol (UDP) and the reliable, connection-oriented transmission protocol, TCP. See figure 16.9, p. 632 in your textbook. This corresponds to the Transport layer

in the ISO model. The IP (Internet Protocol) is responsible for routing protocol through the Internet.

This corresponds to the Network layer in the ISO model. In the TCP/IP protocol model, we do not formally identify a link or physical layer as we

have in the ISO model; this model allows traffic to run across any physical network.


Robustness

We can have all kinds of hardware failure.

Such networks must be able to

Detect a failure

Reconfigure the network to proceed (reconfigure), and

Recover from the failure


Failures – Detection, Reconfiguration, Recovery. Detecting the Failure

This is the easy part. Detecting an error is easy; cause is oftentimes difficult. With no shared memory, however, we are not able to determine easily whether the

failure is due to the link, a site, or a message loss. We simply get a failure and it is difficult to ascertain the exact cause. We typically use handshaking to detect link and site failures.

Sites send ‘I am up’ messages periodically. If no message is received, it may be:

– Link has failed or Site is down. Host can resend and wait, but can only declare a failure.

Now, the site might try sending a message over a different route, if available. If after ‘some time’ the message is received and a response is received, then it is

possible that the desired link is down. If this approach does not result in reception at a target node using a different

route, the sending site can only assert:– Receiving site is down– Direct link (if available) from sender to receiver is down– Alternate path from sender to receiver is down, or– Message has been lost.

It is very difficult for the sending site to clearly determine the cause of the failure.


Failures – Detection, Reconfiguration, Recovery.

Reconfiguring: site down or link down… Procedures must be invoked to inform the network to reconfigure and take a

node out or a link out so that normal operations may ensure. Data Link Layer is responsible for detection in most cases.

– Recall: responsible for handling packets and error recovery - normally Direct Link Down – actions required.:

If this is the case, the fact that this link is down must be broadcast. Routing tables will need to be updated at the Network Layer. This takes time for nodes on the net to modify their routing tables

so that when packets are received form the Transport Layer, it can route packets correctly..

Site Down: Here again, all sites in the system must be informed so that they will

not try to use the services available at that downed site.. If the downed site had special functions on the net such as central

coordination (book) for, say, deadlock detection, etc., then another site must be elected to become the new coordinator of this activity.

If failed node is in a token ring topology, then we must build a new ring and the failed node taken logically out of the ring network...


Failures – Detection, Reconfiguration, Recovery.

Recovery from Failure – link and site After repair, the link or site must be integrated into the network

seemlessly, if possible.

IF the link was down but now repaired, and if the link was only between just two sites, repeated handshaking can accommodate notification. This is the rather simple case.

IF a site has failed and we have now recovered: All other sites must receive this information to update routing

tables, Must now ‘know’ that facilities at ‘that site’ are now

available, and perhaps ‘press on’ with undelivered messages, and more.


Design Issues

There are several very key issues here.

I will divide these into these key issues

Transparency

Fault Tolerance

Scalability


Design Issues – Transparency

In a distributed system, a site or link must be totally transparent to a user.

Making this so has challenged the brightest designers

A distributed system should appear to be a centralized system. Should not be able to distinguish between local and remote services.

Another element of transparency is user mobility. Here, conceivably, a user could log onto any system and obtain his/her

environment wherever logged on.

These would be nice – but very difficult to obtain.


Design Issues – Fault Tolerance

Here, we must be mindful that sites can go down, links can go down, but the network must be maintained.

The question is simply how many faults and what kind of faults can the network tolerate and still provide services??

Continued performance should be proportional, of course, to the magnitude of the faults. If few faults grind the system to a halt, this system is not very fault tolerant.

Most commercial systems have only limited fault tolerance, Many scientific systems have much more tolerance, such as redundant units

(with voters) in space probes. Explain…

Fault tolerance can address processor issues, storage issues, link issues, and so very many factors.

Unfortunately, fault tolerant computing is difficult to accomplish.


Design Issues – Scalability

A bigee!!

Network resources and their concomitant communications can become saturated with very heavy workloads.

Systems have bounded resources.

They are finite state machines with finite resources.

A scalable system should react gracefully to increased load and should degrade more gracefully.

A scalable system should have its resources react more modestly in the face of difficulties than a system not scalable.

But there are bounds. Adding more and more sites to a network can bog down the best of systems.


Design Issues – Scalability - more

Scability is It is not simply a matter of shifting a load from one component to another that might be serving as backup.

In truth, a distributed system must have spares for ensuring reliability and for handling peak loads gracefully.

Scalable systems should have the potential for fault tolerance and scalability. But a poor design can kill this potential.

Very large distributed systems are, for the most part, theoretical. We look at and talk about scalabiity as a forerunner for large scale distributed

systems.


Design Issues – Scalability – still more

Principle of Central Control: Issues of central control and central resources should not be used to build scalable systems.

Examples of centralize control include centralized authentication and servers, central file services, etc. brings us to centralization – and we don’t want this.

We want a functionally symmetric configuration were

all components have an equal role in operation of the system and

each machine has some degree of autonomy.

But this is darn near impossible to obtain with this principle.

There are many examples where we simply cannot have this symmetric configuration.


Design Issues – Scalability – Clustering

One approach to symmetric and autonomy is clustering. Here the system is partitioned into collections of semi-autonomous clusters.

A cluster consists of a set of machines and a dedicated cluster server. We want cross-cluster references to be very infrequent, and so each cluster

should satisfy all requests of units in that cluster most of the time. Clearly, this requires careful design and selection of machines with appropriate

resources to meet these needs. If the centralized cluster can do this, it may be used as a modular building block

to scale up the system.

This is easier said than done, for servers must operate efficiently during peak load and provide for all clients simultaneously.

Thus, a single-process server is not a good choice, since a disk request would block the entire service.

Assigning a process for each client is better, but frequent context switches can be a negative factor.

Then too, all server processes often need to share information.


End Chapter 16.2End Chapter 16.2

Documents

Chapter 16-2 Distributed System Structures. 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 16 Distributed System Structures