Introduction to Computer Networks

Introduction to Computer Networks

Phillip Musumeci

April 14, 2002

http://mirriwinni.cs.jcu.edu.au/˜phillip

1

JCU School of InfTech

Contents

Part I 1

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Network Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Network Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Network Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.5 Alternative classification criterion . . . . . . . . . . . . . . . . . . . . . . . . 31.6 LANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.7 MANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.8 WANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.9 Internetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.10 Network organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.11 Example: Message Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.12 Network Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.13 Interfaces and Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.14 Types of service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.15 Service Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Reference Models — OSI 162.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2 Physical Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Data Link Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4 Network Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.5 Transport Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.6 Session Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.7 Presentation Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.8 Application Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.9 Data Transmission in the OSI Model . . . . . . . . . . . . . . . . . . . . . . 20

3 TCP/IP Reference Model 213.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 The Internet Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 The Transport Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.4 The Application Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.5 Host–to–Network Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.6 OSI versus TCP Reference Models . . . . . . . . . . . . . . . . . . . . . . . 233.7 Example Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.7.1 Internet Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.8 Data Communications Services . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Physical, Data Link, and Network Layers 274.1 Physical Layer Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Data Link Layer Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2.1 Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2.2 Error Control Overview . . . . . . . . . . . . . . . . . . . . . . . . . 28

i c©Phillip Musumeci 2002


4.2.3 Flow Control Overview . . . . . . . . . . . . . . . . . . . . . . . . . 294.3 Network Layer in Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3.1 IP Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.4 IP Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.5 Subnets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.6 IP Router Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.7 Internet Control Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.7.1 Internet Control Message Protocol (ICMP) . . . . . . . . . . . . . . 354.7.2 Address Resolution Protocol (ARP) . . . . . . . . . . . . . . . . . . 364.7.3 Reverse Address Resolution Protocol . . . . . . . . . . . . . . . . . 374.7.4 Interior Gateway Routing Protocol . . . . . . . . . . . . . . . . . . . 374.7.5 Exterior Gateway Routing Protocol . . . . . . . . . . . . . . . . . . 38

4.8 Internet Multicasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.9 Classless InterDomain Routing . . . . . . . . . . . . . . . . . . . . . . . . . 394.10 IPv6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Transport and Session Layers 425.1 Transport Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Types of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.1.3 Qualities of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.1.4 Transport Service Primitives . . . . . . . . . . . . . . . . . . . . . . 445.1.5 Berkeley Sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.1.6 Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1.7 Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.1.8 Example Transport Protocol – TCP . . . . . . . . . . . . . . . . . . . 49

5.2 TCP/IP demonstration client . . . . . . . . . . . . . . . . . . . . . . . . . . 505.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.2.2 Privilege and Complexity . . . . . . . . . . . . . . . . . . . . . . . . 515.2.3 Standard versus nonstandard clients . . . . . . . . . . . . . . . . . . 515.2.4 Connectionless v connection–oriented SVRs . . . . . . . . . . . . . 515.2.5 Program Interface to Protocols . . . . . . . . . . . . . . . . . . . . . 525.2.6 Interface Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . 525.2.7 System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2.8 BSD Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Part II 54

6 TCP/IP Protocols 546.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546.2 Review of TCP/IP Layering . . . . . . . . . . . . . . . . . . . . . . . . . . . 546.3 User Datagram Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 UDP Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.5 UDP Port Number Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 576.6 Reliable Stream Transport Service (TCP) . . . . . . . . . . . . . . . . . . . . 596.7 Providing Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.8 What does TCP provide? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

ii c©Phillip Musumeci 2002


6.9 TCP Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.10 Segments and Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.11 Variable Window Size and Flow Control . . . . . . . . . . . . . . . . . . . . 636.12 TCP Segment Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.12.1 Out of Band Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646.12.2 Maximum Segment Size . . . . . . . . . . . . . . . . . . . . . . . . . 656.12.3 TCP Checksum Computation . . . . . . . . . . . . . . . . . . . . . . 65

6.13 Acknowledgements and Retransmission . . . . . . . . . . . . . . . . . . . . 666.14 TCP Timeouts and Retransmission . . . . . . . . . . . . . . . . . . . . . . . 666.15 TCP Links with High Variance in Delay . . . . . . . . . . . . . . . . . . . . 686.16 Response to Congestion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696.17 Open and Close of TCP Connections . . . . . . . . . . . . . . . . . . . . . . 716.18 Reset of TCP Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726.19 TCP Protocol FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726.20 Forced Data Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.21 Reserved TCP Port Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 746.22 TCP Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766.23 Further Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7 Introduction to Socket Programming 777.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777.2 Creating a socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777.3 Closing a socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787.4 Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787.5 Server: Listen and Accept . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.6 Client: Connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817.7 Sending and Receiving Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 817.8 Flexible use of read() and write() . . . . . . . . . . . . . . . . . . . . . 817.9 Servers for Multiple Services . . . . . . . . . . . . . . . . . . . . . . . . . . 827.10 Network Byte Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837.11 Some Other Related Functions . . . . . . . . . . . . . . . . . . . . . . . . . 847.12 BSD internet super-server inetd . . . . . . . . . . . . . . . . . . . . . . . . 847.13 Additional References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

8 IP Router Operation 868.1 Datagram Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868.2 Route Table Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 888.3 Route Optimisation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 888.4 Interior Gateway Routing Protocol . . . . . . . . . . . . . . . . . . . . . . . 89

8.4.1 Routing Information Protocol (RIP) . . . . . . . . . . . . . . . . . . 898.4.2 Open Shortest Path First . . . . . . . . . . . . . . . . . . . . . . . . . 90

8.5 Exterior Gateway Routing Protocol . . . . . . . . . . . . . . . . . . . . . . . 91

9 Internet Control Protocols 929.1 Internet Control Message Protocol (ICMP) . . . . . . . . . . . . . . . . . . . 929.2 Address Resolution Protocol (ARP) . . . . . . . . . . . . . . . . . . . . . . . 949.3 Reverse Address Resolution Protocol . . . . . . . . . . . . . . . . . . . . . . 949.4 Domain Name System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

iii c©Phillip Musumeci 2002


10 Application Layer 9810.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9810.2 Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9810.3 Network News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9910.4 Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

A PPP — Point–to–Point Protocol 101

Notes:

1. These lecture notes use diagrams from “Computer Networks” by Andrew Tanen-baum. They are the result my teaching at RMIT and JCU.

2. Texts:

• “Computer Networks”, 3rd edition, Andrew Tanenbaum, Prentice-Hall, 1996.ISBN 0-13-394248-1.See also URL http://www.cs.vu.nl/˜ast .

• “Internetworking With TCP/IP Vol. 1”, D.E. Comer, 2nd edition, Prentice-Hall, 1991.ISBN 0-13-468505-9. (reference)

• “Computer Networks and Internets”, D.E. Comer, Prentice-Hall, 1997.ISBN 0-13-599010-6. (reference)

• “Advanced Programming in the UNIX Environment”, W. Richard Stevens,Addison-Wesley, 1992.ISBN 0-201-56317-7. (coding)

iv c©Phillip Musumeci 2002


1 Introduction

We consider the following topics:

• Purpose of networks;

• Network structure and organisation;

• Reference models and service types;

• Internet protocols IP, TCP.

Text:

Andrew Tanenbaum, “Computer Networks”, 3rd edition, Prentice-Hall, 1996.ISBN 0-13-394248-1. See URL http://www.cs.vu.nl/˜ast

1.1 Background

• Recent technological developments: 18th century for mechanical machines; 19th

century for steam power;20th century for electric power and information technology.

• Differences between collecting, transporting, storing, and processing informationare rapidly disappearing.

• The switch from analogue signal communications systems to digital communica-tions systems means (digital) computer data, documents, speech, images, imagesequences, etc. will eventually be indistinguishable.

• The merging of computers and communications has affected the organisation ofcomputer systems — a single large computer has evolved into interconnectedsmaller1 computers usually called a “computer network”.

• The way of doing business is changing, in terms of collaboration and in terms ofcommerce itself.

1At least in physical terms.

1 c©Phillip Musumeci 2002


• The distribution of resources in a computer network is still evolving, driven bymaintenance costs per desktop and the availability of higher bandwidth at reduc-ing cost.

1.2 Network Aims

• Resource sharing — printing, disk storage, mail, etc.

• Robustness — a network allows users to switch between servers to obtain higherreliability2.

• Economy — distributed systems can have centralised management, (expensive)shared software resources cleanly connected to UI systems in client/server sys-tems.

Client process

Request

Reply

Server process

Client machine Server machine

Network

Fig. 1-1. The client-server model.

1.3 Network Use

• Communication:

– remote information access e.g. world wide web, banking and airline systems;

– person–to–person communication e.g. i–phone, email;

– interactive entertainment (emerging).

2Assuming the network is more reliable than the resource source!



1.4 Network Hardware

Two network types:

• Broadcast networks

– contain a single communications channel shared by all machines,

– messages are broken into small “packets” and sent by one machine to allother machines (which re–assemble the message),

– an address field specifies the recipient(s).

Communications between pairs of machines is common but it is possible to have 1 bit

in the address field indicate a “group” address for transmission to multiple recipients

(important for digital HDTV and related pay–on–demand media distribution).

• Point–to–point networks

– consist of many connections between individual pairs of machines,

– a link between any two machines will often have to pass through intermedi-ate machines so “routing” of packets is an important issue.

General rule: geographically localised networks tend to use broadcast structures while

geographically spread networks tend to use point–to–point structures.

1.5 Alternative classification criterion

The scale of a network can also be used to classify a network. Consider AST Figure 1.2:



0.1 m Circuit board

1 m System

10 m Room

100 m Building

Campus1 km

City10 km

Interprocessor distance

Processors located in same

Example

100 km Country

Continent1,000 km

Planet

Data flow machine

Multicomputer

Local area network

Metropolitan area network

Wide area network

The internet10,000 km

Fig. 1-2. Classification of interconnected processors by scale.

• Data flow machines are highly parallel with many functional units e.g. ThinkingMachines CM5 and Fujitsu VP2200 systems3, Texas Instruments TMS320 C40/C6xDSP devices.

• Multicomputers use short, fast busses to pass messages e.g. some DSP chipsets.

• Next are “our” networks operating over longer distances, divided into local, metropoli-tan, and wide–area networks.

• Such networks may be interconnected e.g. the Internet.

1.6 LANs

• Distinguished from other networks by size, transmission technology, and topol-ogy.

• Range in size from single building to a few kilometers in size — restricted sizemeans worst–case transmission time is bounded (used in design).

3T.I. make high speed cross bar switches for variable topology machines and HP make high speedoptical links for PCB use.



• Common transmission technology is a single cable to which all machines are at-tached, with speeds of 10Mbps to 100Mbps. Note: 1Mbps = 220bps = 1 megabit/sec= 1048576bps.

• Common topologies include bus and ring e.g.

Cable Computer

(b)(a)

Computer

Fig. 1-3. Two broadcast networks. (a) Bus. (b) Ring.

• BUS LAN: arbitration is required to handle two or more machines transmittingsimultaneously (a collision). Solutions involve passing tokens to avoid collisions,or mechanisms to handle a collision.

• RING LAN: each bit propagates around the ring (but not more than once!).

• Channel allocation: static or dynamic.

1.7 MANs

• Like a bigger LAN.

• Can support both data and voice (i.e. it is possible to handle delivery time inspeech).

• A standard called Distributed Queue Dual Bus (DQDB, IEEE 802.6) developed inAustralia has been agreed upon.



1 2 3 N

Bus A

Direction of flow on bus B

Head end

Direction of flow on bus A

Bus B

Computer . . .

Fig. 1-4. Architecture of the DQDB metropolitan area network.

1.8 WANs

• Spans a large geographical area.

• Connects machines (hosts, end systems) running user programs.

• Hosts are connected by a communication subnet, or subnet for short, used in thecontext of the top–level view in Figure 1-5.

Subnet Router

Host

LAN

Fig. 1-5. Relation between hosts and the subnet.

• A subnet consists of: transmission lines, switching elements or switching comput-ers (routers).



• Routers figure out which transmission link to use, and perform store–and–forwardoperations.

• Subnet ambiguity: the term subnet also has a meaning in terms of addressing.

• For point–to–point subnets, an important design consideration is topology:

(a) (b) (c)

(d) (e) (f)

Fig. 1-6. Some possible topologies for a point-to-point subnet.(a) Star. (b) Ring. (c) Tree. (d) Complete. (e) Intersecting rings.(f) Irregular.

Exercise: Postulate how (or whether) routing issues might be handled here.

• For WANs with small packet sizes (such as digital telephones), the packets maybe called cells.

1.9 Internetworks

• Gateways are used to interconnect networks.

• Any necessary communications conversions are performed e.g. the networks mightuse different rules (protocols) for operation, or data representation might be dif-ferent.



• An example of a large internetwork is The Internet.

1.10 Network organisation

• Network hardware and software is highly structured.

• Most networks are organised as a series of layers or levels — one reason for theselayers is to help humans control complexity in design and implementation of thenetworks.

• The purpose of each layer is to offer services to higher layers, shielding those lay-ers from the details of how the services are implemented.

• Layer n on one machine carries on a conversation with layer n on the other ma-chine e.g.

Layer 5

Layer 4

Layer 3

Layer 2

Layer 1

Host 1

Layer 4/5 interface

Layer 3/4 interface

Layer 2/3 interface

Layer 1/2 interface

Layer 5 protocolLayer 5

Layer 4

Layer 3

Layer 2

Layer 1

Host 2

Layer 4 protocol

Layer 3 protocol

Layer 2 protocol

Layer 1 protocol

Physical medium

Fig. 1-9. Layers, protocols, and interfaces.



• Note: The dashed lines indicate a conversation between peers. The data bits in-volved circulate down, across the physical medium, and then back up between thecorresponding levels.

• Between each adjacent layer is an interface which defines the primitive operationsand services the lower layer offers to the upper layer.

• Design concerns the number and purpose of the layers, and a

(well–understood) interface between them.

• A set of layers and protocols is called a Network Architecture — this is enoughinformation for someone to build hardware and write software to implement eachlayer.

• Organisation is usually a series of layers or levels. Each layer offers services tohigher layers, shielding those layers from the details of how the services are im-plemented.

• Dashed lines indicate a conversation between peers (local layer n talks to remotelayer n).

• Between each adjacent layer is an interface which defines the primitive operationsand services the lower layer offers to the upper layer.

• A list of protocols used by a certain system, one protocol per layer, may be calleda small protocol stack.

1.11 Example: Message Transfer

Communication is to occur from a process running in the top layer shown in AST Figure1-11.



H2 H3 H4 M1 T2 H2 H3 M2 T2 H2 H3 H4 M1 T2 H2 H3 M2 T2

H3 H4 M1 H3 M2 H3 H4 M1 H3 M2

H4 M H4 M

M M

Layer 2 protocol

2

Layer 3 protocol

Layer 4 protocol

Layer 5 protocol

3

4

5

1

Layer

Source machine Destination machine

Fig. 1-11. Example information flow supporting virtual communi-cation in layer 5.

• Message M is created and passed to layer 4 for transmission.

• Layer 4 prepends a header to identify the message (sequence no., size, time, etc.)and passes it to layer 3.

• While messages usually have no size limit, the layer 3 protocol will impose a limitso incoming messages are broken up into packets which have headers prependedand are passed to layer 2.

• Layer 2 prepends a header and appends a trailer and passes each packet to layer1 for physical transmission.

• As information is passed down, headers and trailers are added. As packets arepassed up, headers and trailers are removed (and used).

• These ideas are not limited to just networking — even hardware designers mayimplement header and trailer handling functions in hardware for high speed com-munications on multi-CPU DSP systems (but still with some software component).

• Peer processes think of the communication (and coordination) as being horizontal.



• Access will be via functions such as SendMessage and ReceiveMessage at the toplevel, and similar packet oriented functions at lower levels.

1.12 Network Design Issues

• Addressing: each host must be identified so messages can pass between pairs ofhosts; hosts have multiple users so there must be addressing in the context of eachhost.

• Data transfer rules:

– simplex (uni–directional);

– half–duplex (either direction at a time);

– full–duplex (both directions simultaneously).

• Error handling: detection & correction (per packet); agreement on methods used.

• Packet handling: packetising large messages; resequencing of out–of–order pack-ets; avoiding mostly empty packets.

• Flow control (protect slow data destinations from fast sources).

• Multiplexing: for multiple connections between two peer layers, a lower layermay choose to multiplex these connections (may reduce costs or delays).

• Routes: given multiple paths between source and destination, a route must bechosen.

1.13 Interfaces and Services

Terminology:

• Entities are the active elements in a layer, software or hardware.

• Peer entities are entities in the same layer on different machines.

• Layer n is a service provider to layer n+1, while layer n+1 is a service user of layern.



• Service Access Points are the access points in layer n where layer n+1 can access theservices provided4.

Layer N+1

Interface

Layer N

ICI SDU

IDU

ICI SDU

SAP

SDU

SAP = Service Access Point IDU = Interface Data Unit SDU = Service Data Unit PDU = Protocol Data Unit ICI = Interface Control Information

Layer N entities exchange N-PDUs in their layer N protocol

HeaderN-PDU

Fig. 1-12. Relation between layers at an interface.

• Between layers, an Interface Data Unit (IDU) is used to exchange information. AnIDU comprises a Service Data Unit (SDU) holding the data plus some control in-formation.

• An SDU is transferred by fragmenting it into small parts called Protocol Data Units

(PDU) e.g. packets.

1.14 Types of service

A layer can offer two types of service to other layers above. Connection–oriented impliesa link is established and used as follows:

• Connection setup e.g. telephone dialing;

• Data is exchanged e.g. talking; and

• Connection tear down e.g. hanging up.4Some services may be more efficient if they “tap” into the protocol at a middle layer, so an SAP may

be of use to a user service.



Connectionless implies:

• each element of data contains a full address;

• each element is sent independently of others, meaning that

• messages are not guaranteed to arrive in order (in contrast to connection–orientedservices).

Note: connection–oriented services may be built upon lower layers that are connection-less, and vice–versa.

Reliable message stream Sequence of pages

Reliable byte stream Remote login

Unreliable connection Digitized voice

Unreliable datagram Electronic junk mail

Registered mailAcknowledged datagram

Database queryRequest-reply

Service Example

Connection- oriented

Connection- less

Fig. 1-13. Six different types of service.

Quality of service relates to aspects such as speed, delays, reliability, data loss:

• data loss avoided with acknowledgements but costs processing and delays;

• reliable connection–oriented service can be implemented as message sequence orbyte stream;

• unreliable (i.e. not acknowledged) connectionless service is often called a datagram

service;

• for higher reliability, the datagram service can become acknowledged — the ac-

knowledged datagram service still avoids connection establishment overheads;

• in a request–reply service, datagrams are exchanged (fast, able to handle packetloss etc., common in client–server computer systems).



1.15 Service Primitives

• A service is specified by a set of primitive operations;

• Primitives tell the service to perform an action or report on an action.

��

Primitive Meaning��

Request An entity wants the service to do some work��

Indication An entity is to be informed about an event��

Response An entity wants to respond to an event��

Confirm The response to an earlier request has come back��

��

��

Fig. 1-14. Four classes of service primitives.

Services between entity a and entity b can be:

• confirmed involving request sent by a, indication received by b, response sent byb, confirm received by a; and

• unconfirmed involving request sent by a and indication received by b.

Example — a simple connection oriented service could be built on 8 primitives:

1. CONNECT.request

2. CONNECT.indication

3. CONNECT.response

4. CONNECT.confirm

5. DATA.request

6. DATA.indication

7. DISCONNECT.request

8. DISCONNECT.indication



AST Figure 1-15 shows typical use over time of these primitives (ignore Millie):

1 5 7

4 6

1 2 3 4 5 6 7 8 9 10

2

3

6

5

8

Layer N + 1

Layer N

Layer N + 1

Layer N

Computer 1

Time

Computer 2

Fig. 1-15. How a computer would invite its Aunt Millie to tea.The numbers near the tail end of each arrow refer to the eight ser-vice primitives discussed in this section.

Review

A service is a set of primitives that a layer provides to the layer above. A protocolis a set of rules governing the format and meaning of frames, packets, and messagesexchanged by peer entities. An entity uses protocols to implement a service.



2 Reference Models — OSI

2.1 Introduction

The ISO Open Systems Interconnection Reference Model has 7 layers chosen accordingto the principles:

1. Layers only created when a different level of abstraction needed;

2. Layers perform well defined function;

3. Layer functions chosen with international standards in mind;

4. Layer boundaries chosen to minimise information flow across boundaries;

5. Number of layers chosen to handle the various distinct functions one per layer,but with not too many layers.

There exists the OSI model specification document and also ISO layer standards.



Layer

Presentation

Application

Session

Transport

Network

Data link

Physical

7

6

5

4

3

2

1

Interface

Interface

Host A

Name of unit exchanged

APDU

PPDU

SPDU

TPDU

Packet

Frame

Bit

Presentation

Application

Session

Transport

Network

Data link

Physical

Host B

Network Network

Data link Data link

Physical Physical

Router Router

Internal subnet protocol

Application protocol

Presentation protocol

Transport protocol

Session protocol

Communication subnet boundary

Network layer host-router protocol

Data link layer host-router protocolPhysical layer host-router protocol

Fig. 1-16. The OSI reference model.

2.2 Physical Layer

• Concerned with transmission of unstructured bit stream over physical link;

• Involves parameters such as signal voltage and bit time durations;

• Handles transmission which is uni–directional or bi–directional;

• Deals with mechanical, electrical, optical, and procedural aspects of establishingthe link and moving data bits and disestablishing the link.

2.3 Data Link Layer

• Main task is to take a raw transmission facility and transform it into a line thatappears free of undetected transmission errors;



• This is done by breaking input data into data frames, sending them sequentially,and processing acknowledgement frames sent back by the receiver;

• Since the physical layer appears to be a “bit conduit”, it is up to the data link layerto create and recognise frame boundaries — this is done by attaching special bitpatterns to the beginning and end of each frame (and handling the case where thisbit pattern needs to be represented within the frame);

• Handles data errors e.g. frame retransmission when frames are lost or corrupted,also handles duplicate frames;

• Handles flow control;

• Medium access in broadcast networks is handled by the medium access sublayer.

2.4 Network Layer

• Provides upper layers with independence from the data transmission and switch-ing technologies used;

• Concerned with controlling the operation of the subnet, including routing whichcan be static (“wired in”) or dynamic (varied to improve performance e.g. avoidcongestion);

• Handles accounting;

• Convert different addressing schemes and packet sizes between different networks;

• Broadcast networks do not have a routing problem.

2.5 Transport Layer

• Provide reliable transparent transport of data between end points;

• Basic function is to accept data from the session layer, split it into smaller units ifneed be, pass these to the network layer, and ensure all pieces arrive correctly;

• Must be efficient, and must isolate upper layers from changes in hardware tech-nology;



• May create multiple network connections in order to achieve high throughput, ormay multiplex several transport connections onto the same network connection toreduce cost;

• The type of service is determined at the transport layer (error–free point–to–pointchannel, or isolated messages with no guarantee of delivery, or “multi–cast”);

• Is a true end–to–end layer, from source to destination (contrast with lower layersof Figure 1-16);

• Handles establishing and deleting connections, and the naming of the end pointusers;

• Handles flow control.

2.6 Session Layer

• Provides the control structure for communication (sessions) between applications,and establishes, manages, and terminates these sessions;

• Can provide dialogue control (e.g. manage one–way traffic), manage tokens (to-kens may be exchanged in protocols where only one end at a time may attemptcritical operations), provide synchronisation (inserts checkpoints in operations sothat restarts are possible e.g. a file transfer restart).

2.7 Presentation Layer

• Can perform generally useful transformations and functions on data in a way thatsupports different users’ needs and avoids each user programming their own so-lution;

• Concerned with syntax & semantics of the information transmitted;

• Common services include encryption, text compression, reformatting;

• Data conversion from host–specific to network–oriented back to (a different) host–specific form5;

5Different computers have different numeric representations while different data bases might havedifferent data representation.



2.8 Application Layer

• Provides a variety of other protocols (in the OSI environment);

• Might provide a “network virtual terminal” service, network management ser-vice, transaction server, file transfer protocol (handling different naming conven-tions and different text representations), mail, etc.

2.9 Data Transmission in the OSI Model

Application layer

Session layer

Transport layer

Network layer

Data link layer

Physical layer

Presentation layer

Application layer

Session layer

Transport layer

Network layer

Data link layer

Physical layer

Presentation layer

Network protocol

Actual data transmission path

Transport protocol

Session protocol

Presentation protocol

Application protocol

Data

Data

Data

Data

Data

Data

Data

Bits

AH

PH

SH

TH

NH

DH DT

Sending Process

Receiving Process

Fig. 1-17. An example of how the OSI model is used. Some of theheaders may be null. (Source: H.C. Folts. Used with permission.)

• Actual data transmission is vertical (apart from the lower physical link);

• Each layer is programmed as if it is transferring data with a horizontal peer.



3 TCP/IP Reference Model

3.1 Introduction

• The Internet was originally developed using leased telephone lines and later satel-lite and radio links;

• It had to handle connection of multiple networks in a seamless way;

• Defined in 1974, design predates OSI.

TCP/IPOSI

Application

Presentation

Session

Transport

Network

Data link

Physical

7

6

5

4

3

2

1

Application

Transport

Internet

Host-to-network

Not present in the model

Fig. 1-18. The TCP/IP reference model.

3.2 The Internet Layer

• It allows hosts to inject packets into any network and have them travel indepen-dently to the destination;

• Packets may take different routes;

• Packets may arrive out of order (in which case higher layers must reorder them).

• The internet layer defines an official packet format and protocol called IP (Internet

Protocol);



• The purpose of this layer is to deliver IP packets hence major issues are: routing,congestion.

3.3 The Transport Layer

• It allows peer entities on the source and destination hosts to carry on a conversa-tion (similar to OSI transport layer);

• There are two end–to–end protocols defined:

– TCP (Transmission Control Protocol) is a reliable connection–oriented protocol— transfers a byte stream from one machine to another across the internetwithout error, breaks message into fragments and reassembles, handles flowcontrol.

– UDP (User Datagram Protocol) is an unreliable, connectionless protocol — avoidsTCP’s overheads, used in client–server request–reply application (where er-rors etc. are usually handled directly), also suitable where speed is moreimportant than error avoidance e.g. speech, video.

• Relationship of IP, TCP, UDP:

ARPANET

Protocols

Networks

TELNET

TCP UDP Transport

LAN

DNS Application

Layer (OSI names)

Packet radio

Physical + data link

SMTP

SATNET

FTP

IP Network

Fig. 1-19. Protocols and networks in the TCP/IP model initially.

3.4 The Application Layer

• The TCP/IP model does not have a session or presentation layer — the need wasnot perceived, and they are now considered of little use;



• Application layer includes higher level protocols such as:

– virtual terminal (TELNET);

– file transfer (FTP);

– mail transfer (SMTP);

– domain name service (DNS);

– network news transfer (NNTP);

– hyper–text transfer (HTTP).

3.5 Host–to–Network Layer

• Not described in the TCP/IP reference model;

• Usually not described in texts;

• Read the sources! I.e. look at 4.4BSD Lite/Lite2 source distributions or subsequentOS sources such as the *BSD family. Useful network sites include:http://www.freebsd.org

http://www.au.freebsd.org

http://www4.au.freebsd.org

3.6 OSI versus TCP Reference Models

• Layers up through and including the transport layer provide an end–to–end network–independent transport service;

• OSI model makes the distinctions between services and interfaces and protocolsexplicit;

• OSI reference model was devised before protocols were invented, while TCP/IPwas given a model to describe the existing protocols (the protocols fit their modelvery well indeed!);

• Separation of interfaces and implementation ties in well with modern OO designtechniques;

• Different number of layers;



• Network layer: OSI provides connectionless and connection–oriented communi-cations while TCP/IP has only connectionless communications;

• Transport layer: OSI has connection–oriented communications while TCP/IP hasconnectionless and connection–oriented communications.

Reading: AST 1.4.4 “Critique of the OSI Models and Protocols” and AST 1.4.5 “Cri-tique of the TCP/IP Reference Model”.

3.7 Example Networks

• Internet in Australia — there are a number of backbone networks across the coun-try with more expected. E.g. Optus, Telstra.

• On a local scale, the pathways between JCU/SIT hosts and remote hosts can bedescribed with the use of the traceroute UNIX command.

• Novell Netware is a very popular PC networking system — based on Xerox Net-work System (XNS), predates OSI, appears similar to TCP/IP. It uses a proprietaryprotocol stack shown in AST Figure 1-22:

Layer

Application

Transport

Network

Data link

Physical

SAP

Ethernet

Ethernet

NCP

File server

IPX

Token ring

Token ring

SPX

. . .

ARCnet

ARCnet

Fig. 1-22. The Novell NetWare reference model.

• Physical and data link layers can be chosen from various industry standards in-cluding ethernet, IBM token ring, etc.

• Has an unreliable connectionless internetwork protocol called IPX, like IP but with10 byte addresses instead of 4 bytes.



• Has a connection–oriented protocol called NCP (network core protocol) provid-ing user data transport and other services (a second protocol SPX provides onlytransport).

• Servers regularly advertise services (SAP).

• PC network protocols are starting to be based on TCP/IP.

Bytes 1222 1 1 12

Destination address Source address Data

Packet typeTransport control

Packet lengthChecksum

Fig. 1-23. A Novell NetWare IPX packet.

3.7.1 Internet Services

• Email: basic service allowing messages to be composed, sent, and received. Usu-ally, a mail client handles email composition and reading while an operating sys-tem service handles email transfer via the Simple Mail Transfer Protocol (e.g. BSDUnix sendmail handles SMTP).

• News: message transfer system allowing individuals to communicate to groups.An application program handles news composition & reading while a networkservice handles news propagation via the Network News Transfer Protocol (NNTP).

• File Transfer (FTP): a user client program communicates with a remote applicationto provide file transfer.

• Remote Procedure Call (RPC): a local program communicates a request to a remote“service provider” asking for a remote procedure (program) to run and return theresults.

• Remote Login: a user can run a remote shell (CLI session or other task) via toolssuch as telnet, rlogin, and rsh.



3.8 Data Communications Services

There exist quite a few other network standards but they are outside the scope of thissubject (see AST for further information). Some additional services of interest are nowmentioned.

• In the 1980’s/1990’s, Bellcore introduced Switched Multimegabit Data Service (SMDS)which was aimed at linking remote networks. Basic SMDS operates at 45Mbps,providing a simple connectionless packet delivery service.

• The CCITT developed the X.25 standard in the 1970’s. It provides an interface be-tween public packer-switched networks and customers. Can be switched virtualcircuit (setup, use, teardown) or permanent virtual circuit. Packets are ordered.Capacity has been sold in 2Mbps increments.

• Frame Relay provides a “bare bones” connection oriented bit transport with userresponsible for errors and flow control. Typical speed = 1.5 Mbps.

• Broadband Integrated Services Digital Network (B-ISDN) can offer television (var-ious image sizes), telephony and high quality audio, other multimedia services,LAN interconnections, etc. The underlying technology is Asynchronous TransferMode (ATM) using a 53 byte packet called a cell (5 byte header, 48 byte payload).

Bytes 485

User dataHeader

Fig. 1-29. An ATM cell.

• ATM cell switching is flexible:

– Handles constant rate traffic (audio, video6) and variable rate traffic (data).

– High speed: Gbps possible.

– Multicasting possible (telephone companies can become broadcasters).

Is connection–oriented, with current speeds of 155Mbps (≈ 3 × T1 links) and622Mbps. Video coding developers have been pushing for priority packets to beallowed.

6Compressed video can be bursty.



4 Physical, Data Link, and Network Layers

4.1 Physical Layer Issues

• Use communications ideas to move data bits — deal with modulation schemes,transmitters & receivers.

• Systems may use time division multiplexing, frequency division multiplexing,wavelength division multiplexing.

• Media can be wire link, fibre link, or radio (wireless).

• Must deal with signal attenuation and interference. Optical systems also sufferphase distortion and signal leakage.

• For further information: see AST .

4.2 Data Link Layer Issues

• Deals with algorithms that achieve reliable efficient communication between twoadjacent machines.

• Machines are linked by a communications channel e.g. coax wire.

• Non–ideal channel characteristics: circuit errors; finite data rates; nonzero propa-gation times.

• DLL design issues: group bits into frames; handle transmission errors; regulateflow.

• DLL provides a well–defined interface to the network layer.

4.2.1 Framing

• Framing refers to the technique of identifying the start and end of each packet.One technique is to use a special flag (bit pattern).

• Using bit stuffing for framing allows data frames to contain an arbitrary numberof bits and frees the system from any concept of character size. Each frame be-gins and ends with the special bit pattern 01111110. A sender DLL encountering 5



consecutive 1 bits inserts a 0 in the bit stream, and the receiver DLL removes theinserted 0.

0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0

0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 0 0 1 0

0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0

Stuffed bits

(a)

(b)

(c)

Fig. 3-5. Bit stuffing. (a) The original data. (b) The data as theyappear on the line. (c) The data as they are stored in the receiver’smemory after destuffing.

4.2.2 Error Control Overview

• To ensure delivery of a frame, we need some feedback from the receiver to thesender to indicate success or failure — this will handle errors within a frame.

• What if a frame is completely lost (perhaps due to a noise burst)? Start a timerafter each frame is sent and resend if no acknowledgement received within sometime limit.

• What if frames arrive twice (ack. was lost) or out of order? Give each frame an IDnumber.

• DLL’s duties include management of timers & frame sequence ID numbers.

• Error detection: general idea is to have the TX end append extra bits to the mes-sage in such a way that the RX end can detect illegal bit combinations. Mathemati-cally, it is possible to define a measure of distance between the valid message+checkbits — we then know how many bit errors can be detected.

• Error correction: suppose valid combinations of message+check bits differ in atleast 3 bits and we receive a message+check sequence that differs from the allowedby only 1 bit. If we think only 1 bit is in error, we can choose the nearest allowedpattern and fix the error.



• A Hamming Code is a method of determining the smallest number of check bitsto achieve a desired detection and correction capability.

• In a Cyclic Redundancy Check, the basic idea is to:

– regard the message m as a long binary number;

– divide m by a long prime number g;

– use the remainder after division as a check;

– the sender and receiver both do this calculation — for an error to go unde-tected, a multiple of g bit errors must have occurred for the remainder to stillbe OK.

– The choice of g is be special!

4.2.3 Flow Control Overview

• The sender and receiver of frames may operate at different maximum rates due toCPU power available, CPU loadings, etc.

• Higher speed senders must be prevented from swamping lower speed receiversin order to prevent frame losses.

• Again, a feedback mechanism is employed from receiver to sender.

• Can be explicit requests to send n frames, or can be handled by receiver slowingacknowledgements.

• For further information: see AST .

4.3 Network Layer in Internet

• The internet can be viewed as a group of subnets joined together.

• The “glue” is the network layer protocol called IP (Internet Protocol) which was de-signed with internetworking in mind.

• Above the network layer is the transport layer which takes data streams andbreaks them up into datagrams, of size up to 64K byte but usually 1500 bytes,which are handled by the network layer.

• These datagrams may be fragmented into smaller units.



• At the destination, the pieces are reassembled into the original datagrams andpassed to the transport layer.

4.3.1 IP Header

Version IHL Type of service Total length

Identification

Time to live Protocol

Fragment offset

Header checksum

Source address

Destination address

Options (0 or more words)

D F

M F

32 Bits

Fig. 5-45. The IP (Internet Protocol) header.

• An IP datagram contains a header part and a data part, with fields stored mostsignificant bit first (big–endian).

• Version identifies which version of the protocol is being used — this allows proto-col changes to be supported.

• IHL specifies the header length.

• Type of service allows the host to tell the subnet what type of service — can choosefrom combinations of reliability versus speed.

Current use:

– bits 7,6,5 = 3bit priority;

– bits 4,3,2 = DTR (what is most important out of delay, throughput, reliability);

– bits 1,0 = unused.

At present, routers tend to ignore this field!

• Total length = datagram length (header and data).



• Identification field allows a destination host to determine which datagram a frag-ment that arrives belongs to (it is reassembling the datagram).

• DF indicates a don’t fragment request and routers should not fragment this data-gram (useful if the destination cannot reassemble it e.g. when a PC is booting andneeds to receive its OS as a single datagram). All systems must be able to acceptfragments of 576 bytes or less.

• MF indicates there are more fragments to come.

• Fragment Offset says where this fragment belongs in the current datagram. 13 bitsize gives a maximum datagram size of 64K.

• Time to live limits packets lifetimes — prevents packets wondering around forever.Usually decremented on each hop — packet discarded if 0 and source warned.

• Protocol field identifies the protocol: TCP, UDP, others.

• Header checksum is the header checksum — updated on each hop as time to live isupdated (CPU task).

• Addresses specify source and destination.

• Options:

� ��

Option Description� ��

Security Specifies how secret the datagram is� ��

Strict source routing Gives the complete path to be followed� ��

Loose source routing Gives a list of routers not to be missed� ��

Record route Makes each router append its IP address� ��

Timestamp Makes each router append its address and timestamp� ��

��

��

Fig. 5-46. IP options.

• IP addresses come in three 32 bit classes:



32 Bits

Range of host addresses

1.0.0.0 to 127.255.255.255

128.0.0.0 to 191.255.255.255

192.0.0.0 to 223.255.255.255

224.0.0.0 to 239.255.255.255

240.0.0.0 to 247.255.255.255

Class

0 Network Host

10 Network Host

110 Network Host

1110 Multicast address

11110 Reserved for future use

A

B

C

D

E

Fig. 5-47. IP address formats.

4.4 IP Addresses

• Are 32 bits long with the leading bits indicating the class of address.

• Class A addresses: bit 31=0, a 7 bit network part, and a 24 bit host part, yieldingan address range1.0.0.0–127.255.255.255.

• Class B addresses: bits 31,30=10, a 14 bit network part, and a 16 bit host part,yielding an address range128.0.0.0–191.255.255.255.

• Class C addresses: bits 31–29=110, a 21 bit network part, and an 8 bit host part,yielding an address range192.0.0.0–223.255.255.255.

• Class D addresses: bits 31–28=1110 and a 28 bit multicast part, yielding an addressrange224.0.0.0–239.255.255.255.

• Class E addresses are reserved for future use, and: bits 31–27=11110(240.0.0.0–247.255.255.255).

• Address space is used more efficiently if class A networks migrate to class B or C(where possible).



• Machines that are to be connected to the internet have to obtain a registered IPaddress. An Internet Service Provider and/or ftp://rs.internic.net hasdetails on registration of official IP address.

This host

A host on this network

Broadcast on the local network

0

Host

Network

127 (Anything)

Broadcast on a distant network

Loopback

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

0 0 0 0. . .

. . .1 1 1 1 1 1 1 1

Fig. 5-48. Special IP addresses.

• There are also a number of special IP addresses:

– using 32 ones is a broadcast address to all hosts on the local network — notethat this means a host can talk to its neighbours before knowing its own ad-dress;

– using all ones in only the host field is a broadcast address to all hosts on thespecified network;

– using 32 zeros means this host;

– using all zeros in the leading address bits up to and including the networkfield is an address to a hosts on the local network — note that this means ahost can talk to its neighbours without knowing what network they are on;

– a 7 bit network address of 127 (with any 24 bit host value) allows softwareto “talk” to/with the network interface without any packets going onto thewire — this allows what is called loopback testing to occur with no addressinformation at all;

– according to RFC 1597, you can use the following IP networks for private netswhich will never be connected to the Internet:10.0.0.0–10.255.255.255172.16.0.0–172.31.255.255192.168.0.0–192.168.255.255

◦ This means you could have one registered internet system and then privatenetworks (using parts of the above IP address ranges) attached to it. A BSD



UNIX kernel built with “firewall” and “ipdivert” support could also handleNetwork Address Translation.

4.5 Subnets

• Networking requirements can change, especially in growing organisations.

• A range of IP addresses can be broken up into “subnets” e.g. Fig. 5-49 shows aclass B address in which the original 16 bit host part has been reallocated into a 6bit subnet part and a 10 bit host part.

• This split looks the same from the outside world, so no registrations change.

• However, internally, the network has been divided up into smaller subnets (lesscollisions, greater total distance can be covered, etc.).

• The routers internal to the organisation are simply given new details.

32 Bits

Subnet mask

10 Network Subnet Host

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0

Fig. 5-49. One of the ways to subnet a class B network.

4.6 IP Router Operation

• Each router maintains lists of network addresses [network,0] and local host ad-dresses [0,host].

• The first [network ,0] list tells how to get to remote networks.

• The second [0,host ] list says which hosts are known on the local network.

• When an IP packet arrives, it is looked up in the routing table.

• If the network part identifies a remote network, the packet is forwarded out theappropriate interface to the next router.



• If the network part identifies the local network, it is sent directly to the host (net-work=0 packets are ignored).

• If the network is not found in the tables, the packet is forwarded to a default routerwith more extensive tables.

• When subnets are added, the router now maintains tables of [network ,0],[this-network ,subnet ,0] and[this-network ,this-subnet ,host ].

• ANDing the IP address with the subnet mask (Fig. 5-49) gives the particularsubnet. ANDing the IP address with the netmask identifies the network. →3–level hierarchy.

4.7 Internet Control Protocols

4.7.1 Internet Control Message Protocol (ICMP)

��

Message type Description��

Destination unreachable Packet could not be delivered��

Time exceeded Time to live field hit 0��

Parameter problem Invalid header field��

Source quench Choke packet��

Redirect Teach a router about geography��

Echo request Ask a machine if it is alive��

Echo reply Yes, I am alive��

Timestamp request Same as Echo request, but with timestamp��

Timestamp reply Same as Echo reply, but with timestamp��

��

��

Fig. 5-50. The principal ICMP message types.

• Destination unreachable — as it says, or packet cannot be fragmented preventsdelivery.

• Time exceeded — packet is looping, or congestion or timeout problems.



• Parameter problem — invalid IP packet found.

• Source quench — was used for flow control.

• Redirect — allows network knowledge to propagate.

• Echo allows destinations to be checked for reachability and timestamping allowsperformance measurement.

4.7.2 Address Resolution Protocol (ARP)

• The interface board only knows about 48 bit LAN addresses (each board is manu-factured with a unique 48 bit address).

• Each interface has an IP address.

• ARP is a mechanism that allows a host to find out what 48 bit LAN address be-longs to an IP address. A system outputs a broadcast packet to every machine onthe network in question, asking “who owns this IP address”. The owner replieswith their LAN address.

• ARP reduces the need for configuration files.

• By having hosts cache the results, ARP requests are reduced. However, cacheentries are discarded after a few minutes so that systems that have their LANcards replaced due to failure get operating quickly.

• It is also possible for hosts to broadcast their mapping when they bootup. No re-sponse is expected. However, a machine with the same IP address should respondin order to prevent the second machine coming on-line and creating chaos!

• It is also possible for routers to react to ARP requests for IP information belong-ing to remote networks. In proxy ARP, routers cooperate by forwarding the ARPrequest to the appropriate network for a response to be generated (and returned).

• Note: given just an internet name e.g. marlin.jcu.edu.au , anotherservice called the domain name service can be used to obtain the IP addresscorresponding to the name.



4.7.3 Reverse Address Resolution Protocol

• RARP does the reverse of ARP.

• A diskless machine about to boot up will know its 48 bit LAN address but will notknow its IP address.

• An RARP sends out a broadcast packet (with all 32 address bits = 1) saying whatits LAN address is, and an RARP server responds with the IP address.

• This allows the diskless machine to share boot files with other machines whileretaining its unique identity.

• Broadcast packets with all address bits = 1 are not propagated by routers to avoidunwanted traffic, & RARP servers must exist on any subnet needing them.

• On UNIX, RARP can be handled by daemons such as rarpd and bootp whichare started at boot time.

• Note: In some circumstances, it is desired to allocate IP numbers dynamically tohosts. One solution is to have a UNIX system run a Dynamic Host ConfigurationProtocol Server (the dhcpd daemon) which also supports the bootp protocol.

4.7.4 Interior Gateway Routing Protocol

• As the internet connects many different organisations, these organisations havebeen free to develop their own internal routing methods.

• Early algorithms suffered as networks grew so the Internet Engineering Task Forcedeveloped the OSPF (Open Shortest Path First) standard in 1990.

• It is published hence “open”.

• It can handle metrics such as physical distance, time delay, and others.

• It is dynamic hence can adapt to changes in topology automatically.

• Supports routing based on service i.e. the type of service field is now inspected sothat it is possible to handle real–time traffic (multi–media), etc.

• It can do load balancing hence routers connected by multiple pathways can havetheir traffic spread across the pathways to maximise performance (previously,



routers used the best single link and ignored the others). Load spreading is im-portant when routers are connected by multiple PPP links.

• OSPF works by having adjacent routers exchange information with acknowledge-ment and timestamping — hence, routers have up-to-date knowledge of costs etc.

• In normal operation, a router floods link state update messages to its neighbours.

• To minimise overall coordination traffic, one router is elected to be the designatedrouter and it is considered to be “adjacent” to all other routers.

• As the routers all belong to a single organisation, they can trust one another!

��


Hello Used to discover who the neighbors are��

Link state update Provides the sender’s costs to its neighbors��

Link state ack Acknowledges link state update��

Database description Announces which updates the sender has��

Link state request Requests information from the partner��

��

��

Fig. 5-54. The five types of OSPF messages.

4.7.5 Exterior Gateway Routing Protocol

• Border Gateway Protocol (BGP) acts as a routing protocol between organisationsaccording to policies chosen by the owners.

• Policies can be based on politics, commercial considerations, costs, services to cus-tomers (and rejection of traffic of non–customers), etc.

• BGP is very general.

• Pairs of BGP routers communicate by establishing a TCP connection.

4.8 Internet Multicasting

• IP supports multicasting using class D addresses.



• The group ID has 28 bits for > 250, 000, 000 groups.

• Packets addressed to multicast addresses get “best effort” delivery with no guar-antees.

• Address can be permanent e.g.◦ 224.0.0.1 = all systems on a LAN◦ 224.0.0.2 = all routers on a LAN◦ 224.0.0.5 = all OSPF routers on a LAN◦ 224.0.0.6 = all designated OSPF routers on a LAN

• Temporary addresses are available to processes running on the computers. A pro-cess can ask to join a group and leave a group.

• A host must therefore handle traffic to group address(es) as well as its own IPaddress(es), and keep track of which groups it has processes belonging to.

• Multicasting is supported by special multicast routing.

• Internet Gateway Management Protocol (IGMP) allows routers to track whichgroups are active on their subnet.

4.9 Classless InterDomain Routing

• The granularity of class based addressing limits the efficiency of address use —class B says you get 216 addresses while class C says you get 28 addresses but whatif you want 2,001?

• In CIDR, the address ranges 204.0.0.0–223.255.255.255 are allocated for future use,and194.0.0.0–195.255.255.255: Europe,198.0.0.0–199.255.255.255: North America,200.0.0.0–201.255.255.255: Central and South America,202.0.0.0–203.255.255.255: Asia/Pacific.

• The network/host bit split in the IP address is variable allowing almost completeflexibility and high efficiency.

• Studies showed that the previous class based addressing was using around 50%of available IP addresses.



• Router work is easy with respect to identifying the region on Earth, but then alarge database of information must be (quickly) accessed to determine final packetroutes. This requires a router to have higher computational power and more in-ternal storage (the usual ...).

4.10 IPv6

• Development of IPv6 stated in 1990.

• Aims: support of billions of hosts; simplify routing tables; simplify protocols; bet-ter authentication and privacy; be more responsive to type of service; allow scopeof multicasting to be specified; support mobile IP addresses; allow future protocolevolution; and have old and new protocols coexist.

• Multimedia support was important.

• IPv6 is compatible with TCP, UDP, ICMP, IGMP, OSPF, BGP, and DNS.

• It is not directly compatible with IPv4 since it has a different header with less fieldsto simply processing (and bigger addresses too).

• Less fields simplify router work, although routers will have to handle both ver-sions for maybe a decade (probably not a problem given VLSI advances).

• The version field is 6 for IPv6.

• Priority values 0 . . . 7 are for traffic that can be slowed down given congestion,while 8 . . . 15 is for real–time traffic.

• Addresses are 16 bytes long, with the first group of reserved addresses allocatedto IPv4.

• Address use will not be efficient, but will allow fast routing.



32 Bits

Version Priority Flow label

Payload length Next header Hop limit

Source address (16 bytes

Destination address (16 bytes)

Fig. 5-56. The IPv6 fixed header (required).

��Prefix (binary) Usage Fraction��0000 0000 Reserved (including IPv4) 1/256��0000 0001 Unassigned 1/256��0000 001 OSI NSAP addresses 1/128��0000 010 Novell NetWare IPX addresses 1/128��0000 011 Unassigned 1/128��0000 1 Unassigned 1/32��0001 Unassigned 1/16��001 Unassigned 1/8��010 Provider-based addresses 1/8��011 Unassigned 1/8��100 Geographic-based addresses 1/8��101 Unassigned 1/8��110 Unassigned 1/8��1110 Unassigned 1/16��1111 0 Unassigned 1/32��1111 10 Unassigned 1/64��1111 110 Unassigned 1/128��1111 1110 0 Unassigned 1/512��1111 1110 10 Link local use addresses 1/1024��1111 1110 11 Site local use addresses 1/1024��1111 1111 Multicast 1/256��

��

��

��

��

Fig. 5-57. IPv6 addresses



5 Transport and Session Layers

5.1 Transport Protocols

5.1.1 Introduction

• Transport protocols (TPs) are the most important part of a communications system(AST says “the heart” of the protocol hierarchy).

• Its task is to provide reliable, cost-effective data transport from the source machineto the destination machine, independent of the physical network(s) in use.

• A TP shields the user from the details and characteristics of the layers below it.The TP is complex.

• Transport protocols provide the basic end-to-end service of transferring data be-tween users. This is achieved by communication with a remote peer, and usingthe services of the network layer.

• The hardware and/or software within the transport layer that does the work iscalled the transport entity.

Application/transport interface

Transport/network interface

Application (or session)

layer

Transport entity

Transport address

Network address

Network layer

Application (or session)

layer

Transport entity

Network layer

TPDU

Transport protocol

Host 1 Host 2

Fig. 6-1. The network, transport, and application layers.

• The existence of the transport layer makes it possible for the transport service tobe more reliable than the underlying network service.



• The transport layer offers independence from the actual network layer available.This leads to a view of the lower 4 layers as being the transport service provider andthe upper layers as the transport service user.

• The transport layer can also be viewed as a way of enhancing quality of service(QoS).

5.1.2 Types of Service

• Connection or connectionless.

• Connection-oriented provides the establishment, maintenance and termination ofa logical connection between transport users. Allows the use of flow control, errorcontrol and sequenced delivery.

• A connectionless service can be more appropriate, for example:

– Inward data collection;

– Outward data dissemination;

– Message passing;

– Remote procedure calls; and

– Real-time applications.

5.1.3 Qualities of Service

• Not easily defined by users.

• Users may be able to request a particular QoS (the transport protocol can use thisto determine how best to use the network layer).

• Example QoS are:

– Connection establishment delay — elapsed time to request a transport con-nection and user receiving confirmation (includes remote time delays);

– Connection establishment failure probability — chance of no connection es-tablished within a maximum establishment delay time;

– Throughput — user number of bytes transferred / unit time over a test inter-val (bi–directional);



– Transit Delay — delay between message being sent by the transport user atsource and it being received by the transport user at the destination machine(bi–directional);

– Residual error ratio — lost or garbled messages that were not fixed;

– Priority levels — allow higher priority links to be serviced in the event ofcongestion;

• A transport protocol is limited by the nature of the underlying network, and notall QoS may be possible.

• There is also a trade-off between reliability, delay, throughput and service cost.

• Sample applications’ QoSs:

– File transfer — low errors and high throughput;

– Remote procedure calls — low delay;

– Email transfer — priority levels.

• Either a transport protocol provides different QoS via option negotiation, or thereare different TPs for different classes of traffic.

5.1.4 Transport Service Primitives

• Provides user access to the transport service.

• While the network service tends to model the (possibly unreliable) network that itis implemented on, the transport service provides the user with access to a some-what idealised service where acknowledgements, lost packets, congestion, etc. arenot directly visible.

• The transport service is therefore easier to use, and we describe access primitives.



��

Primitive TPDU sent Meaning��

LISTEN (none) Block until some process tries to connect��

CONNECT CONNECTION REQ. Actively attempt to establish a connection��

SEND DATA Send information��

RECEIVE (none) Block until a DATA TPDU arrives��

DISCONNECT DISCONNECTION REQ. This side wants to release the connection��

��

��

��

Fig. 6-3. The primitives for a simple transport service.

• LISTEN — a server executes a listen primitive and waits (is blocked) until a con-nection is established.

• CONNECT— a client executes a connect primitive when it wants to talk to theserver. The transport entity carries out this primitive by making the caller wait(blocking) and sending a packet containing a transport layer message to the server’stransport entity.

• CONNECTION REQUESTarrives at the server and the transport entity checks tosee if there is a server blocked on a LISTEN . If there is, it is unblocked and thetransport entity sends CONNECTION ACCEPTEDmessage back to the client.

• Data can now be exchanged between the client and the server.

• A TPDU (transport protocol data unit) is a message sent from transport entity totransport entity.

Frame header

Packet header

TPDU header

TPDU payload

Frame payload

Packet payload

Fig. 6-4. Nesting of TPDUs, packets, and frames.



• Note: transport service users do not need to know about the packet nesting andassociated error detection/correction mechanisms present — they simply see areliable link.

• DISCONNECTcan be asymmetric, where either transport user can issue a discon-nect primitive, or symmetric, where each direction is closed separately.

• Termination may be abrupt (with loss of data), or graceful. Some TPs allow full-duplex and half-duplex connections.

5.1.5 Berkeley Sockets

��

Primitive Meaning��

SOCKET Create a new communication end point��

BIND Attach a local address to a socket��

LISTEN Announce willingness to accept connections; give queue size��

ACCEPT Block the caller until a connection attempt arrives��

CONNECT Actively attempt to establish a connection��

SEND Send some data over the connection��

RECEIVE Receive some data from the connection��

CLOSE Release the connection��

��

��

Fig. 6-6. The socket primitives for TCP.

• Berkeley UNIX (BSD) developed the sockets interface for users to program at thetransport service level.

• A successful socket call returns an ordinary file descriptor for use in followingcalls (i.e. network access is “file system mapped”).

• A server would execute the first 4 calls in the order shown, starting with a SOCKET

create call. The BIND call assigns an address for the socket.

• A client also creates a socket and then does a CONNECTcall in an attempt to accessa remote server (of known address).

• Once a link is established, the client and server exchange data.



• When finished, both must execute a CLOSEto release the symmetric connection.

5.1.6 Addressing

• To transfer data, we need the address information of the destination:

– User identification;

– Transport protocol identification.

– Network address of destination station.

• The source usually calls a certain TP with a (Station, Port) pair e.g. the Internetuses (IP address, local port)7.

• We use the architecture-neutral term Transport Service Access Point (TSAP). Thereis also an underlying Network Service Access Point (NSAP).

Application process

Application layer

Network connection starts here

Transport connection starts here

TSAP 122

TSAP 6

NSAPNSAP

Phys

Transport layer

Network layer

Data link layer

Physical layer

Server

Host 1 Host 2

Fig. 6-8. TSAPs, NSAPs, and connections.7ATM uses (AAL, SAP); AAL = ATM adaptation layer, SAP = Service Access Point.



• How does the source know the address of the destination?

– The user knows the address ahead of time;

– The destination has a “well-known” address;

– A name server is used (it must be listening on a well-known address);

– The target is created at request time. The request goes to a well- known server,which creates the target and passes it the connection.

• In BSD UNIX, the inetd daemon is started at boot-time and it listens for incomingrequests (see Figure 6.9). Also see the man page for inetd .

Layer

4TSAP

Time of day server

(a) (b)

Host 1 Host 2 Host 1 Host 2

Process Server

User User Process Server

Fig. 6-9. How a user process in host 1 establishes a connectionwith a time-of-day server in host 2.

• If a name server is used, then new services will require a means of registering theiravailability to the name server.

5.1.7 Multiplexing

• If there are multiple users on one station, the TP must differentiate data to/fromeach user and each connection.



• Thus, the TP must multiplex/demultiplex data to/from the network layer.

• Aside: an example of the use of multiplexing also appears in networks with highdelays such as satellite links — a user with high throughput needs could multiplexdata across multiple open network connections.

5.1.8 Example Transport Protocol – TCP

• TCP stands for Transmission Control Protocol.

• Used on top of DoD’s Internet network, which is unreliable, nonsequencing andconnectionless.

• Provides a full-duplex reliable connection between two transport users.

• Also provides “out-of-band” (urgent) data transfer between users.

• A user can “push” any data waiting to be transmitted at the transport layer.

• TCP uses a sliding window method to provide both flow and error control.

• 3-way connection handshake, exchanging window sizes, sequence numbers, andother connection parameters.

• Other QoS such as delay, precedence etc. are handled at the network layer.

• TCP fragments and reassembles data to fit the MDU.

• TCP reorders arriving data according to the sequence numbers.

• The TCP header is 20 bytes long:



32 Bits

Source port Destination port

Sequence number

Acknowledgement number

TCP header length

U R G

A C K

P S H

R S T

S Y N

F I N

Window size

Checksum Urgent pointer

Options (0 or more 32-bit words)

Data (optional)

Fig. 6-24. The TCP header.

• As TCP uses IP, which also has a 20 byte header, the header overhead per packetis 40 bytes, not including data link layer headers.

5.2 TCP/IP demonstration client

5.2.1 Introduction

• From “Internetworking With TCP/IP” Vol. 3 (BSD socket version) by DouglasComer & David Stevens, Prentice Hall.

• Provides peer–to–peer communication.

• Common approach is to use client–server paradigm:

– Because TCP/IP does not provide any mechanisms that automatically createrunning programs when a message arrives, a program must be waiting toaccept communication before any requests arrive.

– Here, an application program that initiates peer–to–peer communication iscalled a client.



– Similarly, a server is any program that waits for an incoming communicationrequest from a client, and then performs any necessary computation in orderto send a reply.

5.2.2 Privilege and Complexity

• Servers may provide controlled access to information that belongs to the operatingsystem or users, hence protection of data is essential.

• Authentication — verifying the identity of the client.

• Authorisation — determining whether a particular client is permitted access to aparticular service.

• Data security — do not want data unintentionally revealed or compromised.

• Privacy.

• Protection — guarantee network applications cannot abuse system resources.

5.2.3 Standard versus nonstandard clients

• Standard applications services consist of those services that are defined by TCP/IPand are assigned well–known, universally recognised protocol port identifiers.

• All other services may be considered to be locally–defined application services ornonstandard services.

• See definitions in file /etc/services on BSD UNIX hosts (sample also availableat

http://mirriwinni.cs.edu.au/˜phillip/intro2cn/services ).

5.2.4 Connectionless v connection–oriented SVRs

• If the client and server communicate using UDP (User Datagram Protocol), theinteraction is connectionless.

• If the client and server communicate using TCP (Transfer Control Protocol), theinteraction is connection–oriented and error detection and correction is handled.



5.2.5 Program Interface to Protocols

• Loosely specified protocol software interface — details of how applications soft-ware should interface with TCP/IP protocol software is not specified, but the re-quired functionality is suggested.

• Only a few interfaces exist — one is the socket interface (or sockets) defined for theBerkeley UNIX Operating System (BSD), and another is the TLI (Transport LayerInterface) defined by AT&T.

• In the PC world, the winsock library originally provided access to Berkeley stylesockets.

5.2.6 Interface Functionality

• Allocate local resources for communication.

• Specify local and remote communications endpoints.

• Initiate a connection (client side).

• Wait for an incoming connection (server side).

• Send or receive data.

• Determine when data arrives.

• Generate urgent data.

• Handle incoming urgent data.

• Terminate a connection gracefully.

• Handle connection termination from the remote side.

• Abort communication.

• Handle error conditions or a connection abort.

• Release local resources when communication finished.



5.2.7 System Calls

• A system call can be thought of as a function call made with a subroutine suppliedas part of the operating system.

• The system subroutine was written as part of the OS so it should be able to protectitself and the network from incorrect calls.

• Design approaches: create new system calls, or extend existing system calls tohandle networking.

• The BSD sockets interface extends the file system handling system calls to alsohandle networking in a open–read–write–close paradigm.

• Sample source code, based on examples given in the text of Douglas Comer &David Stevens, is available at URL

http://mirriwinni.cs.edu.au/˜phillip/intro2cn/tcpip-demo

5.2.8 BSD Tutorial

• An Advanced 4.4BSD Interprocess Communication Tutorial is available on-line atURL

http://mirriwinni.cs.edu.au/˜phillip/intro2cn/BSD



6 TCP/IP Protocols

6.1 Introduction

We consider the following topics:

• Review of TCP/IP Layering;

• Overview of User Datagram Protocol (UDP);

• Reliable Stream Transport Service (TCP).

Additional References:

• “Internetworking With TCP/IP Vol. 1”, D.E. Comer, 2nd edition, Prentice-Hall,1991.ISBN 0-13-468505-9.

• “Computer Networks and Internets”, D.E. Comer, Prentice-Hall, 1997. ISBN 0-13-599010-6.

6.2 Review of TCP/IP Layering

• TCP/IP protocols are organised into 5 conceptual layers:

◦ Layer 1: Physical. Basic network hardware.

◦ Layer 2: Network Interface. Organises data into frames and transmits them.

◦ Layer 3: Internet. Specify packet format. Handles packet forwarding through oneor more routers to final destination.

◦ Layer 4: Transport. Handles reliable transfer.

◦ Layer 5: Applications.



6.3 User Datagram Protocol

• TCP/IP is capable of transferring IP datagrams amongst host computers.

• At the IP layer, no further distinction other than IP address specifies the user orapplication to receive the datagram.

• How can multiple destinations at a host be specified?

◦ The mechanism that was developed came from the BSD world.

◦ BSD unix is a multiprocessing operating system where executing programs, re-ferred to as processes or tasks, can be part of the OS or can be user level tasks.

• A task was chosen as the ultimate destination on the host computer. However,this was not the total answer:

◦ Tasks are dynamic i.e. they are continuously created and destroyed, so the senderis unlikely to know the ID of the recipient task;

◦ We would prefer to be able to change the recipient task without informing thesender (maybe the recipient task needs restarting);

◦ we would prefer to specify a destination task according to a service rather than itsID number.

• The solution developed uses a set of abstract destination points called protocolports8.

◦ A destination can therefore be specified with host name,port number.

◦ Observe that we have said nothing about a user. Any further specification of recip-ient will require an additional authentication service. This approach keeps thingssimple.

◦ The solution makes use of the underlying IP for basic (unreliable) packet delivery.

• Operating systems provide a synchronous access to ports.

◦ If a task attempts to access data before it arrives, that task is blocked (put to sleep)and when data becomes available, the OS restarts it.

◦ Ports are buffered by the OS (finite size buffer).8These are the service access points mentioned earlier in semester 1.



• When communicating with a remote port, each message carries a destination portnumber on the foreign machine and a source port number on the source machineto which replies are to be addressed.

◦ Thus, any task that receives a message can also reply.

• The User Datagram Protocol (UDP) provides unreliable connectionless deliveryservice using IP to transfer messages between hosts. It adds the ability to distin-guish among multiple destinations within a single host.

◦ An application that uses UDP must handle the problem of reliability, includingmessage loss, duplication, delay, out-of-order delivery, etc.

◦ These problems are often underestimated by developers who prototype on highlyreliable, low delay private LANs.

• Conceptual layering:

ApplicationUser Datagram (UDP)

Internet (IP)Network Interface

• Format of UDP Messages:

UDP Source Port [0-15] UDP Destination Port [16-31]UDP Message length [0-15] UDP Checksum [16-31]

Data [0-31]...

◦ A 64-bit header is followed by data.

◦ Four 16-bit fields specify source and destination port numbers, the message length(bytes in header+data), and checksum.

◦ The source port is optional — used only if replies are needed.

◦ The checksum is optional — however, as IP does not compute a checksum on itsdata payload, this checksum should be used if data integrity is to be checked.

◦ A UDP datagram is encapsulated inside an IP datagram for transmission i.e. UDPprepends a header to user data and passes this new packet to the IP layer, whichprepends an IP header and passes this new(er) packet to the network layer.

• As usual, the layered structure provides a (fairly) clear separation of duties:



◦ IP is responsible for transferring data between a pair of hosts;

◦ UDP is responsible for differentiating among multiple sources and destinationswithin one host.

◦ There is a minor violation of the layering principle — the UDP must know the IPaddresses in order to fully specify a source and destination, which means that theUDP layer must interact with the IP layer to a very limited extent.

6.4 UDP Multiplexing

• A common feature of the layer structure to protocols is that a layer can multiplexits services to multiple user at the next layer up.

• In UDP, an obvious way to multiplex is via port numbers.

◦ In practice, an application program must negotiate with the OS to obtain a protocolport and an associated port number before it can send a UDP datagram.

◦ The Socket Abstraction developed in BSD provides the programmer with functionsneeded to send and receive data.

• When processing input, UDP accepts incoming datagrams from the IP layer anddemultiplexes according to UDP destination port.

• If local host software has negotiated with the OS to accept incoming UDP data-grams on a given port, the OS will have setup queues to hold this data.

◦ Demultiplexed UDP packets are placed in their corresponding queues.

◦ If no local task has requested that it be passed UDP packets on a certain port,demultiplexing cannot occur and the UDP layer returns a message to say that aport is unreachable.

• How are port numbers assigned?

6.5 UDP Port Number Allocation

• Port numbers could be permanently assigned to a particular service by a centralauthority, or they could be organised dynamically.



• The BSD people chose a hybrid solution.

• The 16-bit numbers have some constant assignments for services for the very smallvalues 0-1023 — these are known as the well-known port assignments.

• The Registered Ports are those from 1024 through 49151.

• the Dynamic and/or Private Ports are those from 49152 through 65535.

• Some common UDP services are:

tcpmux 1/udp #TCP Port Service Multiplexerecho 7/udpdiscard 9/udp sink nullsystat 11/udp users #Active Usersdaytime 13/udpftp-data 20/udp #File Transfer [Default Data]ftp 21/udp #File Transfer [Control]ssh 22/udp #Secure Shell Logintelnet 23/udp# 24/udp any private mail systemsmtp 25/udp mail #Simple Mail Transfertime 37/udp timservernameserver 42/udp name #Host Name Servernicname 43/udp whoisxns-time 52/udp #XNS Time Protocoldomain 53/udp #Domain Name Serverwhois++ 63/udpsql*net 66/udp #Oracle SQL*NETbootps 67/udp dhcps #Bootstrap Protocol Serverbootpc 68/udp dhcpc #Bootstrap Protocol Clienttftp 69/udp #Trivial File Transferfinger 79/udphttp 80/udp www www-http #World Wide Web HTTPkerberos-sec 88/udp kerberos # krb5 # Kerberos (v5)hostname 101/udp hostnames #NIC Host Name Serverpop2 109/udp postoffice #Post Office Protocol - Version 2pop3 110/udp #Post Office Protocol - Version 3sunrpc 111/udp rpcbind #SUN Remote Procedure Callsqlserv 118/udp #SQL Servicesnntp 119/udp usenet #Network News Transfer Protocol#x11 6000-6063/udp X Window System

• You can list port numbers on BSD (and compatible) operating systems with a com-mand such as fgrep udp /etc/services . See also http://www.isi.edu/in-

notes/iana/

assignments/port-numbers



◦ Notice UDP port 1 is titled TCP Port Service Multiplexer. Connections on this fixedport number are used to organise dynamic port numbers.

◦ Notice the UDP port range 6000-6063. This entry is commented out but it remindsusers that the X-windows system is allocated TCP ports in this range. As we shallsee, TCP provides similar port multiplexing and the convention is that UDP andTCP port numbers are allocated the same.

6.6 Reliable Stream Transport Service (TCP)

• A reliable stream transport service is provided so that applications need not han-dle error detection and correction. Problems addressed: packet loss; out-of-orderdelivery; delay; duplication; optimal packet size.

◦ The Transmission Control Protocol (TCP) defines this Internet service.

◦ The protocol designers have attempted to find a general purpose solution to theproblems of providing reliable stream delivery. This gives a cleaner separation ofapplications software from underlying networking software, and assists in debug-ging.

• Properties of a reliable delivery service:

◦ Stream orientation —An application program can send a stream of bits, with exactly these data bitsbeing received by the recipient.

◦ Virtual Circuit Connection —The sender can “place a call” to a possible recipient, with the protocol handling thecall setup. After data is transferred, the protocol also handles the disestablishmentof the link or call teardown.

◦ Buffered Transfer —The sender can inject data into the link in any size that it finds convenient, and thereceiver can extract data from the link in its preferred size.

To make transfer more efficient, the protocol collects enough data from a streamto fill a reasonably large datagram before transmitting it.

For applications where data must be delivered even though it does not fill a buffer,TCP provides a push facility.



◦ Unstructured Stream —Application programs must agree on the structure of the data transferred.

◦ Full Duplex Connection —concurrent data transfer can occur in both directions.

6.7 Providing Reliability

• Most reliable protocols use a single fundamental technique known as positive ac-knowledgement with retransmission.

• The technique requires that a recipient sends an acknowledgement back to thesender as it receives data.

• The sender keeps a record of each packet sent, and waits for an acknowledgementwhile running a wait timer.

• A successful transaction, in its simplest form, has the sender and recipient ex-changing packets.

◦ Packet loss is detected by a timeout, and results in a retransmission.

◦ Packet duplication9 can be handled by including a sequence number in each packettransferred.

• If the sender and recipient communicate in turn, the link is effectively half duplexand much bandwidth is lost.

• The idea of sliding windows is that the sender can transmit a number of packetsbefore waiting for acknowledgement.

◦ This can be thought of as a system where a number of packets ready for transmis-sion are available, and any window of 8 may be accessed.

◦ Only after a packet is acknowledged can the window be moved further along touncover the next packets for transmission.

Initial Window1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10Packet ACK: window slides →

9Varying delays might mean a packet could be resent and two copies be received.



• Network utilisation is much improved as the protocol can essentially keep thenetwork saturated with packets.

6.8 What does TCP provide?

• The protocol specifies the format of the data and acknowledgements that two com-puters exchange to achieve reliable transfer, and the procedures to ensure that dataarrives correctly.

• The protocol does not prescribe how applications programs use TCP, and doesdoes not prescribe how TCP uses the underlying services.

◦ This allows TCP to be implemented on a variety of underlying communicationssystems.

• Like UDP, TCP resides above the IP layer.

• TCP allows multiple applications to communicate simultaneously.

• Like UDP, TCP uses protocol port numbers to identify the ultimate destinationwithin a machine.

• Conceptual layering:

ApplicationReliable Stream (TCP) User Datagram (UDP)

Internet (IP)Network Interface

• However, TCP is built on a connection abstraction in which objects are identified asvirtual circuit connections, and not just end points.

◦ Since TCP identifies a connection by a pair of endpoints, a given TCP port numbercan be shared by multiple connections on the same machine.

6.9 TCP Connections

• Both end points must agree to cooperate.

• There are Passive and Active Opens.



• An application at one end performs a passive open by informing its OS that it willaccept an incoming connection. The OS assigns a TCP port number (and allocatesnecessary resources).

• An application at the other end can then perform an active open request to establishthe connection.

6.10 Segments and Streams

• TCP views its data stream as a sequence of octets or bytes which it divides intosegments for transmission. A segment will usually travel in a single IP datagram.

• TCP uses a sliding window mechanism to provide efficient transmission and flowcontrol.

◦ A sliding window mechanism makes it possible to send multiple segments be-fore an acknowledgement is received → keeps network busy so throughput isincreased.

◦ The use of acknowledgements still allows a receiver to restrict data flow as re-quired by its needs (buffers, data use).

◦ The TCP sliding window mechanism operates at the octet/byte level and not thesegment or packet level. The following diagram shows bytes as Bn:

Current Window

B1 B2 B3 B4 B5 B6 B7 B8 B9 B10

↑ ↑ ↑

The pointer at the left separates bytes on the left that have been sent and acknowl-edged. The pointer at the right marks the right of the sliding window, indicatingthe highest byte in the window that could be sent. The middle pointer separatebytes sent from those not yet sent.

◦ The protocol sends bytes in the window as soon as possible, so the window shownhere moves rapidly to the right.

◦ Recall that acknowledgements travel back from the receiver to the transmitter, ina reverse link. In effect, each end must manage two sets of sliding windows —



one set for its outgoing data (where it determines what to send or resend next),and one set for its incoming data (where at a minimum it determines what toacknowledge).

6.11 Variable Window Size and Flow Control

• TCP supports a time varying window size.

• Each acknowledgement contains a window advertisement that specifies how manyadditional octets of data the receiver is prepared to receive.

◦ We now have a feedback mechanism which assists the sender in not exceeding thecapacity of the receiver.

◦ In response to an increase in window size, a sender now sees more bytes in itswindow and proceeds to transmit them. In response to a decrease in window size,a sender may need to temporarily stop sending if its window shrinks back past itsmiddle pointer.

• This adaptive windowing mechanism helps the internet handle hosts and gate-ways of various speeds and capacities.

◦ The flow control ensures that the end point hosts do not have buffer overflows etc.

◦ The intermediate gateways must also avoid data loss — if this happens, it is calledcongestion. A TCP implementation must be tuned so that it can detect and re-cover from congestion — an inappropriate retransmission scheme can make lossesworse.

6.12 TCP Segment Format

• The basic unit of data exchanged in TCP is the segment.

◦ Segments are exchanged to establish connections, transfer data, carry acknowl-edgements, advertise window sizes, and close connections.

• A segment contains a TCP header followed by data:



SOURCE PORT DESTINATION PORTSEQUENCE NUMBER

ACKNOWLEDGEMENT NUMBERHLEN/RESERVED/CODE BITS WINDOW

CHECKSUM URGENT POINTEROPTIONS (if any) PADDING

DATA

◦ Fields SOURCE PORTand DESTINATION PORTcontain TCP port numbers whichidentify the application programs at the end points.

◦ The SEQUENCE NUMBERidentifies the position in the sender’s byte stream of thedata in this segment.

◦ The ACKNOWLEDGEMENT NUMBERidentifies the number of the octet that the sourceexpects to receive next (i.e. earlier bytes are OK).

◦ The HLENinteger specifies the segment header length in units of 32-bits, becausethe OPTIONSfield (and header) has variable length.

◦ The 6-bit CODE BITSspecify the purpose and contents of the segment. From leftto right, these bits are:

1. URG - urgent pointer field is valid

2. ACK - acknowledgement field is valid

3. PSH - this segment requests a push

4. RST - reset the connection

5. SYN - synchronise sequence numbers

6. FIN - sender has reached end of its byte stream

◦ The WINDOWfield advertises how much data the sender of the packet is willing toreceive.

6.12.1 Out of Band Data

• Out of band data is data that the sender wishes to have handled as soon as possiblei.e. out of order.

◦ Examples use: typing an interrupt command in a telnet window linked to a remoteUnix host.

• The URGbit is used to indicate that urgent data is present, with its location in thewindow given by the URGENT POINTER.



6.12.2 Maximum Segment Size

• Both ends need to agree on the maximum segment size they will transfer (impactson resource allocation at the end points).

• The OPTIONSheader field is used in negotiations, where each end specifies themaximum segment size (MSS) that it is willing to receive.

• For better efficiency, an MSS should be chosen so that the resulting IP datagrammatches the MTU of the underlying network.

◦ If an MSS is chosen too small: network utilisation suffers (a segment size of 41bytes of which 40 bytes are header is very inefficient).

◦ If an MSS is chosen too large: the segment, which is encapsulated in an IP data-gram (which is itself encapsulated in a network frame), may need to be frag-mented in order to fit into the datagram. This results in extra frag/defrag over-heads and a single lost fragment means the whole segment must be resent (recallthat the TCP window mechanism operates on segments).

• An optimum segment size S occurs when the IP datagrams carrying the segmentsare as large as possible without requiring fragmentation anywhere along the pathfrom the source to the destination.

• Can an optimal S be found?

◦ There is no probing mechanism built into TCP.

◦ Also, the network routes can be time varying, in response to topology changes(nodes may fail or come on-line) or congestion avoidance (time outs cause alter-native routes to be used).

◦ There is currently no standard way to find S.

6.12.3 TCP Checksum Computation

• A 16-bit arithmetic checksum of a segment allows the receiver to verify that theTCP header and data has been received without error.

• To allow the TCP protocol to verify the correct source and destination host, thefollowing information is prepended to the segment for the purposes of checksumcalculations:



SOURCE IP ADDRESSDESTINATION IP ADDRESS

ZERO PROTOCOL TCP LENGTHSOURCE PORT DESTINATION PORT

SEQUENCE NUMBERACKNOWLEDGEMENT NUMBER

remainder of segment follows

6.13 Acknowledgements and Retransmission

• TCP acknowledgements specify the sequence number of the next octet that thereceiver expects to receive.

◦ Counting from 0, this is in effect the number of contiguous correct bytes received.This is sometimes called a cumulative scheme.

• Advantages: easy to compute and unambiguous; any loss does not automaticallyforce retransmission.

• Disadvantages: the sender does not receive information about all successful trans-mission. If the first of n segments was lost, the sender will eventually retransmitthe lost segment but if the receiver does not then acknowledge all n segmentsquickly enough, further unnecessary retransmission is not avoided.

◦ An attempt to avoid this extra retransmission by waiting for an acknowledgementin order to decide how much to retransmit is similar to discarding the movingwindow mechanism and its advantages. Timing is clearly critical.

6.14 TCP Timeouts and Retransmission

• TCP is intended for use in an internet environment so it must accommodate timevarying delays in segment delivery.

◦ A segment may traverse a single low-delay link (e.g. local network) or it may tra-verse a cascade of interconnected links. It is impossible to know the characteristicsof the link beforehand.

• This adverse environment is handled by using timers on each segment, coupledto adaptive timeout mechanisms.



◦ TCP monitors the performance of each connection and deduces reasonable valuesfor timeouts. As the characteristics of the link change, TCP updates its timeoutvalues.

◦ By measuring the time between segment transmission and segment acknowledge-ment, TCP obtains a sample round trip time or round trip sample. A new sample valueallows the estimate of round trip time (RTT) to be updated.

• An early method to update RTT was to use an exponential forgetting factor, whichgives a response curve like a first order RC circuit discharge:

RTTi+1 = αRTTi + (1− α)RTTsample

◦ i.e. the α applied to the ith RTT “discharges” it (scales it down) to give the historicalcontribution to the (i + 1)th RTT for α < 1; and

◦ (1− α) scales the contribution of the sample measurement to RTTi+1.

• An early method to determine the timeout used a scaled value of RTT i.e. Timeout= β RTT. Originally, β was set to 2 (too close to 1 and retransmission might occurtoo easily).

◦ From a control theory point of view, these techniques are very simple but they doallow a system to operate reasonably well in the face of random noise and otherunmodelled disturbances.

• RTT sample measurements can be ambiguous — when a late segment is received,is it the result of a timeout triggered retransmission or did it just take a very longtime to arrive?

◦ If the segment is the original that was in transit for a long time, but the receiverassumes it is a retransmission (with smaller transit time), then the updated RTTwill be smaller (even though RTT is actually large).

◦ If the received segment is assumed to be from a retransmit but it is not, RTT updateis again incorrect.

◦ How can we avoid incorrect updates to RTT?

• Karn’s algorithm allows the previous RTT updates to only occur on timing datafrom unambiguous acknowledgments10.

10Phil Karn is an amateur radio enthusiast who developed this algorithm to allow TCP operation acrossa high loss packet radio link.



◦ To handle sharp increases in actual round trip time, Karn’s algorithm also has atimer backoff strategy whereby, if a retransmission occurs:

new timeout = γ × old timeout

where γ is typically 2.

◦ The system’s updates are now chosen according to the internet’s behaviour.

• When the internet is well behaved, the terms α and β control updates of RTT andTimeout.

• When the internet misbehaves, Karn’s algorithm uses γ to control timeout up-dates. This decouples the estimate of timeout from round trip travel time. Onlywhen an acknowledgement arrives does the update of timeout become linked toRTT estimates.

6.15 TCP Links with High Variance in Delay

• What happens if round trip times possess a high variance?

• Research has shown that the previous algorithms do not adapt well in the face ofwide variations in delay.

◦ Queueing theory suggests that the variation in RTT, σ, varies in proportion to theinverse of remaining network capacity i.e.

σ ∝ 1

1− L; 0 ≤ L ≤ 1

where L is network load.

• Studies suggest that, for our network’s model, the above relationship is an equalityand RTT = ±2σ.

◦ For L=0.5, σ = ±2(1/0.5) = ±4 i.e. RTT can vary by a factor of 8.

◦ For L=0.75, σ = ±2(1/0.25) = ±8 i.e. RTT varies by 16.

◦ A highly loaded network will have widely varying RTT.

• For β = 2, it can be shown that RTT can only adapt to loads of 30%.

• In 1989, TCP implementations were required to estimate both the average andvariance of RTT, and to use the estimated variance in place of β, as described bythe following recursions:



RTTi+1 = α RTTi + (1− α) RTTsample

DEVi+1 = α DEVi + (1− α) |RTTi −RTTsample|

◦ where α is a smoothing factor that controls how much weight is given to an oldith value when producing a new (i + 1)th value, and the (1− α) factor controls theweight of new information;

◦ DEV is an estimate of standard deviation which works well (it is actually thedeviation smoothed).

• Implementation efficiency: use α = a fraction composed of 1/2, 1/4, 1/8 etc. sointeger arithmetic can be used.

• This discussion has focussed on how to determine RTT and timeout values in areal network. Question: what should be done when congestion occurs?

6.16 Response to Congestion

• Congestion is a form of severe delay caused by an overload of datagrams at oneor more switching points.

◦ When congestion occurs, delays increase and the gateways begin to enqueue data-grams until they can be sent. If buffer capacity is exceeded, packets are dropped.

• Endpoints do not know the details of where or how congestion occurs — they seeonly increased delay.

• Retransmission in response to increased delay leads to more network traffic, whichaggravates congestion and leads to congestion collapse.

• The network gateways can monitor the buffer sizes and signal hosts that conges-tion has occurred11.

• To avoid congestion collapse, TCP must reduce transmission rates and it does thisby adaptively adjusting its effective window size.

◦ We have stated that the transmit window size depends on the advertised windowsize sent by the other end in acknowledgements.

11Internet Control Message protocol (ICMP) allows gateways and hosts to exchange error and controlmessages.



◦ In practice, TCP maintains a second limit called the congestion window limit.

◦ TCP operates with an Allowed window = min{ advertised window, congestion win-dow }. The congestion window is shrunk during times of congestion.

• Multiplicative decrease Congestion Avoidance: Upon loss of a segment, reduce thecongestion window by half (minimum 1). For segments remaining in the allowedwindow, backoff the retransmission timer exponentially.

◦ If congestion is assumed, TCP reduces traffic volume exponentially. If loss contin-ues, the result is TCP attempting to send a single datagram (with exponentiallyincreasing timeout).

◦ The idea is to provide significant and fast load reductions to the gateways.

• Slow-Start Recovery: Whenever starting traffic on a new connection or increasingtraffic after a period of congestion, start the congestion window at the size of a sin-gle segment and increase the window by 1 segment after each acknowledgementarrives.

◦ Once the window reaches half of the advertised size, a congestion avoidance phaseis entered and subsequent increases in window size occur only if all segments inthe window are acknowledged.

• AST Figure 6-32 shows a system that started out with a congestion window size =64K when a timeout occurred. The threshold is set to half this value (32K) whilethe congestion window size shrinks to 1K at transmission number 0.

◦ The window then grows to 2K, 4K, 8K, 16K, 32K until at transmission 5 congestionavoidance mode starts.

◦ Window size then grows linearly up to 40K when a timeout occurs after transmis-sion 13. The congestion window size is then set to 40K/2=20K and transmissionresumes. If no more problems occur, the window size can grow as large as theadvertised window size.



44

40

36

32

28

24

20

16

12

8

4

00 2 4 6 8 10 12

Timeout

Threshold

14 16 18 20 22 24

Con

gest

ion

win

dow

(ki

loby

tes)

Transmission number

Threshold

Fig. 6-32. An example of the Internet congestion algorithm.

6.17 Open and Close of TCP Connections

• TCP uses a three-way handshake to establish and close TCP connections.

• TCP Establish:

◦ The first message is sent with the SYN code bit set, and some sequence number x.

◦ This message is received and a reply (second message) is sent with the SYN codebit set, some sequence number y, and containing acknowledgement x+1.

◦ Finally, the end point setting up the link replies (third message) with a segmentcontaining an acknowledgement.

TCP OpenEvents at site 1 Events at site 2

Send SYN seq=x↘

Receive SYN segmentSend SYN seq=y, ACK x+1

↙Receive SYN+ACK seg

Send ACK y+1↘

receive ACK segment



• TCP Close:

◦ Note that TCP connections are full duplex and we can think of them as two uni-direction independent streams. Once a link is closed, data can no longer travel inthat direction (but control segments can still travel in the opposite direction, as candata until that link is also closed).

◦ One further point is that the end which is told to close sends back two messages,one to acknowledge the request, and one to confirm that there is no more datapresent once the application has been informed.

TCP CloseEvents at site 1 Events at site 2

Send SYN seq=x↘

Receive SYN segmentSend ACK x+1

↙ (app. informed)Receive ACK seg

Send FIN, ACK x+1↙ (app. informed)

Receive FIN+ACK segSend ACK y+1

↘receive ACK segment

6.18 Reset of TCP Connections

• TCP must be able to handle exception events where the link must be shutdown.Example: an application program may have entered an abnormal state and is be-ing aborted.

◦ To reset a connection, one side sends a segment with the RST bit set in the codeword.

◦ The other side responds by aborting.

◦ Data transfer ceases immediately, and all resources are deallocated.

6.19 TCP Protocol FSM

• The operation of TCP can best be explained by a finite state machine model — SeeComer Figure 12.13 or Tanenbaum Figure 6-28.

◦ An application starts in the closed state, where it can issue a passive or active openand progress to a new state.

◦ Normal operation from these states leads to the established state.



◦ When the link is shutdown, the FSM performs waits (longer than twice the seg-ment lifetime) to avoid interference between links.

CLOSED

LISTEN

ESTABLISHED

CLOSING CLOSE WAIT

(Start)CONNECT/SYN

LISTEN/–

SYN/SYN + ACK

SYN RCVD

FIN WAIT 1

TIMED WAIT

LAST ACK

FIN WAIT 2

SYN SENT

RST/–

ACK/–

(Active close)

FIN/ACK

FIN + ACK/ACK

FIN/ACK

ACK/–

ACK/–

ACK/–

SEND/SYN

SYN/SYN + ACK (simultaneous open)

(Data transfer stake)

SYN + ACK/ACK (Step 3 of the three-way handshake)

CLOSE/FIN

CLOSE/FIN FIN/ACK

CLOSE/–

CLOSE/–

CLOSE/FIN

CLOSED

(Passive Close)

(Timeout/)

(Go back to start)

Fig. 6-28. TCP connection management finite state machine. The heavy solid line is the

normal path for a client. The heavy dashed line is the normal path for a server. The

light lines are unusual events.

6.20 Forced Data Delivery

• The push operation allows an application to force delivery of data in the stream,even if the utilisation of the underlying datagram will be inefficient.

• This allows interactive applications to achieve a better response.



6.21 Reserved TCP Port Numbers

• TCP provides static and dynamic port binding using a well known port assignmentsfor commonly invoked programs.

◦ As mentioned for UDP, port numbers of UDP and TCP are usually the same eventhough they are independently assigned.

◦ Some services may be connected to via either UDP or TCP e.g. domain nameserver (DNS).

• Some common TCP services are:

tcpmux 1/tcp #TCP Port Service Multiplexerecho 7/tcpdiscard 9/tcp sink nullsystat 11/tcp users #Active Usersdaytime 13/tcpchargen 19/tcp ttytst source #Character Generatorftp-data 20/tcp #File Transfer [Default Data]ftp 21/tcp #File Transfer [Control]ssh 22/tcp #Secure Shell Logintelnet 23/tcpsmtp 25/tcp mail #Simple Mail Transfernameserver 42/tcp name #Host Name Servernicname 43/tcp whoisdomain 53/tcp #Domain Name Serverbootps 67/tcp dhcps #Bootstrap Protocol Serverbootpc 68/tcp dhcpc #Bootstrap Protocol Clienttftp 69/tcp #Trivial File Transfergopher 70/tcpfinger 79/tcphttp 80/tcp www www-http #World Wide Web HTTPkerberos-sec 88/tcp kerberos # krb5 # Kerberos (v5)hostname 101/tcp hostnames #NIC Host Name Serverpop2 109/tcp postoffice #Post Office Protocol - Version 2pop3 110/tcp #Post Office Protocol - Version 3sunrpc 111/tcp rpcbind #SUN Remote Procedure Callaudionews 114/tcp #Audio News Multicastnntp 119/tcp usenet #Network News Transfer Protocolntp 123/tcp #Network Time Protocolimap 143/tcp imap2 imap4 #Interim Mail Access Protocol v2snmptrap 162/tcp snmp-trapxdmcp 177/tcp #X Display Manager Control Protocolbgp 179/tcp #Border Gateway Protocolris 180/tcp #Intergraphappleqtc 458/tcp #apple quick timekpasswd5 464/tcp # Kerberos (v5)



klogin 543/tcp # Kerberos (v4/v5)kshell 544/tcp krcmd # Kerberos (v4/v5)dhcpv6-client 546/tcp #DHCPv6 Clientdhcpv6-server 547/tcp #DHCPv6 Serverimap4-ssl 585/tcp #IMAP4+SSL (use of 585 is not recommended,nfsd 2049/tcp nfs # NFS server daemonhylafax 4559/tcp #HylaFAX client-server protocol

• A note on some TCP services:

◦ ftp (file transfer protocol) operates with two connections — ftp for control andftp-data for data.

◦ ssh (secure shell) and telnet provide remote login to UNIX systems — ssh is pre-ferred as it supports encryption and can also provide additional “pipes” for re-mote X-windows communication, file transfer, etc.

◦ smtp (simple mail transfer protocol) allows UNIX systems to exchange email,while email delivery to a users desktop system from the UNIX server might em-ploy pop (post office protocol v2 or v3) or imap.

◦ bootp (bootstrap protocol) and tftp (trivial file transfer protocol) support the bootup of diskless systems by providing a means to download an OS.

◦ dhcp (dynamic host configuration protocol) and bootp provide network hostswith IP configuration data.

◦ nfs (network file system) is a file networking system originating from Sun Mi-crosystems.

◦ telnet provides connection to remote systems, and it also allows some access toremote servers on specific port numbers via a command format of telnet host

service e.g.telnet cay.cs.jcu.edu.au daytime ,telnet cay.cs.jcu.edu.au smtp .

• Further port assignments can be obtained fromhttp://www.isi.edu/in-notes/iana/assignments/port-numbers .

◦ Well Known Ports: 0 through 1023.

◦ Registered Ports: 1024 through 49151.

◦ Dynamic and/or Private Ports: 49152 through 65535



6.22 TCP Summary

• Provides reliable stream delivery.

• Provides a full duplex connection.

• Sliding windows allow efficient use of network.

• TCP makes few assumptions about the underlying delivery system so it can oper-ate on a variety of such systems.

• Provides flow control with the receiver stating how much data it can receive.

• Supports out of band messages and push.

• Features an adaptive retransmission mechanism with slow-start, multiplicativedecrease, additive increase, and also a congestion avoidance mode.

6.23 Further Information

• The internet RFCs (request for comment) are documents detailing discussionsleading to standards adopted by the internet. RFCs are on-line at http://www.faqs.org/ .

• Some important RFCs:

◦ J. Postel, “User Datagram Protocol”, RFC 768, USC/Information Sciences Institute,August 1980.

◦ J. Postel, “Transmission Control Protocol - DARPA Internet Program Protocol Spec-ification”, STD 7, RFC 793, USC/Information Sciences Institute, September 1981.

◦ TCP RFCs: updates in 1122, window management 813, fault isolation and recov-ery 816, maximum segment sizes 879, congestion 896.



7 Introduction to Socket Programming

7.1 Background

• We know that servers can perform passive opens and wait for a client to performan active open thus creating the TCP link between two particular applications ontwo hosts.

• In UNIX, file IO employs a open-read-write-close paradigm i.e. a file is opened (theuser is provided with an integer file descriptor ID), some data reads and/or writesoccur, and then the file is closed.

◦ The socket abstraction allows a programmer to access TCP/UDP in a way that issimilar to file IO (whenever it makes sense).In fact, 4.4BSD uses sockets for local interprocess communication (UNIX domain)and interhost/interprocess communication (INET domain).

◦ The API design provides access similar to file IO although opening a socket mayrequire more information than opening a file (instead of a file name argument, weneed the transport protocol name, a remote machine address, a client/server flag,etc.).

7.2 Creating a socket

result = socket( af, type, protocol );

• The system call socket() creates a socket and returns the socket ID number.

◦ Argument af specifies the protocol family, which is AF INET for internet protocol(others may include protocols from Xerox, Apple, CCITT, ISO).

◦ Argument type specifies the type of communication desired e.g. SOCKSTREAM

is reliable stream service, SOCKDGRAMis connectionless datagram delivery, andthere is also SOCKRAWfor privileged programs to access low-level protocols ornetwork interfaces.

◦ Argument protocol allows for multiple versions of a particular af/type com-bination. Example: the TCP/IP protocol suite include the protocol TCP.



◦ Note: no local or remote address information has yet been supplied.

◦ Example: To create a stream socket in the Internet domain, the following call mightbe used: s = socket(AF INET, SOCK STREAM, 0); . This call would resultin a stream socket being created with the TCP protocol providing the underlyingcommunication support.

◦ Example: To create a datagram socket for local use, the call might be: s = socket(AF UNIX,

SOCKDGRAM, 0); .

7.3 Closing a socket

status = close( socket );

• The system call close() deletes a descriptor and returns 0 if successful.

◦ Closing a socket immediately terminates data transfer.

7.4 Binding

• Communicating processes are bound by an association. In the Internet domain,an association is composed of local and foreign addresses, and local and foreignports.

◦ The bind() function call allows a process to specify half of an association i.e. thelocal address and local port.

◦ The general form of bind() is bind( socket, localaddr, addrlen ) wheresocket is a descriptor previously created (but not bound), localaddr is a struc-ture specifying the local address to be assigned to the socket, and addrlen is aninteger specifying the length of the address structure.

• The generic format of the address structure is:

/* Generic socket address */struct sockaddr {

u_char sa_len;u_char sa_family;



char sa_data[14];};

where sa len specifies the length of the address structure, sa family specifiesthe family to which the address belongs (e.g. AF INET), and sa data containsthe address.

• The internet version of this address structure is:

/* Socket address, internet style */struct sockaddr_in {

u_char sin_len;u_char sin_family;u_short sin_port;struct in_addr sin_addr;char sin_zero[8];

};

where sin port is a port number (2 bytes), in addr is an IP address (4 bytes),and sin zero pads out the sa data to be 14 bytes.

◦ For machines with multiple IP addresses (i.e. “multi-homed”), a symbolic addressvalue INADDRANY may be used if the binding is to be allowed on any of themachine’s IP addresses.

7.5 Server: Listen and Accept

• After calling socket() and bind() , a connectionless transport protocol serveris ready to accept messages.

• A connection oriented server must call listen() to place the socket in passivemode, and then call accept() to accept a connection request.

• The function call

listen( socket, queuesize );

asks the operating system to build a separate request queue for the previouslybound socket.



• The function call

newsock = accept( socket, caddress, caddresslen );

asks the operating system to return with a socket associated with the next requestin the queue.

◦ The accept() call blocks (waits) until a request arrives.

◦ Variable newsock is in fact a descriptor of a new socket that was created by ac-

cept() and bound in the same way as socket (in effect, duplicated). The servertask now uses newsock for its work while socket remains available for accept-ing new requests.

• The following extract is from BSD rlogind :

. . .f = socket(AF INET, SOCK STREAM, 0);. . .if (bind(f, (struct sockaddr ∗) &sin, sizeof (sin)) < 0) {

. . .}. . .listen(f, 5); /∗ wait for incoming on socket f ∗/for (;;) {

int g, len = sizeof (from);

/∗ take incoming on socket g ∗/g = accept( f,

(struct sockaddr ∗) &from,&len);

if (g < 0) {if (errno 6= EINTR)

syslog( LOG ERR,"rlogind: accept: %m" );

continue;}

if (fork() == 0) {close(f); /∗ service routine uses g ∗/doit(g, &from);

}close(g); /∗ listener keeps using f ∗/

}



7.6 Client: Connect

• Clients use function call connect() to establish connection to a specific server:

connect( socket, saddress, saddresslen );

◦ In effect, connect() is the function call that a client uses to connect to a serverthat has called accept() .

• Note: connect() may also be used for connectionless protocols as it records theserver’s address in the socket thereby allowing the client to send many messagesto the same server without having to specify the destination address with eachmessage.

7.7 Sending and Receiving Data

• The functions send() and write() can send data via a connected socket. Exam-ple: write( socket, buffer, length ) where buffer = address of datato send and length = number of bytes to send via socket .

• Function sendto() takes additional arguments for the destination address (IP,port) as it does the connect which send() assumed already existed.

• Function sendmsg() is similar to sendto() but uses a structure to hold its argu-ments.

• Function writev() is similar to write() but it can gather the data to send froma buffer list. This avoids copying the transmit data into a contiguous buffer.

• The corresponding receive functions are recv() , read() , recvfrom() , recvmsg() ,and readv() .

7.8 Flexible use of read() and write()

• The UNIX read() and write() functions use descriptors to identify data sourcesand sinks.



• As these are (by design) compatible with the descriptors used for sockets, a singleapplication program can process local data (files) or remote data (data accessedvia sockets).

7.9 Servers for Multiple Services

• The UNIX function select() allows a task to wait for connections on multiplesockets.

◦ select() applies to IO in general, on UNIX.

• The call form is:

nready = select( ndesc, indesc, outdesc, excdesc, timeout );

where ndesc = number of descriptors to watch; pointers indesc , outdesc andexcdesc point to bit masks which are set to identify descriptors to test for inputready, output ready, and exceptional conditions respectively; and timeout is atimeout value.

• Note that the user sets the bit masks before each call, and the select() functionreturns information on active sockets by setting/clearing the same bit masks.

• The only exception source for a socket is out-of-band data.

• For BSD, the following macros assist in bit mask handling:FD SET(fd, &fdset) ,FD CLR(fd, &fdset) ,FD ISSET(fd, &fdset) ,FD ZERO(&fdset) .

• Example of multiplexing reads:

#include <sys/time.h >#include <sys/types.h >. . .fd set read template;struct timeval wait;. . .for (;;) {

wait.tv sec = 1; /∗ one second ∗/



wait.tv usec = 0;

FD ZERO(&read template);

FD SET(s1, &read template);FD SET(s2, &read template);

nb = select( FD SETSIZE,&read template,(fd set ∗) 0, (fd set ∗) 0,&wait );

if (nb ≤ 0) {/∗ An error occurred during the select,

or the select timed out. ∗/}

if (FD ISSET(s1, &read template)) {/∗ Socket #1 is ready to be read from ∗/}

if (FD ISSET(s2, &read template)) {/∗ Socket #2 is ready to be read from ∗/}

}

• Note: read template is cleared and re-initialised at the beginning of every mainloop for(;;) traversal.

7.10 Network Byte Order

• The network has its own definition of byte order for multibyte quantities short

(2 bytes) and long (4 bytes).

• On each host, functions (or macros) are defined to convert between host byte orderand network byte order.

◦ Network to host conversions:

localshort = ntohs( netshort );

locallong = ntohl( netlong );

◦ Host to network conversions:

netshort = htons( localshort );

netlong = htonl( locallong );



7.11 Some Other Related Functions

• After a server successfully returns from accept() , function getpeername()

can be used to provide the full name of the remote system.

• gethostname() provides the local host name.

• gethostbyname() returns the IP address given the hostname.

• gethostbyaddr() returns the hostname given the IP address.

• Functions getsocketopt()/setsocketopt() allow socket options to be readand set.

7.12 BSD internet super-server inetd

• inetd is a 4.4BSD daemon that listens for requests for many daemons instead ofhaving each task (daemon) listening for its own requests.

◦ This reduces the number of idle daemons and simplifies implementation.

• inetd handles two types of services: standard and TCPMUX.

◦ A standard service has a well-known port assigned to it, as listed in BSD systemfile /etc/services (see also man services ), and defined by IANA.

◦ A TCPMUX service is non-standard, has no well-known port assigned, and isinvoked by inetd when a client connects to the tcpmux well-known port.

• On BSD, inetd starts at boot time and determines from file /etc/inetd.conf

the servers for which it is to listen (it creates a socket for each service and thencalls select() ).

◦ When inetd accepts a connection, it does a fork() , duplicates (dup() ) the newsocket to file descriptors {0,1} (stdin and stdout), closes other open file descriptors,and execs the appropriate server.

◦ The server code is then a program that runs with stdin and stdout already setup. In fact, the server can be written using stdio IO (with appropriate flushing).

◦ In 4.4BSD, the TCPMUX service is built into inetd by listening to TCP port 1.



• The TCPMUX service allows a user to add locally developed protocols withoutneeding an official TCP port assignment. The TCPMUX protocol is described inRFC-1078:

A TCP client connects to a foreign host on TCP port 1. It sends the service name followed bya carriage-return line-feed (CRLF). The service name is never case sensitive. The server replieswith a single character indicating positive (“+”) or negative (“-”) acknowledgment, immedi-ately followed by an optional message of explanation, terminated with a CRLF. If the reply waspositive, the selected protocol begins; otherwise the connection is closed.

7.13 Additional References

• “An Advanced 4.4BSD Interprocess Communication Tutorial”, Samuel J. Leffler,Robert S. Fabry, William N. Joy, Phil Lapsley, from Computer Systems ResearchGroup (CSRG), Department of Electrical Engineering and Computer Science, Uni-versity of California Berkeley; Steve Miller, Chris Torek, from Heterogeneous Sys-tems Laboratory, Department of Computer Science, University of Maryland).

◦ On *BSD, see file /usr/share/doc/psd/21.ipc/paper.ascii.gz.

• “An Introductory 4.4BSD Interprocess Communication Tutorial”, Stuart Sechrest,CSRG, Computer Science Division, Department of Electrical Engineering and Com-puter Science, University of California Berkeley.

◦ On *BSD, see file /usr/share/doc/psd/20.ipctut/paper.ascii.gz.



8 IP Router Operation

• We now consider higher level operation of the Internet.

8.1 Datagram Delivery

• A node on a given physical network can send a physical frame directly to anothernode on the same network. A 48-bit hardware address is used in the destinationpart of the frame’s header.

• To transfer an IP datagram, the sender encapsulates the datagram in a physicalframe with a 48-bit hardware address obtained by mapping12 of the destination32-bit IP address. Network hardware then delivers the frame.

• IP routing consists of deciding where to send a datagram based on its destination32-bit IP (v4) address.

◦ In direct delivery, the destination IP address is identified as being on the same sub-net i.e. there is a direct physical connection. Delivery is achieved by encapsulatingthe datagram into a frame with the destination host’s 48-bit address for hardwaretransmission.

◦ In indirect delivery, the destination IP address is determined as being on a remotesystem for which direct delivery is not possible so external information is requiredto find a 48-bit hardware address for encapsulation.

• Consider a large Internet with many networks interconnected by gateways.

◦ A host that cannot achieve direct delivery sends a datagram to a directly connectedgateway where software extracts the encapsulated datagram and IP routing rou-tines select the next destination.

◦ The datagram is passed between the gateways in the Internet until it arrives at agateway that can perform direct delivery.

◦ The total path provided to the IP datagram is determined by the gateways whichform a cooperative interconnected structure in order to maintain route informa-tion.

12Address Resolution Protocol (ARP) allows a host to find a hardware address from an IP address.



• An IP routing algorithm employs an IP routing table on each node to hold informa-tion about possible destinations and how to reach them. A typical routing tablecontains pairs (N,G) where N = IP address of destination network and G is the IPaddress of the gateway along the path to network N.

◦ Based on the network portion of its own address, a node can easily identify data-grams for which indirect delivery is necessary. The table provides an IP addressfor the gateway node so the datagram, with its IP addresses unchanged, is encapsu-lated in a new frame with a 48-bit hardware address chosen for the next gateway.

◦ If the destination network is not found in the routing table, the packet is forwardedto a default router with more extensive tables. If no default route is defined, arouting error has occurred.

• In some instances, particular routes may be defined for some nodes.

• We can summarise the IP routing algorithmRoute IP Datagram( datagram,routing table ) as:

◦ Extract destination IP address, ID, from datagram;

◦ Compute IP address of destination network, IN ;

◦ if IN matches any directly connected network address, send datagram to destina-tion over that network (this involved resolving ID to a physical address, encapsu-lating the datagram, and sending the frame);

◦ else if ID appears as a host-specific route, use that;

◦ else if IN appears in routing table, use the table route;

◦ else if a default route has been specified, use that;

◦ else declare a routing error.

• How is the routing table determined?

◦ For simple networks, a static routing table is feasible.

◦ In large networks, a dynamic routing table is required in order that the network mayadapt to changes e.g. gateway failures can be worked around, etc.



8.2 Route Table Completeness

• Is full routing information necessary for an individual host on the network?

◦ Hosts can route datagrams successfully even if they only have partial routing in-formation because they can rely on gateways. Recall the default route choice thatwas mentioned at the end of the routing algorithm. If a host has a gateway asits default route and it does not know how to handle a particular datagram, it of-floads the datagram (and the problem) to the gateway which has more completeinformation.

◦ The routing tables in a given gateway contain partial information about possibledestinations. An advantage of having partial information present is that it allowsthe administrators at the remote site to make decisions about their local routing.The local gateway passes the datagram through the internet until it arrives at theremote gateway which uses its routing tables to perform delivery.

• Recall that the term “internet” was originally coined to refer to a set of autonomousnetworks that were interconnected to form a single internet.

• We can divide the routing problem into two tasks in a similar way:

◦ Routing is performed on a “local” scale, perhaps involving a number of “small”networks connected to a gateway — this is the interior gateway task;

◦ Routing is performed on a “global” scale where we are interested in moving data-grams across one (or more) backbones that interconnect these gateways — this isthe exterior gateway task.

• First, we briefly discuss how routes can be optimised.

8.3 Route Optimisation Algorithms

• In Dijkstra’s algorithm, each edge in a graph represents a link between nodes andis assigned a non-negative “cost” or weight. The “shortest” path in the graphis obtained by finding the minimum sum of weights between the two nodes ofinterest. This optimisation can be done off-line.

• In vector distance routing, a router periodically sends routing information acrossthe network to neighbours. Each message contains {destination, distance } values.



◦ A recipient compares this information to its own tables:

– it adds entries for which no routes were previously known;

– it updates its distance value for any destination for which the message senderis the next hop, and

– it updates its next hop and distance entries for destinations for which its dis-tance was higher when using a different next hop.

◦ As networks grow, this traffic becomes significant.

• An alternative is the Shortest Path First (SPF) algorithm which requires each gate-way to have complete topology information. The gateways periodically probeeach other so they know the status (“up” or “down”) of their neighbouring gate-ways — this information is periodically broadcast so that each gateway can updateits optimal route information via Dijkstra’s algorithm.

• More complex techniques are required to handle economic as well as technicalcosts. Also, network traffic now includes growing “real-time” data.

8.4 Interior Gateway Routing Protocol

• As the Internet connects many different organisations, these organisations havebeen free to develop their own internal routing methods. We consider two proto-cols.

8.4.1 Routing Information Protocol (RIP)

• Routing Information Protocol implements vector-distance routing for local net-works.

◦ RIP is a widely used protocol due to its distribution in the form of a BSD daemonrouted .

• It partitions participants into active and passive (silent) machines.

◦ A gateway running active RIP broadcasts information in the form of { IP net-work address, integer distance to destination } from the gateway’s current routingdatabase every 30 seconds.



◦ RIP uses a hop count metric. A datagram direct delivery corresponds to one hop.

◦ A hop count does not take into account link speed e.g. 3 hops across fast networkswould most likely be faster than 2 hops via PPP links. RIP implementations mayadvertise artificially high hop counts for slow links in order to optimise routing.

• Both active and passive RIP participants listen to all broadcast messages and up-date their tables when better routes are found. Each route in the table has associ-ated with it a timer so that it is automatically dropped should a gateway providingthe route fail. Routes become invalid after 180 seconds.

◦ Note that this means that good news (fast routes) travels quickly while bad news(failed routes) travels slowly. This slow convergence problem is addressed by tech-niques such as triggered updates which force a gateway to immediately broadcastbad news. An avalanche of updates can also cause problems.

8.4.2 Open Shortest Path First

• The early algorithms suffered as networks grew so the Internet Engineering TaskForce developed the OSPF (Open Shortest Path First) standard in 1990. It is pub-lished hence “open”.

• It can handle metrics such as physical distance, time delay, and others.

• It is dynamic hence can adapt to changes in topology automatically.

• Supports routing based on service i.e. the type of service field is inspected so that itis possible to handle real–time traffic (multi–media), etc.

• It can do load balancing hence routers connected by multiple pathways can havetheir traffic spread across the pathways to maximise performance (previously,routers used the best single link and ignored the others). Example: a router canbalance traffic on multiple PPP pathways forming a link to maximise performance.

• OSPF works by having adjacent routers exchange information with acknowledge-ment and timestamping — hence, routers have up-to-date knowledge of costs etc.

• In normal operation, a router floods link state update messages to its neighbours.

• To minimise overall coordination traffic, one router is elected to be the designatedrouter and it is considered to be “adjacent” to all other routers.

• As the routers all belong to a single organisation, they can trust one another!



��


Hello Used to discover who the neighbors are��

Link state update Provides the sender’s costs to its neighbors��

Link state ack Acknowledges link state update��

Database description Announces which updates the sender has��

Link state request Requests information from the partner��

��

��

Fig. 5-54. The five types of OSPF messages.

8.5 Exterior Gateway Routing Protocol

• Border Gateway Protocol (BGP) acts as a routing protocol between organisationsaccording to policies chosen by the owners.

• Policies can be based on politics, commercial considerations, costs, services to cus-tomers (and rejection of traffic of non–customers), etc.

• BGP is very general.

• Pairs of BGP routers communicate by establishing a TCP connection.



9 Internet Control Protocols

9.1 Internet Control Message Protocol (ICMP)

• The Internet Control Message Protocol allows gateways to send error or controlmessages to other gateways or hosts; ICMP provides communication between theInternet Protocol software on one machine and the Internet Protocol software onanother.

◦ ICMP can only report error conditions to the original source machine — it cannotcorrect the error. The source machine IP software must inform erroneous individ-ual application programs and action must then be taken to correct the problem.

• An ICMP message travels across the Internet in the data portion of an IP datagram.

◦ An IP datagram carrying an ICMP message has the value 1 in the PROTOCOLfield.

◦ Each ICMP message has its own format with a common initial 32 bits:

0 — 7 8 — 15 16 — 23 24 — 31TYPE CODE CHECKSUM

Further ICMP Information

• IP datagrams carrying an ICMP message are not allowed to trigger error reportmessages (as this could cause error messages about error messages about ...).

• Principal ICMP message types:



��


Destination unreachable Packet could not be delivered��

Time exceeded Time to live field hit 0��

Parameter problem Invalid header field��

Source quench Choke packet��

Redirect Teach a router about geography��

Echo request Ask a machine if it is alive��

Echo reply Yes, I am alive��

Timestamp request Same as Echo request, but with timestamp��

Timestamp reply Same as Echo reply, but with timestamp��

��

��

Fig. 5-50. The principal ICMP message types.

• When a gateway cannot deliver an IP datagram, a destination unreachable messageis sent back to the source.

◦ The CODEfield integer further describes the problem. Some values are: 0=net-work unreachable; 1=host unreachable; 2=protocol unreachable; 3=port unreach-able; 4=fragmentation needed and DF (don’t fragment) set;

• Time exceeded — packet could be looping, or congestion or timeout problems.Whenever a gateway processes a datagram, it decrements the time-to-live counter(or hop count) and discards the datagram when the count reaches zero. This en-sures that datagrams can not infinitely loop.

◦ Whenever a gateway discards a datagram because its hop count reaches zero, ora timeout occurs while waiting for fragments of a datagram, it sends a time ex-ceeded message.

• Parameter problem — invalid IP packet found.

• Source quench — used for flow control. Usually, a congested gateway sends onesource quench message for every datagram that it discards due to buffer over-flows. Once the datagram source stops receiving quench messages, it graduallyincreases transmission rate.

• Redirect — allows network knowledge to propagate. In particular, when a gate-way detects a datagram from a host is using a non-optimal route, it forwards the



datagram and also sends a redirect message to the host. Thus, a host can boot upknowing only a default gateway and then optimise its routing.

• Echo Request and Echo Reply allows destinations to be checked for reachability andtimestamping allows performance measurement. On UNIX hosts, the ping com-mand uses these messages to display network performance while the tracer-

oute can identify the route in use.

• A Timestamp Request leads to a Timestamp Reply to allow datagram transit times tobe computed.

9.2 Address Resolution Protocol (ARP)

• As mentioned earlier, each interface has a unique 48-bit hardware LAN addressand a 32-bit IP address.

• ARP is a mechanism that allows a host to find out what 48-bit LAN address be-longs to an IP address. A system outputs a broadcast packet to every machine onthe network in question, asking “who owns this IP address”. The owner replieswith their LAN address.

• ARP reduces the need for configuration files.

• By having hosts cache the results, ARP requests are reduced. However, cacheentries are discarded after a few minutes so that systems that have their LANcards replaced due to failure get operating quickly.

• It is also possible for hosts to broadcast their mapping when they bootup. No re-sponse is expected. However, a machine with the same IP address should respondin order to prevent the second machine coming on-line and creating chaos!

• It is also possible for routers to react to ARP requests for IP information belong-ing to remote networks. In proxy ARP, routers cooperate by forwarding the ARPrequest to the appropriate network for a response to be generated (and returned).

9.3 Reverse Address Resolution Protocol

• RARP does the reverse of ARP.



• A diskless machine about to boot up will know its 48-bit LAN address but will notknow its IP address.

• An RARP sends out a broadcast packet (with all 32 address bits = 1) saying whatits LAN address is, and an RARP server responds with the IP address.

• This allows the diskless machine to share boot files with other machines whileretaining its unique identity.

• Broadcast packets with all address bits = 1 are not propagated by routers to avoidunwanted traffic, and RARP servers must exist on any subnet needing them.

• On UNIX, rarpd can be handled by daemons that starts at boot time.

9.4 Domain Name System

• The Domain Name System implements a machine name hierarchy for TCP/IPInternets.

• Example: mirriwinni.cs.jcu.edu.au contains a hostname mirriwinni anda domain name cs.jcu.edu.au .

◦ This name is a part of the cs.jcu.edu.au domain, which is part of the jcu.edu.au

domain, which is part of the edu.au domain, which is part of the au domain.

• Domain name servers provide a mapping between meaningful host names andactual IP numbers.

◦ An advantage of a hierarchical naming system is that name servers can be dis-tributed, and also organised in a hierarchical manner e.g. a small number ofservers might handle .au names, and be in contact with a slightly larger num-ber of servers handling edu.au queries, etc.

◦ The authority in the hierarchical naming system is also structured. At the top,some servers are in charge of .au . Beneath these are other servers in charge ofedu.au . Down a few more layers, we might see a CS machine such as cay be aDNS server for the cs.jcu.edu.au domain.

• Nameservers may map a given name to more than one item in the domain system.The client specifies the type of object desired when resolving a name, and theserver returns objects of that type.



◦ Examples on Unix:nslookup cay.cs.jcu.edu.au

returns the IP address of cay ;nslookup -q=mx cay.cs.jcu.edu.au

returns the Mail eXchange records for email addressed to cay (this MX mappingallows us to direct email to a main server and backup servers);

nslookup -q=mx cs.jcu.edu.au

returns the Mail eXchange records for host independent CSE email.

• Some domain name system record types:

◦ type A consists of a hostname and its IP address;

◦ type CNAME gives the canonical hostname for an alias;

◦ type MX gives a 16 bit preference and name of host that acts as a mail exchangerfor the domain;

◦ type NS is the name of the authorative server for the domain;

◦ type SOA is the statement of authority which describes which parts of the naminghierarchy a server implements.

• The cost of lookup for non-local names can be high so nameservers will maintaina cache of recently used names — when queried, a reply from a remote server willbe marked authorative while an answer from a locally cached (previous) query willbe marked non-authorative.

◦ Each response from a remote server will include a time to live value set by theauthority at the remote site — this means that server lookups for hosts whoseIP address does not change can be minimised, while improved correctness canbe obtained for entries that are expected to change by assigning them short TTLvalues.

• Before an organisation is granted authority for an official domain, it must agree tooperate a domain name server that meets Internet standards.

◦ For robustness, a site must also find a separate non-dependent site to act as abackup server. A backup server is best physically separate, running on a differentpower supply.

◦ Administration information for the .au domain may be found athttp://www.auda.org.au .



• On Unix, see also man nslookup .

• On BSD operating systems, see also man dig which describes how to obtain in-formation about DNS.



10 Application Layer

10.1 Introduction

• Recall that UNIX Internet daemons such as inetd can simplify the setup of ser-vice provision.

◦ In particular, a server task can be written with its client communication mappedto stdin/stdout IO.

◦ In practice, lightly loaded services are started by inetd as needed while moreheavily used server tasks may be run permanently as daemons in their own right.

10.2 Email

• Simple Mail Transfer Protocol (SMTP) is used to transfer email .

◦ On UNIX, the mail delivery daemon is sendmail (or one of a number of newalternatives such as vmail ) which listens on TCP port 25 for connections fromremote machines. On Unix, see telnet <host> 25 .

◦ SMTP is a simple ASCII (text) transmission protocol described by RFC 821. Re-cently, extended SMTP (ESMTP) has been defined in RFC 1425 to handle issuessuch as larger message length, different timeouts, and prevention of infinite mail-storms (email loops).

◦ Typical RFC 822 header fields (AST):� ��

Header Meaning� ��

To: Email address(es) of primary recipient(s)� ��

Cc: Email address(es) of secondary recipient(s)� ��

Bcc: Email address(es) for blind carbon copies� ��

From: Person or people who created the message� ��

Sender: Email address of the actual sender� ��

Received: Line added by each transfer agent along the route� ��

Return-Path: Can be used to identify a path back to the sender� ��

��

��

Fig. 7-42. RFC 822 header fields related to message transport.



◦ RFC 822 message format (AST):��

Header Meaning��

Date: The date and time the message was sent��

Reply-To: Email address to which replies should be sent��

Message-Id: Unique number for referencing this message later��

In-Reply-To: Message-Id of the message to which this is a reply��

References: Other relevant Message-Ids��

Keywords: User chosen keywords��

Subject: Short summary of the message for the one-line display��

��

��

Fig. 7-43. Some fields used in the RFC 822 message header.

◦ RFCs 1341 and 1521 have added language extensions and MIME (MultipurposeInternet Mail Extensions). In effect, RFC 822 header types have been extended toinclude MIME-Version: , Content-Type: , Content-Transfer-Encoding: ,etc.

◦ Other recent developments have included PGP (Pretty Good Privacy) which sup-ports text compression, secrecy, and digital signatures.

10.3 Network News

• News reading, like email, is one of the original “killer apps” that led to earlyspread of the internet.

• RFC 977 describes the Network News Transfer Protocol (NNTP) which is used topropagate news articles from one machine to another.

◦ Two methods of requesting news transfers are supported in NNTP: in news pull,a host contacts one of its newsfeeds and asks for new news; in news push, thenewsfeed calls the client and announces that it has new news.

◦ TCP port 119 is reserved for NNTP.



10.4 Other Applications

• File Transfer Protocol (FTP) — defines a client/server method of transferring files.

• Hyper Text Transfer Protocol (HTTP) — defines a client/server method of trans-ferring data based on URLs. The client browser is structured so that different typesof data can be interpreted (displayed, played, etc.) as appropriate.

◦ On Unix, you can use telnet to retrieve HTTP data via port 80 e.g.

telnet mirriwinni.cs.jcu.edu.au 80 ←↩

GET /˜phillip/index.html HTTP/1.0 ←↩

Accept:text/plain,text/html ←↩

←↩

• Simple Network Management Protocol (SNMP) — defines a basic client/servermethod of interrogating and setting network configuration attributes. This is out-side the scope of this course.



A PPP — Point–to–Point Protocol

• This is not examinable!

• The Internet Engineering Task Force developed RFC 1661/1662/1663 leading toPPP.

• Handles error detection, multiple protocols, connect time IP address negotiation.

• PPP provides:

– unambiguous framing method which also handles error detection;

– Link Control Protocol (LCP) dealing with bringing lines up, testing them,negotiating options, and taking them down;

– Network Control Protocol (NCP) dealing with network layer options that isindependent of the network layer to be used.

• Example of PPP use:

– user calls the modem attached to a router (or terminal server or equivalent)of the Internet service provider (ISP);

– after a physical connection is established, modern V.34/V.90 modems un-dergo a channel equalisation training stage (so that their digital filters canundo channel distortion);

– the user sends the router a series of LCP packets in the payload field of oneor more PPP frames;

– these packets and their responses select PPP parameters to be used;

– a series of NCP packets are then sent to configure the network layer — for auser wanting to use TCP/IP, an IP address is needed;

– the NCP for IP will have the router provide an IP address to the user for useduring the connection (ISP’s will “own” a number of addresses which areshared amongst their clients dialing in);

– when finished, the NCP is used to tear down the network layer connectionand free up the IP address;

– finally, the LCP shuts down the data link layer connection and a disconnectoccurs.

• The PPP frame is similar to a HDLC frame:

• Character (byte) oriented so all frames are an integral number of bytes.

• Delimiter of 01111110 .



• Address field is always set to 11111111 .

• Control field has the default value 00000011 for an unnumbered frame (i.e. se-quence numbers not used by default). RFC 663 describes the use of numberedframes in a noisy environment.

• Because the address and control fields can be used with a constant value, the LCPprovides a mechanism for them to be deleted by negotiation.

• The next field is the protocol field — identifies one of LCP, NCP, IP, IPX, AppleTalk,others.

• The data field holds data up to some negotiated maximum length (if no LCP ne-gotiation occurs, 1500 bytes is used).

• The checksum field is usually 2 bytes but can be negotiated to 4 bytes.

NetworkDead

Terminate Open

Establish Authenticate

Carrier detected

Both sides agree on options

Authentication successful

NCP configuration

Carrier dropped

Failed

Failed

Done

Fig. 3-28. A simplified phase diagram for bringing a line up anddown.

• The handling of lines:

– DEAD (no physical layer present) which changes to ESTABLISHED (oncecarrier detection occurs);

– LCP option negotiation occurs (may include authentication);



– NCP protocol then invoked;

– Data transfer occurs.

• Note: the LCP protocol only defines how the negotiation is conducted, not what isnegotiated. See Figure 3.29.

��

Name Direction Description��

Configure-request I → R List of proposed options and values��

Configure-ack I ← R All options are accepted��

Configure-nak I ← R Some options are not accepted��

Configure-reject I ← R Some options are not negotiable��

Terminate-request I → R Request to shut the line down��

Terminate-ack I ← R OK, line shut down��

Code-reject I ← R Unknown request received��

Protocol-reject I ← R Unknown protocol requested��

Echo-request I → R Please send this frame back��

Echo-reply I ← R Here is the frame back��

Discard-request I → R Just discard this frame (for testing)��

��

��

��

Fig. 3-29. The LCP packet types.


Documents

Introduction to Computer Networks