22
Protocols running at the transport layer are charged with providing several important services to enable software applications in higher layers to work over an internetwork. They are typically responsible for allowing connections to be established and maintained between software services on possibly distant machines. Perhaps most importantly, they serve as the bridge between the needs of many higher-layer applications to send data in a reliable way without needing to worry about error correction, lost data or flow management, and network-layer protocols, which are often unreliable and unacknowledged. Transport layer protocols are often very tightly-tied to the network layer protocols directly below them, and designed specifically to take care of functions that they do not deal with. Internet Protocol (IP) addresses are the universally-used main form of addressing on a TCP/IP network. These network- layer addresses uniquely identify each network interface , and as such, serve as the mechanism by which data is routed to the correct network on the internetwork, and then the correct device on that network. What some people don't realize, however, is that there is an additional level of addressing that occurs at the transport layer in TCP/IP, above that of the IP address . Both of the TCP/IP transport protocols, TCP and UDP, use the concepts of ports and sockets for virtual software addressing, to enable the funMultiplexing and Demultiplexing Using Ports The question is: how do we demultiplex a sequence of IP datagrams that need to go to many different application processes? Let's consider a particular host with a single network interface bearing the IP address 24.156.79.20. Normally, every datagram received by the IP layer will have this value in the IP Destination Address field. Consecutive datagrams received by IP may contains a piece of a file you are downloading with your Web browser, an e-mail sent to you by your brother, and a line of text a buddy wrote in an IRC chat channel. How does the IP layer know which datagrams go where, if they all have the same IP address?

Tcp Udp Notes

Embed Size (px)

DESCRIPTION

Unit 4 Of ACN

Citation preview

Page 1: Tcp Udp Notes

Protocols running at the transport layer are charged with providing several important services to enable software applications in higher layers to work over an internetwork. They are typically responsible for allowing connections to be established and maintained between software services on possibly distant machines. Perhaps most importantly, they serve as the bridge between the needs of many higher-layer applications to send data in a reliable way without needing to worry about error correction, lost data or flow management, and network-layer protocols, which are often unreliable and unacknowledged. Transport layer protocols are often very tightly-tied to the network layer protocols directly below them, and designed specifically to take care of functions that they do not deal with.

Internet Protocol (IP) addresses are the universally-used main form of addressing on a TCP/IP network. These network-layer addresses uniquely identify each network interface, and as such, serve as the mechanism by which data is routed to the correct network on the internetwork, and then the correct device on that network. What some people don't realize, however, is that there is an additional level of addressing that occurs at the transport layer in TCP/IP, above that of the IP address. Both of the TCP/IP transport protocols, TCP and UDP, use the concepts of ports and sockets for virtual software addressing, to enable the funMultiplexing and Demultiplexing Using Ports

The question is: how do we demultiplex a sequence of IP datagrams that need to go to many different application processes? Let's consider a particular host with a single network interface bearing the IP address 24.156.79.20. Normally, every datagram received by the IP layer will have this value in the IP Destination Address field. Consecutive datagrams received by IP may contains a piece of a file you are downloading with your Web browser, an e-mail sent to you by your brother, and a line of text a buddy wrote in an IRC chat channel. How does the IP layer know which datagrams go where, if they all have the same IP address?

The first part of the answer lies in the Protocol field included in the header of each IP datagram. This field carries a code that identifies the protocol that sent the data in the datagram to IP. Since most end-user applications use TCP or UDP at the transport layer, the Protocol field in a received datagram tells IP to pass data to either TCP or UDP as appropriate.

In both UDP and TCP messages two addressing fields appear, for a Source Port and a Destination Port. These are analogous to the fields for source address and destination address at the IP level, but at a higher level of detail. They identify the originating process on the source machine, and the destination process on the destination machine. They are filled in by the TCP or UDP software before transmission, and used to direct the data to the correct process on the destination device.

Page 2: Tcp Udp Notes

TCP and UDP port numbers are 16 bits in length, so valid port numbers can theoretically take on values from 0 to 65,535. As we will see in the next topic, these values are divided into ranges for different purposes, with certain ports reserved for particular uses.

One fact that is sometimes a bit confusing is that both UDP and TCP use the same range of port numbers, and they are independent. So, in theory, it is possible for UDP port number 77 to refer to one application process and TCP port number 77 to refer to an entirely different one. There is no ambiguity, at least to the computers, because as mentioned above, each IP datagram contains a Protocol field that specifies whether it is carrying a TCP message or a UDP message. IP passes the datagram to either TCP or UDP, which then sends the message on to the right process using the port number in the TCP or UDP header. This mechanism is illustrated in Figure 198.

In practice, having TCP and UDP use different port numbers is confusing, especially for the reserved port numbers used by common applications. For this reason, by convention, most reserved port numbers are reserved for both TCP and UDP. For example, port #80 is reserved for the Hypertext Transfer Protocol (HTTP) for both TCP and UDP, even though HTTP only uses TCP.

Application process multiplexing and demultiplexing in TCP/IP is implemented using the IP Protocol field and the UDP/TCP Source Port and Destination Port fields. Upon transmission, the Protocol field is given a number to indicate whether TCP or UDP was used, and the port numbers are filled in to indicate the sending and receiving software process. The device receiving the datagram uses the Protocol field to determine whether TCP or UDP was used, and then passes the data to the software process indicated by the Destination Port number

To allow client devices to more easily establish connections to TCP/IP servers, server processes for common applications use universal server port numbers that clients are pre-programmed to know to use by default.

Port numbers assignments are managed by IANA to ensure universal compatibility around the global Internet. The numbers are divided into three ranges: well-known port numbers (0 –1023) used for the most common applications, registered port numbers (1024 – 49151) for other applications, and private/dynamic port numbers ( 49152 – 65535 ) that can be used without IANA registration.

To know where to send the reply, the server must know the port number the client is using. This is supplied by the client as the Source Port in the request, and then used by the server as the destination port to send the reply. Client processes don't use well-known or registered ports. Instead, each client process is assigned a temporary port number for its use. This is commonly called an ephemeral port number.

Page 3: Tcp Udp Notes

Well-known and registered port numbers are needed for server processes since a client must know the server’s port number to initiate contact. In contrast, client processes can use any port number. Each time a client process initiates a UDP or TCP communication it is assigned a temporary, or ephemeral, port number to use for that conversation. These port numbers are assigned in a pseudo-random way, since the exact number used is not important, as long as each process has a different number.

Just as well-known and registered port numbers are used for server processes, ephemeral port numbers are for client processes only. This means that the use of a range of addresses from 1,024 to 4,999 does not conflict with the use of that same range for registered port numbers as seen in the previous topic.

Sockets: Process Identification

What this all means is that the overall identification of an application process actually uses the combination of the IP address of the host it runs on—or the network interface over which it is talking, to be more precise—and the port number which has been assigned to it. This combined address is called a socket. Sockets are specified using the following notation:

<IP Address>:<Port Number>

So, for example, if we have a Web site running on IP address 41.199.222.3, the socket corresponding to the HTTP server for that site would be 41.199.222.3:80.

The overall identifier of a TCP/IP application process on a device is the combination of its IP addresses and port number, which is called a socket.

You will also sometimes see a socket specified using a host name instead of an IP address, like this:

<Host Name>:<Port Number>

For example, you might find a Web site URL like this: “http://www.thisisagreatsite.com:8080”. This tells the Web browser to first resolve the name “www.thisisagreatsite.com” to an IP address using DNS, and then send a request to that address using the non-standard server port 8080, which is occasionally used instead of port 80 since it resembles it. (See the discussion of application layer addressing using URLs for much more.)

The socket is a very fundamental concept to the operation of TCP/IP application software. In fact, it is the basis for an important TCP/IP application program interface (API) with the same name: sockets. A version of this API for Windows is called Windows Sockets or WinSock, which you may have heard of before. These APIs allow application programs to easily use TCP/IP to communicate.

Page 4: Tcp Udp Notes

Socket Pairs: Connection Identification

So, the exchange of data between a pair of devices consists of a series of messages sent from a socket on one device to a socket on the other. Each device will normally have multiple such simultaneous conversations going on. In the case of TCP, a connection is established for each pair of devices for the duration of the communication session. These connections must be managed, and this requires that they be uniquely identified. This is done using the pair of socket identifiers for each of the two devices that are connected.

Each device may have multiple TCP connections active at any given time. Each connection is uniquely identified using the combination of the client socket and server socket, which in turn contains four elements: the client IP address and port, and the server IP address and port.

Let's return to the example we used in the previous topic (Figure 199). We are sending an HTTP request from our client at 177.41.72.6 to the Web site at 41.199.222.3. The server for that Web site will use well-known port number 80, so its socket is 41.199.222.3:80, as we saw before. We have been ephemeral port number 3,022 for our Web browser, so the client socket is 177.41.72.6:3022. The overall connection between these devices can be described using this socket pair:

(41.199.222.3:80, 177.41.72.6:3022)

For much more on how TCP identifies connections, see the topic on TCP ports and connection identification in the section on TCP fundamentals.

Unlike TCP, UDP is a connectionless protocol, so it obviously doesn't use connections. The pair of sockets on the sending and receiving devices can still be used to identify the two processes exchanging data, but since there are no connections the socket pair doesn't have the significance that it does in TCP.

ftp 21/tcp File Transfer [Control]ftp 21/udp File Transfer [Control]ssh 22/tcp SSH Remote Login Protocolssh 22/udp SSH Remote Login Protocoltelnet 23/tcp Telnettelnet 23/udp Telnettftp 69/tcp Trivial File Transfertftp 69/udp Trivial File Transferhttp 80/tcp World Wide Web HTTPhttp 80/udp World Wide Web HTTPwww 80/tcp World Wide Web HTTPwww 80/udp World Wide Web HTTPwww-http 80/tcp World Wide Web HTTPwww-http 80/udp World Wide Web HTTPkerberos 88/tcp Kerberoskerberos 88/udp Kerberos

Page 5: Tcp Udp Notes

pop3 110/tcp Post Office Protocol - Version 3pop3 110/udp Post Office Protocol - Version 3snmp 161/tcp SNMPsnmp 161/udp SNMP

UDPThe User Datagram Protocol (UDP) was developed for use by application protocols that do not require reliability, acknowledgment or flow control features at the transport layer. It is designed to be simple and fast, providing only transport layer addressing in the form of UDP ports and an optional checksum capability, and little else.

What UDP Does

UDP's only real task is to take data from higher-layer protocols and place it in UDP messages, which are then passed down to the Internet Protocol for transmission. The basic steps for transmission using UDP are:

1. Higher-Layer Data Transfer: An application sends a message to the UDP software.

2. UDP Message Encapsulation: The higher-layer message is encapsulated into the Data field of a UDP message. The headers of the UDP message are filled in, including the Source Port of the application that sent the data to UDP, and the Destination Port of the intended recipient. The checksum value may also be calculated.

3. Transfer Message To IP: The UDP message is passed to IP for transmission.

And that's about it. Of course, on reception at the destination device this short procedure is reversed.

What UDP Does Not

In fact, UDP is so simple, that its operation is very often described in terms of what it does not do, instead of what it does. As a transport protocol, some of the most important things UDP does not do include the following:

o UDP does not establish connections before sending data. It just packages it and… off it goes.

o UDP does not provide acknowledgments to show that data was received. o UDP does not provide any guarantees that its messages will arrive. o UDP does not detect lost messages and retransmit them. o UDP does not ensure that data is received in the same order that they

were sent.

Page 6: Tcp Udp Notes

o UDP does not provide any mechanism to manage the flow of data between devices, or handle congestion.

UDP packages application layer data into a very simple message format that includes only four header fields. One of these is an optional checksum field; when used, the checksum is computed over both the real header and a “pseudo header” of fields from the UDP and IP headers, in a manner very similar to how the TCP checksum is calculated.

UDP is most often used by a protocol instead of TCP in two situations. The first is when an application values timely delivery over reliable delivery, and where TCP’s retransmission of lost data would be of limited or even no value. The second is when a simple protocol is able to handle the potential loss of an IP datagram itself at the application layer using a timer/retransmit strategy, and where the other features of TCP are not required. UDP is also used for applications that require multicast or broadcast transmissions, since these are not supported by TCP.

if an application needs to multicast or broadcast data, it must use UDP, because TCP is only supported for unicast communication between two devices.

TCP

The primary transport layer protocol in the TCP/IP suite is the Transmission Control Protocol (TCP). TCP is a connection-oriented, acknowledged, reliable, fully-featured protocol designed to provide applications with a reliable way to send data using the unreliable Internet Protocol. It allows applications to send bytes of data as a stream of bytes, and automatically packages them into appropriately-sized segments for transmission. It uses a special sliding window acknowledgment system to ensure that all data is received by its recipient, to handle necessary retransmissions, and to provide flow control so each device in a connection can manage the rate at which it is sent data.

Functions Performed By TCP

Despite the complexity of TCP, its basic operation can be reasonably simplified by describing its primary functions. The following are what I believe to be the five main tasks that TCP performs:

o Addressing/Multiplexing: TCP is used by many different applications for their transport protocol. Therefore, like its simpler sibling UDP, an important job for TCP is multiplexing the data received from these different processes so they can be sent out using the underlying network-layer protocol. At the same time, these higher-layer application processes are identified using TCP ports. The section on TCP/IP transport layer addressing contains a great deal of detail on how this addressing works.

Page 7: Tcp Udp Notes

o Connection Establishment, Management and Termination: TCP provides a set of procedures that devices follow to negotiate and establish a TCP connection over which data can travel. Once opened, TCP includes logic for managing connections and handling problems that may result with them. When a device is done with a TCP connection, a special process is followed to terminate it.

o Data Handling and Packaging: TCP defines a mechanism by which applications are able to send data to it from higher layers. This data is then packaged into messages to be sent to the destination TCP software. The destination software unpackages the data and gives it to the application on the destination machine.

o Data Transfer: Conceptually, the TCP implementation on a transmitting device is responsible for the transfer of packaged data to the TCP process on the other device. Following the principle of layering, this is done by having the TCP software on the sending machine pass the data packets to the underlying network-layer protocol, which again normally means IP.

o Providing Reliability and Transmission Quality Services: TCP includes a set of services and features that allow an application to consider the sending of data using the protocol to be “reliable”. This means that normally, a TCP application doesn't have to worry about data being sent and never showing up, or arriving in the wrong order. It also means other common problems that might arise if IP were used directly are avoided.

o Providing Flow Control and Congestion Avoidance Features: TCP allows the flow of data between two devices to be controlled and managed. It also includes features to deal with congestion that may be experienced during communication between devices.

Functions Not Performed By TCP

TCP does so much that sometimes it is described as doing “everything” an application needs to use an internetwork. I may even have been guilty of this myself. However, the protocol doesn't do everything. It has limitations and certain areas that its designers specifically did not address. Among the notable functions TCP does not perform include:

o Specifying Application Use: TCP defines the transport protocol. It does not describe specifically how applications are to use TCP.

o Providing Security: TCP does not provide any mechanism for ensuring the authenticity or privacy of data it transmits. If these are needed they must be accomplished using some other means, such as IPSec, for example.

o Maintaining Message Boundaries: TCP sends data as a continuous stream, not as discrete messages. It is up to the application to specify where one message ends and the next begins.

Page 8: Tcp Udp Notes

o Guaranteeing Communication: Wait a minute… isn't the whole point of TCP supposed to be that it guarantees data will get to its destination? Well, yes and no. TCP will detect unacknowledged transmissions and re-send them if needed. However, in the event of some sort of problem that prevents reliable communication, all TCP can do is “keep trying”. It can't make any guarantees because there are too many things out of its control. Similarly, it can attempt to manage the flow of data, but cannot resolve every problem

TCP Characteristics

The following are the ways that I would best describe the Transmission Control Protocol and how it performs the functions described in the preceding topic:

o Connection-Oriented: TCP requires that devices first establish a connection with each other before they send data. The connection creates the equivalent of a circuit between the units, and is analogous to a telephone call. A process of negotiation occurs to establish the connection, ensuring that both devices agree on how data is to be exchanged.

o Bidrectional: Once a connection is established, TCP devices send data bidirectionally. Both devices on the connection can send and receive, regardless of which of them initiated the connection.

o Multiply-Connected and Endpoint-Identified: TCP connections are identified by the pair of sockets used by the two devices in the connection. This allows each device to have multiple connections opened, either to the same IP device or different IP devices, and to handle each connection independently without conflicts.

o Reliable: Communication using TCP is said to be reliable because TCP keeps track of data that has been sent and received to ensure it all gets to its destination. As we saw in the previous topic, TCP can't really “guarantee” that data will always be received. However, it can guarantee that all data sent will be checked for reception, and checked for data integrity, and then retransmitted when needed. So, while IP uses “best effort” transmissions, you could say TCP tries harder, as the old rent-a-car commercial goes.

o Acknowledged: A key to providing reliability is that all transmissions in TCP are acknowledged (at the TCP layer—TCP cannot guarantee that all such transmissions are received by the remote application). The recipient must tell the sender “yes, I got that” for each piece of data transferred. This is in stark contrast to typical messaging protocols where the sender never knows what happened to its transmission. As we will see, this is fundamental to the operation of TCP as a whole.

o Stream-Oriented: Most lower-layer protocols are designed so that to use them, higher-layer protocols must send them data in blocks. IP is the best

Page 9: Tcp Udp Notes

example of this; you send it a message to be formatted and it puts that message into a datagram. UDP is the same. In contrast, TCP allows applications to send it a continuous stream of data for transmission. Applications don't need to worry about making this into chunks for transmission; TCP does it.

o Data-Unstructured: An important consequence of TCP's stream orientation is that there are no natural divisions between data elements in the application's data stream. When multiple messages are sent over TCP, applications must provide a way of differentiating one message (data element, record, etc.) from the next.

o Data-Flow-Managed: TCP does more than just package data and send it as fast as possible. A TCP connection is managed to ensure that data flows evenly and smoothly, with means included to deal with problems that arise along the way.

TCP is designed to have applications send data to it as a stream of bytes, rather than requiring fixed-size messages to be used. This provide maximum flexibility for a wide variety of uses, because applications don’t need to worry about data packaging, and can send files or messages of any size. TCP takes care of packaging these bytes into messages called segments.

Consider for example an application that is sending database records. It needs to transmit record #579 from the Employees database table, followed by record #581 and record #611. It sends these records to TCP, which treats them all collectively as a stream of bytes. TCP will package these bytes into segments, but in a manner the application cannot predict. It is possible that each will end up in a different segment, but more likely they will all be in one segment, or part of each will end up in different segments, depending on their length. The records themselves must have some sort of explicit markers so the receiving device can tell where one record ends and the next starts. Since applications send data to TCP as a stream of bytes and not prepackaged messages, each application must use its own scheme to determine where one application data element ends and the next begins.

TCP is said to treat data coming from an application as a stream; thus, the description of TCP as stream-oriented. Each application sends the data it wishes to transmit as a steady stream of octets (bytes). It doesn't need to carve them into blocks, or worry about how lengthy streams will get across the internetwork. It just “pumps bytes” to TCP.

TCP is designed to have applications send data to it as a stream of bytes, rather than requiring fixed-size messages to be used. This provide maximum flexibility for a wide variety of uses, because applications don’t need to worry about data

Page 10: Tcp Udp Notes

packaging, and can send files or messages of any size. TCP takes care of packaging these bytes into messages called segments.

Since TCP works with individual bytes of data rather than discrete messages, it must use an identification scheme that works at the byte level to implement its data transmission and tracking system. This is accomplished by assigning each byte TCP processes a sequence number.

Since applications send data to TCP as a stream of bytes and not prepackaged messages, each application must use its own scheme to determine where one application data element ends and the next begins.

Consider for example an application that is sending database records. It needs to transmit record #579 from the Employees database table, followed by record #581 and record #611. It sends these records to TCP, which treats them all collectively as a stream of bytes. TCP will package these bytes into segments, but in a manner the application cannot predict. It is possible that each will end up in a different segment, but more likely they will all be in one segment, or part of each will end up in different segments, depending on their length. The records themselves must have some sort of explicit markers so the receiving device can tell where one record ends and the next starts.

TCP carefully keeps track of the data it sends and what happens to it. This management of data is required to facilitate two key requirements of the protocol:

o Reliability: Ensuring that data that is sent actually arrives at its destination, and if not, detecting this and re-sending the data.

o Data Flow Control: Managing the rate at which data is sent so that it does not overwhelm the device that is receiving it.

A basic technique for ensuring reliability in communications uses a rule that requires a device to send back an acknowledgment each time it successfully receives a transmission. If a transmission is not acknowledged after a period of time, it is retransmitted by its sender. This system is called positive acknowledgment with retransmission (PAR). One drawback with this basic scheme is that the transmitter cannot send a second message until the first has been acknowledged.

The basic PAR reliability scheme can be enhanced by identifying each message to be sent, so multiple messages can be in transit at once. The use of a send limit allows the mechanism to also provide flow control capabilities, by allowing each device to control the rate at which it is sent data.

Page 11: Tcp Udp Notes

The TCP sliding window system is a variation on the enhanced PAR system, with changes made to support TCP’s stream orientation. Each device keeps track of the status of the byte stream it needs to transmit by dividing them into four conceptual categories: bytes sent and acknowledged, bytes sent but not yet acknowledged, bytes not yet sent but that can be sent immediately, and bytes not yet sent that cannot be sent until the recipient signals that it is ready for them.

The Send Window and Usable Window

The send window is the key to the entire TCP sliding window system: it represents the maximum number of unacknowledged bytes a device is allowed to have outstanding at once. The usable window is the amount of the send window that the sender is still allowed to send at any point in time; it is equal to the size of the send window less the number of unacknowledged bytes already transmitted.

When a device gets an acknowledgment for a range of bytes, it knows they have been successfully received by their destination. It moves them from the “sent but unacknowledged” to the “sent and acknowledged” category. This causes the send window to slide to the right, allowing the device to send more data.

TCP acknowledgments are cumulative, and tell a transmitter that all the bytes up to the sequence number indicated in the acknowledgment were received successfully. Thus, if bytes are received out of order, they cannot be acknowledged until all the preceding bytes are received. TCP includes a method for timing transmissions and retransmitting lost segments if necessary.

The TCP sliding windows scheme uses three pointers that keep track of which bytes are in each of the four transmit categories. SND.UNA points to the first unacknowledged byte and indicates the start of Transmit Category #2; SND.NXT points to the next byte of data to be sent and marks the start of Transmit Category #3. SND.WND contains the size of the send window; it is added to SND.NXT to mark the start of Transmit Category #4. Adding SND.WND to SND.UNA and then subtracting SND.NXT yields the current size of the usable transmit window.

Three essential fields in the TCP segment format are used to implement the sliding windows system. The Sequence Number field indicates the number of the first byte of data being transmitted. The Acknowledgment Number is used to acknowledge data received by the device sending this segment. The Window field tells the recipient of the segment the size to which it should set its send window

Connection-Handling Responsibilities

Page 12: Tcp Udp Notes

Since TCP is connection-oriented, it has many more responsibilities. Each TCP software layer needs to be able to support connections to several other TCPs simultaneously. The operation of each connection is separate from of each other connection, and the TCP software must manage each independently. It must be sure not only that data is routed to the right process, but that data transmitted on each connection is managed without any overlap or confusion.

Connection Identification

The first consequence of this is that each connection must be uniquely identified. This is done by using the pair of socket identifiers corresponding to the two endpoints of the connection, where a socket is simply the combination of the IP address and the port number of each process. This means a socket pair contains four pieces of information: source address, source port, destination address. Thus, TCP connections are sometimes said to be described by this addressing quadruple.

I introduced this in the general topic on TCP/IP sockets, where I gave the example of an HTTP request sent from a client at 177.41.72.6 to a Web site at 41.199.222.3. The server for that Web site will use well-known port number 80, so its socket is 41.199.222.3:80. If the client has been assigned ephemeral port number 3,022 for the Web browser, the client socket is 177.41.72.6:3022. The overall connection between these devices can be described using this socket pair:

(41.199.222.3:80, 177.41.72.6:3022)

Each device can handle simultaneous TCP connections to many different processes on one or more devices. Each connection is identified by the socket numbers of the devices in the connection, called the connection’s endpoints. Each endpoint consists of the device’s IP address and port number, so each connection is identified by the quadruple of client IP address and port number, and server IP address and port number.

The Simplified TCP Finite State Machine

The TCP finite state machine describes the sequence of steps taken by both devices in a TCP session as they establish, manage and close the connection. Each device may take a different path through the states since under normal circumstances the operation of the protocol is not symmetric—one device initiates either connection establishment or termination, and the other responds

Three types of message that control transitions between states, which correspond to the TCP header flags that are set to indicate a message is serving that function. These are:

Page 13: Tcp Udp Notes

o SYN: A synchronize message, used to initiate and establish a connection. It is so named since one of its functions is to synchronizes sequence numbers between devices.

o FIN: A finish message, which is a TCP segment with the FIN bit set, indicating that a device wants to terminate the connection.

o ACK: An acknowledgment, indicating receipt of a message such as a SYN or a FIN.

TCP Connection Establishment Process: The "Three-Way Handshake"

The Problem With Starting Every Connection Using the Same Sequence Number

To avoid them, each TCP device, at the time a connection is initiated, chooses a 32-bit initial sequence number (ISN) for the connection. Each device has its own ISN, and they will normally not be the same.

As part of the process of connection establishment, each of the two devices in a TCP connection informs the other of the sequence number it plans to use for its first data transmission by putting the preceding sequence number in the Sequence Number field of its SYN message. The other device confirms this by incrementing that value and putting it into the Acknowledgment Number field of its ACK, telling the other device that is the sequence number it is expecting for the first data transmission. This process is called sequence number synchronization.

additional parameters that may be exchanged during connection setup. Some of these include:

o Window Scale Factor: Allows a pair of devices to specify larger window sizes than would normally be possible given the 16-bit size of the TCP Window field.

o Selective Acknowledgment Permitted: Allows a pair of devices to use the optional selective acknowledgment feature to allow only certain lost segments to be retransmitted.

o Alternate Checksum Method: Lets devices specify an alternative method of performing checksums than the standard TCP mechanism.

TCP includes a special connection reset feature to allow devices to deal with problem situations, such as half-open connections or the receipt of unexpected message types. To use the feature, the device detecting the problem sends a TCP segment with the RST (reset) flag set to 1. The receiving device either returns to the LISTEN state, if it was in the process of connection establishment,

Page 14: Tcp Udp Notes

or closes the connection and returns to the CLOSED state pending a new session negotiation.

Normal Connection Termination

A TCP connection is normally terminating using a special procedure where each side independently closes its end of the link. It normally begins with one of the application processes signalling to its TCP layer that the session is no longer needed. That device sends a FIN message to tell the other device that it wants to end the connection, which is acknowledged. When the responding device is ready, it too sends a FIN that is acknowledged; after waiting a period of time for the ACK to be received, the session is closed.

The TIME-WAIT State

The TIME-WAIT state is required for two main reasons. The first is to provide enough time to ensure that the ACK is received by the other device, and to retransmit it if it is lost. The second is to provide a “buffering period” between the end of this connection and any subsequent ones. If not for this period, it is possible that packets from different connections could be mixed, creating confusion.

The standard specifies that the client should wait double a particular length of time called the maximum segment lifetime (MSL) before finishing the close of the connection. The TCP standard defines MSL as being a value of 120 seconds (2 minutes). In modern networks this is an eternity, so TCP allows implementations to choose a lower value if it is believed that will lead to better operation.

Simultaneous close

Just as two devices can simultaneously open a TCP session, they can terminate it simultaneously as well. In this case a different state sequence is followed, with each device responding to the other’s FIN with an ACK, waiting for receipt of its own ACK, and pausing for a period of time to ensure that its ACK is received by the other device before ending the connection.

TCP Message (Segment) Format

TCP Header Field Functions

The price we pay for this flexibility is that the TCP header is large: 20 bytes for regular segments and more for those carrying options. This is one of the reasons why some protocols prefer to use UDP if they don't need TCP's features. The TCP header fields are used for the following general purposes:

Page 15: Tcp Udp Notes

o Process Addressing: The processes on the source and destination devices are identified using port numbers.

o Sliding Window System Implementation: Sequence numbers, acknowledgment numbers and window size fields implement the TCP sliding window system.

o Control Bits and Fields: Special bits that implement various control functions, and fields that carry pointers and other data needed for them.

o Carrying Data: The Data field carries the actual bytes of data being sent between devices.

o Miscellaneous Functions: A checksum for data protection and options for connection setup.

TCP is designed to restrict the size of the segments it sends to a certain maximum limit, to cut down on the likelihood that segments will need to be fragmented for transmission at the IP level. The TCP maximum segment size (MSS) specifies the maximum number of bytes in the TCP segment’s Data field, regardless of any other factors that influence segment size. The default MSS for TCP is 536, which results from taking the minimum IP MTU of 576 and subtracting 20 bytes each for the IP and TCP headers.

Devices can indicate that they wish to use a different MSS value from the default by including a ©Maximum Segment Size option in the SYN message they use to establish a connection. Each device in the connection may use a different MSS value.

TCP Immediate Data Transfer: "Push" Function

TCP includes a special “push” function to handle cases where data given to TCP needs to be sent immediately. An application can send data to its TCP software and indicate that it should be pushed. The segment will be sent right away rather than being buffered. The pushed segment’s PSH control bit will be set to one to tell the receiving TCP that it should immediately pass the data up to the receiving application.

TCP Priority Data Transfer: "Urgent" Function

To deal with situations where a certain part of a data stream needs to be sent with a higher priority than the rest, TCP incorporates an “urgent” function. When critical data needs to be sent, the application signals this to its TCP layer, which transmits it with the URG bit set in the TCP segment, bypassing any lower-priority data that may have already been queued for transmission.

Page 16: Tcp Udp Notes