View
279
Download
6
Tags:
Embed Size (px)
Citation preview
1
TCP
2
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
3
DLP
DLP DLPDLP
Host HostTransport layer protocol
Transport Layer• End-to-end data transfer
(cf) DLP(data link protocol)– data transfer between adjacent nodes
DLP
4
Transport Layer services
• Addressing the application process and delivering data between processes
• What else should the transport layer do for application?
networkaccess 1
IP
Transport
networkAccess 2
IP
transport
network networkaccess1 access2
IP
subnet 1
end-to-end
subnet 2
AP1 AP2 AP3 AP1 AP2 AP3
5
What the transport layer should do in the Internet(1)
• IP provides unreliable services to the upper layers.– no error control
• IP does merely the header checksum, but do not send ACKs nor retransmit.
– no flow control/no congestion control• IP doesn’t have any function to control the transmission rate depending on the sta
tes of receivers or networks.
– duplicate packet discovery• When packets are not delivered within the predefined time limit to the receiver be
cause of network congestion or taking detour, even though those packets are not lost on the way, the sender retransmits the same packets.
• Also, the ACK packets are not delivered to the sender within the predefined time limit, the sender times out and retransmits the same packets.
• The IP of receiver cannot detect those duplicate packets and delivers the packets to the upper layers.
– out-of-order packet delivery• Because IP use the datagram mode, packets can take different paths, consequentl
y they might arrive out of order.
6
What the transport layer should do in the Internet(2)
• The application data that are delivered by IP might– be lost due to error or congestion, or
– arrive at the destination out of order, or
– be duplicated at the destination.
• Thus, the transport layer protocol in the Internet should– provide the reliable services to the application layers if the application
requires reliable service. Otherwise all dirty work should be done by application itself.
– There are two transport protocols in the Internet.
• TCP – provide reliable services.
• UDP - simple, streamlined delivery services to the application layers which do not need reliable service.
7
Internet transport layer protocols• TCP(Transmission Control Protocol)
– provide reliable services to the application layers.
• Multiplexing (addressing the application services)
• error control (error detection and retransmission)
• flow control
• congestion control
• Guarantee no out-of-sequence of the packet order
• UDP(User Datagram Protocol)– Provide unreliable services
– UDP does very simple function compared to TCP.
• Multiplexing (addressing the application services)
• Error detection (optional)
8
TCP service characteristics• End-to-end reliable service
– guarantee the reliable data transfer between application processes
– No error, no loss, no out-of-sequence
• connection-oriented service– Consists of three steps: connection setup, data transfer, connection release
• full duplex transmission– TCP connection setup enables two-way connections.
• stream-oriented transmission– TCP views messages from application processes as continuous byte stream,
not as separate packets.
• Graceful connection release– When the connection terminates, TCP releases the connection after data
transfer is completed.
9
How to provide reliable services(1)• Transmission unit is segment.
– The data sent to TCP from application processes are fragmented to have the size proper for transmission. Each fragmented data is called a segment. So the segment is the transmission unit when TCP sends application process data.
– On the contrary, UDP does not fragment the application data, instead send the data as it was given from application processes.
• Management of the segment sequence– Each segment is given a sequence number (viewed as byte streams), so
receiver TCP can recognize any loss of segments and the out-of-sequence of arriving segments.
• ACK transmission– When TCP receives correct segments, it always replies with ACK segment.– For enhancing performance, it uses the accumulative ACK.
• Timer management– When TCP sends segments, it starts a timer. When the ACK for the segments
sent does not arrive until the timer times out, it resends the same segment.
10
How to provide reliable services(2)
• Error control (checksum)– TCP checks any error on the segments it received using the checksum field
in the header. If it finds any error, it discards the segment.
– Also using the sequence number on the segment, it checks any loss of segments or out-of-sequence of the segments.
• Order control– The receiver stores the packets it receives in the buffer, and after keeping the
order of segments, it delivers them to application processes.
• Detection and discard of duplicate segments– When the same segments arrives, the receiver discard the segment.
11
How to provide reliable services(3)
• Clear connection management– Clear connection setup using 3 way handshake
– Also, clear connection release using 3 way handshake
– When one end station happens to reboot, the station will setup another TCP connection in addition to the current TCP connection. In this case, TCP can distinguish the segments of the previous connection and the newly established connection.
• Flow control– TCP uses a buffer, and notifies the other TCP on the connection of the
available space in the buffer for receiving. So the other TCP can send only the amount of segments and stop.
• Congestion control– TCP controls transmission rate depending on congestion state in the network.
12
TCP header TCP data
IP datagram
TCP segment
20 octets 20 octets
TCP Header
IP header
13
TCP Header
Options (if any)
Data (if any)
16-bit source port number 16-bit destination port number
32-bit sequence number
32-bit acknowledgement number
4bit hdrlength
Reserved(6 bits)
URG
ACK
PSH
RST
SYN
FIN
16-bit window size
16-bit TCP checksum 16-bit urgent pointer
TCP
header
Padding(if any)
14
Bit position Name function
11 URG urgent pointer field valid
12 ACK acknowledgment field valid
13 PSH deliver data on receipt of this segment
14 RST reset the sequence/acknowledgment numbers
15 SYN synchronization
16 FIN end of byte stream from sender
TCP Segment Format(code Bits)
15
Port number: addressing application• A connection is identified uniquely by 5 elements.
– (sender IP address, receiver IP address, protocol number, sender application process port number, receiver application process port number)
– The combination of an IP address and a port number is sometimes called socket.
Network access
IP
Network access
IP
subnet subnet
TCP connection
AP
Network access
IP
Port
protocol
H/W addr
IP addr
TCP UDP TCP UDP
AP APAP
16
Connection Identification addresses• IP address
– identifies a specific host in the Internet.
– has 1:1 mapping to the subnet physical address that the host is connected to.
• Protocol number– identifies an upper layer protocol to which IP in the destination host should
send data.
• Port number– identifies an application process to which the receiver IP should deliver
data .
– well-known port numbers
• the port numbers that were already decided by ICANN for their uses such as FTP server is 21, Telnet server is TCP 23, etc.
– Ephemeral number
• port numbers that is assigned temporarily for application processes established presently.
17
Well Known TCP Ports(/etc/services)Keyword UNIX keyword Description
0 Reserved1 TCPMUX - TCP Multiplexor5 RJE - Remote Job Entry7 ECHO echo Echo9 DISCARD discard Discard11 USERS systat Active Users13 DAYTIME daytime Daytime15 - netstat Network status program17 QUOTE qotd Quote of the day19 CHARGEN chargen Character Generator20 FTP-DATA ftp-data File Transfer Protocol21 FTP ftp File Transfer Protocol23 TELNET telnet Terminal Connection25 SMTP smtp Simple Mail Transport Protocol37 TIME time Time42 NAMESERVER name Host Name Server43 NICNAME whois Who Is53 DOMAIN nameserver Domain Name Server77 - rje any private RJE service79 FINGER finger Finger93 DCP - Device Control Protocol95 SUPDUP supdup SUPDUP Protocol
18
Sequence Number• Segment number identifies the byte in the stream of data from the sending TCP
to the receiving TCP, It represents the first byte of data in the segment.
• The unit is not segments, but bytes..
• The size is 232 large enough to detect duplicate segments.
TCP user TCP
[seq=300, data]
[seq=500, data]
[seq=650, data]
SEND (200 byte data)
SEND (100 byte data)
SEND (150 byte data)
19
Acknowledge Number• Accumulative ACK
• By convention, the ACK number is the byte number of the segment that the receiver expects to receive next time.
Sender TCP Receiver TCP
[seq=1000, 100 byte data]
[seq=1100, 200 byte data]
[seq=1300, 100 byte data]
[ACK=1400]
20
Duplicate segments in the same connectionTransport
Entity A
Transport
Entity B
SN1
SN1
SN2
SN3
SN5
SN4
SN6
SN7
SN0
ACK3
ACK3
ACK4
ACK5
ACK6
ACK0
ACK7
A times out and retransmits SN0
A times out and retransmits SN1
Obsolete SN0 arrives
assumption: - seq. number: mod 8- use the accumulative ACK
SN0
SN0
Solution: sequence number space should be large enough
New SN0 arrives
21
SN 2
SYN
SYN
SN 0
SN 1
SN 2 Obsolete segment SN = 2 is accepted;
valid segment SN = 2 is discarded as duplicate
Duplicate segments in different connections(1)
Transport entity A
Transport entity B
New connection opened
Old connection closed
22
Duplicate segments in different connections(2)
• Global numbering– If the sequence number of the last segment of the previous connection is N, n
ew connection use the first sequence number that is distant from N.
– TCP should remember the sequence number that was used in the last segment.
• 2 MSL Timer– When TCP connection closes, new TCP connection is not allowed to open i
mmediately. New connection can open after the amount of time has passed.
– TCP implementation choose a value for the maximum segment life time(MSL). It is the max. amount of time any segment can exit in the network before being discarded.
– TCP connection can be reused after 2MSL wait is over.
23
Window Field
• This field is used for TCP flow control (often called “Credit technique”).
• It is used for a receiver to notify a sender of the size of empty space in the receiver TCP buffer.
• The unit is bytes.
• If the buffer size is larger than 216, it can be extended using the option field.
• Its use is independent of the use of the acknowledge number field that denotes the success of failure of segment transmission.
24
PUSH• Background
– Normally, when the sending TCP receives data from the sending application process, TCP does not send the data immediately. Instead it stores the data in the its buffer, waiting for additional data arrive for the prevention of Silly Window Syindrom.
– In the interactive application, however, the sending TCP is required to send data immediately.
• PSH flag– The sending application process tells its TCP when to set the PUSH flag.
– It is a notification to the sending TCP that the sending application process don’t want the data to hang around in the TCP buffer, waiting for additional data to fill the buffer.
– When the receiver TCP receives the segment with the PSUH flag, it pass data to the receiver application process, telling not to wait until any additional data
– The Socket API don’t provide a way for the application to tell its TCPto set the PUSH flag. Setting this flag is up to the TCP implementation.
– BSD implementations ignores a received PUSH flag because they normally never delay the delivery of received data to the application.
25
URGENT Bit & Urgent Pointer• Urgent mode
– The sending TCP tell the other TCP that urgent data of some form has been placed into the normal stream of data.
– The receiving TCP notifies the receiving application of the arrival of urgent data. The application process will decide what to do on its own way.
– The URG bit is turned on and the urgent pointer is set to a positive offset that must be added to the sequence number field in the TCP header to obtain the sequence number of the last byte of urgent data.
– In the socket API, sending application process can set this bit using SO OOB.
• What is urgent mode used for?– The two most common uses are Telnet and Rlogin when interactive uses type
the interrupt key(etc, ^C). Another is FTP, when interactive users abort a file transfer.
26
TCP Option Fields• MSS (Maximum Segment Size) option
– The maximum size of the data transmitted
– When a connections established, each end can announce the MSS it expects to receive. An MSS option can only appear in a SYN segment. If one end does not receive an MSS option form the other end, a default of 536 bytes is assumed.
• 576 (IP datagram default size) - 40 (IP/TCP header fixed size)
– In general, the larger the MSS the better, until fragmentation occurs.
• Window Scale Option– It increase the window size. It means the maximum window size can be 216x216=232.
• New window size = window size defined in the header x 2window scale factor
– The window size factor can be determined only during the connection setup phase.
• Time stamp option– The sender fills the time stamp value when the segment leaves. When the receiver
sends an ACK for this segment, it enters the time stamp value that it receives from the sender. When the sender receives the ACK, it can calculate the round trip time for this segment.
27
• The checksum applies to three parts: pseudo-header, TCP header, and
the data coming form the application process)• Checking the pseudo-header prevent packets from being delivered to
wrong hosts due to the corruption of the IP header.• Divide the total bits into 16-bit words. Add all 16-bit sections, using
one’s complement arithmetic.
Checksum
TCPTCP
segmentsegment
Pseudo-headerPseudo-headerSource IP address
Destination IP addresszero Segment length
TCP header
User Data
00 1616 3131
Protocol idChecksum Checksum scopescope
28
TCP summary• Connection establishment
– 3 way handshake
• Connection termination– support graceful close using the 3 way handshake.
– also support abrupt close using ABORT primitive.
• Data transfer– Each segment is assigned a sequence number with the unit of byte.
– Error control by retransmission: selective repeat
– Flow control by credit allocation
– PUSH
– URGENT POINTER
• Reset service– RST
29
TCP Primitivesprimitive
UNSPECIFIED_PASSIVE_OPENFULL_PASSIVE_OPEN ACTIVE_OPEN ACTIVE_OPEN_WITH_DATA OPEN_ID OPEN_SUCCESSOPEN_FAILURESEND DELIVERALLOCATECLOSECLOSINGTERMINATEABORTSTATUSSTATUS_RESPONSE ERROR
type
RequestRequest Request Request Local responseConfirmConfirmRequest IndicationRequestRequestIndicationRequestRequestRequestLocal Response Indicator
Client/ServerSS C C C CCC/S C/SC/SC/SC/SC/SC/SC/SC/S C/S
Parameters
Source port, timeout, timeout-action, precedence, security rangeSource port, destination port, destination address, timeout, timeout-action, precedence, security rangeSource port, destination port, destination address, timeout, timeout-action, precedence, security rangeSource port, destination port, destination address, data, data length, push flag, urgent flag, timeout, timeout-action, precedence, security rangeLocal connection name, source port, destination port, destination addressLocal connection nameLocal connection nameLocal connection name, data, data length, push flag, urgent flag, timeout, timeout-actionLocal connection name, data, data length, push flag, urgent flagLocal connection name, data lengthLocal connection nameLocal connection nameLocal connection name, reason codeLocal connection nameLocal connection nameLocal connection name, source port, source address, destination address, connection state, receive window, send window, waiting ack, waiting accept, urgent, precedence, security, timeoutLocal connection name, reason code
30
Initiating(client)protocol TCP
Client-IP-serverResponding(server)protocol
ACTIVE_OPEN
ACTIVE_OPEN_WITH_DATA
OPEN_ID
OPEN_SUCCESS
OPEN_FAILURE
SEND
DELIVER
ALLOCATE
STATUS
STATUS_REPORT
ERROR
CLOSE
TERMINATE
ABORT
UNSPECIFIED_PASSIVE_OPEN
FULL_PASSIVE_OPEN
OPEN_RECEIVED
DELIVER
SEND
ALLOCATE
STATUS
STATUS_REPORT
ERROR
CLOSING
CLOSE
TERMINATE
Connection
establishment
Data
transfer
Status/error
reporting
Connection
clearing
TERMINATE+ +
+ +
+
+ +
+ +
+ +
+ +
+ +
+
+
+
+
+ +
Usage of TCP Service Primitives
31
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
32
client serverSYN
SYN, ACK
ACK
FIN
ACK
FIN
ACK
Application close Deliver EOF to
application
Application close
TCP Connection setup and release
33
Client TCP
ACTIVE_OPEN
Send SYN
OPEN_SUCCESS Send ACK
TCP Server
Send SYN
PASSIVE_OPEN
OPEN_RECEIVED
SYN=1, Seq=X
SYN=1, ACK=1, Seq
=Y, ack=X=1
ACK=1, Ack=Y+1
TCP Connection Setup : 3 Way Handshake
• Client-Server model
34
ACTIVE_OPEN
Send SYN
Send ACK
OPEN_SUCCESS
Send SYN
Send ACK
OPEN_SUCCESS
ACTIVE_OPEN
SYN=1, Seq=X
SYN=1, Seq=Y
ACK=1, Ack=Y+1 ACK=1, Ack=X+
1
TCP Connection Setup : 3 Way Handshake
• Simultaneous open
35
SYN i SYN k, ACK p
RST, ACK k
SYN j, ACK i
SYN i
SYN j, ACK i
RST, ACK j
Obsolete SYN arrives
B accept and acknowledges
A rejects B’s connection
(a) Delayed SYN
A initiates a connection.
A acknowledges and begins transmission
(b) Delayed (SYN, ACK)
B accepts and acknowledge
Old SYN arrives at A; A rejects
ACK jSN i+1
Robustness of 3 Way Handshake
36
client server
FIN
ACK of FIN
FIN
ACK of FIN
Application shutdown
Deliver EOF to application
Application close
data
ACK of data
Application write
Application read
Deliver EOF to application
TCP Half-Close: Graceful Disconnection
37
TCP Connection Release: 3 Way Handshake
(a)
(b)
Client side Server side
Client TCP TCP Server
CLOSE
TERMINATE
ABORT
Send FIN
Send ACK
Send RST
Send ACK
Send FIN
CLOSING
CLOSE
TERMINATE
TERMINATE
--------
FIN = 1, seq = X
ACK = 1, ack = X+1
FIN=1,seq=Y
ACK=1,ack=Y+1
RST = 1
38
Connection Release: 3 Way Handshake• Graceful disconnection – 3 way handshake
– Since the TCP connection is full-duplex, when one end request termination, one way connection is terminate. But the other way connection can be maintained while the other end keeps sending data.
• Abrupt disconnection– One-sided termination because of network failure, etc. In this case data can
be lost.
39
Connection Release: 3 Way Handshake• Graceful disconnection – 3 way handshake
– Problem due to out-of-sequence
• The one end sends FIN after sending the last segment. But the FIN segment arrives ahead of the last segment.
• In this case, if the receiver TCP terminates as soon as it receives the FIN, the receiver loses the segment that arrives after connection closure.
• To prevent this kind of loss, TCP assigns the sequence number to FIN segment, which have the number incremented from the sequence number of the last segment..
• When the other end is not cooperative to the termination request, – The requesting end terminates the connection when the timer times out.
40
Crash & Connection Release• The half-open can happen when any end of the connection breaks d
own, since the other end cannot know the other end’s failure.
• In the half-open, the other end keeps retransmitting segments allowed. If no reply arrives until the keepalive timer expires, it terminates the connection.
• The TCP end that has broken down can terminate using RST segment after rebooting.– Since the rebooting TCP has lost all state information, it should send RST se
gments for all segment it received, and the other end that received RST segments must terminate the connection immediately.
41
SYN SENT LISTENSYN RECEIVE
CLOSED
ESTAB
LAST ACK
CLOSE WAIT
CLOSED
FIN WAIT 2
FIN WAIT
CLOSING
TIME WAIT
Active Open or Active
Open with Data
Unspecified Passive Open or
Fully Specified Passive Open
Initialize SV;
Send SYN
Initialize SV
Close
Clear SV
Close
Clear SV
Send SYN
ACK
Send SYN
ACK
Receive SYN Receive SYN
Receive ACK
of SYN
Receive FIN
Send ACK
Receive SYN,ACK
Send ACK
Receive FIN,ACK of SYN
Send ACK
Close
Send FIN
Receive SYN,ACK
Send ACK
Receive FIN
Send ACK
Receive FIN,ACK
Send ACK
Receive
ACK of FIN
Receive FIN
Send ACK
Close
Send FIN
Receive
ACK of FIN
Timeout
(2MSL)
LEGEND
SV = state vector
MSL = maximum segment lifetime
TCP Entity State Diagram
42
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
43
TCP Traffic Control• Traffic control
– There are two reasons for sender to reduce the rate of sending packets.
– When receiver’s buffer space is not enough, flow control
– When the network is congested, congestion control
Small-capacityreceiver
networkcongestion
44
Sliding Window Flow Control
(a) sender’s window
(b) receiver’s window
Window is shrinking as the segments are sent
Window expandsas the acks are received
Segments sent, butnot acknowledged
Segments that can be sent
0 1 2 3 0 1 2 3
The last segmentThat was acked
Window expands as acks are sent
Segments that were received Segments that will be received
0 1 2 3 0 1 2 3
Window is shrinking as the segments are received
45
Is the sliding window scheme enough?
0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 30 1 2 3 0 1 2 3
Window closedWindow closed
ACK(2)
ACKs not sent
0 1 2 3 0 1 2 3
Window closed 0 1 2 3 0 1 2 3
I(2)
I(0)
I(1)
TIMEOUT
I(0)
I(3)
I(1)
I(0)
I(3)
I(1)
Retransmit I(3),I(0),I(1)
Window closed, BUSY CONDITION
Make the receiver’s state worse!!
0
1
2
3
0
1
window size = 3
46
What is wrong with the sliding window?• No distinction between the ACK and the current available buffer si
ze.– When the receiver TCP receives segments uncorrupted and stores them in the
buffer, but does not finish processing them,
• If the TCP doesn’t send any ACK, then the sender’s timer expires and try to retransmit the segments. ==> It causes unnecessary loads to network!
• Otherwise, if the TCP sends ACKs, then the sender transmits new segments, which may be discarded eventually. ==> aggravate the receiver’s condition!
• Solution: credit allocation protocol– It distinguishes the ACK from the credit information. The ACK information i
nforms the sender of successful transmission, while the credit information notifies the sender of the its current empty buffer size.
47
Credit Allocation Protocol
0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 30 1 2 3 0 1 2 3
Closing WindowClosing Window
ACK 2, CDT 3
I(2)
TIMEER
I(0)
I(3)
I(1)
Not retransmit I(3),I(0),I(1)
0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3
Closing Window, BUSY CONDITION
IDLE CONDITION
ACK 1, CDT 0
ACK 1, CDT 2
0 1 2 3 0 1 2 3
Closing Window
0 1 2 3 0 1 2 3
Open Window
I(0)
I(1)
I(2)I(3)
0
1
2
3
0
1
1
window size = 3
48
Transport Entity A Transport Entity B
....1000 1001 2400 2401...
A may send 1400 octets
....1000 1001
....1000 1001
....1600 1601
....1600 1601
.…2600 2601
2401...
2401...
1601
2001
2001
2600 2601….
4000 4001.…
2601….
A shrinks its transmit window with eachtransmission
A adjusts its window with each credit
A exhausts its credit
A receive new credit
....1000 1001 2400 2401...
....1600 1601 2601….
....1600 1601 2001 2601….
.…2600 2601 4000 4001.…
B is prepared receive 1400 octets,beginning with 1001
B acknowledges 3 segments (600 octets) but is only
B acknowledges 5 segments (1000 octets) and
through 2600original budget (I.e., B will accept octets 1601prepared to receive 200 additional octets beyond the
restore the original amount of credit
SN = 1001
SN = 1401
A = 1601, W = 1000
SN = 1601
SN = 2001
A = 2601, W = 1400
SN = 1201SN = 1601
SN = 1801
SN = 2201SN = 2401
Example of TCP Credit Allocation Mechanism.
49
A keystroke arrive
41 bytes IP packets
40 bytes ACKApplication read 1 byteof keystroke40 bytes
window updateApplication echoes it
41 bytes IP packets
Too Small Data & Immediate Window Update• Example of TELNET
– When data arrives from application, if the sender TCP transmit it immediately, or the receiver TCP sends window update right after its buffer changes, then they have to exchange segments frequently, but do little.
50
Silly Window Syndrome(caused by Receiver)
Receiver’s buffer is full
Application reads 1 byte
Window update segment sent
New byte arrives
Header
Receiver’s buffer is full
Room for one more byte
Header
1 byte
51
Silly Window Syndrome(caused by Sender)
Sender’s Buffer is empty
Application writes 1 byte
Sender’s buffer has 1 byte.
Header
1 byte
TCP sends 1 byte.
52
Avoiding SWS from the sender• Background
– Suppose the case that data from application arrives at TCP 1 byte at a time. In that case TCP does not need to send small segment immediately every time it receives data.
• Nagle’s algorithm– If data arrives 1 byte at a time, TCP sends the first byte in a small segment,
and collect the next bytes in its buffer.
– TCP sends the data in the buffer as a single segment when the ACK for the first segment arrives.
– And TCP store the next bytes in the buffer again until it receives the ACK for the segment.
53
Avoiding SWS from the receiver• Clark’s solution
– The receiver TCP does not send window update until before its buffer is half empty or the size of data in the buffer becomes as large as the MSS.
• Delayed ACK– TCP does not send an ACK the moment it receives a segment. Instead, it
delays the ACK, hoping to have data going to the same host as the ACK for piggybacking.
– Most implementations use a 200 ms delay.
54
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
55
Congestion Control• Background
– Too much traffic has been injected into the network. The traffic inflow at this moment is exceeding the capacity that the network can accommodate.
• So, the solution is simple. The traffic influx should be pull down below the network capacity level. But the rate should be reduced way ahead of reaching the full capacity level. (need very early action!!)
• How can the network detect the early symptom of the congestion?– Monitoring the buffer size of network nodes (eg, routers)
– Keeping track of the round-trip time of packets
56
TCP and congestion control• In the Internet, TCP is responsible for the congestion control. (It is
somewhat odd!)
• Then, how does TCP detect the congestion?– Timeout: No ACKs has arrived until timer expires.
– The timeout can be triggered by two occasions: One is the transmission error, and the other is packet loss by congestion. But in the current network, the transmission error happens very rarely, so we give the congestion the benefit of the doubt.
• TCP Congestion control methods– Slow start
– Congestion avoidance
– Fast retransmit
– Fast recovery
57
Slow Start• Control parameters
– Awnd (advertised window by receiver)
• At the initial setup, the sender informs the receiver of its maximum buffer size, which is the initial value of awnd.
• Every time the sender transmits an ACK, it advertises its current available buffer size.
– Cwnd (congestion window)
• Determine how many segments can be sent without receiving ACKs..
• Slow StartInitialize: cwnd = 1 MSS (max. segment size);
Every time each ACK arrives:
cwnd = cwnd + 1 MSS until min(cwnd, awnd) /* exponential growth */
Initial rate is slow, but ramp up exponentially fast.
58
Cwnd = 1 Segment 1
ACK 2
sender receiver
Cwnd = 2Segment 2
Segment 3
ACK 3
ACK 4Cwnd = 3
Cwnd = 4 Segment 4
Segment 7...
ACK 5
ACK 8.....
Cwnd = 8
Effect of Slow Start
59
Congestion Avoidance
• If no ACKs arrive until timeout, TCP starts the Congestion Avoidance algorithm.
• Congestion Avoidance algorithm
If (segment timeout) {
1. Set ssthresh = cwnd / 2 /* slow start threshold */
2. Set cwnd = 1 MSS
Restart “slow-start” until (cwnd=ssthresh)
3. If (cwnd ssthresh)
cwnd = cwnd + 1 MSS every roundtrip time
}
60
Slow Start and Congestion AvoidanceCWND=1 A B
CWND=2
CWND=3CWND=4
CWND=5CWND=6CWND=7CWND=8
CWND=9CWND=10CWND=11CWND=12CWND=13CWND=14CWND=15CWND=16
Con
gest
ion
avo
idan
ce
CWND=1 A B
CWND=2
CWND=3CWND=4
CWND=5CWND=6CWND=7CWND=8
CWND=9
CWND=10
(a) Slow start, ending with a time out (b) Slow start followed by congestion avoidance
61
cwnd
Round-trip times
15
20
5
21
10
43 5 126 98 107 1311 15 1614
Time out occurs
Threshold
Slow Start and Congestion Avoidance
62
Fast Retransmission and Fast Recovery• Background
– TCP is required to generate an immediate acknowledgement (a duplicate ACK) when an out-of-order segment is received.
– We don’t know whether a duplicate ACK is caused by a lost segment or just a reordering of segments.
– If three or more duplicate ACKs are received in a row, it is a strong indication that a segment has been lost. Three or more duplicate ACKs implies that there is a flow of segments over the network.
– Therefore the Congestion Avoidance is too conservative approach to this case.
• Fast retransmission– If 4 consecutive ACKs are received before timeout, then TCP do not wait for
the timeout and retransmit the segment immediately.
63
A BSN=1001SN=1201(lost)
SN=1401SN=1601SN=1801SN=2001SN=2201SN=2401SN=2601
SN=1201(retransmission)
SN=2801SN=3001SN=3201
A=801
A=1001
A=1201
A=1201
A=1201
A=1201
A=1201
A=1201
A=1201
A=2801A=2601
Ela ps ed t im
e le ss th an c ur re nt RT
O
Fast Retransmit
64
Fast Recovery
• Fast recovery algorithm (avoiding initial slow start phase)
1. When the third duplicate ACK is received,
Set ssthresh = cwnd / 2;
Retransmit the missing segment;
cwnd = ssthresh + 3 ;
2. Each time another duplicate ACK arrives,
Increment cwnd by the segment size;
Transmit a new segment (if allowed by the new cwnd value);
3. When the next ACK arrives that acknowledges new data,
cwnd = ssthresh ;
cwnd = cwnd + 1 every roundtrip time ;
65
Fast Retransmission and Fast Recovery
send time (sec)
seq #&
cwnd
cwnd
sequence number
66
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
67
Round Trip Time & Timeout
• RTT is important because the timeout value is determined based on RTT.
• RTT can change over time as route might change and as network traffic changes.
• So, TCP should track these changes and modify its timeout accordingly.
68
Round Trip Time & Timeout• Original TCP specification
RTT(n+1) = a * RTT(n) + (1-a) * RTT_SAMPLE(n) /* recommendation : a=0.9 */
RTO = b * RTT(n+1) /* recommendation : b = 2 */
RTO: Retransmission Timeout value
RTT_SAMPLE : measured RTT
• Karn’s algorithm– We cannot update the RTT estimation when an ACK for retransmitted segme
nt arrives because we don’t know to which segment the ACK corresponds, the original one or the retransmitted one?
– Don’t calculate a new RTO until an acknowledgement is received for a segment that was not retransmitted.
– Set the timeout after retransmission as follows: Timeout = 2 * RTT(n) /* exponential growth */
– After the ACK for the retransmitted segment arrives, restart the calculation of RTT_SAMPLE.
69
Jacobson’s Algorithm• Background
– We can have better performance when we consider variance together rather than use simple RTT average values alone.
• Jacobson’s algorithmDIFF(n+1) = RTT_SAMPLE(n+1) - RTT(n)
DEV(n+1) = DEV(n) + h * (|DIFF(n+1)| - DEV(n)) /* typically h = 1/8 */
RTT(n+1) = RTT(n) + g * DIFF(n+1) /* typically g = 1/4 */
Timeout(n+1) = RTT(n+1) + 4 * DEV(n+1)
70
TCP Timers
• Retransmission timer
• Persist timer
• Keepalive timer
• 2MSL timer
71
Retransmission Timer
• It is used for determining how long the TCP sender wait for retransmission (timeout).
• In the real implementation, there are not each timer operating for each segment. There is only one timer for each connection.
72
TCP Persist Timer• Background
– When the TCP receiver advertises window = 0, the TCP sender stops sending temporarily. Afterwards, the receiver lets the sender know it can receive segments again by sending new window advertisement. But if this new window advertisement is lost, the sender will wait for the new advertisement forever. (Deadlock!!)
• Solution– After the sender knows window=0, the sender transmits window probe
segment periodically to check out if the receiver is ready to accept. The window probe is sent according to the persist timer.
– Window probe is a segment of 1 byte length.
– TCP allows sender to transmit one byte even if the receiver’s window is closed.
– TCP persist timer is increasing exponentially.
73
TCP Persist Timer
win=0
win=256
lost
Deadlock
win=0
window probe
window probe
window probe
ACK(win=0)
ACK(win=0)
ACK(win=0)
Persist Timer (normal TCP Exponential backoff)
74
TCP Keepalive Timer• If there is no activity on a given connection for a period of time, the server sends
a probe segment to see if the client is still alive.
• The keepalive timer specifies the interval at which the server want to know if client’s host has either crashed or is down. The interval is normally 2 hours.
• When the Keepalive timer expires, the server sends a probe segment:– (1) if the client is still alive,
• It will respond and there will be no more prove for next 2
– (2) if the client is down,
• It times out after 75 seconds, and the server sends a total of 10 probes, 75 seconds apart, and if no response, the server terminates the connection.
– (3) if the client is rebooted,
• There is a response for the probe, but the reponse will be a reset. terminating the connection
– (4) if the client is alive but not unreachable,
• same as in case (2)
75
2 MSL Timer• 2 MSL(Maximum Segment Lifetime)
– It is the maximum amount of time any segment can exist in the network before being discarded.
– When TCP performs an active close and sends the final ACK(reponse to the FIN), that connection must stay in the TIME_WAIT state for twice the MSL.
– If the final ACK is lost, the other TCP can resend the FIN segment.
– And, new TCP connection will open after 2 M 니 . (Some systems prevents from using the port numbers existed during 2 M 니 )
• Quiet time concept– Suppose that a host crashed before the timeout while it is in the 2 MSLwait st
ate, and then rebooted immediately. If the host open a new TCP connection as soon as it reboot, it cannot distinguish old segment in the previous connection from new segments in the new connection.
– To avoid this confusion, TCP is not allowed to open new connection for 1 MSL right after rebooting. This 1 MSL time is called quiet time.
76
closed
(a) Closing connections sequentially
closed
Wait for 2 MSL and then terminate
closedclosed
ACK m+1
SN n-1
ACK nACK n+1
FIN m
FIN n
FIN n
ACK m+1
FIN m
ACK n+1
2 MSL Timer
Wait for 2 MSL and then terminate
(b) Closing simultaneously
77
Contents
• TCP
• TCP connection
• TCP flow control
• TCP congestion control
• TCP timer
• UDP
78
UDP
• Addressing and checksum• Providing unreliable service to application• Datagram-oriented
– one application data -> one UDP datagram
79
Data (if any)
16-bit source port number 16-bit destination port number
16-bit Length 16-bit Checksum
8
octets
UDP Header
IP header UDP header UDP data
IP datagram
UDP datagram
20 octets 8 octets
80
A Few Well-known UDP PortsDecimal Keyword UNIX Keyword Description
0 - - Reserved7 ECHO echo Echo9 DISCARD discard Discard11 USERS systat Active Users13 DAYTIME daytime Daytime15 - netstat Who is up or NETSTAT17 QUOTE qotd Quote of the Day19 CHARGEN chargen Character Generator37 TIME time Time42 NAMESERVER name Host Name Server43 NICNAME whois Who is53 DOMAIN nameserver Domain Name Server67 BOOTPS bootps Bootstrap Protocol Server68 BOOTPC bootpc Bootstrap Protocol Client69 TFTP tftp Trivial File Transfer111 SUNRPC sunrpc Sun Microsystems RPC123 NTP ntp Network Time Protocol161 - snmp SNMP net monitor162 - snmp-trap SNMP traps512 - biff UNIX comsat513 - who UNIX rwho daemon514 - syslog system log525 - timed Time daemon
81
TCP and UDP• TCP
– connection-oriented
– Reliable service provisioning
– Error control and flow control
– stream-oriented
– Good for stable transmission of long persistent data
• UDP
– connectionless
– Unreliable services
– No error control and no flow control
– datagram-oriented
– Good for short data or data that is permissible to error