CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion

1

CompSci 514: Computer NetworksLecture 4: End-to-end Congestion

Control

Xiaowei Yang

2

Outline• TCP congestion collapse

• TCP congestion control algorithm

• Analysis

• Throughput and loss rate modeling

3

A fundamental networking problem

• Packet switching– Statistical multiplexing

• Q: N users, and one network– How fast should each user send?– Or when should a packet be sent?– Why is it a difficult problem?

...

• A à C, B à D• How fast should A and B send?• Don’t know the number of users• Don’t know the bottleneck capacity

4

A

B

C

D1Gbps

1Gbps

10Mbps

1Gbps

1Mbps

The TCP Congestion Collapse Incident

• In October of 1986, the Internet had the first of what became a series of �congestion collapse�: increased load leads to decreased throughput

• The NSFnet phase-I backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, and continued until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988.

5

http://en.wikipedia.org/wiki/NSFnet

http://en.wikipedia.org/wiki/Congestion_control

What caused the TCP congestion collapse?

• Original TCP (Cerf74)– Static window W– W = # of unacknowledged bytes

6

Send 1, 2, 3, …, W bytes

ACK 1, 2, 3, …, WSend W+1, W+2, … 2W bytes

ACK W+1, W+2, … 2W bytes

Review of TCP’s sliding window algorithm

• A well-known algorithm in networking• Used for

– Reliable transmission– Flow control– Congestion control

7

Definitions

• Round Trip Time (RTT)– The time it takes from sending a packet to

receiving an ACK for that packet– RTTmin = minimum RTT observed during a

connection

• Bandwidth Delay Product (BDP)– Bottleneck Capacity * RTTmin

8

Drawbacks of static window sizes

• One TCP flow– MSS = 512 bytes, Window size = 10 MSS,

RTTmin = 100ms• 10 TCP flows, 100 TCP flows, …

9

What caused the TCP congestion collapse?

• Original TCP (Cerf74)– Static window W– W = # of unacknowledged bytes

• When user increases, queueing delay increases• TCP times out, retransmitting the whole window• TCP retransmits too early, wasting the

network’s bandwidth to retransmit the same packets already in transit and reducing useful throughput (goodput)

10

Retransmitting packets that are not lost à congestion collapse

11

How can we prevent this problem?

12

Design Goals• Congestion avoidance: making the

system operate around the knee to obtain low latency and high throughput

• Congestion control: making the system operate left to the cliff to avoid congestion collapse

• Congestion control: making the system operate left to the cliff to avoid congestion collapse

• Congestion avoidance: making the system operate around the knee to obtain low latency and high throughput

Design goals

• More concretely– Efficiency

• High throughput/utilization– Low latency– Fairness– Fast convergence

14

Key Improvements from the TCP88 paper

• Revised RTO computation– Prevent spurious retransmissions

• ACK self-clocking– Determines when to send a packet

• Dynamic window sizing– Determines how fast a sender can send a

packet

Revised RTO estimates

• Old design: RTTn+1 = a RTT + (1- a ) RTTnRTO = β RTTn+1

• Improved designsrttn+1 = a RTT + (1- a ) srttnrttvarn+1 = b ( | RTT – srttn | ) + (1- b )rttvarn

RTOn+1 = srttn+1 + 4 rttvarn+1

16

Exponential backoff• When a timeout occurs , the RTO value is

doubledRTOn+1 = max (2*RTOn , 64) seconds

• Can save the day in the worst case

17

ACK self-clocking

• When pipe is full, the speed of ACK returns equals to the speed new packets should be injected into the network

Why is self-clocking important

• Determines when a packet should be sent– A flow should send no faster than what its

bottleneck allows– ACK spacing correlates with bottleneck

capacity

27

Does ACK clocking solve it all?• No

• A sender will not send faster, but may under-utilize bandwidth or cause packets being buffered at the bottleneck

28

Dynamic window sizing• Sending speed: W / RTT

• Ideally, W / RTT = available bandwidth at the bottleneck

• à Adjusting W based on available bandwidth

1. Increases W when there is no congestion2. Decreases W when there is congestion

How to detect congestion?

• Packet loss– Time out– Duplicate ACKs

• Latency– Current RTT – RTTmin

• Sending rate / ACKing rate

• ML? 31

TCP Reno: Two Modes of Congestion Control

1. Probing for the available bandwidth– slow start (W < ssthresh)

2. Avoid overloading the network– congestion avoidance (W >= ssthresh)

Slow Start• Initial value: Set W = 1 MSS

• Modern TCP implementation may set initial window size to a much larger value

• When receiving an ACK, W+= 1 MSS

• W doubles every RTT– Exponential increase

Congestion Avoidance

• If W >= ssthresh then each time an ACK is received, increases W as follows:

• W += 1 / W

• W increases by one MSS per RTT

Reaction to Congestion

• Reduce W

• Timeout: severe congestion– W is reset to one MSS:– ssthresh is set to half of the current size of the

congestion window:ssthressh = W / 2

– entering slow-start

Reaction to Congestion• Duplicate ACKs: not so congested (why?)• Fast retransmit

– Three duplicate ACKs indicate a packet loss

– Retransmit without timeout

Reaction to congestion: Fast

Recovery

• Avoiding slow start (changed from TCP88)

– ssthresh = W / 2

– W = W +3MSS

– Increase W by one MSS for each additional duplicate ACK

• When ACK arrives that acknowledges “new data,” set:

W = ssthresh = W/2

enter congestion avoidance, i.e., increase W by 1 MSS per RTT

42

The Sawtooth behavior of TCP

• For every ACK received– W += 1/W

• For every packet lost– W /= 2

• Only true in unit of RTT

RTT

W

43

Why does it work? [Chiu-Jain]

• A feedback control system• The network uses feedback y to adjust users�

load åx_i

44

Goals of Congestion Avoidance

– Efficiency: the closeness of the total load on the resource of its knee: åx_i ~= capacity

– Fairness:

• When all x_is are equal, F(x) = 1• When all x_i�s are zero but x_j = 1, F(x) = 1/n

– Distributedness• A centralized scheme requires complete knowledge of the state of

the system

– Convergence• The system approach the goal state from any starting state

45

Metrics to measure convergence

• Responsiveness• Smoothness

46

Model the system as a linear control system

• Four sample types of controls• AIAD, AIMD, MIAD, MIMD

47

Phase space

X1=W1/RTT

X2=W2/RTT

48

TCP congestion control is AIMD

• Problems:– Each source has to probe for its bandwidth– Congestion occurs first before TCP backs off– Unfair: long RTT flows obtain smaller bandwidth

shares

RTT

W

49

Macroscopic behavior of TCP

pRTTMSS••5.1

• Throughput is inversely proportional to RTT:

• In a steady state, total packets sent in one sawtooth cycle:– S = w + (w+1) + … (w+w) = 3/2 w2

• the maximum window size is determined by the loss rate– 1/S = p– w =

• The length of one cycle: w * RTT• Average throughput: 3/2 w * MSS / RTT

11.5p

Why is congestion control still relevant

• The Mice & Elephant phenomenon of Internet traffic– Many small flows– But a few large flows sent most of the byte

50

52

Conclusion

• Congestion control is one of the fundamental issues in networking

• TCP congestion control algorithm• The AIMD algorithm• TCP macroscopic behavior model

Documents

CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion