Upload
bly
View
29
Download
0
Tags:
Embed Size (px)
DESCRIPTION
OBJECTIVES: understand principles behind transport layer services: multiplexing/demultiplexing reliable data transfer flow control congestion control. learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connection-oriented transport - PowerPoint PPT Presentation
Citation preview
1-1
Chapter 3: Transport LayerOBJECTIVES: understand
principles behind transport layer services: multiplexing/
demultiplexing reliable data
transfer flow control congestion control
learn about transport layer protocols in the Internet: UDP: connectionless
transport TCP: connection-oriented
transport TCP congestion control
1-2
Transport services and protocols provide logical
communication between app processes running on different hosts
transport protocols run in end systems send side: breaks app
messages into segments, passes to network layer
rcv side: reassembles segments into messages, passes to app layer
more than one transport protocol available to apps Internet: TCP and UDP
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
logical end-end transport
1-3
Transport vs. Network Layer network layer: logical communication
between hosts PDU: datagram packets may be lost, duplicated, reordered
in the Internet – “best effort” service transport layer: logical communication
between processes relies on, enhances, network layer services PDU: segment extends “host-to-host” communication to
“process-to-process” communication
1-4
TCP/IP Transport Layer Protocols
reliable, in-order delivery (TCP) congestion control flow control connection setup
unreliable, unordered delivery: UDP no-frills extension of “best-effort” IP What does UDP provide in addition to IP?
• process-to-process deliver• error checking
1-5
Multiplexing/Demultiplexing
Use same communication channel between hosts for several logical communication processes
How does Mux/DeMux work? Sockets: doors between process & host UDP socket: (dest. IP, dest. Port) TCP socket: (src. IP, src. port, dest. IP, dest. Port)
TransportLayer
NetworkLayer
TransportLayer
NetworkLayer
HTTP
FTP
Telnet
1-6
Connectionless demux
UDP socket identified by two-tuple: (dest IP address, dest port number)
When host receives UDP segment: checks destination port number in segment directs UDP segment to socket with that port number
IP datagrams with different source IP addresses and/or source port numbers are directed to the same socket
1-7
Connection-oriented demux
TCP socket identified by 4-tuple: source IP address source port number dest IP address dest port number
recv host uses all four values to direct segment to appropriate socket
Server host may support many simultaneous TCP sockets: each socket identified
by its own 4-tuple
Web servers have different sockets for each connecting client non-persistent HTTP will
have different socket for each request
1-8
UDP: User Datagram Protocol [RFC 768]
“bare bones” Internet transport protocol “best effort” service, UDP segments may be:
lost delivered out of order to app
Why use UDP? No connection establishment cost (critical for
some applications, e.g., DNS) No connection state Small segment headers (only 8 bytes) Finer application control over data
transmission
1-9
UDP Segment Structure
often used for streaming multimedia apps loss tolerant rate sensitive
Other appl. Protocols using UDP DNS SNMP
reliable transfer over UDP: add reliability at application layer application-specific
error recovery!
source port # dest port #
32 bits
Applicationdata
(message)
UDP segment format
length checksumLength, in
bytes of UDPsegment,including
header
1-10
UDP checksum
Sender: treat segment contents
as sequence of 16-bit integers
checksum: addition (1’s complement sum) of segment contents
sender puts checksum value into UDP checksum field
Receiver: compute checksum of
received segment check if computed
checksum equals checksum field value: NO - error detected YES - no error detected.
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
1-11
Internet Checksum Example Note: When adding numbers, a carryout
from the most significant bit needs to be added to the result
Example: add two 16-bit integers
Weak error protection? Why is it useful?
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
wraparound
sumchecksum
1-12
Checksum CalculationAt the sender Set the value of the checksum field to 0. Add all segments using one’s
complement arithmetic The final result is complemented to
obtain the checksumAt the receiver Add all segments, and complement’s the
results All zero’s => accept datagram, else
reject
1-13
UDP Checksum
Fill checksum with 0’s Divide into 16-bit words
(adding padding if required)
Add words using 1’s complement arithmetic
Complement the result and put in checksum field
Deliver UDP segment to IP
source port # dest port #
32 bits
data (add padding to make
data a multiple of 16 bits)
length checksum
1-14
Principles of Reliable Data Transfer important in application, transport, and link layers Fundamentally important problem in networking
characteristics of unreliable channel will determine complexity of reliable data transfer (rdt) protocol
Sending Process
Receiving Process
Reliable Channel
Sending Process
Receiving Process
Unreliable Channel
RDT protocol(sending side)
RDT protocol(receiving side)
Application Layer
Network Layer
Transport Layer
1-15
Reliable Data Transfer: FSMs
We’ll: incrementally develop sender, receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions!
use finite state machines (FSM) to specify sender, receiver
state1
state2
event causing state transitionactions taken on state transition
state: when in this “state” next state
uniquely determined by next event
eventactions
1-16
Rdt1.0: Data Transfer over a Perfect Channel
underlying channel perfectly reliable no bit errors packets received in the order sent no loss of packets
separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packet,data)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
1-17
Rdt2.0: channel with bit errors [stop & wait protocol]
Assumptions All packets are received Packets may be corrupted (i.e., bits may be flipped) Checksum to detect bit errors
How to recover from errors? Use Automatic Repeat reQuest) ARQ mechanism acknowledgements (ACKs): receiver explicitly tells
sender that the packet is received correctly negative acknowledgements (NAKs): receiver
explicitly tells sender that the packet had errors sender retransmits pkt on receipt of NAK
What about error correcting codes? ARQ needs error detection capability
1-18
rdt2.0: FSM specification
Wait for call from above
sndpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
1-19
rdt2.0: Observations
Wait for call from above
sndpkt = make_pkt(data, checksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isNAK(rcvpkt)
Wait for ACK or
NAK
sender
rdt_send(data)
1. A stop-and-Wait protocol
2. What happens when ACK or NAK has bit errors?
Approach 1: resend the current data packet?
Duplicatepackets
1-20
Handling Duplicate Packets
sender adds sequence number to each packet
sender retransmits current packet if ACK/NAK garbled
receiver discards (doesn’t deliver up) duplicate packet
1-21
rdt2.1: sender, handles garbled ACK/NAKs
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
1-22
rdt2.1: receiver, handles garbled ACK/NAKs
Wait for 0 from below
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
1-23
rtd2.1: examples
PKT(0)
PKT(0)
PKT(1)
ACK
ACK
sender receiver
x Receiver expects a pkt with seq. # 1
Duplicate pkt.
sender receiver
PKT(0)
ACK
PKT(1)
NAK
PKT(1)
ACK
PKT(0)
x
x
1-24
rdt2.1: summary
Sender: seq # added to pkt two seq. #’s (0,1)
will suffice. Why? must check if
received ACK/NAK corrupted
twice as many states state must
“remember” whether “current” pkt has 0 or 1 seq. #
Receiver: must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq #
note: receiver can not know if its last ACK/NAK received OK at sender
1-25
rdt2.2: a NAK-free protocol
same functionality as rdt2.1, using ACKs only instead of NAK, receiver sends ACK for last pkt
received OK receiver must explicitly include seq # of pkt being
ACKed
duplicate ACK at sender results in same action as NAK: retransmit current pkt
1-26
rdt2.2: sender, receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
1-27
rdt3.0:The case of “Lossy” Channels
Assumption: underlying channel can also lose packets (data or ACKs)
Approach: sender waits “reasonable” amount of time for ACK (a Time-Out) Time-out value? Possibility of duplicate packets/ACKs?
if pkt (or ACK) just delayed (not lost): retransmission will be duplicate, but use of
seq. #’s already handles this receiver must specify seq # of pkt being
ACKed
1-28
rdt3.0 sender
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
1-29
rdt3.0 in action
1-30
rdt3.0 in action
1-31
stop-and-wait operation
first packet bit transmitted, t = 0
sender receiver
D
last packet bit transmitted, t = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = D + L / R
Can we do better???
1-32
Pipelining: Motivation Stop-and-wait allows the sender to only have a
single unACKed packet at any time example: 1 Mbps link (R), end-2-end round trip
propagation delay (D) of 92ms, 1KB packet (L):
1KB pkt every 100 ms -> 80Kbps throughput on a 1 Mbps link
What does bandwidth x delay product tell us?
U sender
= 8ms
100ms = 0.08
microseconds
L / R
D + L / R =
Ttransmit
= 8kb/pkt10**3 kb/sec
= 8 msL (packet length in bits)R (transmission rate, bps)
=
1-33
Pipelining: Motivation
1-34
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts range of sequence numbers must be
increased buffering at sender and/or receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
1-35
Pipelining: increased utilization
first packet bit transmitted, t = 0
sender receiver
D
last bit transmitted, t = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = D + L / R
last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK
U sender
= 24ms
100ms = 0.24
microseconds
3 * L / R
D + L / R =
Increase utilizationby a factor of 3!
1-36
Go-Back-N Allow up to N unACKed pkts in the
network N is the Window size
Sender Operation: If window not full, transmit ACKs are cumulative On timeout, send all packets previously sent
but not yet ACKed. Uses a single timer – represents the oldest
transmitted, but not yet ACKed pkt
Why limit the unACKed pkts to N???
1-37
Go-Back-N
1-38
GBN: sender extended FSM
Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])…udt_send(sndpkt[nextseqnum-1])
timeout
rdt_send(data)
if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ }else refuse_data(data)
base = getacknum(rcvpkt)+1If (base == nextseqnum) stop_timer else start_timer
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
base=1nextseqnum=1
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
1-39
GBN: receiver extended FSM
ACK-only: always send ACK for correctly-received pkt with highest in-order seq # may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt: discard (don’t buffer) -> no receiver buffering! Re-ACK pkt with highest in-order seq #
Wait
udt_send(sndpkt)
default
rdt_rcv(rcvpkt) && notcurrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(expectedseqnum,ACK,chksum)udt_send(sndpkt)expectedseqnum++
expectedseqnum=1sndpkt = make_pkt(expectedseqnum,ACK,chksum)
1-40
GBN inaction
There is a performance issue with GBN?
1-41
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts, as needed, for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq #’s again limits seq #s of sent, unACKed pkts
1-42
Selective repeat: sender, receiver windows
1-43
Selective repeat
data from above : if next available seq # in
window, send pkt
timeout(n): resend pkt n, restart timer
ACK(n) in [sendbase,sendbase+N1]:
mark pkt n as received if n smallest unACKed pkt,
advance window base to next unACKed seq #
senderpkt n in [rcvbase, rcvbase+N-
1]
send ACK(n) out-of-order: buffer in-order: deliver (also
deliver buffered, in-order pkts), advance window to next not-yet-received pkt
pkt n in [rcvbase-N,rcvbase-1]
ACK(n)
otherwise: ignore
receiver
1-44
Selective Repeat Example0123 456789
PKT0
PKT1
PKT2
PKT3
ACK0 ACK1
ACK2
ACK30 1234 56789 PKT4
0 1234 56789
PKT1
ACK1
ACK4
01234 5678 9
0 1234 56789
01 2345 6789
01234 5678 9Time-Out
Sender Receiver
1-45
Another Example
1-46
Selective repeat: dilemma
Example: seq #’s: 0, 1, 2, 3 window size=3
receiver sees no difference in two scenarios!
incorrectly passes duplicate data as new in (a)
Q: what is the relationship between seq # size and window size?
1-47
Transmission Control Protocol
1-48
TCP segment structure
source port # dest port #
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG: urgent data (generally not used)
ACK: ACK #valid
PSH: push data now(generally not used)
RST, SYN, FIN:connection estab(setup, teardown
commands)
# bytes rcvr willingto accept
countingby bytes of data(not segments!)
Internetchecksum
(as in UDP)
1-49
Sequence and Acknowledgement Number TCP views data as unstructured, but
ordered stream of bytes. Sequence numbers are over bytes, not
segments Initial sequence number is chosen
randomly TCP is full duplex – numbering of data is
independent in each direction Acknowledgement number – sequence
number of the next byte expected from the sender
ACKs are cumulative
1-50
TCP seq. #’s and ACKsSeq. #’s:
byte stream “number” of first byte in segment’s data
ACKs: seq # of next byte
expected from other side
cumulative ACKQ: how receiver handles
out-of-order segments A: TCP spec
doesn’t say, - up to implementor
Host A Host B
Seq=42, ACK=79, data
Seq=79, ACK=1043, no data
Seq=1043, ACK=79, data
1000 bytedata
Host sends another
500 bytes
host ACKsreceipt of
data
timeSeq=79, ACK=1544, no data
1-51
TCP reliable data transfer
TCP creates rdt service on top of IP’s unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by: timeout events duplicate acks
Initially consider simplified TCP sender: ignore duplicate acks ignore flow control,
congestion control
1-52
TCP sender events:data rcvd from app: Create segment with
seq # seq # is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval: TimeOutInterval
timeout: retransmit segment
that caused timeout restart timer Ack rcvd: If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
1-53
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) { switch(event)
event: data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event: timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer }
} /* end of loop forever */
Comment:• SendBase-1: last cumulatively ack’ed byteExample:• SendBase-1 = 71;y= 73, so the rcvrwants 73+ ;y > SendBase, sothat new data is acked
1-54
TCP Flow Control
receive side of TCP connection has a receive buffer:
speed-matching service: matching the send rate to the receiving app’s drain rate app process may be
slow at reading from buffer
sender won’t overflow
receiver’s buffer bytransmitting too
much, too fast
flow control
1-55
TCP Flow control: how it works
Assumption: TCP receiver discards out-of order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesn’t overflow
1-56
Principles of Congestion Control Congestion: informally: “too many sources
sending too much data too fast for the network to handle”
Different from flow control! Manifestations:
Packet loss (buffer overflow at routers) Increased end-to-end delays (queuing in router
buffers) Congestion results in unfairness and poor
utilization of network resources Resources used by dropped packets (before they
were lost) Retransmissions Poor resource allocation at high load
1-57
Congestion Control: Approaches
Goal: Throttle senders as needed to ensure load on the network is “reasonable”
End-end congestion control: no explicit feedback from network congestion inferred from end-system
observed loss, delay approach taken by TCP
Network-assisted congestion control: routers provide feedback to end systems single bit indicating congestion (e.g., ECN) sender should send at explicit rate
1-58
TCP Congestion Control: Overview end-end control (no network assistance) Limit the number of packets in the
network to window W Roughly,
W is dynamic, function of perceived network congestion
rate = W
RTT Bytes/sec
1-59
TCP Congestion Controls Tahoe (Jacobson 1988)
Slow Start Congestion Avoidance Fast Retransmit
Reno (Jacobson 1990) Fast Recovery
SACK Vegas (Brakmo & Peterson 1994)
Delay and loss as indicators of congestion
1-60
data segment
Slow Start “Slow Start” is used to
reach the equilibrium state Initially: W = 1 (slow start) On each successful ACK:
W W + 1 Exponential growth of W
each RTT: W 2 x W Enter CA when
W >= ssthresh ssthresh: window size
after which TCP cautiously probes for bandwidth
ACK
sendercwnd
1
2
34
5678
receiver
1-61
ACK
data segment
Congestion Avoidance
Starts when W ssthresh
On each successful ACK:W W+ 1/W
Linear growth of W each RTT:
W W + 1
1
2
3
4
receiversender
1-62
CA: Additive Increase, Multiplicative Decrease
We have “additive increase” in the absence of loss events
After loss event, decrease congestion window by half – “multiplicative decrease” ssthresh = W/2 Enter Slow Start
1-63
Detecting Packet Loss
Assumption: loss indicates congestion
Option 1: time-out Waiting for a time-out
can be long!
Option 2: duplicate ACKs How many? At least
3.
11
13
12
1415
17
16
11
10
X
11
11
10
11
Sender Receiver
1-64
Fast Retransmit
Wait for a timeout is quite longImmediately retransmits after 3
dupACKs without waiting for timeout
Adjusts ssthreshssthresh W/2
Enter Slow Start W = 1
1-65
How to Set TCP Timeout Value?longer than RTT
but RTT variestoo short: premature timeout
unnecessary retransmissionstoo long: slow reaction to segment
loss
1-66
How to Estimate RTT?
SampleRTT: measured time from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary, want estimated RTT “smoother” average several recent measurements, not
just current SampleRTT
1-67
TCP Round-Trip Time and TimeoutEstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
EWMA influence of past
sample decreases exponentially fast
typical value: = 0.125
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT
(mill
isec
onds
)
SampleRTT Estimated RTT
1-68
TCP Round Trip Time and Timeout
Setting the timeout EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin first estimate of how much SampleRTT deviates from
EstimatedRTT:
TimeoutInterval = µ*EstimatedRTT + Ø*DevRTT
Typically, µ =1 and Ø = 4.
DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically, = 0.25)
Then set timeout interval:
[Jacobson/Karels Algorithm]
1-69
TCP Tahoe: Summary Basic ideas
Gently probe network for spare capacity Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time
estimation, error recoveryfor every ACK {
if (W < ssthresh) then W++ (SS) else W += 1/W (CA)
}for every loss {
ssthresh = W/2 W = 1 }
1-70
TCP Tahoe
Window
Time
W1
W1/2
W2
W2/2
ssthresh=W1/2
ssthresh=W2/2
Slow Start
Reached initial ssthresh value; switch to CA mode
1-71
Questions?
Q. 1. To what value is ssthresh initialized to at the start of the algorithm?
Q. 2. Why is “Fast Retransmit” triggered on receiving 3 duplicate ACKs (i.e., why isn’t it triggered on receiving a single duplicate ACK)?
Q. 3. Can we do better than TCP Tahoe?
1-72
TCP Reno
Window
TimeSlow Start
Reached initial ssthresh value; switch to CA mode
Note how there is “Fast Recovery” after cutting Window in half
1-73
TCP Reno: Fast Recovery Objective: prevent `pipe’ from emptying
after fast retransmit each dup ACK represents a packet having
left the pipe (successfully received) Let’s enter the “FR/FR” mode on 3 dup ACKs
ssthresh W/2retransmit lost packetW ssthresh + ndup (window inflation)Wait till W is large enough; transmit new packet(s)On non-dup ACK (1 RTT later)
W ssthresh (window deflation)enter CA mode
1-74
Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K
TCP connection 1
bottleneckrouter
capacity R
TCP connection 2
TCP Fairness
1-75
Chapter 3: Summary principles behind transport
layer services: multiplexing,
demultiplexing reliable data transfer flow control congestion control
instantiation and implementation in the Internet UDP TCP
Next Lecture: leaving the network
“edge” (application, transport layers)
into the network “core”