139
Transport Layer Our goals: understand principles behind transport layer services: multiplexing/ demultiplexing reliable data transfer flow control congestion control learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connection- oriented transport TCP congestion control

Transport Layer

  • Upload
    yael

  • View
    37

  • Download
    2

Embed Size (px)

DESCRIPTION

learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connection-oriented transport TCP congestion control. Our goals: understand principles behind transport layer services: multiplexing/ demultiplexing reliable data transfer flow control - PowerPoint PPT Presentation

Citation preview

3rd Edition: Chapter 3

Transport LayerOur goals: understand principles behind transport layer services:multiplexing/demultiplexingreliable data transferflow controlcongestion controllearn about transport layer protocols in the Internet:UDP: connectionless transportTCP: connection-oriented transportTCP congestion control

1Transport Layer Topics Review: multiplexing, connection and connectionless transport, services provided by a transport layerUDPReliable transportTools for reliable transport layerError detection, ACK/NACK, ARQApproaches to reliable transport Go-Back-NSelective repeatTCP ServicesTCP: Connection setup, acks and seq num, timeout and triple-dup ack, slow-start, congestion avoidance.2Transport Layerapplicationtransportnetworklinkphysicalapplicationtransportnetworklinkphysicalapplicationtransportnetworklinkphysicalapplicationtransportnetworklinkphysical

applicationtransportnetworklinkphysicalapplicationtransportnetworklinkphysicalWeb BrowserAppGoogle Server AppTransportTransportKey transport layer service: Send messages between AppsJust specify the destination and the message and thats itmessages

NetworkNetworkKey service the transport layer requires: Network should attempt to deliver segements.Transport layerTransfers messages between application in hostsFor ftp you exchange files and directory information.For http you exchange requests and replies/filesFor smtp messages are exchangedServices possibly providedReliabilityError detection/correctionFlow/congestion controlMultiplexing (support several messages being transported simultaneously) 4Connection oriented / connectionlessTCP supports the idea of a connectionOnce listen and connect complete, there is a logical connection between the hosts.One can determine if the message was sentUDP is connectionlessPackets are just sent. There is no concept (supported by the transport layer) of a connectionBut the application can make a connection over UDP. So the application is each host will support the hand-shaking and monitoring the state of the connection.

There are other transport layer protocols such as SCTP besides TCP and UDP, but TCP and UDP are the most popular5TCP vs. UDPConnection orientedConnections must be set upThe state of the connection can be determinedFlow/congestion controlLimits congestion in the network and end hostsControl how fast data can be sentLarger Packet headerAutomatically retransmits lost packets and reports if the message was not successfully transmittedCheck sum for error detection

ConnectionlessConnections do not need to be set-upNo feedback provided as to whether packets were successfully deliveredNo flow/congestion controlCould cause excessive congestion and unfair usageData can be sent exactly when it needs to beLow overheadCheck sum for error detection

6Applications and Transport ProtocolsApplicationTCP or UDP?SMTPTelnetHTTPFTPMultimedia streaming via youtudeVoIP via SkypeDNSUDPUDPTCPTCPTCPTCPTCPNFSTCP or UDP7Multiplexing with portsClientIP:BP1client IP: AP1P2P4serverIP: CSP: 9157DP: 80SP: 9157DP: 80P5P6P3D-IP:CS-IP: AD-IP:CS-IP: BSP: 5775DP: 80D-IP:CS-IP: BTransport layer packet headers always contain source and destination portIP headers have source and destination IPsWhen a message is sent, the destination port must be known. However, the source port could be selected by the OS.AppTransportNetworkTCPTCPTCP8About multiplexingHTTP usually has port 80 as the destination, but you can make a web server listen on any port that is not already used by another applicationICANN registered ports (0-1024)HTTP: 80HTTP over SSL: 443FTP: 21Telnet: 23DNS: 53Microsoft server: 3389Typically, only one application can listen on a port at a time (tools such as PCAP can be used to listen on ports that are already in use. Wireshark uses PCAP)For TCP, you cannot control the source port; the OS sets it. For UDP, you can set the source port. A connection is defined as a 5 tuple: source IP, source port, destination IP, and destination port, and transport protocol.NATs make use to these five pieces of information. NATs are discussed in detail in Chapter 4, but they are dependent on transport layerSince connections are defined by ports and addresses, there cross layer dependencies (the transport layer cannot demultiplex without knowledge of the IP addresses, with is a concept of a different layer.)

9Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control10UDP: User Datagram Protocol [RFC 768]no frills, bare bones Internet transport protocolbest effort service, UDP segments may be:lostdelivered out of order to appconnectionless:no handshaking between UDP sender, receivereach UDP segment handled independently of others

Why is there a UDP?no connection establishment (which can add delay)simple: no connection state at sender, receiversmall segment headerno congestion control: UDP can blast away as fast as desired

11UDP: moreoften used for streaming multimedia appsloss tolerantrate sensitiveother UDP usesDNSSNMPreliable transfer over UDP: add reliability at application layerapplication-specific error recovery!source port #dest port #32 bitsApplicationdata (message)UDP segment formatlengthchecksumLength, inbytes of UDPsegment,includingheader12UDP checksumSender:treat segment contents as sequence of 16-bit integerschecksum: addition (1s complement sum) of segment contentssender puts checksum value into UDP checksum field

Receiver:compute checksum of received segmentcheck if computed checksum equals checksum field value:NO - error detectedYES - no error detected. But maybe errors nonetheless? More later .

Goal: detect errors (e.g., flipped bits) in transmitted segment

13Internet Checksum ExampleNoteWhen adding numbers, a carryout from the most significant bit needs to be added to the resultExample: add two 16-bit integers1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1wraparoundsumchecksum14Kurose and Ross forgot to say anything about wrapping the carry and adding it to low order bitChapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control15Principles of reliable data transfer

16Principles of Reliable data transfer

17Principles of reliable data transfer

18Reliable data transfer: getting started

sendsidereceivesiderdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layerudt_send(): called by rdt,to transfer packet over unreliable channel to receiverrdt_rcv(): called when packet arrives on rcv-side of channeldeliver_data(): called by rdt to deliver data to upper19Application implemented reliable data transfer Main AppUDPcommunicationApplicationMain AppcommunicationApplicationUDPreliable channelunreliable channelApplication LayerTransportLayerPros and cons of implementing a reliable transport protocol in the applicationConsIt is already done by the OS, why reinvent the wheel.The OS might have higher priority than the application.ProsThe OSs TCP is designed to work in every scenario, but your app might only exist in specific scenariosNetwork storage deviceMobile phoneCloud app20Reliable data transfer: getting startedWell:incrementally develop sender, receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transferbut control info will flow on both directions!use finite state machines (FSM) to specify sender, receiverstate1state2event causing state transitionactions taken on state transitionstate: when in this state next state uniquely determined by next eventeventactions21Rdt1.0: reliable transfer over a reliable channelAssume that the underlying channel is perfectly reliableno bit errorsno loss of packetsMake separate FSMs for sender, receiver:sender sends data into underlying channelreceiver read data from underlying channelsegment = make_pkt(data)udt_send(segment)Wait for call from aboverdt_send(data)senderdata = extract (segment)deliver_data(data)Wait for call from belowrdt_rcv(segment)receiver22Rdt2.0: channel with bit errorsunderlying channel may flip bits in packetschecksum to detect bit errorsthe question: how to recover from errors:negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKacknowledgements (ACKs): receiver explicitly tells sender that pkt received OKnew mechanisms in rdt2.0 (beyond rdt1.0):error detectionreceiver feedback: control msgs (ACK,NAK) rcvr->sender23rdt2.0: FSM specificationWait for call from abovesnkpkt = make_pkt(data, checksum)udt_send(sndpkt)extract(rcvpkt,data) deliver_data(data) udt_send(ACK)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)Wait for ACK or NAKWait for call from belowsenderreceiverrdt_send(data)24rdt2.0: FSM specificationWait for call from abovesnkpkt = make_pkt(data, checksum)udt_send(sndpkt)data = extract(rcvpkt)deliver_data(data)udt_send(ACK)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)rdt_rcv(rcvpkt) && isACK(rcvpkt)Wait for ACK or NAKWait for call from belowsenderreceiverrdt_send(data)25rdt2.0: FSM specificationWait for call from abovesnkpkt = make_pkt(data, checksum)udt_send(sndpkt)data = extract(rcvpkt)deliver_data(data)udt_send(ACK)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)rdt_rcv(rcvpkt) && isACK(rcvpkt)udt_send(sndpkt)rdt_rcv(rcvpkt) && isNAK(rcvpkt)udt_send(NAK)rdt_rcv(rcvpkt) && corrupt(rcvpkt)Wait for ACK or NAKWait for call from belowsenderreceiverrdt_send(data)26rdt2.0 has a fatal flaw!What happens if ACK/NAK corrupted?sender doesnt know what happened at receiver!cant just retransmit: possible duplicate

Handling duplicates: sender retransmits current pkt if ACK/NAK garbledsender adds sequence number to each pktreceiver discards (doesnt deliver up) duplicate pktSender sends one packet, then waits for receiver responsestop and wait27rdt2.1: sender, handles garbled ACK/NAKsWait for call 0 from abovesndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)rdt_send(data)Wait for ACK or NAK 0udt_send(sndpkt)rdt_rcv(rcvpkt) && (corrupt(rcvpkt) || isNAK(rcvpkt) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) Wait for call 1 from abovesndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)rdt_send(data)udt_send(sndpkt)rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) Wait for ACK or NAK 128extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt2.1: receiver, handles garbled ACK/NAKsWait for 0 from belowWait for 1 from belowrdt_rcv(rcvpkt) && !corrupt(rcvpkt) && has_seq0(rcvpkt) 29sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt2.1: receiver, handles garbled ACK/NAKsWait for 0 from belowWait for 1 from belowrdt_rcv(rcvpkt) && !corrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && ! corrupt(rcvpkt) && seqnum(rcvpkt)==1

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt)30sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt2.1: receiver, handles garbled ACK/NAKsWait for 0 from belowsndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)

rdt_rcv(rcvpkt) && !corrupt(rcvpkt) && has_seq1(rcvpkt) extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)Wait for 1 from belowrdt_rcv(rcvpkt) && !corrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && ! corrupt(rcvpkt) && seqnum(rcvpkt)==1

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt)31rdt2.1: sender, handles garbled ACK/NAKsWait for call 0 from abovesndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)rdt_send(data)Wait for ACK or NAK 0udt_send(sndpkt)rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)rdt_send(data)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) udt_send(sndpkt)rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) Wait for call 1 from aboveWait for ACK or NAK 1LL32rdt2.1: receiver, handles garbled ACK/NAKsWait for 0 from belowsndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)Wait for 1 from belowrdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && (corrupt(rcvpkt)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)33rdt2.1: discussionSender:seq # added to pkttwo seq. #s (0,1) will suffice. Why?must check if received ACK/NAK corrupted twice as many statesstate must remember whether current pkt has 0 or 1 seq. #

Receiver:must check if received packet is duplicatestate indicates whether 0 or 1 is expected pkt seq #note: receiver can not know if its last ACK/NAK received OK at sender34rdt2.2: a NAK-free protocolsame functionality as rdt2.1, using ACKs onlyinstead of NAK, receiver sends ACK for last pkt received OKreceiver must explicitly include seq # of pkt being ACKed duplicate ACK at sender results in same action as NAK: retransmit current pkt35rdt2.2: sender, receiver fragmentsWait for call 0 from abovesndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)rdt_send(data)udt_send(sndpkt)rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0) Wait for ACK0sender FSMfragmentWait for 0 from belowrdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && (corrupt(rcvpkt) || has_seq1(rcvpkt))udt_send(sndpkt)receiver FSMfragmentLWhat happens if a pkt is duplicated?36rdt3.0: channels with errors and lossNew assumption: underlying channel can also lose packets (data or ACKs)checksum, seq. #, ACKs, retransmissions will be of help, but not enoughApproach: sender waits reasonable amount of time for ACK retransmits if no ACK received in this timeif pkt (or ACK) just delayed (not lost):retransmission will be duplicate, but use of seq. #s already handles thisreceiver must specify seq # of pkt being ACKedrequires countdown timer37stop_timerrdt3.0 sendersndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timerrdt_send(data)Wait for ACK0rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )Wait for call 1 from aboverdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0) udt_send(sndpkt)start_timertimeoutWait for call 0from above38rdt3.0 sendersndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timerrdt_send(data)Wait for ACK0rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )Wait for call 1 from abovesndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timerrdt_send(data)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0) rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1) stop_timerstop_timerudt_send(sndpkt)start_timertimeoutudt_send(sndpkt)start_timertimeoutrdt_rcv(rcvpkt)Wait for call 0from aboveWait for ACK1rdt_rcv(rcvpkt)39resend pkt1rdt3.0 in actionsenderreceiversend pkt0timerec pkt0send ack0send pkt1rec ack0rec pkt1send ack1rec ack1send pkt1rec pkt1senderreceiversend pkt0timerec pkt0send ack0send pkt1rec ack0send pkt2rec pkt2TOrec pkt1send ack1rec ack140rdt3.0 in actionsenderreceivertimesend pkt0rec pkt0send ack0rec ack0send pkt1rec pkt1send pkt1rec pkt1TOsend ack1send pkt2rec ack1send ack1senderreceivertimesend pkt0rec pkt0send ack0rec ack0send pkt1rec pkt1send pkt1rec pkt1TOsend ack1send pkt?rec ack1send ack1send pkt2rec pkt2rec ack1send no pkt (dupACK)send ack2rec ack2send pkt241rdt3.0 sendersndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timerrdt_send(data)Wait for ACK0rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )Wait for call 1 from abovesndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timerrdt_send(data)rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0) rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1) stop_timerstop_timerudt_send(sndpkt)start_timertimeoutudt_send(sndpkt)start_timertimeoutrdt_rcv(rcvpkt)Wait for call 0from aboveWait for ACK1rdt_rcv(rcvpkt)42Performance of rdt3.0rdt3.0 works, but performance stinksex: 1 Gbps link, 15 ms prop. delay, 8000 bit packet and 100bit ACK:What is the total delayData transmission delay8000/109 = 810-6ACK Transmission delay100/109 = 10-7 secTotal Delay 215ms + .008 + .0001=30.0081ms

UtilizationTime transmitting / total time.008 / 30.0081 = 0.00027

This is one pkt every 30msec or 33 kB/sec over a 1 Gbps link!

Is this only a problem on fast links? That is, was this a problem in 1974 when data rates were very low?

43rdt3.0: stop-and-wait operationfirst packet bit transmitted, t = 0senderreceiverRTT last packet bit transmitted, t = L / Rfirst packet bit arriveslast packet bit arrives, send ACKACK arrives, send next packet, t = RTT + L / R

44Pipelined protocolsPipelining: sender allows multiple, in-flight, yet-to-be-acknowledged pktsrange of sequence numbers must be increasedbuffering at sender and/or receiverTwo generic forms of pipelined protocols: go-Back-N, selective repeat

45Pipelining: increased utilizationfirst packet bit transmitted, t = 0senderreceiverRTT last bit transmitted, t = L / Rfirst packet bit arriveslast packet bit arrives, send ACKACK arrives, send next packet, t = RTT + L / Rlast bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK

Increase utilizationby a factor of 3!46Pipelining ProtocolsGo-back-N: big picSender can have up to N unacked packets in pipelineRcvr only sends cumulative acksDoesnt ack packet if theres a gapSender has timer for oldest unacked packetIf timer expires, retransmit all unacked packetsSelective Repeat: big picSender can have up to N unacked packets in pipelineRcvr acks individual packetsSender maintains timer for each unacked packetWhen timer expires, retransmit only unack packet

47Go-Back-NSender:k-bit seq # in pkt headerwindow of up to N, unacked pkts allowed

ACK(n): ACKs all pkts up to, including seq # n - cumulative ACKmay receive duplicate ACKs (see receiver)timer for each in-flight pkttimeout(n): retransmit pkt n and all higher seq # pkts in window

48Go-Back-NPkt that could be sentunACKed pktUnused pktACKed pktsend pktwindow1 unACKed pktsstartwindowN=120 unACKed pktspktssend pktswindowN unACKed pktsNext pkt to be sentACK arriveswindowN=12Send pktN unACKed pktswindowN-1 unACKed pktsSliding windowState of pkts49Go-Back-NPkt that could be sentunACKed pktUnused pktACKed pktwindowN unACKed pktswindowN-1 unACKed pktsACK arrivesSend pktN unACKed pktswindowN unACKed pktswindowNo ACK arrives . timeout0 unACKed pktswindow50Go-Back-N

base51

baseGBN: sender extended FSMWaitrdt_send(data) if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) startTimer(nextseqnum) nextseqnum++}else refuse_data(data)for i = base to getacknum(rcvpkt) { stop_timer(i)}base = getacknum(rcvpkt)+1rdt_rcv(rcvpkt) && !corrupt(rcvpkt)

base=1nextseqnum=1

start

baseGBN: sender extended FSMWaitudt_send(sndpkt[base])startTimer(base)udt_send(sndpkt[base+1])startTimer(base+1)udt_send(sndpkt[nextseqnum-1])startTimer(nextseqnum-1)

timeout

rdt_send(data) if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) startTimer(nextseqnum) nextseqnum++}else refuse_data(data)for i = base to getacknum(rcvpkt) { stop_timer(i)}base = getacknum(rcvpkt)+1rdt_rcv(rcvpkt) && !corrupt(rcvpkt)

base=1nextseqnum=1

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

start

GBN: sender extended Activity DiagramWaiting for fileSet NSet NextPktToSend=0Set LastACKed=-1NextPktToSend LastACKed no receiver buffering!Re-ACK pkt with highest in-order seq #Waitsndpkt = make_pkt(expectedSeqNum-1,ACK,chksum)udt_send(sndpkt)rdt_rcv(rcvpkt) && (currupt(rcvpkt) || seqNum(rcvpkt)!=expectedSeqNum)

rdt_rcv(rcvpkt) && !currupt(rcvpkt) && seqNum(rcvpkt)==expectedSeqNumextract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(expectedSeqNum,ACK,chksum)udt_send(sndpkt)expectedSeqNum++expectedSeqNum=1

start up

expectedSeqNumReceived !Received 57GBN in ActionsenderreceiverSend pkt0Send pkt2Send pkt3Send pkt4Send pkt5Send pkt6Send pkt7Send pkt8Send pkt9TOSend pkt6Send pkt7Send pkt8Send pkt9Rec 0, give to app, and Send ACK=0Rec 1, give to app, and Send ACK=1Rec 2, give to app, and Send ACK=2Rec 3, give to app, and Send ACK=3Rec 4, give to app, and Send ACK=4Rec 5, give to app, and Send ACK=5Rec 7, discard, and Send ACK=5Rec 8, discard, and Send ACK=5Rec 9, discard, and Send ACK=5Rec 6, give to app,. and Send ACK=6Rec 7, give to app,. and Send ACK=7Rec 8, give to app,. and Send ACK=8Rec 9, give to app,. and Send ACK=9Send pkt158Optimal size of N in GBN (or selective repeat)senderreceiverSend pkt0Send pkt2Send pkt3Send pkt4Send pkt5Send pkt6Send pkt7RTTSend pkt159Optimal size of N in GBN (or selective repeat)senderreceiverSend pkt0Send pkt2Send pkt3RTTSend pkt1Q: How large should N be?A: Large enough so that the transmitter is constantly transmitting.

How many pkts can be transmitted before the first ACK arrives?==How many pkts can be transmitter in one RTT?N = RTT / (L/R)

This is only a first crack at the size of N:What if there are other data transfers sharing the link?What if the receiver has a slower link than the transmitter?What if some intermediate link is the slowest?

senderreceiver1Gbps1Mbpssender1Gbps1Mbpsreceiver1Gbps60Selective Repeatreceiver individually acknowledges all correctly received pktsbuffers pkts, as needed, for eventual in-order delivery to upper layersender only resends pkts for which ACK is not receivedsender timer for each unACKed pktsender windowN consecutive seq #sagain limits seq #s of sent, unACKed pkts61Selective repeat in actionPkt that could be sentunACKed pktUnused pktACKed pktState of pktsWindowN=6WindowN=6Delivered to appWindowN=6WindowN=6WindowN=6WindowN=6WindowN=6WindowN=6ACKed + Buffered62Selective repeat in actionPkt that could be sentunACKed pktUnused pktACKed pktState of pktsWindowN=6Delivered to appWindowN=6WindowN=6WindowN=6WindowN=6WindowN=6WindowN=6WindowN=6ACKed + Buffered63Selective repeat in actionPkt that could be sentunACKed pktUnused pktACKed pktState of pktsWindowN=6Delivered to appWindowN=6ACKed + BufferedWindowN=6WindowN=664Selective repeat in actionPkt that could be sentunACKed pktUnused pktACKed pktState of pktsDelivered to appACKed + BufferedWindowN=6WindowN=6TOWindowN=6WindowN=665Selective repeatdata from above :if next available seq # in window, send pkttimeout(n):resend pkt n, restart timerACK(n) in [sendbase,sendbase+N]:mark pkt n as receivedif n smallest unACKed pkt, advance window base to next unACKed seq #

senderpkt n in [rcvbase, rcvbase+N-1]send ACK(n)out-of-order: bufferin-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pktpkt n in [rcvbase-N,rcvbase-1]ACK(n)otherwise: ignore

receiverWindowN=6WindowN=6sendbasercvbase66Summary of transport layer tools used so farACK and NACKSequence numbers (and no NACK)Time outSliding windowOptimal size = ?Cumulative ACKBuffer at the receiver is optionalSelective ACKRequires buffering at the receiver67Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control68Go to other slidesTCP: Overview RFCs: 793, 1122, 1323, 2018, 2581full duplex data:bi-directional data flow in same connectionMSS: maximum segment sizeconnection-oriented: handshaking (exchange of control msgs) inits sender, receiver state before data exchangeflow controlled:sender will not overwhelm receiverpoint-to-point:one sender, one receiver reliable, in-order byte steam:Pipelined and time-varying window size:TCP congestion and flow control set window sizesend & receive buffers

70TCP segment structuresource port #dest port #32 bitsapplicationdata (variable length)sequence numberacknowledgement numberReceive windowUrg data pnterchecksumFSRPAUheadlennotusedOptions (variable length)URG: urgent data (generally not used)ACK: ACK #validPSH: push data now(generally not used)RST, SYN, FIN:connection estab(setup, teardowncommands)Internetchecksum(as in UDP)# bytes rcvr willingto acceptcountingby bytes of data(not segments!)71TCP seq. #s and ACKsSeq. #s:byte stream number of first byte in segments dataIt can be used as a pointer for placing the received data in the receiver bufferACKs:seq # of next byte expected from other sidecumulative ACK

Host AHost BSeq=42, ACK=79, data = CSeq=79, ACK=43, data = CSeq=43, ACK=80UsertypesChost ACKsreceipt of echoedChost ACKsreceipt ofC, echoesback Ctimesimple telnet scenario72Seq no and ACKs110108HELLO WORLD101102103104105106107109111Byte numbersSeq no: 101ACK no: 12Data: HELLength: 3Seq no: 12ACK no: Data: Length: 0Seq no: 104ACK no: 12Data: LO WLength: 4Seq no: 12ACK no:Data: Length: 010410873Seq no and ACKs - bidirectional110108HELLO WORLD101102103104105106107109111Byte numbersGOODB UY12131415161718Seq no: 101ACK no: 12Data: HELLength: 3Seq no: ACK no:Data: GOODLength: 4Seq no: ACK no: Data: LO WLength: 4Seq no: ACK no: Data: BULength: 2121041041610816TCP Round Trip Time and TimeoutQ: how to set TCP timeout value (RTO)?If RTO is too short: premature timeoutunnecessary retransmissionsIf RTO is too long: slow reaction to segment loss

Can RTT be used?No, RTT varies, there is no single RTTWhy does RTT varying?Because statistical multiplexing results in queuingHow about using the average RTT?The average is too small, since half of the RTTs are larger the averageQ: how to estimate RTT?SampleRTT: measured time from segment transmission until ACK receiptignore retransmissionsSampleRTT will vary, want estimated RTT smootheraverage several recent measurements, not just current SampleRTT75TCP Round Trip Time and TimeoutEstimatedRTT = (1- )*EstimatedRTT + *SampleRTTExponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value: = 0.12576Example RTT estimation:

77TCP Round Trip Time and TimeoutSetting the timeout (RTO)RTO = EstimtedRTT plus safety marginlarge variation in EstimatedRTT -> larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT: RTO = EstimatedRTT + 4*DevRTTDevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|

(typically, = 0.25) Then set timeout interval:78TCP Round Trip Time and TimeoutRTO = EstimatedRTT + 4*DevRTTMight not always workRTO = max(MinRTO, EstimatedRTT + 4*DevRTT)MinRTO = 250 ms for Linux 500 ms for windows 1 sec for BSDSo in most cases RTO = minRTOActually, when RTO>MinRTO, the performance is quite bad; there are many spurious timeouts.Note that RTO was computed in an ad hoc way. It is really a signal processing and queuing theory question79RTO detailsWhen a pkt is sent, the timer is started, unless it is already running.When a new ACK is received, the timer is restartedThus, the timer is for the oldest unACKed pktQ: if RTO=RTT+, are there many spurious timeouts?A: Not necessarily

RTOACK arrives, and so RTO timer is restartedRTORTORTOThis shifting of the RTO means that even if RTO