29
CSS432: End-to-End Protocols 1 CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2 Professor: Munehiro Fukuda

CSS432: End-to-End Protocols 1 CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2 Professor: Munehiro Fukuda

Embed Size (px)

Citation preview

CSS432: End-to-End Protocols 1

CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2

Professor: Munehiro Fukuda

2

IP Protocols Underlying best-effort network

drop messages re-orders messages delivers duplicate copies of a given message limits messages to some finite size delivers messages after an arbitrarily long delay

M5, M4, M3, M2, M1 M5, M2, M1

M4, M3 M4

M3

M2, M1, M4M5

Request a retransmission of M3 and M5

M5, and M3M5, M5, M3

CSS432: End-to-End Protocols

3

Simple Demultiplexor (UDP) Unreliable and unordered datagram service Adds multiplexing No flow control Endpoints identified by ports

servers have well-known ports see /etc/services on Unix

Header format

Optional checksum psuedo header + UDP header + data

SrcPort DstPort

Checksum Length

Data

0 16 31

CSS432: End-to-End Protocols

4

UDP: Simple Demultiplexerint sd;sd = socket(AF_INET, SOCK_DGRAM, 0);

struct sockaddr_in server;server.sin_family =AF_INET;server.sin_addr.s_addr = htonl( INADDR_ANY)server.sin_port = htons( 12345 );bind( sd, (struct sockaddr *)&server, sizeof( server ) );

recv( sd, buf, sizeof( buf ), 0 );

struct sockaddr_in server;struct hostent *hp, *gethostbyname( );Server.sin_family = AF_INET;hp = gethostbyname( );bcopy( hp->h_addr, &( server.sin_addr.s_addr ), sizeof( hp->h_length) );server.sin_port = htons( 12345 );

sendto( sd, buf, sizeof( buf ), 0, (struct sockaddr *)&server, sizeof( server ) );

socket()

sendto()

socket()

bind()

recv()

Port:12345

socket()

bind()

recv()

Port: 13579

13579

13579

UDP

Packetsdemultiplexed

M5, M4, M3, M2, M1CSS432: End-to-End Protocols

5

End-to-End Protocols Common end-to-end services

guarantee message delivery deliver messages in FIFO order deliver at most one copy of each message support arbitrarily large messages support synchronization allow the receiver to flow control the sender support multiple application processes on each host

P1 P2 P3 P4

NetworkM5, M4, M3, M2, M1

m5, m4, m3, m2, m1

CSS432: End-to-End Protocols

6

TCP Overview Connection-oriented Byte-stream

app writes bytes TCP sends

segments app reads bytes

Application process

Writebytes

TCPSend buffer

Segment Segment Segment

Transmit segments

Application process

Readbytes

TCPReceive buffer

… …

Full duplex Flow control: keep sender from

overrunning receiver Congestion control: keep sender

from overrunning network

CSS432: End-to-End Protocols

7

Sockets (Code Example)int sd, newSd;sd = socket(AF_INET, SOCK_STREAM, 0);

sockaddr_in server;bzero( (char *)&server, sizeof( server ) );server.sin_family =AF_INET;server.sin_addr.s_addr = htonl( INADDR_ANY )server.sin_port = htons( 12345 );bind( sd, (sockaddr *)&server, sizeof( server ) );

struct hostent *host = gethostbyname( arg[0] );sockaddr_in server;bzero( (char *)&server, sizeof( server ) );server.sin_family = AF_INET;server.s_addr = inet_addr( inet_ntoa( *(struct in_addr*)*host->h_addr_list ) );server.sin_port = htons( 12345 );

connect( sd, (sockaddr *)&server, sizeof( server ) );

sockaddr_in client; socklen_t len=sizeof(client);while( true ) { newSd = accept(sd, (sockaddr *)&client, &len);

write( sd, buf1, sizeof( buf ) );write( sd, buf2, sizeof( buf ) ); if ( fork( ) == 0 ) {

close( sd ); read( newSd, buf1, sizeof( buf1 ) ); read( newSd, buf2, sizeof( buf2 ) );

} close( newSd );}

read() read()

socket()

connect()

write()

socket()

connect()

write()

listen( sd, 5 );

socket()

bind()

listen()

accept()

buf2, buf1

buf2, buf1close( newsd);exit( 0 );

int sd = socket(AF_INET, SOCK_STREAM, 0);

CSS432: End-to-End Protocols

8

Data Link Versus Transport Potentially connects many different hosts

need explicit connection establishment and termination Potentially different RTT (Round Trip Time)

need adaptive timeout mechanism Potentially long delay in network

need to be prepared for arrival of very old packets need to set MSL (Maximum Segment Lifetime)

Potentially different capacity at destination need to accommodate different node capacity

Potentially different network capacity need to be prepared for network congestion

CSS432: End-to-End Protocols

9

Segment Format

Options (variable)

Data

Checksum

SrcPort DstPort

HdrLen 0 Flags

UrgPtr

AdvertisedWindow

SequenceNum

Acknowledgment

0 4 10 16 31

Each connection identified with 4-tuple: (SrcPort, SrcIPAddr, DsrPort,

DstIPAddr) Sliding window + flow control

Acknowledgment, SequenceNum, AdvertisedWinow

Flags SYN, FIN, RESET, PUSH, URG,

ACK SYN: Establishing a conneciton FIN: terminating a connection RESET: Confused and Terminating PUSH: Section 5.2.7 URG: Sending urgent data ACK: Validating acknowledgment

field SequenceNum is incremented in all

cases other than ACK.

Sender

Data (SequenceNum)

Acknowledgment +AdvertisedWindow

Receiver

CSS432: End-to-End Protocols

10

Connection Establishment and Termination

Active participant(client)

Passive participant(server)

SYN, SequenceNum = x

SYN + ACK, SequenceNum = y,

ACK, Acknowledgment = y + 1

Acknowledgment = x + 1

Tree-Way Handshake Client

Initiate a connection to a server with seq=x

Set a timer and retransmit the request upon an expiration

Server Acknowledge the client

request with ack=++x Initiate a reverse

connection with seq=y Set a timer and retransmit

the request upon an expiration

Client Acknowledge the server

request with ack=++y X and y are chosen at

random Protect against two

incarnations of the same connection using the same sequence number.

CSS432: End-to-End Protocols

11

State Transition DiagramCLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes(2 * MSL)FIN/ACK

ACK

ACK

ACK

Close/FIN

Close

CLOSED

Active open/SYN Open

Active open A client connect( )

Passive open A server listen( ) Can change active

Close Active close

A client or a server First close( ) Both side can be

active Passive close

A client or server close( ) in response to

the first close( )CSS432: End-to-End Protocols

12

State Transition Diagram

In what condition can the state transit from FIN_WAIT_1 to TIME_WAIT?

What is the purpose of the TIME_WAIT state?

CSS432: End-to-End Protocols

13

Timing ChartClient Server

( connect( ) ) SYN_SENT SYN seq=x

ACK=y + 1

SYN_RCVD

ESTABLISHED

LISTEN ( listen( ) )

SYN seq=y, ACK=x + 1

ESTABLISHED

( write( ) )

( read( ) )

seq=x+1 ACK=y + 1

ACK x + 2

( close( ) ) FIN_WAIT_1 FIN seq=x+2 ACK=y + 1CLOSE_WAIT

LAST_ACK( close( ) )ACK x + 3

FIN seq = y + 1FIN_WAIT_2

TIME_WAITACK=y + 2

Est

ablis

hmen

tT

erm

inat

ion

Dat

aT

rans

fer

Peek such a flow with tcpdump in assignment 3.CSS432: End-to-End Protocols

14

Sliding Window Revisited

Sending side LastByteAcked ≤ LastByteSent LastByteSent ≤ LastByteWritten buffer bytes between

[LastByteAcked, LastByteWritten]

Sending application

LastByteWritten

TCP

LastByteSentLastByteAcked

Receiving application

LastByteRead

TCP

LastByteRcvdNextByteExpected

Receiving side LastByteRead < NextByteExpected NextByteExpected ≤ LastByteRcvd+1 buffer bytes between

[NextByteRead, LastByteRcvd]

CSS432: End-to-End Protocols

15

Flow ControlSending application

LastByteWritten

TCP

LastByteSentLastByteAcked

Receiving application

LastByteRead

TCP

LastByteRcvdNextByteExpected

LastByteRcvd – LastByteRead ≤ MaxRcvbuffer

AdvertisedWindow= MaxRcvBuffer – (NextByteExpected – NextByteRead)

LastByteSent – LastByteAcked ≤ AdvertisedWindow

EffectiveWindow= AdvertisedWindow – (LastByteSent – LastByteAcked)

LastByteWritten – LastByteAcked ≤ MaxsendBuffer

block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer

y

Send ACK with an advertise windowin response to arriving data segments• as long as all the preceding bytes have also arrived and• until the advertised window reaches 0. (ACK returned at the first time when it reaches 0)

CSS432: End-to-End Protocols

16

Flow Control with A Slower ReceiverSending application

LastByteWritten

TCP

LastByteSentLastByteAcked

Receiving application

LastByteRead

TCP

LastByteRcvdNextByteExpected

LastByteRcvd – LastByteRead ≤ MaxRcvbuffer

AdvertisedWindow= MaxRcvBuffer – (NextByteExpected – LastByteRead)

LastByteSent – LastByteAcked ≤ AdvertisedWindow

EffectiveWindow= AdvertisedWindow – (LastByteSent – LastByteAcked)

LastByteWritten – LastByteAcked ≤ MaxsendBuffer

block sender since(LastByteWritten - LastByteAcked) + y > MaxSenderBuffer

0

0

0

< 0

y y

No more send, no more ack, thus it staysIn the same value

Read slow.

CSS432: End-to-End Protocols

17

The sender won’t send any more data. The receiver won’t initiate to send any

advertised window.

Then, how can the sender find out when the receiver can receive more data?

Flow Control

CSS432: End-to-End Protocols

18

32-bit SequenceNum MSL (Maximum Segment Lifetime) = 120sec.

SequenceNum should not be wrapped around within 120 seconds.

Bandwidth Time Until Wrap AroundT1 (1.5 Mbps) 6.4 hoursEthernet (10 Mbps) 57 minutesT3 (45 Mbps) 13 minutesFDDI (100 Mbps) 6 minutesSTS-3 (155 Mbps) 4 minutesSTS-12 (622 Mbps) 55 secondsSTS-24 (1.2 Gbps) 28 seconds

Protection Against Wrap Around

CSS432: End-to-End Protocols

19

Keeping the Pipe Full 16-bit AdvertisedWindow = 64KB Assuming that RTT = 100msec, the advertised window’s

capacity is at least RTT x Network bandwidthBandwidth RTT(100msec) x Bandwidth ProductT1 (1.5 Mbps) 18KBEthernet (10 Mbps) 122KBT3 (45 Mbps) 549KBFDDI (100 Mbps) 1.2MBSTS-3 (155 Mbps) 1.8MBSTS-12 (622 Mbps) 7.4MBSTS-24 (1.2 Gbps) 14.8MB

CSS432: End-to-End Protocols

20

Segment Transmission

A segment is transmitted out:1. When a segment to send reaches Maximum

segment size (MMS) = Maximum Transfer Unit (MTU)

2. When a TCP receives a push operation that flushes the unsent data (Peek with tcpdump in programming assignment 3)

3. When a timer expires A trigger by a timer may cause the silly

window syndrome.

CSS432: End-to-End Protocols

21

Silly Window Syndrome

If a sender aggressively takes advantage of any available window, The receiver empties every window regardless of its size and thus small

windows will never disappear.

The problem occurs only when either the sender transmits a small segment or the receiver opens the window a small amount

The receiver can delay ACKs to make a larger window How long does it wait?

The sender should make a decision Nagle’s Algorithm (Programming assignment 3)

MMS 1

MMS 2

Ad Window

Ad Window

small

Sender Receiver

CSS432: End-to-End Protocols

22

Nagle’s Algorithm

Ack works as a timer to fire a new segment transmission. The algorithm may not respond to interactive or real-time

applications TCP_NODELAY Option: Transmit data as soon as possible

setsockopt(sockfd, SOL_TCP, TCP_NODELAY, &intFlag, sizeof(intFlag))

When the application produces data to send { if both the available data and the window ≥ MSS send a full segment // no problem, send it right now else if there is unAcked data in transit // a full empty segment returns soon buffer the new data until an ACK arrives else // the receiver won’t reply. Can’t avoid the silly window syndrome send all the new data now}

CSS432: End-to-End Protocols

23

How to Estimate RTT

Timeout is necessaryTo retransmit a lost packetTo detect congestions and invoke AIMD.

RTT estimation algorithmsOriginal AlgorithmKarn/Partridge AlgorithmJacobson/Karels Algorithm

CSS432: End-to-End Protocols

24

Original Algorithm Measure SampleRTT for each segment/ ACK pair Compute weighted average of RTT

EstRTT = x EstRTT + x SampleRTT where + = 1- between 0.8 and 0.9- between 0.1 and 0.2

Set timeout based on EstRTT TimeOut = 2 x EstRTT Why double? EstRTT cannot respond to

deviated SampleRTT quickly.

CSS432: End-to-End Protocols

25

Karn/Partridge Algorithm

Do not sample RTT when retransmitting Can’t figure out which transmission the latest ACK corresponds to.

Double timeout after each retransmission Congestion causes this retransmission. Modestly retransmit segments.

Sender Receiver

Original transmission

ACK

Sam

pleR

TT

Retransmission

Sender Receiver

Original transmission

ACK

Sam

pleR

TT

Retransmission

CSS432: End-to-End Protocols

26

Jacobson/ Karels Algorithm Original Algorithm

EstRTT = x EstRTT + x SampleRTT 0.8 and 0.9 0.1 and 0.2

TimeOut = EstRTT * 2

New Algorithm that takes into a consideration if the variation among smapleRTTs are large. EstRTT = EstRTT + (SampleRTT – EstRTT)

0.125 Diff

Dev = Dev + (|SampleRTT – EstRTT| – Dev) Diff

TimeOut = x EstRTT + x Dev 1 4

CSS432: End-to-End Protocols

27

Jacobson/ Karels AlgorithmSampleRTT –= (EstimatedRTT >> 3)

EstimatedRTT += SampleRTT;

If (SampleRTT < 0)

SampleRTT = – SampelRTT;

SampleRTT –= (Deviation >> 3);

Deviation += SampleRTT;

TimeOut = (EstimatedRTT >> 3) + (Deviation >> 1)

EstimatedRTT = 8 EstRTTDeviation = 8 DevSampleRTT = (SampleRTT – 8 EstRTT) / 8

= SampleRTT – EstRTT)

diff

8 EstRTT = 8EstRTT + SampleRTT= 8EstRTT + (SampleRTT – EstRTT)

EstRTT = EstRTT + 1/8 (SampleRTT – EstRTT)

diff

8 Dev = 8 Dev + |SampleRTT – EsRTT| - DevDev = Dev + 1/8( |SampleRTT – EstRTT| - Dev )

TimeOut = 8 EstRTT / 8 + 8 Dev / 2TimeOut = EstRTT + 4 Dev

|SampleRTT – EstRTT| – 8Dev/8

|diff| Notes

Algorithm does not work accurately with a large granularity of clock (500ms on Unix)

Accurate timeout mechanism important to congestion control

CSS432: End-to-End Protocols

28

TCP Extensions RTT Measurement

Store a 32-bit timestamp in outgoing segments’ header option

Receive an ack with the original timestamp Sample RTT by subtracting the timestamp from the

current timer Resolving the quick wrap-around of sequence

number The 32-bit timestamp and the 32-bit sequence number

gives a 64-bit sequence space Extending an advertised window

Scale up the advertised window How many bytes to send → How many 16-byte units to send

CSS432: End-to-End Protocols

29

Reviews UDP TCP: three-way handshake and state transition Sliding window and flow control Segment transmission: silly window syndrome and

Nagle’s algorithm Adaptive retransmission: original, Karn/Partridge, and

Jacobson/Karels Exercises in Chapter 5

Ex. 5, 14, 22, and 39 (TCP state transition) Ex. 9(a) (Sliding window) Ex. 20 (Nagle’s algorithm)

CSS432: End-to-End Protocols