4/19/05CS1181 What we covered last week How to calculate the delay in packet delivery r Queueing...

Preview:

Citation preview

4/19/05 CS1181

What we covered last week

How to calculate the delay in packet delivery

Queueing delay and congestion losses

Transmission delay Propagation delay Round Trip Time Bandwidth-delay product Delay in delivering multiple

packets across multiple hops

A few application protocols HTTP

HTTP 1.0 HTTP 1.1

• Pipelining

• Without pipelining

FTP Email and SMTP

4/19/05 CS1182

Why we need a naming database for the Internet

People: many identifiers: SSN, name, passport #

Internet hosts, routers: IP address (32 bit) - used

for addressing datagrams “name”, e.g.,

ww.yahoo.com - used by humans

Domain Name System Hostname to IP address

translation And vice versa

has also been used for other purposes E.g. Load distribution

among replicated Web servers: one website name maps to a set of IP addresses

4/19/05 CS1183

DNS: Domain Name System

DNS consists of A hierarchical name space A distributed database

implemented in hierarchy of many name servers

An application-layer protocol used by hosts, routers, and name servers to communicate to resolve names (address/name translation) A core Internet function,

implemented as an application-layer protocol

Why not a centralize DNS? single point of failure traffic volume distant centralized

database Maintenance The most important

factor: the need for distributed management

4/19/05 CS1184

A hierarchical name space

each non-leaf node in the tree is a domain Each domain belongs to an administrative authority

any domain can assign sub-domains below it no limit on the depth along any branch

DNS name hierarchy is completely independent from the Internet's topological structure

edu com gov org us uk fr

mit ucla xerox dec nasa nsf acm ieee

cs seas cad

...

.....

..... .....

rootTLD (top leveldomains)

Foo Bar

4/19/05 CS1185

Root DNS Servers

.com DNS servers .org DNS servers .edu DNS servers

ucla.eduDNS servers

umass.eduDNS servers

yahoo.comDNS servers

amazon.comDNS servers

pbs.orgDNS servers

DNS: Implemented as a distributed database

art art.ucla.edu

The entire DNS name space is divided to a hierarchy of zones a zone: a continuous sub-space in the DNS name tree

may contain domains at different levels

CScs.ucla.edu

netsec.cs.ucla.edu

4/19/05 CS1186

What makes a zoneeach zone is controlled by its own administrator,

served by its own name server(s)One master server keeps a master zone file,

distributes it to multiple secondary serversBoth are called authoritative servers for the zone

Each server must be able toResolve all the names in its own zoneKnow where to direct queries for names belonging to

its sub-zones CScs.ucla.edu

netsec.cs.ucla.edu

4/19/05 CS1187

What's in the zone's master file:

data that defines the top node of the zone including a list of all the servers for the zone

authoritative data for all nodes in the zone for all of the nodes from the top node to leaf nodes (that are

outside of any sub-zone)

data that describes delegated sub-zones Domain name, owner, etc

“glue data”: IP address(es) for each sub-zone's name server(s)

CScs.ucla.edu

netsec.cs.ucla.edu

4/19/05 CS1188

How to resolve a DNS name?

EX: your browser needs IP address for www.amazon.com: Your host sends a query to a local DNS server The local server either finds the answer in its cache, or

otherwise sends a query to a root server The root server replies with pointers to .com DNS

servers The local server queries .com DNS server which replies

with pointer to amazon.com DNS server The local server queries amazon.com DNS server to get

the IP address for www.amazon.com, and sends the answer back to your host

4/19/05 CS1189

Local Name Server

Each ISP (residential ISP, company, university) has one.Also called “default name server”, "local cache

server"

Every host knows the IP address(es) of its local DNS server(s)

When a host makes a DNS query, query is sent to its local DNS server

4/19/05 CS11810

requesting hostlixia.cs.ucla.edu

gaia.umass.edu

root DNS server

local DNS serverToucan.CS.UCLA.EDU

1

23

4

5

6

authoritative DNS serverdns.umass.edu

78

.edu DNS server

Example

A host at cs.ucla.edu wants IP address for gaia.umass.edu

4/19/05 CS11811

requesting hostlixia.cs.ucla.edu

gaia.umass.edu

root DNS server

local DNS serverToucan.CS.UCLA.EDU

1

2

45

6

authoritative DNS serverdns.umass.edu

7

8

3

Recursive queries

recursive query:puts burden of name resolution on contacted name server.heavy load?

.edu DNS server

iterated query:contacted server replies with name of server to contact“I don’t know this name, but ask this server”

4/19/05 CS11812

DNS: caching and replication

Virtually each and all Internet applications invoke DNS lookup

Redundant servers for each zone “13” root servers

once a name server learns a DNS name to IP address mapping, it caches the mappingcache entries timeout (deleted) after some time

(specified in the DNS query reply)TLD servers typically cached in local name servers

• Thus root name servers not often visited

4/19/05 CS11813

DNS records

DNS: all DNS servers storing Resource Records (RR)

Type=NS name is a domain (e.g.

foo.com) value is hostname of

authoritative name server for this domain

RR format: (name, value, type, ttl)

Type=Aname is hostnamevalue is IP address

Type=CNAMEname is a alias name for some “canonical” (the real) name

www.ibm.com is reallyservereast.backup2.ibm.comvalue is canonical name

Type=MXvalue is name of mailserver associated with namee.g. name = cs.ucla.edu value= mailman.cs.ucla.edu

type = MX ttl = 172800

4/19/05 CS11814

DNS protocol, messages

DNS protocol : query and reply messages, with same message format

msg headeridentification: 16 bit # for query, reply to query uses same #flags:

query or replyrecursion desired recursion availablereply is authoritative

4/19/05 CS11815

DNS protocol, messages

Name, type fields for a query

RRs in responseto query

records forauthoritative servers

additional “helpful”info that may be used

4/19/05 CS11816

Inserting records into DNS Example: just created startup “Network Utopia” Register name networkuptopia.com at a registrar (e.g.,

Network Solutions) Need to provide registrar with names and IP addresses of your

authoritative name servers (primary and secondary) Registrar inserts two RRs into the com TLD server:

(networkutopia.com, dns1.networkutopia.com, NS)(dns1.networkutopia.com, 212.212.212.1, A)

Put in authoritative server Type A record for www.networkuptopia.com and Type MX record for networkutopia.com

How do people get the IP address of Web site www.networkutopia.com ?

4/19/05 CS11817

How to use DNS in practice?

Two popular programs you can play on a unix: “host” – look up host names using domain servers

Command: host [-l] [-v] [-w] [-r] [-d] [-t query type] host [server] Manual page: man host

“nslookup” – query Internet name servers interactively Command: nslookup [-options…] [host-to-find | –[server] ] Manual page: man nslookup

> nslookup cs.ucla.eduServer: Toucan.CS.UCLA.EDUAddress: 131.179.96.16

Name: cs.ucla.eduAddress: 131.179.128.22

> nslookup -q=MX cs.ucla.eduServer: Toucan.CS.UCLA.EDUAddress: 131.179.96.16

cs.ucla.edu pref. = 3, mail exchanger=Mailman.cs.ucla.educs.ucla.edu pref. = 3, mail exchanger=Toucan.cs.ucla.educs.ucla.edu nameserver = NS0.cs.ucla.educs.ucla.edu nameserver = NS1.cs.ucla.educs.ucla.edu nameserver = NS2.cs.ucla.educs.ucla.edu nameserver = NS3.cs.ucla.eduMailman.cs.ucla.edu internet address = 131.179.128.30Toucan.cs.ucla.edu internet address = 131.179.128.16NS0.cs.ucla.edu internet address = 131.179.128.30NS1.cs.ucla.edu internet address = 131.179.128.16NS2.cs.ucla.edu internet address = 131.179.128.17NS3.cs.ucla.edu internet address = 131.179.128.18

4/19/05 CS11819

Chapter 2: Application layer

2.1 Principles of network applications app architectures app requirements

2.2 Web and HTTP 2.4 Electronic Mail

SMTP, POP3, IMAP

2.5 DNS

2.6 P2P file sharing 2.7 Socket programming

with TCP 2.8 Socket programming

with UDP 2.9 Building a Web

server

4/19/05 CS11820

P2P file sharing

Example Alice runs P2P client

application on her notebook computer

Intermittently connects to Internet; gets new IP address for each connection

Asks for “Hey Jude” Application displays other

peers that have copy of Hey Jude.

Alice chooses one of the peers, Bob.

File is copied from Bob’s PC to Alice’s notebook: HTTP

While Alice downloads, other users uploading from Alice.

Alice’s peer is both a Web client and a transient Web server.

All peers are servers = highly scalable!

4/19/05 CS11821

P2P: centralized directory

original “Napster” design

1) when peer connects, it informs central server: IP address content

2) Alice queries for “Hey Jude”

3) Alice requests file from Bob

centralizeddirectory server

peers

Alice

Bob

1

1

1

12

3

4/19/05 CS11822

P2P: problems with centralized directory

Single point of failure Performance bottleneck Copyright infringement

file transfer is decentralized, but locating content is highly centralized

4/19/05 CS11823

Query flooding: Gnutella

fully distributed no central server

public domain protocol many Gnutella clients

implementing protocol

Overlay network: graph edge between peer X and

Y if there’s a TCP connection

all active peers and edges is overlay net

Edge is not a physical link

Given peer will typically be connected with < 10 overlay neighbors

4/19/05 CS11824

Gnutella: protocol

Query

QueryHit

Query

QueryQuery

Query

QueryHit

QueryHit

File transfer:HTTP

Query messagesent over existing TCPconnections

peers forwardQuery message

QueryHit sent over reversepath

Scalability:limited scopeflooding

4/19/05 CS11825

Gnutella: Peer joining

1. Joining peer X must find some other peer in Gnutella network: use list of candidate peers

2. X sequentially attempts to make TCP with peers on list until connection setup with Y

3. X sends Ping message to Y; Y forwards Ping message.

4. All peers receiving Ping message respond with Pong message

5. X receives many Pong messages. It can then setup additional TCP connections

4/19/05 CS11826

Exploiting heterogeneity: KaZaA

Each peer is either a group leader or assigned to a group leader. TCP connection between

peer and its group leader. TCP connections between

some pairs of group leaders.

Group leader tracks the content in all its children.

ordinary peer

group-leader peer

neighoring relationshipsin overlay network

4/19/05 CS11827

KaZaA: Querying

Each file has a hash and a descriptorClient sends keyword query to its group leaderGroup leader responds with matches:

For each match: metadata, hash, IP addressIf group leader forwards query to other group

leaders, they respond with matchesClient then selects files for downloading

HTTP requests using hash as identifier sent to peers holding desired file

4/19/05 CS11828

KaZaA tricks

Limitations on simultaneous uploadsRequest queuingIncentive prioritiesParallel downloading

4/19/05 CS11829

Chapter 2: Application layer

2.1 Principles of network applications

2.2 Web and HTTP 2.3 FTP 2.4 Electronic Mail

SMTP, POP3, IMAP

2.5 DNS

2.6 P2P file sharing 2.7 Socket programming

with TCP 2.8 Socket programming

with UDP 2.9 Building a Web

server

4/19/05 CS11830

Socket-programming using TCP

Socket: a door between application process and end-end-transport protocol (UCP or TCP)

TCP service: reliable transfer of bytes from one process to another

process

TCP withbuffers,

variables

socket

controlled byapplicationdeveloper

controlled byoperating

system

host orserver

process

TCP withbuffers,

variables

socket

controlled byapplicationdeveloper

controlled byoperatingsystem

host orserver

internet

4/19/05 CS11831

Socket programming with TCP

Client must contact server server process must first

be running server must have created

socket (door) that welcomes client’s contact

Client contacts server by: creating client-local TCP

socket specifying IP address, port

number of server process When client creates

socket: client TCP establishes connection to server TCP

When contacted by client, server TCP creates new socket for server process to communicate with client allows server to talk

with multiple clients source port numbers

used to distinguish clients (more in Chap 3)

TCP provides reliable, in-order transfer of bytes (“pipe”) between client and server

application viewpoint

4/19/05 CS11832

Client/server socket interaction: TCP

wait for incomingconnection requestconnectionSocket =welcomeSocket.accept()

create socket,port=x, forincoming request:welcomeSocket =

ServerSocket()

create socket,connect to hostid, port=xclientSocket =

Socket()

closeconnectionSocket

read reply fromclientSocket

closeclientSocket

Server (running on hostid) Client

send request usingclientSocketread request from

connectionSocket

write reply toconnectionSocket

TCP connection setup

4/19/05 CS11833

Socket programming with UDP

UDP: no “connection” between client and server

no handshaking sender explicitly attaches IP

address and port of destination to each packet

server must extract IP address, port of sender from received packet

UDP: transmitted data may be lost, or received out of order

application viewpoint

UDP provides unreliable transfer of chunks of bytes (“datagrams”)

between client and server

4/19/05 CS11834

Client/server socket interaction: UDP

closeclientSocket

Server (running on hostid)

read reply fromclientSocket

create socket,clientSocket = DatagramSocket()

Client

Create, address (hostid, port=x,send datagram request using clientSocket

create socket,port=x, forincoming request:serverSocket = DatagramSocket()

read request fromserverSocket

write reply toserverSocketspecifying clienthost address,port number

4/19/05 CS11835

Chapter 2: Summary Application architectures

client-server P2P

Specific application protocols HTTP, FTP, SMTP, POP,

IMAP, DNS application service

requirements: reliability, bandwidth, delay

Internet transport service model connection-oriented, reliable:

TCP unreliable, datagrams: UDP

Learned about protocols typical request/reply message

exchange: client sends request server responds with status

code, data

Typical message formats: headers: fields giving info about

data data: info being communicated

In-band vs. out-of-band control messages

Stateless vs. stateful protocols

4/19/05 CS11836

identifying servers and services

Each service is assigned a unique well-known port # HTTP: TCP/80, FTP: TCP/21, smtp: TCP/25, DNS: UDP/53

server application process registers with local protocol software with that port #

a client requests a service by sending request to a specific server host with the well-known port #

server handles multiple requests concurrently master process accepts incoming requests and creates a child

server process for each client, then goes to wait for future request

the child server process handles all msgs from the same client process, each incoming msg identifies its server process by (source addr + port, destination addr + port)

4/19/05 CS11837

Transport services and protocols provide logical communication

between app processes running on different hosts

transport protocols run in end systems send side: breaks app

messages into segments, passes to network layer

rcv side: reassembles segments into messages, passes to app layer

more than one transport protocol available to apps Internet: TCP and UDP

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

logical end-end transport

4/19/05 CS11838

Transport vs. network layer

network layer: logical communication between hosts

transport layer: logical communication between processes relies on, enhances,

network layer services

Household analogy:12 kids sending letters to 12

kids processes = kids app messages = letters in

envelopes hosts = houses transport protocol = Ann

and Bill network-layer protocol =

postal service

4/19/05 CS11839

Internet transport-layer protocols

reliable, in-order delivery (TCP) congestion control flow control connection setup

unreliable, unordered delivery: UDP no-frills extension of

“best-effort” IP services not available:

delay guarantees bandwidth guarantees

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

logical end-end transport

4/19/05 CS11840

Multiplexing/demultiplexing

application

transport

network

link

physical

P1 application

transport

network

link

physical

application

transport

network

link

physical

P2P3 P4P1

host 1 host 2 host 3

= process= socket

delivering received segmentsto correct socket

Demultiplexing at rcv host:gathering data from multiplesockets, enveloping data with header (later used for demultiplexing)

Multiplexing at send host:

Each process is identified by IP address and port#A transport association is identified by [source addr, port#; destination addr, port#]

4/19/05 CS11841

Multiplexing/demultiplexing: examples

Web clienthost A

Webserver B

Web clientshost C

Source IP: CDest IP: B

sour port:1180dest. port:

80

Source IP: CDest IP: B

sour port:2211dest. port:

80

port use: Web server

Source IP: ADest IP: B

sour port:1180dest. port:

80

host receives IP datagrams each datagram has source IP

address, destination IP address each datagram carries 1 transport-

layer segment each segment has source,

destination port number host uses IP addresses & port numbers

to direct segment to appropriate socket

4/19/05 CS11842

UDP: User Datagram Protocol [RFC 768]

“best effort” service, UDP segments may be: lost delivered out of order to

application processes connectionless:

no prior handshaking between UDP sender, receiver

each UDP segment handled independently of others

Why is there a UDP? no connection establishment

(which can add delay) simple: no connection state at

sender, receiver small segment header no congestion control: UDP can

blast away as fast as desired

4/19/05 CS11843

UDP: more

often used for streaming multimedia apps loss tolerant rate sensitive

other UDP uses DNS SNMP

reliable transfer over UDP: add reliability at application layer application-specific

error recovery!

source port # dest port #

32 bits

Applicationdata

(message)

UDP segment format

length checksumLength, in

bytes of UDPsegment,including

header

4/19/05 CS11844

UDP checksum

Sender: treat segment contents as

sequence of 16-bit integers checksum: addition (1’s

complement sum) of segment contents

sender puts checksum value into UDP checksum field

Receiver: compute checksum of

received segment check if computed checksum

equals checksum field value: NO - error detected YES - no error detected.

But maybe errors nonetheless? More later ….

Goal: detect “errors” (e.g., flipped bits) in transmitted segment

4/19/05 CS11845

Internet Checksum Example

Note: When adding numbers, a carryout from the most significant bit needs to be added to the result

Example: add two 16-bit integers

1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

wraparound

sumchecksum

4/19/05 CS11846

source port # dest port #

32 bits

Applicationdata

(message)

length checksum

UDP header format

source IP address

destination IP address

zero protocol UDP length

How to Calculate UDP Checksum UDP header

Length: # of bytes (including both header & data) checksum: computed over

• the pseudo header, and• UDP header and data.• if the field is 0, no checksum

pseudo header: UDP's self-protection against misdelivered IP packets pseudo header is not carried in UDP packet, nor counted in the length field

4/19/05 CS11847

Chapter 3 outline

3.1 Transport-layer services

3.2 Multiplexing and demultiplexing

3.3 Connectionless transport: UDP

3.4 Principles of reliable data transfer

3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management

3.6 Principles of congestion control

3.7 TCP congestion control

4/19/05 CS11848

Principles of Reliable data transfer

characteristics of unreliable channel determines complexity of reliable data transfer protocol (rdt)

We’ll: incrementally develop sender, receiver sides of reliable data

transfer protocol (rdt) consider only unidirectional data transfer

but control info will flow in both directions!

4/19/05 CS11849

Reliable data transfer: getting started

sendside

receiveside

rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer

udt_send(): called by rdt,to transfer packet over

unreliable channel to receiver

rdt_rcv(): called when packet arrives on rcv-side of channel

deliver_data(): called by rdt to deliver data to upper layer

4/19/05 CS11850

Reliable data transfer: getting started

use finite state machines (FSM) to specify sender, receiver

state1

state2

event causing state transitionactions taken on state transition

state: when in this “state”, the next state is uniquely determined by next event

eventactions

State 3State 3State 3State 3

4/19/05 CS11851

Wait for call from above packet = make_pkt(data)

udt_send(packet)

rdt_send(data)

extract (packet,data)deliver_data(data)

Wait for call from

below

rdt_rcv(packet)

sender receiver

Rdt1.0: reliable transfer over a reliable channel

underlying channel perfectly reliableno bit errorsno loss of packets

separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel

4/19/05 CS11852

Rdt2.0: channel with bit errors

underlying channel may flip bits in packet checksum to detect bit errors

the question: how to recover from errors: acknowledgements (ACKs): receiver explicitly tells sender that pkt

received OK negative acknowledgements (NAKs): receiver explicitly tells sender

that pkt had errors sender retransmits pkt on receipt of NAK

new mechanisms in rdt2.0 (beyond rdt1.0): error detection receiver feedback: control msgs (ACK,NAK) rcvr->sender

4/19/05 CS11853

rdt2.0: FSM specification

Wait for call from above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

belowsender

receiverrdt_send(data)

4/19/05 CS11854

rdt2.0: operation with no errors

Wait for call from above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

sender FSM

receiver FSM

4/19/05 CS11855

rdt2.0: error scenario

Wait for call from above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

sender FSM

receiver FSM

4/19/05 CS11856

rdt2.0 has a fatal flaw!

What happens if ACK/NAK corrupted?

sender doesn’t know what happened at receiver!

can’t just retransmit: possible duplicate

Handling duplicates: sender retransmits current pkt

if ACK/NAK garbled sender adds sequence number

to each pkt receiver discards (doesn’t

deliver up) duplicate pkt

Sender sends one packet, then waits for receiver response

stop and wait

4/19/05 CS11857

rdt2.1: sender, handles garbled ACK/NAKs

Wait for call 0 from

above

sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)

rdt_send(data)

Wait for ACK or NAK 0 udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)

rdt_send(data)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)

Wait for call 1 from

above

Wait for ACK or NAK 1

4/19/05 CS11858

rdt2.1: receiver, handles garbled ACK/NAKs

Wait for 0 from below

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

Wait for 1 from below

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)

4/19/05 CS11859

rdt2.1: discussion

Sender: seq # added to pkt two seq. #’s (0,1) will

suffice. Why? must check if received

ACK/NAK corrupted twice as many states

state must “remember” whether “current” pkt has 0 or 1 seq. #

Receiver: must check if received

packet is duplicate state indicates whether 0

or 1 is expected pkt seq #

note: receiver cannot know if its last ACK/NAK received OK at sender

4/19/05 CS11860

rdt2.2: a NAK-free protocol

same functionality as rdt2.1, using ACKs only instead of NAK, receiver sends ACK for last pkt

received OK receiver must explicitly include seq # of pkt being

ACKed

duplicate ACK at sender results in same action as NAK: retransmit current pkt

4/19/05 CS11861

Wait for call 0 from

above

sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)

rdt_send(data)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) )

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)

Wait for ACK

0

sender FSMfragment

Wait for 0 from below

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt) || has_seq1(rcvpkt))

udt_send(sndpkt)

receiver FSMfragment

rdt2.2: sender, receiver fragments

4/19/05 CS11862

rdt3.0: channels with bit errors and packet loss

New assumption: underlying channel can also lose packets (data or ACKs) checksum, seq. #, ACKs,

retransmissions will be of help, but not enough

Approach: sender waits “reasonable” amount of time for ACK

retransmits if no ACK received in this time

if pkt (or ACK) just delayed (not lost): retransmission will be

duplicate, but use of seq. #’s already handles this

receiver must specify seq # of pkt being ACKed

requires countdown timer

4/19/05 CS11863

rdt3.0 sender

sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer

rdt_send(data)

Wait for

ACK0

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )

Wait for call 1 from

above

sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer

rdt_send(data)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)

stop_timerstop_timer

udt_send(sndpkt)start_timer

timeout

udt_send(sndpkt)start_timer

timeout

rdt_rcv(rcvpkt)

Wait for call 0from

above

Wait for

ACK1

rdt_rcv(rcvpkt)

4/19/05 CS11864

rdt3.0 in action

4/19/05 CS11865

rdt3.0 in action

4/19/05 CS11866

Performance of rdt3.0

example: 1 Gbps link, 15 ms prop. delay, 1KB packet:

Ttransmit

= 8kb/pkt10**9 b/sec

= 8 microsec

U sender: utilization – fraction of time sender busy sending 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources!

U sender

= .008

30.008 = 0.00027

microseconds

L / R

RTT + L / R =

L (packet length in bits)R (transmission rate, bps)

=

4/19/05 CS11867

rdt3.0: stop-and-wait operation

first packet bit transmitted, t = 0

sender receiver

RTT

last packet bit transmitted, t = L / R

first packet bit arriveslast packet bit arrives, send ACK

ACK arrives, send next packet, t = RTT + L / R

U sender

= .008

30.008 = 0.00027

microseconds

L / R

RTT + L / R =

4/19/05 CS11868

Pipelined protocols

Pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts range of sequence numbers must be increased buffering at sender and/or receiver

4/19/05 CS11869

Pipelining: increased utilization

first packet bit transmitted, t = 0

sender receiver

RTT

last bit transmitted, t = L / R

first packet bit arriveslast packet bit arrives, send ACK

ACK arrives, send next packet, t = RTT + L / R

last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK

U sender

= .024

30.008 = 0.0008

microseconds

3 * L / R

RTT + L / R =

Increase utilizationby a factor of 3!

4/19/05 CS11870

What if some packets get lost?

Two generic forms of pipelined protocols: go-Back-N, selective repeat

4/19/05 CS11871

Go-Back-NSender: k-bit seq # in pkt header “window” of up to N, consecutive unack’ed pkts allowed

ACK(n): ACKs all pkts up to, including seq # n (cumulative ACK) may receive duplicate ACKs (see receiver)

timer for each in-flight pkt timeout(n): retransmit pkt n and all higher seq # pkts in window

4/19/05 CS11872

GBN: sender extended FSM

Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])…udt_send(sndpkt[nextseqnum-1])

timeout

rdt_send(data)

if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ }else refuse_data(data)

base = getacknum(rcvpkt)+1If (base == nextseqnum) stop_timer else start_timer

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

base=1nextseqnum=1

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Call from application

Call from network

4/19/05 CS11873

GBN: receiver extended FSM

ACK-only: always send ACK for correctly-received pkt with highest in-order seq # may generate duplicate ACKs need only remember expectedseqnum

out-of-order pkt: discard (don’t buffer) -> no receiver buffering! Re-ACK pkt with highest in-order seq #

Wait

udt_send(sndpkt)

default

rdt_rcv(rcvpkt) && notcurrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(expectedseqnum,ACK,chksum)udt_send(sndpkt)expectedseqnum++

expectedseqnum=1sndpkt = make_pkt(expectedseqnum,ACK,chksum)

4/19/05 CS11874

GBN in action

4/19/05 CS11875

Selective Repeat

receiver individually acknowledges all correctly received pkts buffers pkts (which may have arrived out-of-order), as

needed, for eventual in-order delivery to upper layer

sender only resends pkts for which ACK not received sender timer for each unACKed pkt

sender window N consecutive seq #’s again limits seq #s of sent, unACKed pkts

4/19/05 CS11876

Selective repeat: sender, receiver windows

4/19/05 CS11877

Selective repeat

data from above : if next available seq # in

window, send pkt

timeout(n): resend pkt n, restart timer

ACK(n) in [sendbase,sendbase+N]:

mark pkt n as received if n smallest unACKed pkt,

advance window base to next unACKed seq #

sender

pkt n in [rcvbase, rcvbase+N-1]

send ACK(n) out-of-order: buffer in-order: deliver (also

deliver buffered, in-order pkts), advance window to next not-yet-received pkt

pkt n in [rcvbase-N,rcvbase-1]

ACK(n)

otherwise: ignore

receiver

4/19/05 CS11878

Selective repeat in action

4/19/05 CS11879

Selective repeat: dilemma

Example: seq #’s: 0, 1, 2, 3 window size=3

receiver sees no difference in two scenarios!

incorrectly passes duplicate data as new in (a)

Q: what relationship between seq # size and window size?

4/19/05 CS11880

1 2 3 0 1 2 3 0

1 2 3 0 1 2 3 0

sender

reciver

(Max. seq# + 1) / 2 window-size

Sequence number: how many bits needed?

Example: Window size = 4, is 2 bits enough?

4/19/05 CS11881

Three basic componentsin reliable data delivery by retransmission

sequence number: used by both sender and receiver to uniquely identify individual frames

Acknowledgment (ACK): reception report sent by receiver

Retransmission by the sender upon TIMEOUTmust know how long to wait before retry

4/19/05 CS11882

M

M

M

M

Ht

HtHn

HtHnHl

Always Keeps the Big Picture in Mind

applicationtransportnetwork

linkphysical

Web browser

HTTP

TCP

Unreliable network data packet

delivery

Unreliable network data packet

delivery

Socket interface

Application process

Write bytes

TCP

Send buffer

Application process

Read bytes

TCP

Receive buffer

segment segment

Web serverHTTP

TCPSocket interface

Recommended