Data Networks Lecture 1 Introductionxcellenttutorial.weebly.com/uploads/5/4/1/2/5412925/data... ·...

Data Networks

Lecture 1

Introduction

Eytan Modiano

Eytan Modiano Slide 1

6.263: Data Networks

• Fundamental aspects of network Design and Analysis:

– Architecture Layering Topology design

– Protocols Pt.-to-Pt. Multiple access End-to-end

– Algorithms Error recovery Routing Flow Control

– Analysis tools Probabilistic modeling Queueing Theory

Course Information

• Lecturer: Professor Eytan Modiano

• Requirements & Grading – About one problem set per week (10% of grade) – Project (5% of grade) – Midterm exam (35 %) – Final Exam during finals week (50%)

• Prerequisite Policy: 6.041, or an equivalent class in probability

• Textbook: Bertsekas & Gallager, Data Networks (2nd Edition)

Tentative syllabus LEC # TOPICS

1 Introduction, OSI 7-layer architecture2 Data Link Layers, Framing, error detection3 Retransmission Algorithms

4 Retransmission Algorithms5 Queueing Models - Introduction & Little's theorem

6 M/M/1, M/M/m, queues etc.7 Networks of queues

8 M/G/1 queues, M/G/1 w/ vacations9 M/G/1 queues and reservations, priority queues

10 Stability of queueing systems11 M/G/1 queue occupancy distribution

12 Quiz

Tentative syllabus, continuedLEC # TOPICS

13 Multiple access & Aloha 14 Stabilized Aloha, Tree Algorithms

15 CSMA, CSMA/CD and Ethernet 16 High-speed LANs, Token rings, Satellite reservations 17 Introduction to switch architecture 18 High Speed Switch Scheduling 19 Broadcast routing & Spanning trees 20 Shortest path routing

21 Distributed routing algorithms, optimal routing 22 Flow Control - Window/Credit Schemes

23 Flow Control - Rate Based Schemes

24 Transport layer and TCP/IP 25 ATM Networks

26 Special topic: Optical Networks, Wireless networks

Final Exam during final exam week. Date and time to be announced.

Network Applications

• Resource sharing – Computing – Mainframe computer (old days)

Today, computers cheaper than comm (except LANS) Printers, peripherals

– Information DB access and updates

E.g., Financial, Airline reservations, etc.

• Services – Email, FTP, Telnet, Web access – Video conferencing – DB access – Client/server applications

Network coverage areas

• Wide Area Networks (WANS) – Span large areas (countries, continents, world)

– Use leased phone lines (expensive!) 1980’s: 10 Kbps, 2000’s: 2.5 Gbps

User access rates: 56Kbps – 155 Mbps typical

– Shared comm links: switches and routers E.g, IBM SNA, X.25 networks, Internet

• Local Area Networks (LANS) – Span office or building

– Single hop (shared channel) (cheap!)

– User rates: 10 Mbps – 1 Gbps E.g., Ethernet, Token rings, Apple-talk

• Metro Area networks (MANS) • Storage area networks

Network services

• Synchronous – Session appears as a continuous stream of traffic (e.g, voice)

– Usually requires fixed and limited delays

• Asynchronous – Session appears as a sequence of messages

– Typically bursty

– E.g., Interactive sessions, file transfers, email

• Connection oriented services – Long sustained session

– Orderly and timely delivery of packets

– E.g., Telnet, FTP

• Connectionless services – One time transaction (e.g., email)

• QoS Eytan Modiano

Switching Techniques

• Circuit Switching – Dedicated resources

• Packet Switching – Shared resources – Virtual Circuits – Datagrams

Circuit Switching

• Each session is allocated a fixed fraction of the capacity on eachlink along its path

– Dedicated resources – Fixed path – If capacity is used, calls are blocked

E.g., telephone network • Advantages of circuit switching

– Fixed delays – Guaranteed continuous delivery

• Disadvantages – Circuits are not used when session is idle – Inefficient for bursty traffic – Circuit switching usually done using a fixed rate stream (e.g., 64

Kbps) Difficult to support variable data rates

Problems with circuit switching

• Many data sessions are low duty factor (bursty), (message transmission time)/(message interarrival time) << 1 Same as: (message arrival rate) * (message transmission time) << 1

• The rate allocated to the session must be large enough to meet thedelay requirement. This allocated capacity is idle when the sessionhas nothing to send

• If communication is expensive, then circuit switching isuneconomic to meet the delay requirements of bursty traffic

• Also, circuit switching requires a call set-up during whichresources are not utilized. If messages are much shorter than thecall set-up time then circuit switching is not economical (or evenpractical)

– More of a problem in high-speed networks

Circuit Switching Example

L = message lengths λ = arrival rate of messages R = channel rate in bits per second X = message transmission delay = L/R

– R must be large enough to keep X small – Bursty traffic => λx << 1 => low utilization

• Example – L = 1000 bytes (8000 bits) – λ = 1 message per second – X < 0.1 seconds (delay requirement) – => R > 8000/0.1 = 80,000 bps

Utilization = 8000/80000 = 10%

• With packet switching channel can be shared among manysessions to achieve higher utilization

Packet Switched Networks

Packet Network PS

PS Buffer Packet

Switch

Messages broken into Packets that are routed To their destination

Eytan ModianoSlide 13

Packet Switching

• Datagram packet switching– Route chosen on packet-by-packet basis– Different packets may follow different routes– Packets may arrive out of order at the destination– E.g., IP (The Internet Protocol)

• Virtual Circuit packet switching– All packets associated with a session follow the same path– Route is chosen at start of session– Packets are labeled with a VC# designating the route– The VC number must be unique on a given link but can change from

link to link Imagine having to set up connections between 1000 nodes in a mesh Unique VC numbers imply 1 Million VC numbers that must be represented

and stored at each node– E.g., ATM (Asynchronous transfer mode)

Virtual Circuits Packet Switching

• For datagrams, addressing information must uniquely distinguish each network node and session

– Need unique source and destination addresses

• For virtual circuits, only the virtual circuits on a link need be distinguished by addressing

– Global address needed to set-up virtual circuit– Once established, local virtual circuit numbers can then be used to

represent the virtual circuits on a given link: VC number changes from link to link

• Merits of virtual circuits– Save on route computation

Need only be done once at start of session

– Save on header size – Facilitate QoS provisioning– More complex– Less flexible

Node 5 table (3,5) VC13 -> (5,8) VC3 (3,5) VC7 -> (5,8) VC4 (6,5) VC3 -> (5,8) VC7

VC7VC4

VC3VC7

Circuit vs packet switching

• Advantages of packet switching– Efficient for bursty data– Easy to provide bandwidth on demand with variable rates

• Disadvantages of packet switching– Variable delays– Difficult to provide QoS assurances (Best-effort service)– Packets can arrive out-of-order

Switching Technique Network service

Circuit switching => Synchronous (e.g., voice)Packet switching => Asynchronous (e.g., Data)

Virtual circuits => Connection orientedDatagram => Connectionless

Circuit vs Packet Switching

• Can circuit switched network be used to support data traffic?

• Can packet switched network be used for connection oriented traffic (e.g., voice)?

• Need for Quality of service (QoS) mechanisms in packet networks

– Guaranteed bandwidth– Guaranteed delays– Guaranteed delay variations– Packet loss rate– Etc...

7 Layer OSI Reference Model

Virtual link for reliable packets

Application

Presentation

Session

Transport

Network

Data link Control

Application

Presentation

Session

Transport

Network

Data link Control

Network Network

DLC DLC DLC DLC

Physical link

Virtual bit pipe

Virtual link for end to end packets

Virtual link for end to end messages

Virtual session

Virtual network service

External Site

subnet node

External site

physical interfacephys. int. phys. int. phys. int. phys. int.

physical interface

Layers

• Presentation layer– Provides character code conversion, data encryption, data compression,

• Session layer– Obtains virtual end to end message service from transport layer

– Provides directory assistance, access rights, billing functions, etc.

• Standardization has not proceeded well here, since transport to application are all in the operating system and don't really need standard interfaces

• Focus: Transport layer and lower

Transport Layer

• The network layer provides a virtual end to end packet pipe to the transport layer.

• The transport layer provides a virtual end to end message service to the higher layers.

• The functions of the transport layer are:1) Break messages into packets and reassemble

packets of size suitable to network layer2) Multiplex sessions with same source/destination nodes3) Resequence packets at destination4) recover from residual errors and failures5) Provide end-to-end flow control

Network layer

• The network layer module accepts incoming packets from the transport layer and transit packets from the DLC layer

• It routes each packet to the proper outgoing DLC or (at the destination) to the transport layer

• Typically, the network layer adds its own header to the packets received from the transport layer. This header provides the information needed for routing (e.g., destination address)

DLC layer link 1

DLC layer link 2

DLC layer link 3

Network layer

Transport layer

Each node contains one networkLayer module plus one Link layer module per link

Link Layer

• Responsible for error-free transmission of packets across a single link

– Framing Determine the start and end of packets

– Error detection Determine which packets contain transmission errors

– Error correction Retransmission schemes (Automatic Repeat Request (ARQ))

Physical Layer

• Responsible for transmission of bits over a link

• Propagation delays– Time it takes the signal to travel from the source to the destination

Signal travel approximately at the speed of light, C = 3x108 meters/second– E.g.,

LEO satellite: d = 1000 km => 3.3 ms prop. delay GEO satellite: d = 40,000 km => 1/8 sec prop. delay Ethernet cable: d = 1 km => 3 µs prop. delay

• Transmission errors– Signals experience power loss due to attenuation– Transmission is impaired by noise– Simple channel model: Binary Symmetric Channel

P = bit error probability Independent from bit to bit

– In reality channel errors are often bursty

Internet Sub-layer

• A sublayer between the transport and network layers is required when various incompatible networks are joined together

• This sublayer is used at gateways between the different networks

• It looks like a transport layer to the networks being joined

• It is responsible for routing and flow control between networks, so looks like a network layer to the end-to-end transport layer

• In the internet this function is accomplished using the Internet Protocol (IP)

– Often IP is also used as the network layer protocol, hence only one protocol is needed

Internetworking with TCP/IP

FTP client

FTPserver

FTP Protocol

TCP TCP Protocol

IP IP Protocol IP Protocol

Ethernet Ethernet Protocol

token ring driver

token ringProtocol

Ethernet

driver

IPROUTER

Ethernetdriver

token ring driver

token ring

Encapsulation

Ethernet

Application

user data

Appluser dataheader

TCPheader application data

headerIP TCP

header application data

IP datagram

TCPheader application dataheader

IPEthernetheader

Ethernettrailer

Ethernet frame

46 to 1500 bytes

14 420 20

driver

TCP segment

Lecture 2

6.263/16.37

The Data Link Layer: Framing and Error Detection

Eytan Modiano MIT, LIDS

Data Link Layer (DLC)

• Responsible for reliable transmission of packets over a link

– Framing: Determine the start and end of packets (sec 2.5)

– Error Detection: Determine when a packet contains errors (sec 2.3)

– Error recovery: Retransmission of packets containing errors (sec 2..4)

DLC layer recovery

May be done at higher layer

Framing

_____________________________________ 010100111010100100101010100111000100

Where is the DATA??

• Three approaches to find frame and idle fill boundaries:

1) Character oriented framing

2) Length counts - fixed length

3) Bit oriented protocols (flags)

Character Based Framing

SYN Packet SYN Header STX SYN SYN CRC ETX

SYN is synchronous idleSTX is start textETX is end text

• Standard character codes such as ASCII and EBCDIC contain special communication characters that cannot appear in data

• Entire transmission is based on a character code

Issues With Character Based Framing

• Character code dependent

– How do you send binary data?

• Frames must be integer number of characters

• Errors in control characters are messy

NOTE: Primary Framing method from 1960 to ~1975

Length field approach (DECNET)

• Use a header field to give the length of the frame (in bits or bytes) – Receiver can count until the end of the frame to find the start of the

next frame – Receiver looks at the respective length field in the next packet

header to find that packet’s length

• Length field must be log2 (Max_Size_Packet) + 1 bits long – This restricts the packet size to be used

• Issues with length counts – Difficult to recover from errors – Resynchronization is needed after an error in the length count

Fixed Length Packets (e.g., ATM)

• All packets are of the same size – In ATM networks all packets are 53 Bytes

• Requires synchronization upon initialization

• Issues: – Message lengths are not multiples of packet size

Last packet of a message must contain idle fill (efficiency)

– Synchronization issues

– Fragmentation and re-assembly is complicated at high rates

Bit Oriented Framing (Flags)

• A flag is some fixed string of bits to indicate the start and end of apacket

– A single flag can be used to indicate both the start and the end of a packet

• In principle, any string could be used, but appearance of flag mustbe prevented somehow in data

– Standard protocols use the 8-bit string 01111110 as a flag – Use 01111111..1110 (<16 bits) as abort under error conditions – Constant flags or 1's is considered an idle state

• Thus 0111111 is the actual bit string that must not appear in data

• INVENTED ~ 1970 by IBM for SDLC (synchronous data link protocol)

BIT STUFFING (Transmitter)

• Used to remove flag from original data

• A 0 is stuffed after each consecutive five 1's in the original frame

Stuffed bits

0 0 0 01 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0

Original frame

• Why is it necessary to stuff a 0 in 0111110? – If not, then

0111110111 -> 0111110111 011111111 -> 0111110111

– How do you differentiate at the receiver?

DESTUFFING (Receiver)

• If 0 is preceded by 011111 in bit stream, remove it

• If 0 is preceded by 0111111, it is the final bit of the flag.

Example: Bits to be removed are underlined below

1001111101100111011111011001111110 flag

Overhead

• In general with a flag 01K0 the bit stuffing is require whenever 01k-1

appears in the original data stream • For a packet of length L this will happen about L/2k times

E{OH} = L/ 2k + (k+ 2) bits

• For 8 bit flag OH ~ 8 + L/64 – For large packets efficiency ~ 1 - 1/64 = 98.5 (or 1.5% overhead)

• Optimal flag length – If packets are long want longer flag (less stuffing) – If packets are short want short flag (reduce overhead due to flag)

Kopt ~ log2(L)

Framing Errors

• All framing techniques are sensitive to errors

– An error in a length count field causes the frame to be terminated at the wrongpoint (and makes it tricky to find the beginning of the next frame)

– An error in DLE, STX, or ETX causes the same problems

– An error in a flag, or a flag created by an error causes a frame to disappear or an extra frame to appear

• Flag approach is least sensitive to errors because a flag will eventuallyappear again to indicate the end of a next packet

– Only thing that happens is that an erroneous packet was created – This erroneous packet can be removed through an error detection technique

Error detection techniques

• Used by the receiver to determine if a packet contains errors • If a packet is found to contain errors the receiver requests the

transmitter to re-send the packet

• Error detection techniques

– Parity check single bit Horizontal and vertical redundancy check

– Cyclic redundancy check (CRC)

Effectiveness of error detection technique

• Effectiveness of a code for error detection is usually measured bythree parameters:

1) minimum distance of code (d) (min # bit errors undetected) The minimum distance of a code is the smallest number of errors that can map one codeword onto another. If fewer than d errors occur they will always detected. Even more than d errors will often be detected (but not always!)

2) burst detecting ability (B) (max burst length always detected)

3) probability of random bit pattern mistaken as error free (goodestimate if # errors in a frame >> d or B)

– Useful when framing is lost

– K info bits => 2k valid codewords

– With r check bits the probability that a random string of length k+r maps onto one of the 2k valid codewords is 2k/2k+r = 2-r

Parity check codes

k Data bits r Check bits

• Each parity check is a modulo 2 sum of some of the data bits

Example:

c1 = x1 + x2 + x3 c2 = x2 + x3 + x4 c3 = x1 + x2 + x4

Single Parity Check Code

• The check bit is 1 if frame contains odd number of 1's; otherwise it is 0

1011011 -> 1011011 11100110 -> 1100110 0

• Thus, encoded frame contains even number of 1's • Receiver counts number of ones in frame

– An even number of 1’s is interpreted as no errors – An odd number of 1’s means that an error must have occured

A single error (or an odd number of errors) can be detected An even number of errors cannot be detected Nothing can be corrected

• Probability of undetected error (independent errors)

P(un det ected) = ∑ N

pi (1 − p) N −i N = packet size i even i p = error prob.

Horizontal and Vertical Parity

1 0 0 1 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 1

1 0 1 1 1 1 1 0

Vertical checks

1 0 0 1 0 1 0 1 Horizontal 0 1 1 1 0 1

checks 1 1 0 1 0 0 0 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 1

1 0 1 1 1 1 1 0

• The data is viewed as a rectangular array (i.e., a sequence of words)

• Minimum distance=4, any 4 errors in a rectangular configuration isundetectable

Cyclic Redundancy Checks (CRC)

M R M = info bits R = check bits

T T = codeword

T = M 2r + R

• A CRC is implemented using a feedback shift register

Bits in Bits out

k Data bits r Check bits

Cyclic redundancy checks

T = M 2r + R

• How do we compute R (the check bits)? – Choose a generator string G of length r+1 bits – Choose R such that T is a multiple of G (T = A*G, for some A) – Now when T is divided by G there will be no remainder => no errors – All done using mod 2 arithmetic

T = M 2r + R = A*G => M 2r = A*G + R (mod 2 arithmetic)

Let R = remainder of M 2r/G and T will be a multiple of G

• Choice of G is a critical parameter for the performance of a CRC

Example

r = 3, G = 1001 M = 110101 => M2r = 110101000

110011 1001 110101000

1001 Modulo 2

01000 Division1001 0001100

1001 01010 1001 011 = R (3 bits)

Checking for errors

• Let T’ be the received sequence • Divide T’ by G

– If remainder = 0 assume no errors – If remainder is non zero errors must have occurred

Example: 1001Send T = 110101011 110101011

Receive T’ = 110101011 (no errors)

No way of knowing how many errors occurred or which bits are In error

1001 01000 1001 0001101

1001 01001 1001 000 => No errors

Mod 2 division as polynomial division

Implementing a CRC

Performance of CRC

• For r check bits per frame and a frame length less than 2r-1, the following can be detected

1) All patterns of 1,2, or 3 errors (d > 3)2) All bursts of errors of r or fewer bits 3) Random large numbers of errors with prob. 1-2-r

• Standard DLC's use a CRC with r=16 with option of r=32

– CRC-16, G = X16 + X15 + X2 +1 = 11000000000000101

Physical Layer Error Characteristics

• Most Physical Layers ( communications channels) are not well described by a simple BER parameter

• Most physical error processes tend to create a mix of random & bursts of errors

• A channel with a BER of 10-7 and a average burst size of1000 bits is very different from one with independent random errors

• Example: For an average frame length of 104 bits – random channel: E[Frame error rate] ~ 10-3

– burst channel: E[Frame error rate] ~ 10-6

• Best to characterize a channel by its Frame Error Rate

• This is a difficult problem for real systems

Lectures 3 & 4

6.263/16.37

The Data Link Layer: ARQ Protocols

Eytan Modiano 1

Automatic Repeat ReQuest (ARQ)

• When the receiver detects errors in a packet, how does it let the transmitter know to re-send the corresponding packet?

• Systems which automatically request the retransmission of missing packets or packets with errors are called ARQ systems.

• Three common schemes – Stop & Wait – Go Back N – Selective Repeat

Eytan Modiano 2

Pure Stop and Wait Protocol

Transmitter departure times at A Time

----->

packet 0 CRC packet 1 CRC packet 1 CRC

arrival times at receiver

Packet 0 Accepted

• Problem: Lost Packets – Sender will wait forever for an acknowledgement

• Packet may be lost due to framing errors

• Solution: Use time-out (TO) – Sender retransmits the packet after a timeout

Packet 1 Accepted

Eytan Modiano 3

The Use Of Timeouts For Lost Packets Requires Sequence Numbers

<---- timeout ----->packet 0 CRC packet 0 CRC

packet 0 or 1?packet 0accepted

• Problem: Unless packets are numbered the receiver cannot tell which packet it received

• Solution: Use packet numbers (sequence numbers)

Eytan Modiano 4

Request Numbers Are Required On ACKs To Distinguish Packet ACKed

ACKACK

0 ut 0 ? timeopacket 0 packet 0 packet 1 1

Packet 0 accepted

• REQUEST NUMBERS: – Instead of sending "ack" or "nak", the receiver sends the number of the

packet currently awaited. – Sequence numbers and request numbers can be sent modulo 2.

This works correctly assuming that 1) Frames travel in order (FCFS) on links 2) The CRC never fails to detect errors 3) The system is correctly initialized.

Eytan Modiano 5

Stop and Wait ProtocolAlgorithm at sender (node A)

(with initial condition SN=0)

1) Accept packet from higher layer when available; assign number SN to it

2) Transmit packet SN in frame with sequence # SN

3) Wait for an error free frame from B

i. if received and it contains RN>SN in the request # field, set SN to RN and go to 1

ii. if not received within given time, go to 2

Eytan Modiano 6

Stop and WaitAlgorithm at receiver (node B)

(with initial condition RN=0)

1) Whenever an error-free frame is received from A with a sequence # equal to RN, release received packet to higher layer and increment RN.

2) At arbitrary times, but within bounded delay after receiving any error free frame from A, transmit a frame to A containing RN in the request # field.

Eytan Modiano 7

Correctness of stop & wait with integer SN, RN

• Assume, for A to (from) B transmission, that

– All errors are detected as errors – Initially no frames are on link, SN=0, RN=0 – Frames may be arbitrarily delayed or lost – Each frame is correctly received with at least

some probability q>0.

• Split proof of correctness into two parts:

– SAFETY: show that no packet is ever released out of order or more than once

– LIVENESS: show that every packet is eventually released

Eytan Modiano 8

Safety

• No frames on link initially, packet 0 is first packet accepted at A, it is the only packet assigned SN=0, and must be the packet released by B if B ever releases a packet

• Subsequently (using induction) if B has released packets up to and including n-1, then RN is updated to n when n-1 is released, and only n can be released next

Eytan Modiano 9

LIVENESS

Node A

Node B

Packets out i

t1 = time at which A first starts to transmit packet i

t2 = time at which B correctly receives & releases i, and increases RN to i+1

t3 = time at which SN is increased to i+1

Will prove that t1 < t2 < t3 < ∞. => Liveness Eytan Modiano

Liveness Argument

• Let SN(t), RN(t) be values of SN and RN at time t

From the algorithm,

(1) SN(t) and RN(t) are increasing in t and SN(t) ≤ RN(t) for all t (2) From safety (since i has not been sent before t1) RN(t1) ≤ i and SN(t1) = i

• From (1) and (2), RN(t1) = SN(t1) = i • RN is incremented at t2 and SN at t3, so t2 < t3

• A transmits i repeatedly up to t3, and thus to t2 when it is correctly received. Since q>0, t2 is finite

• B transmits RN=i+1 repeatedly until correctly received at t3, and q>0 implies that t3 is finite.

Eytan Modiano 11

Correctness of Stop & Wait withbinary (finite) SN, RN

• Assume that frames travel on link in order

Note that with integer SN, RN, either SN=RN (from t1 to t2) or (3) SN=RN-1 (from t2 to t3) (4)

Since frames travel in order, the sequence numbers arriving at B and the request numbers arriving at A are increasing, so a single bit can resolve the ambiguity between (3) and (4)

– RN = 0 and SN = 1 or RN =1 and SN = 0 => received packet is an old packet

– RN = 0 and SN = 0 or RN = 1 and SN = 1 => received packet is new

Eytan Modiano 12

Efficiency of stop and wait

Let S = total time between the transmission of a packet and reception of its ACK

DTP = transmission time of the packet

Efficiency (no errors) = DTP/S

DP = prop delay

packet

DTP DP DTA

S = DTP + 2DP + DTAA

DTA = ACK trans. Time DTP = packet trans. time

Eytan Modiano 13

E = DTP/(DTP + 2DP + DTA )

Stop and wait in the presence of errors

Let P = the probability of an error in the transmission of a packet or in its acknowledgment

S = DTP + 2DP + DTA

TO = the timeout interval X = the amount of time that it takes to transmit a packet and receive its

ACK. This time accounts for retransmissions due to errors

E[X] = S + TO*P/(1-P), Efficiency = DTP/E[X]

Where,

TO = DTP in a full duplex system TO = S in a half duplex system

Eytan Modiano 14

Go Back N ARQ(Sliding Window)

• Stop and Wait is inefficient when propagation delay is larger than the packet transmission time

– Can only send one packet per round-trip time • Go Back N allows the transmission of new packets before earlier ones

are acknowledged

• Go back N uses a window mechanism where the sender can send packets that are within a “window” (range) of packets

– The window advances as acknowledgements for earlier packets are received

PKT-0 PKT-1 PKT-2 PKT-3 PKT-9PKT-8PKT-7PKT-6PKT-5PKT-4

ACK-0 ACK-1 ACK-2 ACK-3 ACK-4 ACK-5 ACK-6 ACK-7 ACK-8

WINDOW

Eytan Modiano 15

Features of Go Back N

• Window size = N – Sender cannot send packet i+N until it has received the ACK for packet i

• Receiver operates just like in Stop and Wait – Receive packets in order – Receiver cannot accept packet out of sequence – Send RN = i + 1 => ACK for all packets up to and including i

• Use of piggybacking – When traffic is bi-directional RN’s are piggybacked on packets going in the

other direction Each packet contains a SN field indicating that packet’s sequence number and a RN field acknowledging packets in the other direction

<--Frame Header --------->

SN RN Packet CRC

Eytan Modiano 16

Go Back N ARQ

• The transmitter has a "window" of N packets that can be sent without acknowledgements

• This window ranges from the last value of RN obtained from the receiver (denoted SNmin) to SNmin+N-1

• When the transmitter reaches the end of its window, or times out, it goes back and retransmits packet SNmin

Let SNmin be the smallest number packet not yet ACKed

Let SNmax be the number of the next packet to be accepted from the higher layer (I.e., the next new packet to be transmitted)

Eytan Modiano 17

Go Back NSender Rules

• SNmin = 0; SNmax = 0 • Repeat

– If SNmax < SNmin + N (entire window not yet sent) Send packet SNmax ; SNmax = SNmax + 1;

– If packet arrives from receiver with RN > SNmin SNmin = RN;

– If SNmin < SNmax (there are still some unacknowledged packets) and sender cannot send any new packets

Choose some packet between SNmin and SNmax and re-send it

• The last rule says that when you cannot send any new packets you should re-send an old (not yet ACKed) packet

– There may be two reasons for not being able to send a new packet Nothing new from higher layer Window expired (SNmax = SNmin + N )

– No set rule on which packet to re-send Least recently sent

Eytan Modiano 18

Receiver Rules

• RN = 0; • Repeat

– When a good packet arrives, if SN = RN Accept packet Increment RN = RN +1

• At regular intervals send an ACK packet with RN – Most DLCs send an ACK whenever they receive a packet from the other

direction Delayed ACK for piggybacking

• Receiver reject all packets with SN not equal RN – However, those packets may still contain useful RN numbers (see

homework assignment)

Eytan Modiano 19

Example of Go Back 7 ARQ

0 3 4 5

RN 0 1 2 3 5

Window (0,6) (1,7) (5,11)(2,8) (3,9)

Node A

Node B

Packets 0 1 2 3 4 5

delivered

• Note that packet RN-1 must be accepted at B before a frame containing request RN can start transmission at B

Eytan Modiano20

RETRANSMISSION BECAUSE OF ERRORS FOR GO BACK 4 ARQ

RN 0 1 1 1 1 2

Window

Node A

Node B

Packets

(0,3) (1,4) (2,5) 2 1 2

0 1 2 3delivered

• Note that the timeout value here is take to be the time to send a full window of packets

• Note that entire window has to be retransmitted after an error

Eytan Modiano 21

RETRANSMISSION DUE TO FEEDBACK ERRORS FOR GO BACK 4 ARQ

50 3 4

RN 0 4 5

Window (0,3) (2,5) (4,7) (5,8)

Node A

Node B

Packets 0 1 2 3 4delivered

• When an error occurs in the reverse direction the ACK may still arrive in time. This is the case here where the packet from B to A with RN=2 arrives in time to prevent retransmission of packet 0

• Packet 2 is retransmitted because RN = 4 did not arrive in time, however it did arrive in time to prevent retransmission of packet 3

– Was retransmission of packet 4 and 5 really necessary?

Strictly no because the window allows transmission of packets 6 and 7 before further retransmissions. However, this is implementation dependent Eytan Modiano

EFFECT OF LONG FRAMES

RN 0 1 3 4

Window

Node A

Packets

(0,3) (1,4) (3,6) (4,7)

Node B

0 1 2 4 delivered

• Long frames in feedback direction slow down the ACKs – This causes a transmitter with short frames to wait or go back

• Notice again that the retransmission of packets 3 and 4 was not strictly required because the sender could have sent new packets within the window

– Again, this is implementation dependent Eytan Modiano

Efficiency of Go Back N

packet

DTP DP DTA

S = DTP + 2DP + DTA packet packet packet

• We want to choose N large enough to allow continuous transmission while waiting for an ACK for the first packet of the window,

N > S/ DTP

• Without errors the efficiency of Go Back N is,

E = min{1, N*DTP/S} Eytan Modiano 24

Efficiency of Go Back N with transmission errorsApproximate analysis

Assume: N =

TO = N*DTP

• When an error occurs the entire window of N packets must be retransmitted

Let X = the number of packets sent per successful transmission

E[X] = 1*(1-P) + (X+N)*P

= 1 + N*P/(1-P)

Efficiency = 1/E[X]

Eytan Modiano 25

Go Back N Requirements

• Go Back N is guaranteed to work correctly, independent of the detailed choice of which packets to repeat, if

1) System is correctly initialized 2) No failures in detecting errors 3) Packets travel in FCFS order 4) Positive probability of correct reception 5) Transmitter occasionally resends Snmin (e.g., upon timeout) 6) Receiver occasionally sends RN

Eytan Modiano 26

Notes on Go Back N

• Requires no buffering of packets at the receiver • Sender must buffer up to N packets while waiting for their ACK • Sender must re-send entire window in the event of an error • Packets can be numbered modulo M where M > N

– Because at most N packets can be sent simultaneously • Receiver can only accept packets in order

– Receiver must deliver packets in order to higher layer – Cannot accept packet i+1 before packet i – This removes the need for buffering – This introduces the need to re-send the entire window upon error

• The major problem with Go Back N is this need to re-send the entire window when an error occurs. This is due to the fact that the receiver can only accept packets in order

Eytan Modiano 27

Selective Repeat Protocol (SRP)

• Selective Repeat attempts to retransmit only those packets that are actually lost (due to errors)

– Receiver must be able to accept packets out of order – Since receiver must release packets to higher layer in order, the receiver must

be able to buffer some packets

• Retransmission requests – Implicit

The receiver acknowledges every good packet, packets that are not ACKed before a time-out are assumed lost or in error Notice that this approach must be used to be sure that every packet is eventually received

– Explicit An explicit NAK (selective reject) can request retransmission of just one packet This approach can expedite the retransmission but is not strictly needed

– One or both approaches are used in practice

Eytan Modiano 28

SRP Rules

• Window protocol just like GO Back N – Window size W

• Packets are numbered Mod M where M >= 2W • Sender can transmit new packets as long as their number is with W of

all un-ACKed packets • Sender retransmit un-ACKed packets after a timeout

– Or upon a NAK if NAK is employed • Receiver ACKs all correct packets • Receiver stores correct packets until they can be delivered in order to

the higher layer

Eytan Modiano 29

Need for buffering

• Sender must buffer all packets until they are ACKed – Up to W un-ACKed packet are possible

• Receiver must buffer packets until they can be delivered in order – I.e., until all lower numbered packets have been received – Needed for orderly delivery of packets to the higher layer – Up to W packets may have to be buffered (in the event that the first packet

of a window is lost) • Implication of buffer size = W

– Number of un-ACKed packets at sender =< W Buffer limit at sender

– Number of un-ACKed packets at sender cannot differ by more than W Buffer limit at the receiver (need to deliver packets in order)

– Packets must be numbered modulo M >= 2W (using log2(M) bits)

Eytan Modiano 30

EFFICIENCY

• For ideal SRP, only packets containing errors will be retransmitted – Ideal is not realistic because sometimes packets may have to be

retransmitted because their window expired. However, if the window size is set to be much larger than the timeout value then this is unlikely

• With ideal SRP, efficiency = 1 - P – P = probability of a packet error

• Notice the difference with Go Back N where

efficiency (Go Back N) = 1/(1 + N*P/(1-P))

• When the window size is small performance is about the same, however with a large window SRP is much better

– As transmission rates increase we need larger windows and hence the increased use of SRP

Eytan Modiano 31

Why are packets numbered Modulo 2W?

• Lets consider the range of packets that may follow packet i at the receiver

i - W +1 i i - W +1

x i i - W +1

Packet i may be followed by the first packet of the window (i -W+1) if it requires retransmission

i i+1 i+2 i+W

xxx x i i - W

Packet i may be followed by the last packet of the window (i+W) if all Of the ACKs between i and i +W are lost

• Receiver must differentiate between packets i -W+1 ... i +W – These 2W packets can be differentiated using Mod 2W numbering

Eytan Modiano 32

STANDARD DLC's

• HDLC, LAPB (X.25), and SDLC are almost the same – HDLC/ SDLC developed by IBM for IBM SNA networks – LAPB developed for X.25 networks

• They all use bit oriented framing with flag = 01111110 • They all use a 16-bit CRC for error detection • They all use Go Back N ARQ with N = 7 or 127 (optional)

SDLC packet flag address control data CRC flag

Multipoint SN,RN communication

• Older protocols (used for modems, e.g., xmodem) used stop and wait and simple checksums

Eytan Modiano 33

Optimal packet size based on pipelining effect

Source destination

M links

• Packet must be completely received before being forwarded to next node • Delay for sending N packets over M links (pipelining delay)

D = N*DTP + (M-1)*DTP

• Each packet contains K bits of data and a header of size H bits – CRC, flags, SN’s, etc. – Total packet size K+H bits

• In order to transmit a message of L bits we need L/K packets • Time to transmit message over M links,

R = data rate D = L K + H

+ (M − 1) K + H

R R Eytan Modiano 34

Optimal packet sizeTransmission Delay

+ (M − 1)K + H

Pipelining delay

• Small packets reduce the pipelining delay but increase the transmission delay due to additional headers

• Large packets reduce header overhead but increase the pipelining delay

• Optimal packet size, Kopt = LH

M − 1 • Approach may be appropriate for high-speed multi-hop networks

• Alternative approach may optimize the packet size to minimize link layer retransmissions due to errors

– Large packet are more likely to contain transmission errors Eytan Modiano

Lectures 5 & 6

6.263/16.37

Introduction to Queueing Theory

Packet Switched Networks

Packet Network PS

PS Buffer Packet

Switch

Messages broken into Packets that are routed To their destination

Queueing Systems

• Used for analyzing network performance

• In packet networks, events are random – Random packet arrivals – Random packet lengths

• While at the physical layer we were concerned with bit-error-rate, at the network layer we care about delays

– How long does a packet spend waiting in buffers ? – How large are the buffers ?

• In circuit switched networks want to know call blocking probability – How many circuits do we need to limit the blocking probability?

Random events

• Arrival process – Packets arrive according to a random process – Typically the arrival process is modeled as Poisson

• The Poisson process – Arrival rate of λ packets per second

– Over a small interval δ,

P(exactly one arrival) = λδ + ο(δ) P(0 arrivals) = 1 - λδ + ο(δ) P(more than one arrival) = 0(δ)

Where 0(δ)/ δ −> 0 �� δ −> 0.

– It can be shown that:

P(n arrivalsininterval T)= ( λT )n e−λT

The Poisson Process

P(n arrivalsininterval T)= ( λT )n e−λT

n = number of arrivals in T

It can be shown that,

E[n] = λT

E[n2] = λT + (λT)2

σ 2 = E[(n -E[n])2] = E[n2] - E[n]2 = λT

Inter-arrival times

• Time that elapses between arrivals (IA)

P(IA <= t) = 1 - P(IA > t) = 1 - P(0 arrivals in time t)

= 1 - e-λt

• This is known as the exponential distribution – Inter-arrival CDF = FIA (t) = 1 - e-λt

– Inter-arrival PDF = d/dt FIA(t) = λe-λt

• The exponential distribution is often used to model the service times (I.e.,the packet length distribution)

Markov property (Memoryless)

P(T ≤ t0 + t | T > t0 ) = P(T ≤ t)

Pr oof :

P(T ≤ t0 + t | T > t0 ) = P(t0 < T ≤ t0 + t) P(T > t0 )

t 0 +t∫ λe−λtdt −e− λt | t0

t0 + t −e−λ ( t +t 0 ) + e−λ ( t0 ) t 0= = = ∞ ∞ e−λ ( t0 )

∫ λe− λtdt −e−λt |t 0 t0

= 1 − e − λt = P(T ≤ t)

• Previous history does not help in predicting the future!

• Distribution of the time until the next arrival is independent of when the last arrival occurred!

Example

• Suppose a train arrives at a station according to a Poisson processwith average inter-arrival time of 20 minutes

• When a customer arrives at the station the average amount of timeuntil the next arrival is 20 minutes

– Regardless of when the previous train arrived

• The average amount of time since the last departure is 20 minutes!

• Paradox: If an average of 20 minutes passed since the last trainarrived and an average of 20 minutes until the next train, then anaverage of 40 minutes will elapse between trains

– But we assumed an average inter-arrival time of 20 minutes! – What happened?

Properties of the Poisson process

• Merging Property λ1 λ2 ∑λi

λk Let A1, A2, … Ak be independent Poisson Processes of rate λ1, λ2, …λk

A = ∑Ai is also Poisson of rate = ∑λi

• Splitting property – Suppose that every arrival is randomly routed with probability P to

stream 1 and (1-P) to stream 2 – Streams 1 and 2 are Poisson of rates Pλ and (1-P)λ respectively

λP λ

λ(1−P) Eytan Modiano

Queueing Models

Customers Queue/buffer

• Model for – Customers waiting in line – Assembly line – Packets in a network (transmission line)

• Want to know – Average number of customers in the system – Average delay experienced by a customer

• Quantities obtained in terms of – Arrival rate of customers (average number of customers per unit time) – Service rate (average number of customers that the server can serve

per unit time)

server

Little’s theorem

Network (system) (N,T)λ packet per second

• N = average number of packets in system • T = average amount of time a packet spends in the system • λ = arrival rate of packets into the system

(not necessarily Poisson)

• Little’s theorem: N = λT – Can be applied to entire system or any part of it – Crowded system -> long delays

On a rainy day people drive slowly and roads are more congested!

Proof of Little’s Theorem

α(t), β(t)

t1 t2 t3 t4

• α(t) = number of arrivals by time t • β(t) = number of departures by time t • ti = arrival time of ith customer • Ti = amount of time ith customer spends in the system • N(t) = number of customers in system at time t = α(t) - β(t)

• Similar proof for non First-come-first-serve

Proof of Little’s Theorem

t1Nt = ∫ N (τ )dτ = timeave.numberof customersinqueue

N = Limitt→ ∞ Nt = steadystatetimeave.

λt = α (t) / t, λ = Limitt→ ∞ λt = arrival rate

∑( t)

Ti Tt = i= 0

α (t) = timeave.systemdelay, T = Limitt → ∞ Tt

• Assume above limits exists, assume Ergodic systemα ( t)

N (t) = α(t) − β(t ) ⇒ Nt = ∑ i=1 Ti

t α (t ) α( t )

α ( t)N = limt → ∞

∑i =1 Ti , T = limt →∞

∑i =1 Ti ⇒ ∑ i=1

Ti = α (t)T t α (t)

α ( t) α ( t)

Eytan Modiano N = ∑ i=1

Ti = (α(t)) ∑ i=1 Ti = λT

t t α (t)

Application of little’s Theorem

• Little’s Theorem can be applied to almost any system or part of it

• Example: Customers server

Queue/buffer

1) The transmitter: DTP = packet transmission time – Average number of packets at transmitter = λDTP = ρ = link utilization

2) The transmission line: Dp = propagation delay – Average number of packets in flight = λDp

3) The buffer: Dq = average queueing delay – Average number of packets in buffer = Nq = λDq

4) Transmitter + buffer – Average number of packets = ρ + Nq

Application to complex system

3 λ 1

• We have complex network with several traffic streams moving through itand interacting arbitrarily

• For each stream i individually, Little says Ni = λiTi

• For the streams collectively, Little says N = λT where

• N = ∑i Ni & λ = ∑i λi i= k

• From Little's Theorem: T = ∑ i=1 λiTi

Eytan Modiano ∑ i=1

λi Slide 15

Single server queues

buffer

λ packet per second

• M/M/1

Server

µ packet per second

Service time = 1/µ

– Poisson arrivals, exponential service times

• M/G/1 – Poisson arrivals, general service times

• M/D/1 – Poisson arrivals, deterministic service times (fixed)

Markov Chain for M/M/1 system

λδ λδ λδ λδ

1−λδ

µδ µδ µδ µδ

• State k => k customers in the system

• P(I,j) = probability of transition from state I to state j – As δ => 0, we get:

P(0,0) = 1 - λδ, P(j,j+1) = λδ P(j,j) = 1 - λδ −µδ P(j,j-1) = µδ

P(I,j) = 0 for all other values of I,j.

• Birth-death chain: Transitions exist only between adjacent states – λδ , µδ are flow rates between states

Equilibrium analysis

• We want to obtain P(n) = the probability of being in state n

• At equilibrium λP(n) = µP(n+1) for all n – Local balance equations between two states (n, n+1) – P(n+1) = (λ/µ)P(n) = ρP(n), ρ = λ/µ

• It follows: P(n) = ρn P(0) ∑i

= 0 P(n) = 1

• Now by axiom of probability: ∞ P(0)

⇒ ∑i =0 ρnP(0) =

1− ρ= 1

⇒ P(0) = 1 − ρ

P(n) = ρ n(1 − ρ)

Average queue size

∞ ∞

N = ∑ nP(n) =∑ nρn (1 − ρ) = ρ

n=0 n=0 1− ρ

= λ / µ

1 − ρ 1 − λ / µ µ − λ

• N = Average number of customers in the system • The average amount of time that a customer spends in the T =

1 system can be obtained from Little’s formula (N=λT => T = N/λ) µ − λ

• T includes the queueing delay plus the service time (Service time = DTP = 1/µ ) 1

– W = amount of time spent in queue = T - 1/µ => W = µ − λ

− 1 µ

• Finally, the average number of customers in the buffer can be obtained from little’s formula

λNQ = λW =

µ − λ−

λ= N − ρ

Example (fast food restaurant)

• Customers arrive at a fast food restaurant at a rate of 100 per hourand take 30 seconds to be served.

• How much time do they spend in the restaurant?

– Service rate = µ = 60/0.5=120 customers per hour – T = 1/µ−λ = 1/(120-100) = 1/20 hrs = 3 minutes

• How much time waiting in line? – W = T - 1/µ = 2.5 minutes

• How many customers in the restaurant? – N = λT = 5

• What is the server utilization? – ρ = λ/µ = 5/6

Packet switching vs. Circuit switching

λ/M 1 2 3 M 1 2 3 M

TDM, Time Division MultiplexingEach user can send µ/N packets/sec and has packet arriving at rate λ/N packets/sec

D = M / µ + M (λ / µ)

M/M/1(µ − λ ) formula

Packets generated at random times

λ/M λ Buffer µ packets/secStatistical Mutliplexer

D = 1/ µ + ( λ / µ) M/M/1

λ/M (µ − λ) formula

Circuit (tdm/fdm) vs. Packet switching

Average Packet Service Time (slots)

0 0.2 0.4 0.6 0.8 1

Total traffic load, packets per slot

TDM with 20 sources

Ideal Statistical Multiplexing (M/D/1)

M server systems: M/M/m

buffer

λ packet per second

Server

M servers µ packet per second, per server

• Departure rate is proportional to the number of servers in use

• Similar Markov chain:

λδ λδ λδ

1−λδ

µδ 2µδ 3µδ mµδ mµδ

Server

M/M/m queue

λP(n − 1) = nµP(n) n ≤ m • Balance equations:

λP(n − 1) = mµP(n) n > m

P(0)( mρ )n / n! n ≤ m , ρ =

λ≤ 1P(n ) =

P(0)( mmρn ) / m! n > m mµ

• Again, solve for P(0): m −1 (mρ) n ( mρ )m −1

P(0) = ∑ n =0 n!

+ m!(1− ρ)

n= ∞

PQ = ∑ P( n) = P(0)(mρ)m

n= m m!(1 − ρ)

n= ∞ n=∞ m + n ρNQ = ∑ nP(n + m) = ∑nP(0)(

mmρ ) = PQ (1 − ρ

) n=0 n =0 m!

λ Q , T = W + 1/ µ, N =λT =λ / µ + NQ

Applications of M/M/m

• Bank with m tellers • Network with parallel transmission lines

m lines, each of rate µ Use

λNode

A Node

B M/M/m formula

VS One line of rate mµ Use

λNode

A Node

B M/M/1 formula

• When the system is lightly loaded, PQ~0, and Single server is m times faster • When system is heavily loaded, queueing delay dominates and systems are

roughly the same Eytan Modiano

M/M/Infinity

• Unlimited servers => customers experience no queueing delay • The number of customers in the system represents the number of

customers presently being served

λδ λδ λδ

1−λδ

µδ 2µδ 3µδ nµδ (n+1)µδ

λP(n − 1) = nµP(n), ∀n > 1, ⇒ P( n) = P(0)(λ / µ)n

∞ −1 P(0) = [1 + ∑ n=1

(λ / µ)n / n!] = e−λ / µ

P(n ) = (λ / µ) ne− λ / µ / n! => Poisson distribution!

N = Averagenumber in system =λ / µ, T = N / λ =1/ µ = servicetime Eytan Modiano

Blocking Probability

• A circuit switched network can be viewed as a Multi-server queueing system

– Calls are blocked when no servers available - “busy signal” – For circuit switched network we are interested in the call blocking

probability

• M/M/m/m system – m servers => m circuits – Last m indicated that the system can hold no more than m users

• Erlang B formula – Gives the probability that a caller finds all circuits busy – Holds for general call arrival distribution (although we prove

Markov case only)

PB = ∑

λ / µ)m / m!

n= 0(λ / µ)n / n!

M/M/m/m system: Erlang B formula

λδ λδ λδ λδ

1−λδ

µδ 2µδ 3µδ mµδ

λP(n − 1) = nµP(n ), 1 ≤ n ≤ m, ⇒ P(n) = P(0)( λ / µ)n

n! −1m

P(0) = [∑n= 0(λ / µ)n / n!]

PB = P( Blocking) = P(m) = ( m

λ / µ)m / m!

∑n= 0(λ / µ) n / n!

Erlang B formula

• System load usually expressed in Erlangs – A= λ/µ = (arrival rate)*(ave call duration) = average load PB =

( mA)m / m!

– Formula insensitive to λ and µ but only to their ratio ∑n= 0( A)n / n!

• Used for sizing transmission line – How many circuits does the satellite need to support? – The number of circuits is a function of the blocking probability that we can tolerate

Systems are designed for a given load predictions and blocking probabilities (typically small)

• Example – Arrival rate = 4 calls per minute, average 3 minutes per call => A = 12

– How many circuits do we need to provision? Depends on the blocking probability that we can tolerate

Circuits PB 20 1% 15 8% 7 30%

Multi-dimensional Markov Chains

• K classes of customers – Class j: arrival rate λj; service rate µj

• State of system: n = (n1, n2, …, nk); nj = number of class j customers inthe system

• If detailed balance equations hold for adjacent states, then a product formsolution exists, where:

– P(n,.n2, …, nk) = P1(n1)*P2(n2)*…*Pk(nk)

• Example: K independent M/M/1 systems

Pi (ni ) = ρini (1 − ρi ), ρi = λi / µi

• Same holds for other independent birth-death chains

– E.g., M.M/m, M/M/Inf, M/M/m/m

Truncation

• Eliminate some of the states – E.g., for the K M/M/1 queues, eliminate all states where n1+n2+…+nk > K1 (some constant)

• Resulting chain must remain irreducible – All states must communicate

Product form for stationary distribution of the truncated system

• E.g., K independent M/M/1 queues

nK nKP(n1, n2 ,...nk ) =

ρ1 n1ρ2

n2....ρ K , G =∑ ρn1ρ2 n2....ρK1G n∈S

• E.g., K independent M/M/inf queues

P(n1, n2 ,...nk ) = (ρ1

n1 / n1!)(ρ2 n2 / n2!)....(ρK nK / nk !)G

nK / nk !), G =∑(ρ1

n1 / n1!)(ρn2 / n2!)....(ρK2 n∈S

– G is a normalization constant that makes P(n) a distribution – S is the set of states in the truncated system

Example

• Two session classes in a circuit switched system – M channels of equal capacity – Two session types:

Type 1: arrival rate λ1 and service rate µ1 Type 2: arrival rate λ2 and service rate µ2

• System can support up to M sessions of either class – If µ1= µ2, treat system as an M/M/m/m queue with arrival rate λ1+ λ2

– When µ1=! µ2 need to know the number of calls in progress of each session type – Two dimensional markov chain state = (n1, n2) – Want P(n1, n2): n1+n2 <=m

• Can be viewed as truncated M/M/Inf queues – Notice that the transition rates in the M/M/Inf queue are the same as those in a

truncated M/M/m/m queue

i = m j = m −i

P(n1, n2 ) = (ρ1

n1 / n1!)(ρ2 n2 / n2!) , G = ∑ ∑(ρi / i!)(ρ2

j / j!), n1+ n2≤ m1G i =0 j =0

– Notice that the double sum counts only states for which j+i <= m Eytan Modiano

PASTA: Poisson Arrivals See Time Averages

• The state of an M/M/1 queue is the number of customers in the system

• More general queueing systems have a more general state that mayinclude how much service each customer has already received

• For Poisson arrivals, the arrivals in any future increment of time is independent of those in past increments and for many systems of interest, independent of the present state S(t) (true for M/M/1, M/M/m, and M/G/1).

• For such systems, P{S(t)=s|A(t+δ)-A(t)=1} = P{S(t)=s} – (where A(t)= # arrivals since t=0)

• In steady state, arrivals see steady state probabilities

Occupancy distribution upon arrival

• Arrivals may not always see the steady-state averages

• Example: – Deterministic arrivals 1 per second – Deterministic service time of 3/4 seconds

λ = 1 packets/second T = 3/4 seconds (no queueing)

N = λT = Average occupancy = 3/4

• However, notice that an arrival always finds the system empty!

Occupancy upon arrival for a M/M/1 queue

an = Lim t-> inf (P (N(t) = n | an arrival occurred just after time t)) Pn = Lim t-> inf (P(N(t) = n))

For M/M/1 systems an = Pn

Proof: Let A(t, t+δ) be the event that and arrival occurred between t and t+δ

an (t) = Lim t-> inf (P (N(t) = n| A(t, t+δ) ) = Lim t-> inf (P (N(t) = n, A(t, t+δ) )/P(A(t, t+δ) ) = Lim t-> inf P(A(t, t+δ)| N(t) = n)P(N(t) = n)/P(A(t, t+δ) )

• Since future arrivals are independent of the state of the system,

P(A(t, t+δ)| N(t) = n)= P(A(t, t+δ))

• Hence, an (t) = P(N(t) = n) = Pn(t)

• Taking limits as t-> infinity, we obtain an = Pn

• Result holds for M/G/1 systems as well Eytan Modiano

Lecture 7

Burke’s Theorem and Networks of Queues

Eytan ModianoMassachusetts Institute of Technology

�Burke’s Theorem

• An interesting property of an M/M/1 queue, which greatlysimplifies combining these queues into a network, is thesurprising fact that the output of an M/M/1 queue with arrival rate λ is a Poisson process of rate λ

– This is part of Burke's theorem, which follows from reversibility

• A Markov chain has the property that – P[future | present, past] = P[future | present]

Conditional on the present state, future states and past states areindependent

P[past | present, future] = P[past | present]

=> P[Xn=j |Xn+1 =i, Xn+2=i2,...] = P[Xn=j | Xn+1=i] = P*ij

Burke’s Theorem (continued)

• The state sequence, run backward in time, in steady state, is aMarkov chain again and it can be easily shown that

piP*ij = pjPji (e.g., M/M/1 (pn)λ=(pn+1)µ)

• A Markov chain is reversible if P*ij = Pij – Forward transition probabilities are the same as the backward

probabilities – If reversible, a sequence of states run backwards in time is

statistically indistinguishable from a sequence run forward

• A chain is reversible iff piPij=pjPji

• All birth/death processes are reversible – Detailed balance equations must be satisfied

Implications of Burke’s Theorem

Arrivals

Departures

• Since the arrivals in forward time form a Poisson process, thedepartures in backward time form a Poisson process

• Since the backward process is statistically the same as the forwardprocess, the (forward) departure process is Poisson

• By the same type of argument, the state (packets in system) left by a(forward) departure is independent of the past departures

– In backward process the state is independent of future arrivals Eytan Modiano

NETWORKS OF QUEUES

Exponential Exponential

M/M/1 M/M/1 ?

Poisson

Poisson Poisson

• The output process from an M/M/1 queue is a Poisson process ofthe same rate λ as the input

• Is the second queue M/M/1?

Independence Approximation(Kleinrock)

• Assume that service times are independent from queue to queue – Not a realistic assumption: the service time of a packet is determined

by its length, which doesn't change from queue to queue

Link 3,4

• Xp = arrival rate of packets along path p

• Let λij = arrival rate of packets to link (i,j) λij = ∑ Xp P traverses link (i, j)

• µij = service rate on link (i,j)

Kleinrock approximation

• Assume all queues behave as independent M/M/1 queues

λijNij = µij − λij

• N = Ave. packets in network, T = Ave. packet delay in network

λijN = Nij =µij − λi, j ij

N∑ , T = λ

λ = XP = total external arrival rate∑all paths p

• Approximation is not always good, but is useful when accuracy ofprediction is not critical

– Relative performance but not actual performance matters – E.g., topology design

Slow truck effect

Long packet Short packets

queue queue queue

• Example of bunching from slow truck effect – long packets require long service at each node – Shorter packets catch up with the long packets

• Similar to phenomenon that we experience on the roads – Slow car is followed by many faster cars because they

catch up with it

Jackson Networks

• Independent external Poisson arrivals • Independent Exponential service times

– Same job has independent service time at different queues • Independent routing of packets

– When a packet leaves node i it goes to node j with probability Pij – Packet leaves system with probability 1 −=∑ j

Pij – Packets can loop inside network

• Arrival rate at node i,

λi = ri +=∑k λk Pki

External Internal arrivals from arrivals Other nodes

– Set of equations can be solve to obtain unique λi’s – Service rate at node i = µi

Jackson Network (continued)

r (1−P) λµ >> λ=+ x λ=

External input Internal inputs

External input

• Customers are processed fast (µ >> λ)=• Customers exit with probability (1-P)

– Customers return to queue with probability P – λ== r + Pλ==> λ== r/(1-P)

• When P is large, each external arrival is followed by a burst ofinternal arrivals

– Arrivals to queues are not Poisson

Jackson’s Theorem

v • We define the state of the system to be n = (n1, n2 L nk )

where ni is the number of customers at node i • Jackson's theorem:

i = k i = k niP(n

v) = ∏=Pi ( ni ) = ∏ ρi (1 −=ρi ), where ρi =

i 1 i 1 µi

• That is, in steady state the state of node i (ni) is independent of thestates of all other nodes (at a given time)

– Independent M/M/1 queues – Surprising result given that arrivals to each queue are neither

Poisson nor independent – Similar to Kleinrock’s independence approximation – Reversibility

Exogenous outputs are independent and Poisson The state of the entire system is independent of past exogenous departures

Example

λ1 λ2 r µ1 µ2

2/83/8

λ1 = ? λ2 = ?

P(n1,n2) = ?

Lectures 8 & 9

M/G/1 Queues

Eytan Modiano MIT

M/G/1 QUEUE

Poisson Service timesM/G/1

General independent

• Poisson arrivals at rate λ

• Service time has arbitrary distribution with given E[X] and E[X2] – Service times are independent and identically distributed (IID) – Independent of arrival times – E[service time] = 1/µ – Single Server queue

Pollaczek-Khinchin (P-K) Formula

W =λE[X 2 ] 2(1 − ρ)

where ρ = λ/µ = λE[X] = line utilization

From Little’s formula,

NQ = λW

T = E[X] + W

N = λT= NQ + ρ

M/G/1 EXAMPLES

• Example 1: M/M/1

E[X] = 1/µ ; E[X2] = 2/µ2

W = λ

µ2(1-ρ) =

µ(1-ρ) •

Example 2: M/D/1 (Constant service time 1/µ)

E[X] = 1/µ ; E[X2] = 1/µ2

W = λ = ρ

2µ2(1-ρ) 2µ(1-ρ)

Proof of Pollaczek-Khinchin

• Let Wi = waiting time in queue of ith arrival Ri = Residual service time seen by I (I.e., amount of time for currentcustomer receiving service to be done) Ni = Number of customers found in queue by i

i arrives Wi

Ri i-3X i-2X i-1X Xi

Time -> Ni = 3

i-1 W i = R i + � X j

j=i- N i

• E[Wi] = E[Ri] + E[X]E[Ni] = R + NQ/µ – Here we have used PASTA property plus independent service time property

• W = R + λW/µ => W = R/(1-ρ) Eytan Modiano

– Using little’s formula Slide 5

What is R?(Time Average Residual Service Time)

Residual Service Time R(t)

X1 X2 X3 X4 time ->

Let M(t) = Number of customers served by time t E[R(t)] = 1/t (sum of area in triangles)

t 2 M(t)

M(t) X

i 1 M(t) � 2

R t = 1

R( τ )d τ = 1 � = 2 t M(t)t i=1 i=1

As t -> Infinity M(t) = average departure rate = average arrival rate

t M(t)

M(t) � M(t) = E[X2] => R = λE[X2]/2

tEytan Modiano

M/G/1 Queue with Vacations

• Useful for polling and reservation systems (e.g., token rings) • When the queue is empty, the server takes a vacation • Vacation times are IID and independent of service times and

arrival times – If system is empty after a vacation, the server takes another vacation – The only impact on the analysis is that a packet arriving to an empty

system must wait for the end of the vacation

i arrives Wi

Vj i-3X i-2X i-1X Xi

Time -> Ni = 3

i-1 W i = R i + � X j

j=i- N i

E[Wi] = E[Ri] + E[X]E[Ni] = R + NQ/µ = R/(1-ρ)

Average Residual Service Time(with vacations)

Residual Service Time R(t)

X1 X2 V1 X3 X4 time ->

t M(t) 2 L(t) 2

R = [R(t)]= 1 R( τ )d τ = 1 (� X i + �

V j )t t 2 2

i=1 j=1

R = lim E[M(t)] E[X 2] +

L(t) E[V 2]

t →∞ t 2 t 2

• Where L(t) is the number of vacations taken up to time t • M(t) is the number of customers served by time t

Average Residual Service Time(with vacations)

• As t->∞, M(t)/t -> λ and L(t)/t -> λv = vacation rate

• Now, let I = 1 if system is on vacation and I = 0 if system is busy • By Little’s Theorem we have,

– E[I] =E[#vacations] = P(system idle) = 1-ρ = λv E[V] – => λv = (1-ρ)/E[V]

• Hence, remember W = R/(1-ρ)

R λ E[X2]

(1-ρ )E[V 2] 2 E[V]

= W λ E[X 2]

2(1- ρ ) +

E[V 2 ]

2 E[V] =

Example: Slotted M/D/1 system

Each slot = one packet transmission time = 1/µ

• Transmission can begin only at start of a slot • If system is empty at the start of a slot, server not available for the

duration of the slot (vacation)

λ / µ 2 λ / µ • E[X] = E[v] = 1/µ 2 / µ• E[X2] = E[v2] = 1/µ2

W = 2(1 − λ / µ)

+ 1/ µ 2

= 2(µ − λ)

= WM / D /1 + E[ X]/ 2

• Notice that an average of 1/2 slot is spent waiting for the start of a slot

FDM EXAMPLE

• Assume m Poisson streams of fixed length packets of arrival rate λ/m each multiplexed by FDM on m subchannels. Total traffic = λ

Suppose it takes m time units to transmit a packet, so µ=1/m.

The total system load: ρ = λ

User 2

User m

User 1 SLOT for User 1

SLOT for User 2

SLOT for User m

Frames

• We have an M/D/1 system { W=λE[x2]/2(1-ρ) }

W = (λ/m) m ρ m

= FDM 2 (1- ρ ) 2 (1- ρ )

Slotted FDM

• Suppose now that system is slotted and transmissions start only on mtime unit boundaries.

User 2

User m

User 1 SLOT for User 1

SLOT for User 2

SLOTTED FDM

SLOT for User m

SLOT for User 2

Vacation for User m

Frames

• This is M/D/1 with vacations – Server goes on vacation for m time units when there is nothing to transmit

E[V] = m; E[V2] = m2.

WSFDM = WFDM + E[V2]/2E[V]

= WFDM + m/2

TDM EXAMPLE

slot 1 slot 2 slot m. . . TDM

slot m Frame

• TDM with one packet slots is the same (a session has to wait forits own slot boundary), so

W = R/(1-ρ)

R = λ=E[X2] + (1-ρ=)E[V2]

2 2 E[V]

E[X 2] E[V 2]W = λ= +

2(1- ρ=) 2 E[V] Eytan Modiano

TDM EXAMPLE

• Therefore, WTDM = WFDM + m/2

Adding the packet transmission time, TDM comes out bestbecause transmission time = 1 instead of m.

TFDM = [WFDM ] + m

TSFDM = [WFDM + m/2]+m

TTDM = [WFDM + m/2]+1

= TFDM - [m/2-1]

Lectures 10 & 11

Reservations SystemsM/G/1 queues with Priority

Eytan Modiano MIT

RESERVATION SYSTEMS

• Single channel shared by multiple users • Only one user can use the channel at a time • Need to coordinate transmissions between users

• Polling systems

– Polling station polls the users in order Pollingto see if they have something to send station

– A scheduler can be used to receive and schedule transmission requests

R1 D1 R2 D2 R3 D3 R1 D1

– Reservation interval (R) used for polling or making reservations – Data interval (D) used for the actual data transmission

Reservations and polling systems

• Gated system - users can transmit only those packets that arrived prior to start of reservation interval

– E.g., explicit reservations • Partially gated system - Can transmit all packets that arrived before the

start of the data interval • Exhaustive system - Can transmit all packets that arrive prior to the end of

the data interval – E.g., token ring networks

• Limited service system - only one (K) packets can be transmitted in a data interval

R1 R2 R3 R1D1 D2 D3 D1

Gated system arrivals

Partially gated system arrival

Exhaustive system arrivals

Single user exhaustive systems

• Let Vj be the duration of the jth reservation interval – Assume reservation intervals are iid

• Consider the ith data packet:

E[Wi] = Ri + E[Ni]/µ Ri = residual time for current packet or reservation interval Ni = Number of packets in queue

• Identical to M/G/1 with vacations

W =λE[X 2 ] E[V 2] 2(1 − ρ)

+ 2E[V]

When V = A (constant)⇒ W =λE[X 2] A

Eytan Modiano 2(1 − ρ) +

2Slide 4

Single user gated system (e.g.,reservations)

i arrives Wi

i-2i-3i-4X X X ViVi-1 i-1X Xi Ri Time -> Ni = 4

Wi = R i + � X j + V

i j=i- N

E[Wi] = E[Ri] +E[Ni]E[X] + E[V]

W = R + NQ E[X] + E[V] (NQ=λW)

W = (R + E[V])/(1-ρ)

SINGLE USER RESERVATION SYSTEM

• The residual service time is the same as in the vacation case,

R = λ E[X2] +

(1-ρ )E[V2] 2 2 E[V]

• Hence,

W = λ E[X 2] + E[V 2] + E[V]

2(1- ρ ) 2 E[V] 1- ρ

• If all reservation intervals are of constant duration A,

W = λ E[X 2] + A 22(1- ρ ) 1- ρ

Multi-user exhaustive system

• Consider m incoming streams of packets, each of rate λ/m

• Service times {Xn} are IID and independent of arrivals with mean 1/µ, second moment E[X2].

• Server serves all packets from stream 0, then all from stream 1, ..., then all from m-1, then all from 0, etc.

• There is a reservation interval of fixed duration Vi = V (for all i)

Arrival from stream 0 Time -> W m = 3

V1 V 2 Stream 0 Stream 1 Stream 2 Stream 0

• Consider arbitrary packet i • Let Yi = the duration of whole reservation intervals during which

packet i must wait (E[Yi] = Y)

W = R + ρW + Y

• Packet i may arrive during the reservation or data interval of anyof the m streams with equal probability (1/m)

– If it arrives during its own interval Yi = 0, etc…, hence,

Yi = {iV w. p. 1/ m 0 ≤ i < m

V m −1i = V (m − 1)

Y = E[Yi ] = m

∑i =0 2

R + YR =

(1 − ρ)V 2

+λE[ X2 ]

Eytan Modiano

W = (1 − ρ)

, 2V 2

W = (1 − ρ)V + λE[ X 2 ] + V (m − 1)

,2(1 − ρ)

V V( m − 1) λE[ X2 ]=

2(1 − ρ ) +

2(1 − ρ)

• In text, V = A/m and hence,

A A(m − 1) λE[ X 2 ]W =

2m(1 − ρ) +

2(1 − ρ)

λE[ X2 ] V(m − ρ)=

2(1 − ρ) +

2(1 − ρ )

λE[ X 2 ] A(1 − ρ / m)=

2(1− ρ) +

2(1 − ρ)

Gated System

• When a packet arrives during its own reservation interval, it must wait m full reservation intervals

Yi = {iV w. p. 1/ m 1 ≤ i ≤ m

V mY = E[Yi ] = m

∑i =1 i = V(m

2 + 1)

2 + V( m + 1) λE[ X2 ] 2(1 − ρ)

+ 2(1 − ρ )

WithV = A/m,

λE[X2 ] A A(1 + 1/ m) λE[ X 2 ] A

2(1 − ρ ) +

2(1 − ρ) =

2(1 − ρ) +

2 (1 + (2 − ρ)/ m

)(1 − ρ)

M/G/1 Priority Queueing

• Priority classes 1, …, n (class 1 highest and n lowest) λk = arrivalrate for class k

µk = service rate for class k

E[Xk 2 ] = sec ondmoment of servicetime(class k)

• Non-preemptive system: Customer receiving service is allowed tocomplete service without interruption

iE[Xi 2 ]

λ∑ λi =1Wk =

2(1 − ρ1 − ... − ρk −1 )(1 − ρ i µ1 − ... − ρk ), ρ = i

• Notice that the waiting time of high priority traffic is affected bylower priority traffic

Preemptive-resume systems

• When a higher priority customer arrives, lower priority customer is interrupted – Service is resumed when no higher priority customers remain – Notice that the delay of high priority customers is no longer affected by that of lower

priority customers – Preemption is not always practical and usually involves some overhead

• Consider a class k arrival and let,

– Wk = waiting time for customers of class k or higher priority classes (1..K-1) alreadyin the system

Rk = residual time for class k or higher customers Notice that lower priority customers in service don’t affect Wk because they are preempted

– WI = Waiting time for higher priority customers that arrive while priority k customeris already in the system

– TK = Average system time for priority K customer

Tk = Wk + WI + 1/µ�

Preemptive-resume, continued…

Rk ∑λiE[Xi

2 ] Wk =

1 − ρ1 − ... − ρk, Rk = i=1

k −1 k −1

WI = ∑(λi / µi )Tk = ∑ (ρi)Tk i =1 i =1

1 Rk k −1

Tk = µ k

+ 1− ρ1 − ... − ρk

+ Tk ∑ ρi i =1

Tk = ( 1 ) (1 − ρ1 − ... − ρk ) + Rk 1 − ... − ρk −1 )(1 − ρµ k (1 − ρ 1 − ... − ρk )

• Notice independence of lower priority traffic Eytan Modiano

Stability of Queueing Systems

• Possible Definitions

– Average Delay is bounded

E(delay) < infinity)

– Delay is finite with probability 1

P(delay < infinity) = 1

– Existence of a stationary occupancy distribution

Occupancy does not drift to infinity

E(delay) < Infinity

• Example: M/M/1 queue

• Example: M/G/1 queue

µ − λ< ∞ ∀ λ < µ ⇒ ρ < 1

T =1µ

+λE[X 2]2(1− ρ)

< ∞ if (ρ < 1) and (E[X2 ] < ∞)

P(Delay< Infinity) = 1

• Slightly weaker definition than E[delay] < infinity

• P(delay < infinity) = 1 even if E(delay) = infinity

• Example:

• In general it can be shown that for any G/G/1 queue– Arrival and service time distributions may even be correlated!

If λ < µ, P(delay < Infinity) = 1 even if E(delay) not finite

(d ) = 2π(1+ d2 )

, d > 0

E[Delay] =2d

π(1+ d2 )0

∫ =Log[1+ d2 ]

π 0∞ ⇒ ∞

P[Delay < x] = 2π(1+ d2 )0

∫ = 2arctan(x)π

→x→ ∞

Existence of a stationary occupancy distribution

• Irreducible and Aperiodic Markov chain

– Pj > 0 for all states j => all states are visited infinitely often

• Drift:

• When in state I, Di > 0 => state tends to increaseDi < 0 => state tends to decrease

• Intuitively, we don’t want the state to drift to infinity, hence for large enough states the drift better get negative!

• Lemma: If Di < infinity for all i and for some δ > 0 and i’ > 0,

Di < - δ for all i > i’ , then the Markov chain has a stationary distribution

Di = E Xn+1 − Xn | Xn = i[ ]= kP( i,i+ k )k =i

Irriducible: all states communicate (I.e., positive probability of getting from every state to every other state)Periodic state : self transitions are possible only after a number of transitions (n)that is a multiple of some constant d (I.e., n = 3, 6, 9, …). Aperiodic => no state is periodic

Examples

• M/M/1

• M/M/m

• M/M/Inf

i i+1Di = E Xn+1 − Xn | Xn = i[ ]= 1(λδ ) − 1(µδ ) = (λ − µ )δ

Di < 0 ⇒λ < µ

Di = E Xn+1 − Xn | Xn = i[ ]= 1(λδ ) − 1(mµδ ) ∀i ≥ m

Di < 0 ⇒λ < mµ ∀i ≥ m

Di = E Xn+1 − Xn | Xn = i[ ]= 1(λδ ) − 1(iµδ )

Di < 0 ⇒λ < iµ

For any λ < ∞ and 1/ µ <∞ ∃i' s.t.,Di < 0 ∀i > i'

(ι+1)µδ

Lectures 13 & 14

Packet Multiple Access: The Aloha protocol

Eytan Modiano Massachusetts Institute of Technology

Multiple Access

• Shared Transmission Medium – a receiver can hear multiple transmitters – a transmitter can be heard by multiple receivers

• the major problem with multi-access is allocating the channelbetween the users; the nodes do not know when the other nodes have data to send

– Need to coordinate transmissions

Examples of Multiple Access Channels

• Local area networks (LANs) – Traditional Ethernet – Recent trend to non-multi-access LANs

• satellite channels

• Multi-drop telephone

• Wireless radio

• Medium Access Control (MAC) – Regulates access to channel

• Logical Link Control (LLC) – All other DLC functions

Approaches to Multiple Access

• Fixed Assignment (TDMA, FDMA, CDMA) – each node is allocated a fixed fraction of bandwidth – Equivalent to circuit switching – very inefficient for low duty factor traffic

• Contention systems – Polling

– Reservations and Scheduling

– Random Access

Single receiver, many transmitters

Receiver ....

Transmitters

E.g., Satellite system, wireless

Slotted Aloha

• Time is divided into “slots” of one packet duration – E.g., fixed size packets

• When a node has a packet to send, it waits until the start of the next slot to send it

– Requires synchronization • If no other nodes attempt transmission during that slot, the

transmission is successful – Otherwise “collision” – Collided packet are retransmitted after a random delay

1 3 4 5 2

Success Idle Collision Idle Success

Slotted Aloha Assumptions

• Poisson external arrivals • No capture

– Packets involved in a collision are lost – Capture models are also possible

• Immediate feedback – Idle (0) , Success (1), Collision (e)

• If a new packet arrives during a slot, transmit in next slot • If a transmission has a collision, node becomes backlogged

– while backlogged, transmit in each slot with probability qr until successful

• Infinite nodes where each arriving packet arrives at a new node – Equivalent to no buffering at a node (queue size = 1) – Pessimistic assumption gives a lower bound on Aloha performance

Markov chain for slotted aloha

• state (n) of system is number of backlogged nodes.

pi,i-1 = prob. of one backlogged attempt and no new arrival

pi,i =prob. of one new arrival and no backlogged attempts or nonew arrival and no success

pi,i+1= prob of one new arrival and one or more backlogged attempts

pi,i+j = Prob. Of J new arrivals and one or more backlogged attemptsor J+1 new arrivals and no backlogged attempts

• Steady state probabilities do not exists – Backlog tends to infinity => system unstable

Eytan Modiano – More later Slide 8

slotted aloha

• let g(n) be the attempt rate (the expected number of packetstransmitted in a slot) in state n

g(n) = λ + nqr

• The number of attempted packets per slot in state n isapproximately a Poisson random variable of mean g(n)

– P (m attempts) = g(n)me-g(n)/m! – P (idle) = probability of no attempts in a slot = e-g(n)

– p (success) = probability of one attempt in a slot = g(n)e-g(n)

– P (collision) = P (two or more attempts) = 1 - P(idle) - P(success)

Throughput of Slotted Aloha

• The throughput is the fraction of slots that contain a successfultransmission = P(success) = g(n)e-g(n)

– When system is stable throughput must also equal the external arrival rate (λ)

Departure rate g(n)e-g(n)

1 g(n)

– What value of g(n)maximizes throughput?

– g(n) < 1 => too many idle slots – g(n) > 1 => too many collisions – If g(n) can be kept close to 1, an external arrival rate of 1/e packets

per slot can be sustained

d dg( n)

g( n)e−g( n) = e−g( n) − g( n)e−g( n) = 0

⇒ g(n) = 1 ⇒ P( success) = g(n )e−g( n) = 1/ e ≈ 0.36

Instability of slotted aloha

• if backlog increases beyond unstable point (bad luck) then it tendsto increase without limit and the departure rate drops to 0

• Drift in state n, D(n) is the expected change in backlog over onetime slot

– D(n) = λ - P(success) = λ - g(n)e-g(n)

negative drift

positivedrift G=0

λ Arrival rate

Departure rate

Stable Unstable

negative drift

positivedrift

G = λ + nqr

Stabilizing slotted aloha

• choosing qr small increases the backlog at which instabilityoccurs ( since g(n) = λ + nqr), but also increases delay (since meanretry time is 1/qr)

• solution: estimate the backlog (n) from past feedback – Given the backlog estimate, choose qr to keep g(n) = 1

Assume all arrivals are immediately backlogged g(n) = nqr , P(success) = nqr (1-qr)n-1

To maximize P(success) choose qr = min{1,1/n} – When the estimate of n is perfect:

idles occur with probability 1/e,successes with 1/e, andcollisions with 1-2/e.

– When the estimate is too large, too many idle slots occur – When the estimate is too small, too many collisions occur

• Nodes can use feedback information (0,1,e) to make estimates – A good rule is increase the estimate of n on each collision, and to

decrease it on each idle slot or successful slot note that the increase on a collision should be (e-2)-1 times as large as thedecrease on an idle slot

stabilized slotted aloha

• assume all arrivals are immediately backlogged – g(n) = nqr = attempt rate – p(success) = nqr (1-qr)n-1

for max throughput set g(n) = 1 => qr = min{1,1/n’}where n’ is the estimate of n

– Let nk = estimate of backlog after kth slot

max {λ, nk+λ-1} idle or success =nk+1

nk+λ+(e-2)-1 collision

– Can be shown to be stable for λ < 1/e

TDM vs. slotted aloha

0 0.2 0.4 0.6 0.8

ARRIVAL RATE

• Aloha achieves lower delays when arrival rates are low • TDM results in very large delays with large number of users, while

Aloha is independent of the number of users

TDM, m=8

TDM, m=16

Pure (unslotted) Aloha

• New arrivals are transmitted immediately (no slots) – No need for synchronization – No need for fixed length packets

• A backlogged packet is retried after an exponentially distributedrandom delay with some mean 1/x

• The total arrival process is a time varying Poisson process of rateg(n) = λ + nx (n = backlog, 1/x = ave. time between retransmissions)

• Note that an attempt suffers a collision if the previous attempt is not yet finished (ti-ti-1<1) or the next attempt starts too soon (ti+1-ti<1)

New Arrivals

43τ τ

t 1 t 2 t 3 t 4 t 5

Collision Eytan Modiano

Retransmission

Throughput of Unslotted Aloha

• An attempt is successful if the inter-attempt intervals on bothsides exceed 1 (for unit duration packets)

– P(success) = e-g(n) e-g(n) = e-2g(n)

– Throughput (success rate) = g(n) e-2g(n)

– For max throughput at g(n) = 1/2, Throughput = 1/2e ~ 0.18

– stabilization issues are similar to slotted aloha

– advantages of unslotted aloha are simplicity and possibility of unequal length packets

Splitting Algorithms

• More efficient approach to resolving collisions – Simple feedback (0,1,e) – Basic idea: assume only two packets are involved in a collision

Suppose all other nodes remain quiet until collision is resolved, and nodes in the collision each transmit with probability 1/2 until one issuccessful

On the next slot after this success, the other node transmits

The expected number of slots for the first success is 2, so the expectednumber of slots to transmit 2 packets is 3 slots

Throughput over the 3 slots = 2/3

– In practice above algorithm cannot really work Cannot assume only two users involved in collision Practical algorithm must allow for collisions involving unknown numberof users

Tree algorithms

• After a collision, all new arrivals and all backlogged packets notin the collision wait

• Each colliding packet randomly joins either one of two groups(Left and Right groups)

– Toss of a fair coin – Left group transmits during next slot while Right group waits

If collision occurs Left group splits again (stack algorithm) Right group waits until Left collision is resolved

– When Left group is done, right group transmits (1,2,3,4)

(1,2,3) 4

success collision

(2,3) collision

collision (2,3)

success success

Notice that after the idle slot, collision between (2,3) was sure to happen and could have been avoided

Many variations and improvements on the original tree splitting algorithm

success

Throughput comparison

• stabilized pure aloha T = 0.184 = (1/(2e))

• stabilized slotted aloha T = 0.368 = (1/e)

• Basic tree algorithm T = 0.434

• Best known variation on tree algorithm T = 0.4878

• Upper bound on any collision resolution algorithm with (0,1,e)feedback T <= 0.568

• TDM achieves throughputs up to 1 packet per slot, but the delayincreases linearly with the number of nodes

Lectures 15 & 16

Local Area Networks

Eytan Modiano

Carrier Sense Multiple Access (CSMA)

• In certain situations nodes can hear each other by listening to the channel - “Carrier Sensing”

• CSMA: Polite version of Aloha – Nodes listen to the channel before they start transmission

Channel idle => Transmit Channel busy => Wait (join backlog)

– When do backlogged nodes transmit?

When channel becomes idle backlogged nodes attempt transmission with probability qr= 1

Persistent protocol, qr= 1

Non-persistent protocol, qr< 1

• Let τ = the maximum propagation delay on the channel – When a node starts/stops transmitting, it will take this long for all

nodes to detect channel busy/idle

• For initial understanding, view the system as slotted with "mini-slots" of duration equal to the maximum propagation delay

– Normalize the mini-slot duration to β = τ/Dtp and packet duration = 1

β −> minislots packet

<-----------<− ----------------> 1

• Actual systems are not slotted, but this hypothetical systemsimplifies the analysis and understanding of CSMA

Rules for slotted CSMA

• When a new packet arrives – If current mini-slot is idle, start transmitting in the next mini-slot – If current mini-slot is busy, node joins backlog – If a collision occurs, nodes involved in collision become backlogged

• Backlogged nodes attempt transmission after an idle mini-slot with probability qr < 1 (non-persistent)

– Transmission attempts only follow an idle mini-slot – Each”busy-period” (success or collision) is followed by an idle slot

before a new transmission can begin

• Time can be divided into epochs: – A successful packet followed by an idle mini-slot (duration = β+1) – A collision followed by an idle mini-slot (duration = β+1) – An idle minislot (duration = β)

�Analysis of CSMA

• Let the state of the system be the number of backlogged nodes

• Let the state transition times be the end of idle slots – Let T(n) = average amount of time between state transitions when the system is

in state n T(n) = β + (1 - e-λβ (1-qr)n)

When qr is small (1-qr)n ~ e-qrn => T(n) = β + (1 - e-λβ−nq

• At the beginning of each epoch, each backlogged node transmits with probability qr

• New arrivals during the previous idle slot are also transmitted

• With backlog n, the number of packets that attempt transmission at the beginning of an epoch is approximately Poisson with rate

g(n) = λβ + nqr

Analysis of CSMA

• The probability of success (per epoch) is

Ps = g(n) e-g(n)

• The expected duration of an epoch is approximately

T(n) ~ β + (1 - e-g(n) )

• Thus the success rate per unit time is

g(n)e− g( n)

λ < departure rate= β + 1− e− g( n)

Maximum Throughput for CSMA

• The optimal value of g(n) can again be obtained: 1

g(n) ≈ 2β λ < 1 + 2β

• Tradeoff between idle slots and time wasted on collisions

• High throughput when β is small

• Stability issues similar to Aloha (less critical)

1-¦2 β

Arrival rate

Departure rate

g(n) = λβ r + nqEytan Modiano ¦2 βSlide 7

Unslotted CSMA

• Slotted CSMA is not practical – Difficult to maintain synchronization – Mini-slots are useful for understanding but not critical to the

performance of CSMA

• Unslotted CSMA will have slightly lower throughput due toincreased probability of collision

• Unslotted CSMA has a smaller effective value of β than slotted CSMA

– Essentially β becomes average instead of maximum propagation delay

CSMA/CD and Ethernet

Two way cable

WS WS WS WS WS WS

• CSMA with Collision Detection (CD) capability – Nodes able to detect collisions – Upon detection of a collision nodes stop transmission

Reduce the amount of time wasted on collisions

• Protocol:

– All nodes listen to transmissions on the channel

– When a node has a packet to send: Channel idle => Transmit Channel busy => wait a random delay (binary exponential backoff)

– If a transmitting node detects a collision it stops transmission Waits a random delay and tries againEytan Modiano

Time to detect collisions

WS WS τ τ = prop delay

• A collision can occur while the signal propagates between the twonodes

• It would take an additional propagation delay for both users todetect the collision and stop transmitting

• If τ is the maximum propagation delay on the cable then if acollision occurs, it can take up to 2τ seconds for all nodes involved in the collision to detect and stop transmission

Approximate model for CSMA/CD

• Simplified approximation for added insight

• Consider a slotted system with “mini-slots” of duration 2τ

2τ −> <----------- 1 ----------------> <− packet minislots

• If a node starts transmission at the beginning of a mini-slot, by theend of the mini-slot either

– No collision occurred and the rest of the transmission will be uninterrupted

– A collision occurred, but by the end of the mini-slot the channel would be idle again

• Hence a collision at most affects one mini-slot

Analysis of CSMA/CD

• Assume N users and that each attempts transmission during afree “mini-slot” with probability p

– P includes new arrivals and retransmissions

N P(i users attempt) =

i Pi(1− P)N −i

P(exactly 1 attempt) = P(success) = NP(1-P)N-1

To maximize P(success),

[NP(1- P)N-1] = N(1-P)N-1 − N(N − 1)P(1− P)N− 2 = 0

1 ⇒ Popt =

⇒ Average attempt rate of one per slot

⇒ Notice the similarity to slotted Aloha Eytan Modiano

Analysis of CSMA/CD, continued

P(success) =NP(1- p)N-1 = (1− 1 N

)N −1

1Ps = limit (N → ∞) P(success) = e

Let X = Average number of slots per succesful transmission

P(X = i) = (1- Ps)i −1Ps

1 ⇒ E[X] = Ps

• Once a mini-slot has been successfully captured, transmissioncontinues without interruption

• New transmission attempts will begin at the next mini-slot after the end of the current packet transmission

Analysis of CSMA/CD, continued

• Let S = Average amount of time between successful packettransmissions

S = (e-1)2τ + DTp + τ Ave time until start of next Mini-slot

Idle/collision Packet transmission timeMini-slots

• Efficiency = DTp/S = DTp / (DTp + τ + 2τ(e-1))

• Let β = τ/ DTp => Efficiency ≈ 1/(1+4.4β) = λ < 1/(1+4.4β)

1• Compare to CSMA without CD where λ < 1 + 2β

Notes on CSMA/CD

• Can be viewed as a reservation system where the mini-slots areused for making reservations for data slots

• In this case, Aloha is used for making reservations during themini-slots

• Once a users captures a mini-slot it continues to transmit withoutinterruptions

• In practice, of course, there are no mini-slots

– Minimal impact on performance but analysis is more complex

CSMA/CD examples

• Example (Ethernet) – Transmission rate = 10 Mbps – Packet length = 1000 bits, DTp = 10-4 sec – Cable distance = 1 mile, τ = 5x10-6 sec

– ➨ β = 5x10-2 and E = 80%

• Example (GEO Satellite) - propagation delay 1/4 second – β = 2,500 and E ~ 0%

• CSMA/CD only suitable for short propagation scenarios!

• How is Ethernet extended to 100 Mbps?

• How is Ethernet extended to 1 Gbps?

Token rings

• Token rings were developed by IBM in early 1980’s

• Token: a bit sequence – Token circulates around the ring

Busy token: 01111111 Free token: 01111110

• When a node wants to transmit – Wait for free token – Remove token from ring (replace with busy token) – Transmit message – When done transmitting, replace free token on ring

– Nodes must buffer 1 bit of data so that a free token can be changed to a busy token

• Token ring is basically a polling system Token does the polling

Token Ring

Release of token

• Release after transmission – Node replaces token on ring as soon as it is done transmitting the

packet – Next node can use token after short propagation delay

• Release after reception – Node releases token only after its own packet has returned to it

Serves as a simple acknowledgement mechanism

PACKET TRANSMISSION(release after transmission)

• When not transmitting their own packets nodes relay whateverthey receive

• After receiving an idle token a node can start sending a newpacket (discard incoming bits)

• After a node sends a packet and the idle token, it sends idle filluntil:

– The packet followed by idle, or – busy token, returns around the ring

BTPacket return

BT Packet Idle fillBT New packet IT Packet

BT Packet IT Idle fill Packet

Transmitted bits

Received bits

<-one time unit->

PACKET TRANSMISSION(release after reception)

• In many implementations (including IEEE802.5, but not includingFDDI), a node waits to check its packet return before sending theidle token.

This increases packet transmission time by one round trip delay.

BTBT Packet IT Idle fill Packet returnBTIdle fill

BT Packet BT Idle fill Idle fill IT

Idle fill

Idle fill BT New packet

Delay analysis

• System can be analyzed using multi-user reservation results

• Exhaustive system - nodes empty their queue before passingtoken on to the next node

• Assume m nodes and each with Poisson arrivals of rate λ/m

• Let v = average propagation and token transmission delay

• System can be viewed as a reservation system with m users andaverage reservation interval (see reservation system results)

W =λE[ X 2 ] + v( m − ρ)

, ρ = m(λ / m)E[ X] = λE[ X ]2(1− ρ)

• Notice that 100% throughput can be achieved for exhaustive system

Throughput analysis (non-exhaustive)

• Gated system with limited service - each node is limited to sending one packet at a time

– When system is heavily loaded nodes are always busy and have apacket to send

• Suppose each node transmits one packet and then releases thetoken to the next node

– Vi = propagation and transmission time for token between two nodes (transmission time is usually negligible)

• The amount of time to transmit N packets

TN = N*E[X] + V1 + V2 +…+ VN = N*E[X] + N*E[V]

λ < N*E[X]/(N*E[X] + N*E[V]) = 1/(1+E[V]/E[X])

• Compare to CSMA/CD, but notice that V is the delay between twonodes and not the maximum delay on the fiber

Throughput analysis (token release after reception)

• Nodes release token only after it has returned to it • Again assume each node sends one packet at a time

• Total time to send ONE packet

Time to send token to next node • T = E[X] + V1 + V2 +…+ Vm + Vi

M nodes on the ring

• T = E[X] + (m+1)E[V] =>

λ < E[X]/T = 1/(1+(m+1)E[V]/E[X])

Delay Analysis

• Release after transmission – Partially gated limited service system (sec. 3.5.2)

W =λE[ X 2 ] + v( m + λE[ X]) 2(1 − λE[ X ] − λv)

• Release after reception – Homework problem 4.27 – Additional round-trip time can be added to the packet transmission

W =λ( E[ X2 ] + 2 mv + m2v2 ) + v( m + λ (E[ X] + mv))

2(1 − λ( E[ X] + ( m + 1)v))

Token ring issues

• Fairness: Can a node hold the token for a long time – Solution: maximum token hold time

• Token failures: Tokens can be created or destroyed by noise

– Distributed solution: Nodes are allowed to recognize the loss of a token and create a new token

Collision occurs when two or more nodes create a new token at the same time => need collision resolution algorithms

• Node failures: Since each node must relay all incoming data, thefailure of a single node will disrupt the operation of the ring

• Token ring standard: IEEE 802.5

• Fiber distributed data interface (FDDI) is a 100 Mbps Fiber OpticToken Ring local area network standard

• FDDI uses two counter-rotating rings – Single faults can be isolated by switching from one ring to the other

on each side of

(distance between nodes, number of nodes)

fault.

• Token release after transmission

• Limit on token hold time

• Upper-bound on time betweentoken visits at a node

– Support for guaranteed delays – Imposes a limit on the size of a ring

• FDDI designed to be a metro or campus area network technology

TOKEN BUSES

WS WS WS WS WS WS

• Special control packet serves as a token • Nodes must have token to transmit • Token is passed from node to node in some order

– Conceptually, a token bus is the same as a token ring

– When one node finishes transmission, it sends an idle token to the next node (by addressing the control packet properly)

– Similar to a polling system • Issues

– Efficiency lower than token rings due to longer transmission delayfor the packets and longer propagation delays

– Need protocol for joining and leaving the bus

IMPLICIT TOKENS

• The idle tokens on a token bus can be replaced with silence • The next node starts to transmit a packet after hearing the bus

become silent • If the next node has no packet, successive nodes start with

successively greater delay • If the bus propagation delay is much smaller than the time to

transmit a token, this can reduce delay

• This scheme is used for wireless LANs (IEEE 802.11) and it goesby the name of CSMA/CA (collision avoidance)

DISTRIBUTED QUEUE DUAL BUS (DQDB)

• Metropolitan area network using two oppositely directedunidirectional 150 Mbps buses

• All frames are the same length (53 bytes); empty frames aregenerated at the head ends of the buses and are filled by thenodes "on the fly"

• A node uses the right moving bus to send frames to nodes on theright and the left moving bus for nodes on its left

• DQDB was standardized as IEEE 802.6 and was intended to be compatible with ATM

DQDB Reservations

• Greedy algorithm: Each node uses a free slot when it has something to send

– Thus an efficiency of 100% is possible

• The trouble with this trivial approach is unfairness - nodes at the tail of the bus can be “starved”

• DQDB uses a reservations systems whereby nodes send requests upstream so that empty slots can be reserved

– If a node has a frame to send on the right bus, it sets the request bitin a frame on the left bus

– Nodes maintain an “implicit” queue of requests that can be served on a FCFS basis (hence the name distributed queue)

Large propagation delay(satellite networks)

A = mv

1 5 4 3 2

Reservation Data Reservation Interval Interval Interval

Res Data Res Data DataRes Res

Arrival Propagation Delay Transmit

Wait for Reser- Wait for Assignedvation Interval Data Slot

• Satellite reservation system – Use mini-slots to make reservation for longer data slots – Mini-slot access can be inefficient (Aloha, TDMA, etc.)

• To a crude approximation, delay is 3/2 times the propagation delayplus ideal queueing delay.

Satellite Reservations

• Frame length must exceed round-trip delay – Reservation slots during frame j are used to reserve data slots in

frame j+1 – Variable length: serve all requests from frame j in frame j+1

Difficult to maintain synchronization Difficult to provide QoS (e.g., support voice traffic)

– Fixed length: Maintain a virtual queue of requests • Reservation mechanism

– Scheduler on board satellite – Scheduler on ground – Distributed queue algorithm

All nodes keep track of reservation requests and use the same algorithm tomake reservation

• Control channel access – TDMA: Simple but difficult to add more users – Aloha: Can support large number of users but collision resolution

can be difficult and add enormous delay

Aloha Reservations

• Use Aloha to capture a slot • After capturing a slot user keeps the slot until done

– Other users observe the slot busy and don’t attempt • When done other users can go after the slot

– Other users observe the slot idle and attempt using Aloha • Method useful for long data transfers or for mixed voice and data

Slot 1 2 3 4 5 6

2 frame 1

2 frame 2

idle frame 3

6 frame 4

6 frame 5

15 3 20

15 7 9

18 7 5 9

idle 3

Packet multiple access summary

• Latency: Ratio of propagation delay to packet transmission time – GEO example: Dp = 0.5 sec, packet length = 1000 bits, R = 1Mbps

Latency = 500 => very high – LEO example: Dp = 0.1 sec

Latency = 100 => still very high – Over satellite channels data rate must be very low to be in a low

latency environment • Low latency protocols

– CSMA, Polling, Token Rings, etc. – Throughput ~ 1/(1+aα), α = latency, a = constant

• High latency protocols – Aloha is insensitive to latency, but generally low throughput

Very little delays – Reservation system can achieve high throughput

Delays for making reservations – Protocols can be designed to be a hybrid of Aloha and reservations

Aloha at low loads, reservations at high loads

Migration to switched LANs

• Traditional Ethernet – Nodes connected with coax

Long “runs” of wire everywhere – CSMA/CD protocol

WS WS WS WS WS WS

• “Hub” Ethernet – Nodes connected to hub

Hub acts as a broadcast repeater Shorted cable “runs”, Useful for 100 Mbps

– CSMA/CD protocol – Easy to add/remove users – Easy to localize faults – Cheap cabling (twisted pair, 10baseT)

• Switched Ethernet – No CSMA/CD

Easy to increase data rate (e.g., Gbit Ethernet) – Nodes transmit when they want

Packet Switch

Connect – Switch queues the packets and transmits to To otherdestination Switchs – Typical switch capacity of 20-40 ports – Each node can now transmit at the full rate of

10/100/Gbps – Modularity: Switches can be connected to

each other using high rate ports

Lectures 17 & 18

Fast packet switching

Eytan Modiano Massachusetts Institute of Technology

Packet switches

Packet Routing engine Switch

Scheduler

Packet

Data Header Packet Tag

DestinationAddress Output port numberor VC number

• A packet switch consists of a routing engine (table look-up), aswitch scheduler, and a switch fabric.

• The routing engine looks-up the packet address in a routing tableand determines which output port to send the packet.

– Packet is tagged with port number – The switch uses the tag to send the packet to the proper output port

First Generation Switches

LC-1 LC-2 LC-3 Input bufferoutput buffer

• Computer with multiple line cards – CPU polls the line cards – CPU processes the packets

• Simple, but performance is limited by processor speeds and busspeeds

• Examples: Ethernet bridges and low end routers

Second Generation switches

Computer

LC LC LC LC

• Most of the processing is now done in the line cards – Route table look-up, etc. – Line cards buffer the packets – Line card send packets to proper output port

• Advantages: CPU and main Memory are no longer the bottleneck

• Disadvantage: Performance limited by bus speeds – Bus BW must be N times LC speed (N ports)

• Example: CISCO 7500 series router Eytan Modiano

Third generation switches

N by N

SWITCH FABRIC

Input LC

Output LC

Controller

• Replace shared bus with a switch fabric • Performance depends on the switch fabric, but potentially can

alleviate the bus bottleneck

Switch Architectures

• Distributed buffer

• Output buffer

• Input buffer

Distributed buffer

• Modular Architecture

Basic module is a 2x2 switch, which can be either in the throughor crossed position

• Switch buffers: None, at input, or at output of each moduleSwitch fabric consists of many 2x2 modules

N N inputs outputs

Interconnection networks

• N input • Log(N) stages with N/2 modules per stage

Example: Omega (shuffle exchange network) 0 1 2 3

4 5 6 7

0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 117

• Notice the order of inputs into a stage is a shuffle of the outputsfrom the previous stage: (0,4,1,5,2,6,3,7)

• Easily extended to more stages • Any output can be reached from any input by proper switch

settings – Not all routes can be done simultaneously – Exactly one route between each SD pair

Eytan Modiano – Self-routing network Slide 8

Self Routing

• Use a tag: n bit sequence with one bit per stage of the network – E.g., Tag = b3b2b1

• Module at stage i looks at bit i of the tag (bi), and sends the packetup if bi=0 and down if bi=1

• In omega network, for destination port with binary address abc the tag is cba

– Example: output 100 => tag = 001 – Notice that regardless of input port, tag 001 will get you to output 100

Baseline network

• Another Example of a multi-stage interconnection network • Built using the basic 2x2 switch module • Recursive construction

– Construct an N by N switch using two N/2 by N/2 switches and a new stage of N/2 basic (2x2) modules

– N by N switch has Log2(N) stages each with N/2 basic (2x2) modules

N inputs 4 x 4 switch example

N/2 x N/2

2x2 2x2

N/2 basic mods 2 N/2 by N/2 switches Eytan Modiano

Contention

• Two packets may want to use the same link at the same time(same output port of a module)

• Hot spot effect

• Solution: Buffering

Throughput analysis of interconnection networks

• Assume no buffering at the switches

• If two packets want to use the same port one of them is dropped

• Suppose switch has m stages

• Packet transmit time = 1 slot (between stages)

• New packet arrival at the inputs, every slot – Saturation analysis (for maximum throughput) – Uniform destination distribution independent from packet to packet

Interconnection Throughput, continued

• Let P(m) be the probability that a packet is transmitted on a stagem link

P(m) A C P(m+1)P(m) B

• P(0) = 1 • P(m+1) = 1 – P(no packet on stage m+1 link (link c) )

= 1 – P(neither inputs to stage m+1 chooses this output)

• Each input has a packet with probability P(m) and that packet willchoose the link with probability 1/2. Hence,

P(m + 1) = 1− (1 − 1 2

P( m))2

• We can now solve for P(m) recursively • For an m stage network, throughput (per output link) is P(m),

which is the probability that there is a packet at the output Eytan Modiano

Interconnection Throughput, continued

Throughput of interconnect network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

stages

• Throughput can be significantly improved by adding buffers at the stages – Buffers increase delay – Tradeoff between delay and throughput

Advantages/Disadvantages of multi-stage architecture

• Advantages – Modular – Scalable – Bus (links) only needs to be as fast as the line cards

• Disadvantages – Delays for going through the stages

Cut-through possible when buffers empty – Decreased throughput due to internal blocking

• Alternatives: Buffers that are external to the switch fabric – Output buffers – Input buffers

Output buffer architecture

Interconnect fabric or

N inputs

• As soon as a packet arrives, it is transferred to the appropriateoutput buffer

• Assume slotted system (cell switch) • During each slot the switch fabric transfers one packet from each

input (if available) to the appropriate output – Must be able to transfer N packets per slot – Bus speed must be N times the line rate – No queueing at the inputs

Buffer at most one packet at the input for one slot

Queueing Analysis

• If external arrivals to each input are Poisson (average rate A ),each output queue behaves as an M/D/1 queue

– packet duration equaling one slot X = X 2 = 1

• The average number of packets at each output is given by (M/G/1formula):

NQ = 2 A − ( A )2

2(1 − A)

• Note that the only delay is due to the queueing at the outputs andnone is due to the switch fabric

Advantages/Disadvantages ofOutput buffer architecture

• Advantages: No delay or blocking inside switch • Disadvantages:

– Bus speed must be N times line speed Imposes practical limit on size and capacity of switch

• Shared output buffers: output buffers are implemented in sharedmemory using a linked list

– Requires less memory (due to statistical multiplexing) – Memory must be fast

Input buffer architecture

• Packets buffered at input rather than output – Switch fabric does not need to be as fast

Crossbar switch

4 X = connect

Scheduler X

1 2 3 4

• During each slot, the scheduler established the crossbarconnections to transfer packets from the input to the outputs

– Maximum of one packet from each input – Maximum of one packet to each output

• Head of line (HOL) blocking – when the packet at the head of twoor more input queues is destined to the same output, only one can

Eytan Modiano be transferred and the other is blockedSlide 19

Throughput analysis of input queued switches

• HOL blocking limits throughput because some inputs(consequently outputs) are kept idle during a slot even when theyhave other packet to send in their queue

• Consider an NxN switch and again assume that inputs aresaturated (always have a packet to send)

• Uniform traffic => each packet is destined to each output withequal probability (1/N)

• Now, consider only those packets at the head of their queues(there are N of them!)

Throughput analysis, continued

i• Let Qm be the number of HOL packets destined to node i at theend of the mth slot

i i iQm = max(0,Qm −1 + Am − 1)

• Where iAm = number of new HOL messages addressed to node i that arriveto the HOL during slot m. Now,

i Cm −1

P( Am = l) =

l (1/ N )l (1 − 1/ N )Cm− 1 − l

• Where

Cm −1 = number of HOL messages that departed during the m-1 slot =number of new HOL arrivals

i• As N approaches infinity, Am becomes Poisson of rate C/N where C is the average number of departures per slot

Throughput analysis, continued

• In steady-state, Qi behaves as an M/D/1 of rate A and, as before,

Qi = 2 A − (A )2

2(1 − A)

• Notice however that the total number of packets addressed to theoutputs is N (number of HOL packets). Hence,

∑Qi = N => Qi = 2 A − ( A )2

= 1 i =1 2(1 − A)

We can now solve, using the quadratic equation to obtain:

A = utilization = 2 − 2 ≈ 0.58

Summary of input queued switches

• The maximum throughput of an input queued switch, is limited byHOL blocking to 58% ( for large N)

– Assuming uniform traffic and FCFS service

• Advantages of input queues: – Simple – Bus rate = line rate

• Disadvantages: Throughput limitation

Overcoming HOL blocking

• If inputs are allowed to transfer packets that are not at the head oftheir queues, throughput can be substantially improved (notFCFS)

Example: input 1 2 1

input 2 2 3

input 3 3 4

input 4 2 4

• How does the scheduler decide which input to transfer to whichoutput?

Backlog matrix

output

input 2

• Each entery in the backlog matrix represent the number ofpackets in input i’s queue that are destined to output j

• During each slot the scheduler can transfer at most one packetfrom each input to each output

– The scheduler must choose one packet (at most) from each row, and column of the backlog matrix

– This can be done by solving a bi-partite graph matching algorithm – The bi-partite graph consists of N nodes representing the inputs and

N nodes representing the outputs

Bi-partite graph representation

• There is an edge in the graph from an input to an output if there is apacket in the backlog matrix to be transferred from that input to that output

– For previous backlog matrix, the bi-partite graph is:

• Definition: A matching is a set of edges, such that no two edges sharea node

– Finding a matching in the bi-partite graph is equivalent to finding a set ofpackets such that no two packets share a row or column in the backlog matrix

• Definition: A maximum matching is a matching with the maximumpossible number of edges

– Finding a maximum matching is equivalent to finding the largest set of Eytan Modiano packets that can be transferred simultaneouslySlide 26

Maximum Matchings

• Algorithms for finding maximum matching exist • The best known algorithms takes O(N2.5) operations

– Too long for large N

• Alternatives – Sub-optimal solutions – Maximal matching: A matching that cannot be made any larger for a

given backlog matrix

– For previous example:

(1-1,3-3) is maximal

(2-1,1-2,3-3) is maximum

• Fact: The number of edges in a maximal matching ≥ 1/2 thenumber of edges in a maximum matching

Achieving 100% throughput in an input queued switch

• Finding a maximum matching during each time slot does noteliminate the effects of HOL blocking

– Must look beyond one slot at a time in making scheduling decisions

• Definition: A weighted bi-partite graph is a bi-partite graph withcosts associated with the edges

• Definition: A maximum weighted matching is a matching with themaximum edge weights

• Theorem: A scheduler that chooses during each time slot themaximum weighted matching where the weight of link (i,j) is equal tothe length of queue (i,j) achieves full utilization (100% throughput)

– Proof: see “Achieving 100% throughput in an input queued switch” byN. McKeown, et. al., IEEE Transactions on Communications, Aug. 1999.

Lecture 19

Broadcast routing

Eytan Modiano

Broadcast Routing

• Route a packet from a source to all nodes in the network

• Possible solutions:

– Flooding: Each node sends packet on all outgoing links Discard packets received a second time

– Spanning Tree Routing: Send packet along a tree that includes all of the nodes in the network

Graphs

• A graph G = (N,A) is a finite nonempty set of nodes and a set ofnode pairs A called arcs (or links or edges)

4 N = {1,2,3}

N = {1,2,3,4}A = {(1,2),(2,3),(1,4),(2,4)} A = {(1,2)}

Walks and paths

• A walk is a sequence of nodes (n1, n2, ...,nk) in which each adjacent nodepair is an arc.

• A path is a walk with no repeated nodes.

Walk (1,2,3,4,2) Path (1,2,3,4)

Cycles

• A cycle is a walk (n1, n2,...,nk) with n1 = nk, k>3, and with norepeated nodes except n1 = nk

Cycle (1,2,4,3,1) 1

Connected graph

• A graph is connected if a path exists between each pair of nodes.

Connected Unconnected

• An unconnected graph can be separated into two or more connected components.

Acyclic graphs and trees

• An acyclic graph is a graph with no cycles.

• A tree is an acyclic connected graph.

Acyclic, unconnected Cyclic, connected not tree not tree

• The number of arcs in a tree is always one less than the number of nodes

– Proof: start with arbitrary node and each time you add an arc you add a node => N nodes and N-1 links. If you add an arc without adding a node, the arc must go to a node already in the tree and hence form a cycle

Subgraphs

• G' = (N',A') is a subgraph of G = (N,A) if – 1) G' is a graph – 2) N' is a subset of N – 3) A' is a subset of A

• One obtains a subgraph by deleting nodes and arcs from a graph

– Note: arcs adjacent to a deleted node must also be deleted

– Graph G Subgraph G' of G

Spanning trees

• T = (N',A') is a spanning tree of G = (N,A) if

– T is a subgraph of G with N' = N and T is a tree

Graph G Spanning tree of G

Spanning trees

• Spanning trees are useful for disseminating and collecting controlinformation in networks; they are sometimes useful for routing

• To disseminate data from Node n: – Node n broadcasts data on all adjacent tree arcs – Other nodes relay data on other adjacent tree arcs

• To collect data at node n: – All leaves of tree (other than n) send data – Other nodes (other than n) wait to receive data on all but one adjacent

arc, and then send received plus local data on remaining arc

General construction of a spanning tree

• Algorithm to construct a spanning tree for a connected graph G = (N,A):

1) Select any node n in N; N' = {n}; A' = { }

2) If N' = N, then stop (T=(N',A') is a spanning tree)

3) Choose (i,j) ∈ A, i ∈ N', j ∉N'

N' := N'∪{j}; A' := A'∪{(i,j)}; go to step 2

• Connectedness of G assures that an arc can be chosen in step 3 as long as N’ ≠ N

• Is spanning tree unique?

Spanning tree algorithm

• The algorithm never forms a cycle, since each new arc goes to anew node.

• T = (N',A') is a tree at each step of the algorithm since T is alwaysconnected, and each time we add an arc we also add a node

• Theorem: If G is a connected graph of n nodes, then

1) G contains at least n-1 arcs 2) G contains a spanning tree 3) if G contains exactly n-1 arcs, G is a spanning tree

Distributed algorithms to find spanning trees

1) A fixed node sends a "start" message on each adjacent arc of the graph

2) Each other node marks the first arc on which a start message was receivedas a spanning tree arc and then sends a "start" message on each other arc

– This is a distributed implementation of the general spanning tree algorithm

– It has several problems shared by many such algorithms:

a) who chooses the starting node? b) When does the algorithm terminate? c) The resulting tree is somewhat random

Min weight spanning tree

• Given a graph with weights assigned to each arc, find a spanning tree of minimum total weight (MST)

• Define a "fragment" to be a subtree of a MST

• Theorem: – Given a fragment F of an MST, Let a(i,j) be a minimum weight outgoing arc from

F, where j is not in F.

– Then, F extended by arc a(i,j) & node j is a fragment.

• Proof: – Let M be the MST that does not include a(i,j).

– Since a(i,j) is not part of M, then adding a(i,j) to M must cause a cycle. There must be some link in the cycle b ≠ a which is outgoing from F.

– Deleting b and adding a creates a new spanning tree. Since weight of b cannotbe less then weight of a , M' must be a MST.

If weight of a = weight of b, then both are MST’s otherwise M could not have been an MST

MST algorithms

• Generic MST algorithm steps: – Given a collection of subtrees of an MST (called fragments) add a

minimum weight outgoing edge to some fragment

• Prim-Dijkstra: Start with an arbitrary single node as a fragment – Add minimum weight outgoing edge

• Kruskal: Start with each node as a fragment; – Add the minimum weight outgoing edge, minimized over all

fragments

Prim-Dijkstra Algorithm

Step 3

Step 4 Step 5

1 �2

Kruskal Algorithm

9 5 4 1

MST from fragmentMin weight outgoing edge

Fragment

• Suppose the arcs of weight 1 and 3 are a fragment

– Consider any spanning tree using those arcs and the arc of weight 4,say, which is an outgoing arc from the fragment.

– Suppose that spanning tree does not use the arc of weight 2.

– Removing the arc of weight 4 and adding the arc of weight 2 yieldsanother tree of smaller weight.

– Thus an outgoing arc of min weight from fragment must be in MST.

Lecture 20

Routing in Data Networks

Eytan Modiano

Routing

• Must choose routes for various origin destination pairs (O/D pairs)or for various sessions

– Datagram routing: route chosen on a packet by packet basis

Using datagram routing is an easy way to split paths

– Virtual circuit routing: route chosen a session by session basis

– Static routing: route chosen in a prearranged way based on O/D pairs

Routing is a global problem

5 units 5 units

All links have capacity10 units

Either session alone is best routed throughcenter path, butboth cannot go throughcenter.

• Static routing is not desirable

All links have capacity10 units

10 units 15 units

Both sessions must split theirtraffic between two paths.

• Datagam routing is a natural way to split the traffic – How?

Shortest Path routing

• Each link has a cost that reflects – The length of the link – Delay on the link – Congestion – $$ cost

• Cost may change with time

• The length of the route is the sum of the costs along the route

• The shortest path is the path with minimum length

• Shortest Path algorithms – Bellman-Ford: centralized and distributed versions – Dijkstra’s algorithm – Many others

Directed graphs (digraphs)

• A directed graph (digraph) G = (N,A) is a finite nonempty set of nodes N and a set of ordered node pairs A called directed arcs.

N = {1,2,3,4}

A = {(1,2), (2,1),(1,4),(4,2), (4,3),(3,2)}

• Directed walk: (4,2,1,4,3,2)

• Directed path: (4,2,1)

• Directed cycle: (4,2,1,4)

• Data networks are best represented with digraphs, although typically linkstend to be bi-directional (cost may differ in each direction)

– For simplicity we will use bi-directional links of equal costs in our examples

Bellman Ford algorithm

• Finds the shortest paths, from a given source node, say node 1, to allother nodes.

• General idea:

– First find the shortest single arc path, – Then the shortest path of at most two arcs, etc. – Let dij=∞ if (i,j) is not an arc.

• Let Di(h) be the shortest distance from 1 to i using at most h arcs. – Di(1) = d1i ; i≠1 D1(1) = 0 – Di(h+1) = min {j} [Dj(h) + dji] ;i≠1 D1(h+1) = 0

• If all weights are positive, algorithm terminates in N-1 steps.

Bellman Ford - example

Distributed Bellman Ford

• Link costs may change over time– Changes in traffic conditions– Link failures– Mobility

• Each node maintains its own routing table– Need to update table regularly to reflect changes in network

• Let Di be the shortest distance from node i to the destination

– Di = min {j} [Dj + dij] : update equation

• Each node (i) regularly updates the values of Di using the update equation– Each node maintains the values of dij to its neighbors, as well as values of Dj

received from its neighbors– Uses those to compute Di and send new value of Di to its neighbors

– If no changes occur in the network, algorithm will converge to shortest paths in no more than N steps

Slow reaction to link failures

• Start with D3=1 and D2=100– After one iteration node 2 receives D3=1 and

D2 = min [1+1, 100] = 2

• In practice, link lengths occasionally change– Suppose link between 3 and 1fails (I.e., d31=infinity)– Node 3 will update D3 = d32 + D2 = 3– In the next step node 2 will update: D2 = d23+D3 = 4

– It will take nearly 100 iterations before node 2 converges on the correct route to node 1

• Possible solutions:– Propagate route information as well– Wait before rerouting along a path with increasing cost

Node next to failed link should announce D=infinity for some time to prevent loops

Instability

Destination

Assume d is equal to the flow on (i,j) Note that D ij

D = 3 D = 5 D = 6 D = 6 D = 6 D = 5+2 D = 3+

6 5 = D + 0

As routes change due to traffic conditions, they affect theLoadings on the links, hence routes may oscillate

Instability

• Having a bias independent of flow in the arc distances helps to prevent this problem.

• Asynchronous updates also helps.

Assume d is equal to the flow on (i,j) Note that D now has a counter clockwise shortest path

Destination

D = 3 D = 3 D = 3 D = 3 D = 3 D = 3 D = 2

Dijkstra's algorithm

• Find the shortest path from a given source node to all other nodes– Requires non-negative arc weights

• Algorithm works in stages:

– Stage k: the k closest nodes to the source have been found

– Stage k+1: Given k closest nodes to the source node, find k+1st.

• Key observation: the path to the k+1st closest nodes includes only nodes from among the k closest nodes

• Let M be the set of nodes already incorporated by the algorithm– Start with Dn = dsn for all n (Dn = shortest path distance from node n to the

source node– Repeat until M=N

Find node w∉M which has the next least cost distance to the source node Add w to M Update distances: Dn = min [ Dn, Dw + dwn] (for all nodes n ∉M)

– Notice that the update of Dn need only be done for nodes not already in M and that the update only requires the computation of a new distance by going through the newly added node w.

Dijkstra example

Dijkstra’s algorithm implementation

• Centralized version: Single node gets topology information and computes the routes

– Routes can then be broadcast to the rest of the network

• Distributed version: each node i broadcasts {dij all j} to all nodes of the network; all nodes can then calculate shortest paths to each other node

– Open Shortest Path First (OSPF) protocol used in the internet

Routing in the Internet

• Autonomous systems (AS)– Internet is divided into AS’s each under the control of a single

authority• Routing protocol can be classified in two categories

– Interior protocols - operate within an AS– Exterior protocols - operate between AS’s

• Interior protocols– Typically use shortest path algorithms

Distance vector - based on distributed Bellman-ford link state protocols - Based on “distributed” Dijkstra’s

Distance vector protocols

• Based on distributed Bellman-Ford– Nodes exchange routing table information with their neighbors

• Examples:– Routing information protocols (RIP)

Metric used is hop-count (dij=1) Routing information exchanged every 30 seconds

– Interior Gateway Routing Protocol (IGRP) CISCO proprietary Metric takes load into account Dij ~ 1/(µ−λ) (estimate delay through link)

Update every 90 seconds Multi-path routing capability

Link State Protocols

• Based on Dijkstra’s Shortest path algorithm– Avoids loops– Routers monitor the state of their outgoing links– Routers broadcast the state of their links within the AS– Every node knows the status of all links and can calculate all routes

using dijkstra’s algorithm Nonetheless, nodes only send packet to the next node along the route with

the packets destination address. The next node will look-up the address in the routing table

• Example: Open Shortest Path First (OSPF) commonly used in the internet

• Link State protocols typically generate less “control” traffic than Distance-vector

Inter-Domain routing

• Used to route packets across different AS’s

• Options:

– Static routing - manually configured routes

– Distance-vector routing Exterior Gateway Protocol (EGP) Border Gateway Protocol (BGP)

• Issues– What cost “metric” to use for Distance-Vector routing

Policy issues: Network provider A may not want B’s packets routed through its network or two network providers may have an agreement

Cost issues: Network providers may charge each other for dlivery of packets

Bridges, Routers and Gateways

• A Bridge is used to connect multiple LAN segments– Layer 2 routing (Ethernet)– Does not know IP address– Varying levels of sophistication

Simple bridges just forward packets smart bridges start looking like routers

• A Router is used to route connect between different networks using network layer address

– Within or between Autonomous Systems– Using same protocol (e.g., IP, ATM)

• A Gateway connects between networks using different protocols– Protocol conversion – Address resolution

• These definitions are often mixed and seem to evolve!

Ethernet A

Ethernet BBridge

IPRouter

Small company

Gateway Service provider’s

ATM backbone

ATM switches(routers)

Gateway

Another provider’sFrame Relay

Backbone

Bridges, routers and gateways

Lecture 21

Optimal Routing

Eytan Modiano

Optimal Routing

• View routing as a “global” optimization problem

• Assumptions: – The cost of using a link is a function of the flow on that link – The total network cost is the sum of the link costs

– The required traffic rate between each source-destination pair isknown in advance

– Traffic between source-destination pair can be split along multiplepaths with infinite precision

• Find the paths (and associated traffic flows) along which to routeall of the traffic such that the total cost is minimized

Formulation of optimal routing

• Let Dij (fij) be the cost function for using link (i,j) with flow fij – Fij is the total traffic flow along link (i,j) – Dij() can represent delay or queue size along the link – Assume Dij is a differentiable function

• Let D(F) be the total cost for the network with flow vector F

• Assume additive cost: D(F) = Sum(ij) Dij (fij)

• For S-D pair w with total rate rw – Pw is the set of paths between S and D – Xp is the rate sent along path p∈ Pw

S.t. ∑ Xp = rw, ∀w ∈ W fij = ∑ Xp p∈Pw all pcontaining ( i, j )

Formulation continued

• Optimal routing problem can now be written as:

Min D(F) S.t. ∑ Xp = rw , ∀w ∈W p∈Pw

⇒ Min ∑ D(i, j) ∑ Xp s.t. ∑ Xp = rw , ∀w ∈ W

( i, j ) pcontains (i , j) p∈Pw

Optimal routing solution

• Let dD(*)/dxp be the partial derivative of D with respect to Xp

• Then,

• D’xp = dD(*)/dxp = Sum(i,j)∈p D’(I,j)

– Where D’(i,j) is evaluated at the total flow corresponding to xp

• D’xp consists of first derivative lengths along path p

Optimal routing solution continued

• Suppose now that X* = {x*p} is an optimal flow vector for some S-D pair w with paths PW

• Any shift in traffic from any path p to some other path p’ cannot possiblydecrease the total cost (since X* is assumed optimal)

• Define ∆ as the change in cost due to a shift of a small amount of traffic (δ)from some path p with x*p > 0 to another path p’

∆ = δ∂D(X *)

− δ∂D( X*)

≥ 0 ⇒∂D(X*)

≥∂D(X *)

, ∀ p' ∈ Pw∂x p' ∂x p ∂xp ' ∂xp

• Optimality conditions (necessary and sufficient):

– optimal flows can only be positive on paths with minimum first derivative lengths

– All paths along which rw is split must have same first derivative lengths

Example

Example, continued

Routing in the Internet

• Autonomous systems (AS) – Internet is divided into AS’s each under the control of a single

authority • Routing protocol can be classified in two categories

– Interior protocols - operate within an AS – Exterior protocols - operate between AS’s

• Interior protocols – Typically use shortest path algorithms

Distance vector - based on distributed Bellman-ford link state protocols - Based on “distributed” Dijkstra’s

Distance vector protocols

• Based on distributed Bellman-Ford – Nodes exchange routing table information with their neighbors

• Examples: – Routing information protocols (RIP)

Metric used is hop-count (dij=1) Routing information exchanged every 30 seconds

– Interior Gateway Routing Protocol (IGRP) CISCO proprietary Metric takes load into account Dij ~ 1/(µ−λ) (estimate delay through link)

Update every 90 seconds Multi-path routing capability

Link State Protocols

• Based on Dijkstra’s Shortest path algorithm – Avoids loops – Routers monitor the state of their outgoing links – Routers broadcast the state of their links within the AS – Every node knows the status of all links and can calculate all routes

using dijkstra’s algorithm Nonetheless, nodes only send packet to the next node along the route withthe packets destination address. The next node will look-up the address inthe routing table

• Example: Open Shortest Path First (OSPF) commonly used in theinternet

• Link State protocols typically generate less “control” traffic thanDistance-vector

Inter-Domain routing

• Used to route packets across different AS’s

• Options:

– Static routing - manually configured routes

– Distance-vector routing Exterior Gateway Protocol (EGP) Border Gateway Protocol (BGP)

• Issues – What cost “metric” to use for Distance-Vector routing

Policy issues: Network provider A may not want B’s packets routed throughits network or two network providers may have an agreement

Cost issues: Network providers may charge each other for dlivery of packets

Bridges, Routers and Gateways

• A Bridge is used to connect multiple LAN segments – Layer 2 routing (Ethernet) – Does not know IP address – Varying levels of sophistication

Simple bridges just forward packets smart bridges start looking like routers

• A Router is used to route connect between different networks usingnetwork layer address

– Within or between Autonomous Systems – Using same protocol (e.g., IP, ATM)

• A Gateway connects between networks using different protocols – Protocol conversion – Address resolution

• These definitions are often mixed and seem to evolve!

Bridges, routers and gateways

Ethernet A

Ethernet B Bridge

IP Router

Small company

Gateway Service provider’s

ATM backbone

ATM switches (routers)

Gateway

Another provider’s Frame Relay

Backbone

QuickTime™ and aGIF decompressor

are needed to see this picture. QuickTime™ and a

Photo - JPEG decompressorare needed to see this picture.

Lectures 22 & 23

Flow and congestion control

Eytan Modiano

Eytan Modiano Slide 1 Laboratory for Information and Decision Systems

FLOW CONTROL QuickTime™ and aGIF decompressor

• Flow control: end-to-end mechanism for regulating traffic between sourceand destination

• Congestion control: Mechanism used by the network to limit congestion

• The two are not really separable, and I will refer to both as flow control

• In either case, both amount to mechanisms for limiting the amount oftraffic entering the network

– Sometimes the load is more than the network can handle

WITHOUT FLOW CONTROL QuickTime™ and aGIF decompressor

• When overload occurs – queues build up – packets are discarded – Sources retransmit messages – congestion increases => instability

• Flow control prevents network instability by keeping packetswaiting outside the network rather than in queues inside thenetwork

– Avoids wasting network resources – Prevent “disasters”

OBJECTIVES OF FLOW CONTROL QuickTime™ and aGIF decompressor

• Maximize network throughput

• Reduce network delays

• Maintain quality-of-service parameters – Fairness, delay, etc..

• Tradeoff between fairness, delay, throughput…

FAIRNESS QuickTime™ and aGIF decompressor

Session 1

Session 2 Session 3 Session 4

• If link capacities are each 1 unit, then – Maximum throughput is achieved by giving short session one unit

and zero units to the long session; total throughput of 3 units – One concept of fairness would give each user 1/2 unit; total

throughput of 2 units – Alternatively, giving equal resources to each session would give

single link users 3/4 each, and 1/4 unit to the long session

FAIRNESS QuickTime™ and aGIF decompressor

LIDS Session 2

Session 1 C B

• Limited buffer at node B

• Clearly both sessions are limited to 1 unit of traffic

• Without flow control, session 1 can dominate the buffer at node B – Since 10 session 1 packets arrive for each session 2 packet, 10/11

packets in the buffer will belong to session 1

QuickTime™ and aGIF decompressorDEADLOCKS FROM BUFFER OVERFLOWS are needed to see this picture.

QuickTime™ and a Photo - JPEG decompressor

are needed to see this picture.

• If buffers at A fill up with traffic to B and vice versa, then A can not accept any traffic from B, and vice versa causing deadlock

– A cannot accept any traffic from B – B cannot accept any traffic from A

• A can be full of B traffic, B of C traffic, and C of A traffic.

WINDOW FLOW CONTROL QuickTime™ and aGIF decompressor

S Dpacket packet packet packet

ACK ACK ACK

• Similar to Window based ARQ – End-to-end window for each session, Wsd – Each packet is ACK’d by receiver – Total number of un-ACK’s packets <= Wsd

⇒ Window size is an upper-bound on the total number of packets and ACKs in the network

⇒ Limit on the amount of buffering needed inside network

END TO END WINDOWS QuickTime™ and aGIF decompressor

• Let x be expected packet transmission time, W be size of window,and d be the total round trip delay for a packet

– Ideally, flow control would only be active during times of congestion Therefore, Wx should be large relative to the total round trip delay d in the absence of congestion

If d <= Wx, flow control in-active and session rate r = 1/x If d > Wx, flow control active and session rate r = W/d packets per second

(W=6) W X

1 2 3 6 4 5 7

Flow control not active Flow control active

Behavior of end-end windows QuickTime™ and aGIF decompressor

W/d R = min { 1/x, W/d} packets/second

WX round trip delay d

• As d increases, flow control becomes active and limits the transmission rate

• As congestion is alleviated, d will decrease and r will go back up

• Flow control has the affect of stabilizing delays in the network

Choice of window size QuickTime™ and aGIF decompressor

• Without congestion, window should be large enough to allowtransmission at full rate of 1/x packets per second

– Let d’ = the round-trip delay when there is no queueing – Let N = number of nodes along the path – Let Dp = the propagation delay along the path

⇒ d’ = 2Nx + 2 Dp (delay for sending packet and ack along N links)

⇒ Wx > d’ => W > 2N + Dp/x

• When Dp < x, W ~ 2N (window size is independent of prop. Delay)

• When Dp >> Nx, W ~ 2Dp/x (window size is independent on path length

Impact of congestion QuickTime™ and aGIF decompressor

• Without congestion d = d’ and flow control is inactive • With congestion d > d’ and flow control becomes active

• Problem: When d’ is large (e.g., Dp is large) queueing delay issmaller than propagation delay and hence it becomes difficult tocontrol congestion

– => increased queueing delay has a small impact on d and hence a small impact on the rate r

PROBLEMS WITH WINDOWS QuickTime™ and aGIF decompressor

• Window size must change with congestion level • Difficult to guarantee delays or data rate to a session • For high speed sessions on high speed networks, windows must

be very large – E.g., for 1 Gbps cross country each window must exceed 60Mb – Window flow control becomes in-effective – Large windows require a lot of buffering in the network

• Sessions on long paths with large windows are better treated than short path sessions. At a congestion point, large window fills upbuffer and hogs service (unless round robin service used)

NODE BY NODE WINDOWS QuickTime™ and aGIF decompressor

i-1 i i+1 i+2 w w ww

• Separate window (w) for each link along the sessions path – Buffer of size w at each node

• An ACK is returned on one link when a packet is released to the next link – => buffer will never overflow

• If one link becomes congested, packets remain in queue and ACKs don't go back on previous link, which would in-turn also become congested andstop sending ACKs (back pressure)

– Buffers will fill-up at successive nodes Under congestion, packets are spread out evenly on path rather than accumulated at congestion point

• In high-speed networks this still requires large windows and hence largebuffers at each node

RATE BASED FLOW CONTROL QuickTime™ and aGIF decompressor

• Window flow control cannot guarantee rate or delay

• Requires large windows for high (delay * rate) links

• Rate control schemes provide a user a guaranteed rate and some limited ability to exceed that rate

– Strict implementation: for a rate of r packets per second allow exactly one packet every 1/r seconds

=> TDMA => inefficient for bursty traffic

– Less-strict implementation: Allow W packets every W/r seconds Average rate remains the same but bursts of up to W packets are allowed

Typically implemented using a “leaky bucket” scheme

LEAKY BUCKET RATE CONTROL QuickTime™ and aGIF decompressor

Permits arrive at rate r (one each 1/r sec.). Storage for W permits. Incoming packet

Each packet requiresa permit to proceed

• Session bucket holds W permits – In order to enter the network, packet must first get a permit – Bucket gets new permits at a rate of one every 1/r seconds

• When the bucket is full, a burst of up to W packets can enter the network – The parameter W specifies how bursty the source can be

Small W => strict rate control Large W supports allows for larger bursts

– r specifies the maximum long term rate

• An inactive session will earn permits so that it can burst later

Leaky bucket flow control QuickTime™ and aGIF decompressor

• Leaky bucket is a traffic shaping mechanism

• Flow control schemes can adjust the values of W and r inresponse to congestion

– E.g., ATM networks use RM (resource management) cells to tell sources to adjust their rates based on congestion

are needed to see this picture. QUEUEING ANALYSIS OF LEAKY BUCKET are needed to see this picture.

permits

• Slotted time system with a state change each 1/r seconds – A permit arrives at start of slot and is discarded if the bucket is full – Packets arrive according to a Poisson process of rate λ – ai = Prob(i arrivals) = (λ/r)i e-λ/r / i!

– P = number of packets waiting in the buffer for a permit – B = number of permits in the buffer – W = bucket size

• State of system: K = W + P - B – State represents the “permit deficit” and is equal to the number of

permits needed in order to refill the bucket State 0 => bucket full of permitsState W => no permits in bufferState W + j => j packets waiting for a permit

QUEUEING ANALYSIS, continues QuickTime™ and aGIF decompressor

System Markov Chain:

0 1 2 3 4 .... a

a aa a

2 a 2 a 2 a

a0 +a1

• Note that this is the same as M/D/1 with slotted service – In steady-state the arrival rate of packets is equal to the arrival rate of permits

(permits are discarded when bucket full, permits don’t arrive in state “0” when no packets arrive)

=> λ = (1 - P(0)ao)r, => P(0) = (r-λ)/(a0 r)

• Now from global balance eqns: – P(0) [1-a0 - a1] = a0 P(1) – P(1) = [(1-a0-a1)/a0]P(0) => can solve for P(1) in terms of P(0) – P(1)[1-a1] = a2P(0) + a0P(2) => obtain P(2) in terms of P(1) – Recursively solve for all P(i)’s ∞ 1

• Average delay to obtain a permit = T = ∑( j − W ) P( j)

rj= W +1

Choosing a value for r QuickTime™ and aGIF decompressor

• How do we decide on the rate allocated to a session?

• Approaches

1. Optimal routing and flow control • Tradeoff between delay and throughput

2. Max-Min fairness • Fair allocation of resources

3. Contract based • Rate negotiated for a price (e.g., Guaranteed rate, etc.)

Max-Min Fairness QuickTime™ and aGIF decompressor

• Treat all sessions as being equal

• Example:

S1 S2 S3

C A B C = 1 C = 1

• Sessions S0, S1, S2 share link AB and each gets a fair share of 1/3

• Sessions S3 and S0 share link BC, but since session S0 is limited to 1/3 bylink AB, session S3 can be allocated a rate of 2/3

Max-min notion QuickTime™ and aGIF decompressor

• The basic idea behind max-min fairness is to allocate each session the maximum possible rate subject to the constraint thatincreasing one session’s rate should not come at the expense ofanother session whose allocated rate is not greater than the givensession whose rate is being increased

– I.e, if increasing a session’s rate comes at the expense of another session that already has a lower rate, don’t do it!

• Given a set of session requests P and an associated set of ratesRP, RP is max-min fair if,

– For each session p, rp cannot be increased without decreasing rp’ for some session p’ for which rp’ <= rp

Max-Min fair definition QuickTime™ and aGIF decompressor

• Let rp be the allocated rate for session p, and consider a link a withcapacity Ca

• The flow on link a is given by: Fa = ∑rp ∀p cros sin glink a

• A rate vector R is feasible if: – Rp >= 0 for all p in P (all session requests) and – Fa <= Ca for all a in A (where A is the set of links)

• R is max-min fair if it is feasible and For all p, if there exists a feasible R1 such that rp < r1

p Then there exists a session p’ such that rp’ > r1

p’ and rp’ <= rp

• In other words, you can only increase the rate of a path by decreasingthe rate of another path that has been allocated no more capacity

are needed to see this picture. Bottleneck link

• Given a rate vector R, a link ‘a’ is a bottleneck link for session p if: – Fa = Ca and rp >= rp’ for all sessions p’ crossing link ‘a’ – Notice that all other sessions must have some other bottleneck link

for otherwise their rate could be increased on link ‘a’

• Proposition: each session has a bottleneck link with respect to R

• Example (C=1 for all links)

S1, r=2/3 S4, r=1

S5, r=1/3 S3, r=1/3 S2, r=1/3

Bottleneck links

S1 <=> (3,5) S2,S3,S5 <=> (2,3) S4 <=> (4,5)

A feasible rate vector R is max-min fair if and only if

are needed to see this picture.

Max-Min fair algorithm QuickTime™ and aGIF decompressor

• Start all sessions with a zero rate

• Increment all session rates equally by some small amount δ – Continue to increment until some link reaches capacity (Fa = Ca)

All sessions sharing that link have equal rates Link is a bottleneck link with respect to those sessions Stop increasing rates for those sessions (that is their Max-Min allocation)

– Continue to increment the rate for all other sessions that have not yetarrived at a bottleneck link

Until another bottleneck link is found – Algorithm terminates when all sessions have a bottleneck link

• In practice sessions are not known in advance and computingrates in advance is not practical

Generalized processor sharing QuickTime™ and aGIF decompressor

Photo - JPEG decompressorare needed to see this picture. (AKA fair queueing)

• Serve session in round-robin order – If sessions always have a packet to send they each get an equal

share of the link – If some sessions are idle, the remaining sessions share the capacity

equally

• Processor sharing usually refers to a “fluid” model where sessionrates can be arbitrarily refined

• Generalized processor sharing is a packet based approximationwhere packets are served from each session in a round-robinorder

Data Networks Lecture 1 Introductionxcellenttutorial.weebly.com/uploads/5/4/1/2/5412925/data... ·...

Documents

A distributional form of Little's Law

Reviews of Ken Little's work 1982-2010

xcellenttutorial.weebly.comxcellenttutorial.weebly.com/uploads/5/4/1/2/5412925/operatingsyste… · PREFACE This volume is an instructor’s manual for the Sixth Edition of Operating-System

03. Lecture - Little's Law

APPLICATIONS OF THE LITTLE'S LAW IN INFORMATICS AND

Periodic Little's Law - Columbia University

CMS Wireframes - Little's India Wireframes.pdf · Dashboard Menu Slider Template Pages Articles Products Downloads Locations Notifications Settings User Management ersonolisotlon

Chapter 5 Little's Law - MIT - Massachusetts Institute of ...web.mit.edu/~sgraves/www/papers/Little's Law-Published.pdf · Chapter 5 Little's Law John D.C. Little and Stephen C. Graves

Paper for Little's Algorithm

03. Lecture - Little's Law (1)

· Matchman ot the Year . ANDY LITTLE'S angling adventures The nation's No.l all-rounder comes up trumps- usually! -wherever he goes. Zander tales and mackerel tails at prolific

Solution Operating System Concepts By Galvinxcellenttutorial.weebly.com/uploads/5/4/1/2/5412925/os_galvinsilb... · Solution Operating System Concepts By Galvin,Silberschatz ... operating

CSE 421 Algorithms: Divide and Conquercourses.cs.washington.edu/courses/cse421/15wi/slides/05dc.pdf · the base algorithm has super-linear complexity. Moral 2: “If a little's good,

Little's Critique of Welfare Economics (Kenneth J. Arrow - 1951)

Cook Little's Jen Moeckel Presents: Hey Startups: Hire Right!

Interesting content: Synectics (Gordon/Little's method)

Little's Law

Descendants of David Little (1678 – unknown) and Mary Peil ... · See the book, "The Little Family of Harvey Settlement" for further details on John and Janet Little's family. From

*Grade 8 ONLY - Miss Little's Classroom Website · Kindergarten Through Grade Twelve Adopted by the California State Board of Education November 2016 ... Rock Cycle, Plate Tectonics

The General Distributional Little's Law and its Applications