46
Winter 2006 EE384x 1 EE384x: Packet Switch Architectures Handout 2: Queues and Arrival processes, Output Queued Switches, and Output Link Scheduling. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] http://www.stanford.edu/~nickm

EE384x: Packet Switch Architectures

  • Upload
    astrid

  • View
    65

  • Download
    0

Embed Size (px)

DESCRIPTION

EE384x: Packet Switch Architectures. Handout 2: Queues and Arrival processes, Output Queued Switches, and Output Link Scheduling. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] http://www.stanford.edu/~nickm. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: EE384x: Packet Switch Architectures

Winter 2006 EE384x 1

EE384x: Packet Switch Architectures

Handout 2: Queues and Arrival processes,Output Queued Switches, and

Output Link Scheduling.

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

[email protected]://www.stanford.edu/~nickm

Page 2: EE384x: Packet Switch Architectures

Winter 2006 EE384x 2

Outline

1. Output Queued Switches2. Terminology: Queues and arrival

processes.3. Output Link Scheduling

Page 3: EE384x: Packet Switch Architectures

Winter 2006 EE384x 3

Generic Router Architecture

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

QueuePacket

BufferMemory

BufferMemory

QueuePacket

BufferMemory

BufferMemory

QueuePacket

BufferMemory

BufferMemory

Data Hdr

Data Hdr

Data Hdr

1

2

N

1

2

N

N times line rate

N times line rate

Page 4: EE384x: Packet Switch Architectures

Winter 2006 EE384x 4

Simple model of output queued switch

R1Link 1

Link 2

Link 3

Link 4

Link 1, ingress Link 1, egress

Link 2, ingress Link 2, egress

Link 3, ingress Link 3, egress

Link 4, ingress Link 4, egress

Link rate, R

R

R

R

Link rate, R

R

R

R

Page 5: EE384x: Packet Switch Architectures

Winter 2006 EE384x 5

Characteristics of an output queued (OQ) switch

Arriving packets are immediately written into the output queue, without intermediate buffering.

The flow of packets to one output does not affect the flow to another output.

An OQ switch is work conserving: an output line is always busy when there is a packet in the switch for it.

OQ switch have the highest throughput, and lowest average delay.

We will also see that the rate of individual flows, and the delay of packets can be controlled.

Page 6: EE384x: Packet Switch Architectures

Winter 2006 EE384x 6

The shared memory switch

Link 1, ingress Link 1, egress

Link 2, ingress Link 2, egress

Link 3, ingress Link 3, egress

Link N, ingress Link N, egress

A single, physical memory device

R

R

R

R

R

R

Page 7: EE384x: Packet Switch Architectures

Winter 2006 EE384x 7

Characteristics of a shared memory switch

( )

.

( ) / ,Static queues:

Assume memory of size bytes, and is the length of

the queue f or output at time

I f f or all then the switch

operates the same as the basic output queued switc

Dyna

h.

i

i

M Q t

i t

Q t M N i

1( ) ,

I f queues can have any length, so long

as then the l

mic q

oss rate is l

ueues:

ower. N

iiQ t M

Page 8: EE384x: Packet Switch Architectures

Winter 2006 EE384x 8

Memory bandwidthBasic OQ switch: Consider an OQ switch with N different physical

memories, and all links operating at rate R bits/s. In the worst case, packets may arrive

continuously from all inputs, destined to just one output.

Maximum memory bandwidth requirement for each memory is (N+1)R bits/s.

Shared Memory Switch: Maximum memory bandwidth requirement for the

memory is 2NR bits/s.

Page 9: EE384x: Packet Switch Architectures

Winter 2006 EE384x 9

How fast can we make a centralized shared memory switch?

SharedMemory

200 byte bus

5ns SRAM

1

2

N

5ns per memory operation Two memory operations per packet Therefore, up to 160Gb/s In practice, closer to 80Gb/s

Page 10: EE384x: Packet Switch Architectures

Winter 2006 EE384x 10

Outline

1. Output Queued Switches2. Terminology: Queues and arrival

processes.3. Output Link Scheduling

Page 11: EE384x: Packet Switch Architectures

Winter 2006 EE384x 11

Queue Terminology

Arrival process, A(t): In continuous time, usually the cumulative number of arrivals in [0,t], In discrete time, usually an indicator function as to whether or not an arrival occurred at time t=nT. is the arrival rate; the expected number of arriving packets (or bits) per second.

Queue occupancy, Q(t): Number of packets (or bits) in queue at time t.

Service discipline, S: Indicates the sequence of departure: e.g. FIFO/FCFS, LIFO, …

Service distribution: Indicates the time taken to process each packet: e.g. deterministic, exponentially distributed

service time. is the service rate; the expected number of served packets (or bits) per second.

Departure process, D(t): In continuous time, usually the cumulative number of departures in [0,t], In discrete time, usually an indicator function as to whether or not a departure occurred at time

t=nT.

SA(t),

Q(t)

D(t)

Page 12: EE384x: Packet Switch Architectures

Winter 2006 EE384x 12

More terminology

Customer: queueing theory usually refers to queued entities as “customers”. In class, customers will usually be packets or bits.

Work: each customer is assumed to bring some work which affects its service time. For example, packets may have different lengths, and their service time might be a function of their length.

Waiting time: time that a customer waits in the queue before beginning service.

Delay: time from when a customer arrives until it has departed.

Page 13: EE384x: Packet Switch Architectures

Winter 2006 EE384x 13

Arrival Processes

Examples of deterministic arrival processes: E.g. 1 arrival every second; or a burst of 4 packets every

other second. A deterministic sequence may be designed to be

adversarial to expose some weakness of the system.

Examples of random arrival processes: (Discrete time) Bernoulli i.i.d. arrival process:

• Let A(t) = 1 if an arrival occurs at time t, where t = nT, n=0,1,… • A(t) = 1 w.p. p and 0 w.p. 1-p. • Series of independent coin tosses with p-coin.

(Continuous time) Poisson arrival process:• Exponentially distributed interarrival times.

Page 14: EE384x: Packet Switch Architectures

Winter 2006 EE384x 14

Adversarial Arrival ProcessExample for “Knockout” Switch

R

R

R

Memory write bandwidth = k.R < N.R

R

R

R

R

R

If our design goal was to not drop packets, then a simple discrete time adversarial arrival process is one in which:

1. A1(t) = A2(t) = … = Ak+1(t) = 1, and2. All packets are destined to output t mod N.

1

2

3

N

Page 15: EE384x: Packet Switch Architectures

Winter 2006 EE384x 15

Bernoulli arrival process

R

R

R

Memory write bandwidth = N.R

R

R

R

R

R1

2

3

N

Assume Ai(t) = 1 w.p. p, else 0.Assume each arrival picks an output independently, uniformly and at random.

Some simple results follow:1. Probability that at time t a packet arrives to input i destined to output j is p/N.2. Probability that two consecutive packets arrive to input i is the same as the

probability that packets arrive to inputs i and j simultaneously, equals p2.Questions:

1. What is the probability that two arrivals occur at input i in any three time slots?2. What is the probability that two arrivals occur for output j in any three time slots?3. What is the probability that queue i holds k packets?

A1(t)

A2(t)

A3(t)

AN(t)

Page 16: EE384x: Packet Switch Architectures

Winter 2006 EE384x 16

Simple deterministic model

A(t)

D(t)Cumulative number of

departed bits up until time t.

time

Service process

Cumulativenumber of bits

Cumulative number of bits that arrived up until time t.

R

A(t)

D(t)

Q(t)

Properties of A(t), D(t): A(t), D(t) are non-decreasing A(t) >= D(t)

Page 17: EE384x: Packet Switch Architectures

Winter 2006 EE384x 17

D(t)

A(t)

time

Q(t)

d(t)

Queue occupancy: Q(t) = A(t) - D(t).

Queueing delay, d(t), is the time spent in the queue by a bit that arrived at time t, (assuming that the queue is served FCFS/FIFO).

Simple Deterministic Model

Cumulativenumber of bits

Page 18: EE384x: Packet Switch Architectures

Winter 2006 EE384x 18

Outline

1. Output Queued Switches2. Terminology: Queues and arrival

processes.3. Output Link Scheduling

Page 19: EE384x: Packet Switch Architectures

Winter 2006 EE384x 19

The problems caused by FIFO queues in routers

1. In order to maximize its chances of success, a source has an incentive to maximize the rate at which it transmits.

2. (Related to #1) When many flows pass through it, a FIFO queue is “unfair” – it favors the most greedy flow.

3. It is hard to control the delay of packets through a network of FIFO queues.

Fair

ness

Dela

y

Guara

nte

es

Page 20: EE384x: Packet Switch Architectures

Winter 2006 EE384x 20

Fairness

1.1 Mb/s

10 Mb/s

100 Mb/s

A

B

R1C

0.55Mb/s

0.55Mb/s

What is the “fair” allocation: (0.55Mb/s, 0.55Mb/s) or (0.1Mb/s, 1Mb/s)?

e.g. an http flow with a given(IP SA, IP DA, TCP SP, TCP DP)

Page 21: EE384x: Packet Switch Architectures

Winter 2006 EE384x 21

Fairness

1.1 Mb/s

10 Mb/s

100 Mb/s

A

B

R1 D

What is the “fair” allocation?0.2 Mb/sC

Page 22: EE384x: Packet Switch Architectures

Winter 2006 EE384x 22

Max-Min FairnessA common way to allocate flows

N flows share a link of rate C. Flow f wishes to send at rate W(f), and is allocated rate R(f).

1. Pick the flow, f, with the smallest requested rate.

2. If W(f) < C/N, then set R(f) = W(f). 3. If W(f) > C/N, then set R(f) = C/N.4. Set N = N – 1. C = C – R(f).5. If N>0 goto 1.

Page 23: EE384x: Packet Switch Architectures

Winter 2006 EE384x 23

1W(f1) = 0.1

W(f3) = 10R1

C

W(f4) = 5

W(f2) = 0.5

Max-Min FairnessAn example

Round 1: Set R(f1) = 0.1

Round 2: Set R(f2) = 0.9/3 = 0.3

Round 3: Set R(f4) = 0.6/2 = 0.3

Round 4: Set R(f3) = 0.3/1 = 0.3

Page 24: EE384x: Packet Switch Architectures

Winter 2006 EE384x 24

Max-Min Fairness

How can an Internet router “allocate” different rates to different flows?

First, let’s see how a router can allocate the “same” rate to different flows…

Page 25: EE384x: Packet Switch Architectures

Winter 2006 EE384x 25

Fair Queueing1. Packets belonging to a flow are placed in a

FIFO. This is called “per-flow queueing”.2. FIFOs are scheduled one bit at a time, in a

round-robin fashion. 3. This is called Bit-by-Bit Fair Queueing.

Flow 1

Flow NClassification Scheduling

Bit-by-bit round robin

Page 26: EE384x: Packet Switch Architectures

Winter 2006 EE384x 26

Weighted Bit-by-Bit Fair Queueing

Likewise, flows can be allocated different rates by servicing a different number of bits for each flow during each round.

1R(f1) = 0.1

R(f3) = 0.3R1

C

R(f4) = 0.3

R(f2) = 0.3

Order of service for the four queues:

… f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,…

Also called “Generalized Processor Sharing (GPS)”

Page 27: EE384x: Packet Switch Architectures

Winter 2006 EE384x 27

Packetized Weighted Fair Queueing (WFQ)

Problem: We need to serve a whole packet at a time.

Solution: 1. Determine what time a packet, p, would complete if we

served flows bit-by-bit. Call this the packet’s finishing time, F.

2. Serve packets in the order of increasing finishing time.

Theorem: Packet p will depart before F + TRANSPmax

Also called “Packetized Generalized Processor Sharing (PGPS)”

Page 28: EE384x: Packet Switch Architectures

Winter 2006 EE384x 28

Calculating F

Assume that at time t there are N(t) active (non-empty) queues.

Let R(t) be the number of rounds in a round-robin service discipline of the active queues, in [0,t].

A P bit long packet entering service at t0 will complete service in round R(t) = R(t0) + P.

1, max( , ( )), and where: time that -th

packet starts service. i i i i i i iF S P S F R t S i

Page 29: EE384x: Packet Switch Architectures

Winter 2006 EE384x 29

An example of calculating F

Flow 1

Flow i

Flow N

Calculate Si and Fi

& Enqueue

Pick packet withsmallest Fi

& Send

In both cases, Fi = Si + Pi

R(t) is monotonically increasing with t, therefore same departure order in R(t) as in t.

Case 1: If packet arrives to non-empty queue, then Si = Fi-1

Case 2: If packet arrives at t0 to empty queue, then Si = R(t0)

R(t)

Page 30: EE384x: Packet Switch Architectures

Winter 2006 EE384x 30

Understanding bit by bit WFQ 4 queues, sharing 4 bits/sec of bandwidth, Equal

Weights

Weights : 1:1:1:1

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

Time

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1B1C1D1

A2 = 2

C3 = 2

Weights : 1:1:1:1

D1, C1 Depart at R=1A2, C3 arrive

Time

Round 1

Weights : 1:1:1:1

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1B1C1D1

A2 = 2

C3 = 2

A1B1C2D2

C2 Departs at R=2Time

Round 1Round 2

Page 31: EE384x: Packet Switch Architectures

Winter 2006 EE384x 31

Understanding bit by bit WFQ 4 queues, sharing 4 bits/sec of bandwidth, Equal

Weights

Weights : 1:1:1:1

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1B1C1D1

A2 = 2

C3 = 2

A1B1C2D2

D2, B1 Depart at R=3

A1B1C3D2

Time

Round 1Round 2Round 3

Weights : 1:1:1:1

Weights : 1:1:1:1

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1C3 = 2 C1 = 1

C1D1C2B1B1B1D2D2A 1A1A 1A 1

A2 = 2

C3C3A2A2

Departure order for packet by packet WFQ: Sort by finish round of packetsTime

Sort packets

1

1

1

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1B1C1D1

A2 = 2

C3 = 2

A1B1C2D2

A1 Depart at R=4

A1B1C3D2A1C3A2A2

Time

Round 1Round 2Round 3Round 4

C3,A2 Departs at R=6

56

Page 32: EE384x: Packet Switch Architectures

Winter 2006 EE384x 32

Understanding bit by bit WFQ 4 queues, sharing 4 bits/sec of bandwidth, Weights

3:2:2:1

Weights : 3:2:2:1

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

Time

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1A1A1B1

A2 = 2

C3 = 2

Time

Weights : 3:2:2:1

Round 1

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A1A1A1B1

A2 = 2

C3 = 2

D1, C2, C1 Depart at R=1Time

B1C1C2D1

Weights : 3:2:2:1

Round 1

Page 33: EE384x: Packet Switch Architectures

Winter 2006 EE384x 33

Understanding bit by bit WFQ 4 queues, sharing 4 bits/sec of bandwidth, Weights

3:2:2:1

Weights : 3:2:2:1

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A2 = 2

C3 = 2

B1, A2 A1 Depart at R=2Time

A1A1A1B1B1C1C2D1A1A2A2B1

Round 1Round 2

Weights : 3:2:2:1

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1 C1 = 1

A2 = 2

C3 = 2

D2, C3 Depart at R=2Time

A1A1A1B1B1C1C2D1A1A2A2B1C3C3D2D2

Round 1Round 23

Weights : 1:1:1:1

Weights : 3:2:2:1

3

2

2

1

6 5 4 3 2 1 0

B1 = 3

A1 = 4

D2 = 2 D1 = 1

C2 = 1C3 = 2 C1 = 1

C1C2D1A1A1A1A1A2A2B1B 1B1

A2 = 2

C3C3D2D2

Departure order for packet by packet WFQ: Sort by finish time of packetsTime

Sort packets

Page 34: EE384x: Packet Switch Architectures

Winter 2006 EE384x 34

WFQ is complex

There may be hundreds to millions of flows; the linecard needs to manage a FIFO per flow.

The finishing time must be calculated for each arriving packet, Packets must be sorted by their departure time. Naively, with

m packets, the sorting time is O(logm). In practice, this can be made to be O(logN), for N active flows:

1

2

3

N

Packets arriving to egress linecard

CalculateFp

Find Smallest Fp

Departing packet

Egress linecard

Page 35: EE384x: Packet Switch Architectures

Winter 2006 EE384x 35

Deficit Round Robin (DRR) [Shreedhar &

Varghese, ’95]

An O(1) approximation to WFQ

100100

400 400

600

400

150

60 340

50

600

Active packet queues

200

0

0

0

Quantum Size = 200

Step 1:

100100

400 400

600

400

150

60 340

50

600

Active packet queues

100

200

150

200

Step 2,3,4:

It appears that DRR emulates bit-by-bit FQ, with a larger “bit”. So, if the quantum size is 1 bit, does it equal FQ? (No). It is easy to implement Weighted DRR using a different quantumsize for each queue.

Page 36: EE384x: Packet Switch Architectures

Winter 2006 EE384x 36

The problems caused by FIFO queues in routers

1. In order to maximize its chances of success, a source has an incentive to maximize the rate at which it transmits.

2. (Related to #1) When many flows pass through it, a FIFO queue is “unfair” – it favors the most greedy flow.

3. It is hard to control the delay of packets through a network of FIFO queues.

Fair

ness

Dela

y

Guara

nte

es

Page 37: EE384x: Packet Switch Architectures

Winter 2006 EE384x 37

time

Cumulativebytes

A(t)D(t)

Q(t)

Deterministic analysis of a router queue

FIFO delay, d(t)

A(t) D(t)

Model of router queue

Q(t)

Page 38: EE384x: Packet Switch Architectures

Winter 2006 EE384x 38

So how can we control the delay of packets?

Assume continuous time, bit-by-bit flows for a moment…

1. Let’s say we know the arrival process, Af(t), of flow f to a router.

2. Let’s say we know the rate, R(f) that is allocated to flow f.

3. Then, in the usual way, we can determine the delay of packets in f, and the buffer occupancy.

Page 39: EE384x: Packet Switch Architectures

Winter 2006 EE384x 39

Flow 1

Flow NClassificationWFQ

Scheduler

A1(t)

AN(t)

R(f1), D1(t)

R(fN), DN(t)

time

Cumulativebytes

A1(t) D1(t)

R(f1)

Key idea: In general, we don’t

know the arrival process. So let’s

constrain it.

Page 40: EE384x: Packet Switch Architectures

Winter 2006 EE384x 40

Let’s say we can bound the arrival process

time

Cumulativebytes

t

Number of bytes that can arrive in any period of length t

is bounded by:

This is called “() regulation”

A1(t)

Page 41: EE384x: Packet Switch Architectures

Winter 2006 EE384x 41

() Constrained Arrivals and Minimum Service Rate

time

Cumulativebytes

A1(t) D1(t)

R(f1)

dmax

Bmax

Theorem [Parekh,Gallager ’93]: If flows are leaky-bucket constrained,and routers use WFQ, then end-to-end delay guarantees are possible.

1 1

.

( ) , ( ) / ( ).

For no packet loss,

I f then

B

R f d t R f

Page 42: EE384x: Packet Switch Architectures

Winter 2006 EE384x 42

The leaky bucket “()” regulator

Tokensat rate,

Token bucketsize,

Packet buffer

Packets Packets

One byte (or packet) per

token

Page 43: EE384x: Packet Switch Architectures

Winter 2006 EE384x 43

How the user/flow can conform to the () regulationLeaky bucket as a “shaper”

Tokensat rate,

Token bucketsize

Variable bit-ratecompression

To network

time

bytes

time

bytes

time

bytes

C

Page 44: EE384x: Packet Switch Architectures

Winter 2006 EE384x 44

Checking up on the user/flowLeaky bucket as a “policer”

Tokensat rate,

Token bucketsize

time

bytes

time

bytes

C

Router

From network

Page 45: EE384x: Packet Switch Architectures

Winter 2006 EE384x 45

QoS RouterPolicer

Policer

Classifier

Policer

Policer

Classifier

Per-flow Queue

Scheduler

Per-flow Queue

Per-flow Queue

Scheduler

Per-flow Queue

Remember: These results assume that it is an OQ switch! Why? What happens if it is not?

Page 46: EE384x: Packet Switch Architectures

Winter 2006 EE384x 46

References

1. Abhay K. Parekh and R. Gallager“A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The Single Node Case” IEEE Transactions on Networking, June 1993.

2. M. Shreedhar and G. Varghese“Efficient Fair Queueing using Deficit Round Robin”, ACM Sigcomm, 1995.