41
1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews Bell Laboratories, Murray Hill, NJ LCA Seminars Talk, EPFL, March 27, 2003

1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

1

Scheduling Reserved Traffic

in Input-Queued Switches:

New Delay Bounds via Probabilistic Techniques

Milan VojnovićEPFL

Joint work with Matthew Andrews Bell Laboratories, Murray Hill, NJ

LCA Seminars Talk, EPFL, March 27, 2003

2

Introduction: Input-Queued Switch

input ports output ports

......

1

2

3

I I

1

2

...

...

crossbar

At any point in time, connectivity restricted to permutation matrices

3

Some Existing Approaches for Crossbar Scheduling

• maximum-weight matching (McKeown ‘96, many others)

• decomposition-based scheduling (Chang et al, 2000)

• fluid-tracking (Tabatabaee et al, ToN ’01)

4

Decomposition-Based Scheduling

Given: M, a I x I rate demand matrix

[mij] intensity of the service offered to the ij-th input/output port pair

Assume M doubly sub-stochastic

Constraint: crossbar

Find: Decompose M into permutation matrices. Find a schedule such that intensity of the service offered to ij-th input/output port pair is at least [mij]

5

Decomposition-Based Sched. (cont’d)

Observation: A solution to the problem ensures the service rate to be at least M in the long-run

Desired Property: broadly speaking, we want a schedule to be also “smooth” (“non bursty”), that is, the transmission slots would need to be evenly offered to any input-output port pair

Observation: Note, the last is a short-run property

6

A Decomposition: Birkoff/von Neumann

Birkoff/von Neumann (e.g. Chvátal ‘84, p. 330): Any doubly stochastic matrix M is a convex combination of permutation matrices, that is

K

1kkkMM

Mk is a permutation matrix

k is intensity of the k-th permutation matrix

2I2IK 2

Other decompositions can be used for doubly sub-stochastic M;

Birkoff/von Neumann maximizes throughput

Birkoff/von Neumann applied to the switch problem by Chang et al (2000)

7

The Problem that We Study

Given: M1, M2, …, MK a sequence of permutation matrices

Find: schedules with a guarantee on their smoothness

“smooth” quantified through the concept of latency defined shortly

8

Why is the Problem Important

• Rate provision, but also, delay-jitter guarantees for diffserv like EF (Expedited Forwarding), guarantees for MPLS, provision of a good Connection-Reservation-Table to offer guaranteed service to control traffic inside a switch

9

Related Work

When load is not more than 1/4 (Giles and Hajek ‘97) a schedule exists such that each pair ij is scheduled at least once in 1/ij

When load is 1 (Chang et al ‘00) Birkoff/von Neumann decomposition + PGPS scheduling of the decomposition permutation matrices, then a bound exists (shown shortly)

10

Related Work (cont’d)

• Leonardi et al (Infocom’01): a maximum-weight matching switch uniformly loaded with <1 has the mean delay

• Shah and Kopikare (Infocom’02): a switch with bernoulli <1 arrivals and scheduling that at each slots picks permutation matrix uniformly at random over the entire set of I! permutation matrices has the mean delay

) 1 /( ) I( ] W[Eij

) 1 /( )1 I( ] W[Eij Mean-delay results:

11

Content

• Method to Construct Schedules • Latency definition used• Latencies of 4 schedulers: Random-Permutation,

Random-Phase, Random-Distortion, Poisson Competition

• Numerical Examples• Tasting some of the Methods Used to Obtain

Results• Conclusion

12

Method to Construct a Schedule: Superposition of Marked Point

Processes

0

1 intensity

0

0

0

Schedule:

N1:

N2:

NK:

2 intensity

K intensity

K

1kk intensity

N:

1T 2T ...

12 ...

13

Latency of a Schedule

)}Em()T,T[N{ ij1ijmnnij

Latency 1: For any n, m, there exists 0Eij1

Latency 2: For any n, there exists

)}Em()T,T[N:0m{ ij2ijmnnij

0Eij2

Latency 3: There exists

)}Em()T,T[N:0m,0n{ ij3ijmnnij

0Eij3

ijSk

mnnkmnnij )T,T[N:)T,T[N

14

Latency of a Schedule

number of slots offered to the ij-th port pair in [0,m)

mij3E0

mij

)Em( ij3ij

)T,T[N m0ij

15

It is Valuable to have an Input-Output port

Characterized with Rate-Latency

)Em()m(b ij3ijij

• Is a bound on lateness of the slots offered to the ij-th port pair

• It is a strict (rate-latency) service curve • Having an input-output port pair

characterized with a service curve, enables us to use known results from Network Calculus to bound backlog and delay for appropriately characterized arrival traffic

16

Scheduler by Chang et al

PGPStoken arrivals tokens placed

back as new arrivals

)1K|S|

,K

min(Eij

ij

ij

ij3

Initialization: token of type k arrive at k/1

1 to equalelement ij

with matrices perm. ofsubset Sij

ijSk

kij

17

Scheduler by Chang et al (cont’d)

0

1/2 1/3 1/4 1/5 1/1

2/1 2/2 2/3 2/4

0

0

0 K/1 K/2

Schedule:

Tokens 1:

Tokens 2:

Tokens K:

18

Scheduler by Chang et al (cont’d)

The bound of Chang et al is almost tight

One can construct an example that almost attains the bound, see the paper

19

Smooth per-permutation matrix may not mean

smooth per input-output port

• An input-output port pair may be scheduled by more than one permutation matrix

• Aggregate of subset of permutation matrices may be not smoothly scheduled, even though the schedule of permutation matrices is smoothIf each input-output port pair would

have 1 exactly in 1 perm. matrix, then <=> classical polling

20

Random Permutation Scheduler

0

1L/l11 0

0

0

Schedule:

Tokens 1:

Tokens 2:

Tokens K:

1 2 34 5 1l copy from [0,1)

1 2 3 4 2l

1

copy from [0,1)

copy from [0,1)

1

1 2 Kl

L/l22

L/lKK

...

...

...

1

copy from [0,1)

21

Latency of Random Permutation Scheduler

L large ,L1

A~Eij

ijij3

21

e)1Ak4(1k

Ak22 2

Result 1: Fix some 0<<1. With probability 1-

where

(for , the same estimate holds with A=1/2lnij2E

! L~LatencyKK lL

L)1( :caseWorst ij

22

Flavor of a Way to Obtain the Result

}EY{ ij3ij

kL2k1

kL2k1

XminXmax:Y

k)1()T,T[N:X k1k0ijk

W)1(L

YijijL

)t(Binf)t(BsupW 01t001t0d the range of Brownian bridge

definition of the latency 3

period-L

22wk2

1k

22 e)1wk4(2)wW(P

known result

23

Variance of the offered slots with Random

Permutation

)Lm

1(Lm

)1(1L

L)]T,T[N[Var ijij

2

mnnij

24

Random-Phase Scheduler

0

1/2 1/3 1/4 1/5 1/1

2/1 2/2 2/3 2/4

0

0

0 K/1 K/2

Schedule:

Tokens 1:

Tokens 2:

Tokens K:

1/1

2/1

K/1

11 /U

22 /U

KK /U

1)uniform(0, i.i.d. U,...,U,U K21

25

Random-Phase Scheduler (cont’d)

)1L2ln(K22|S|

Eij

ijij3

Result 2: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,

26

Random-Distortion Scheduler

0

1/2 1/3 1/4 1/5 1/1

2/1 2/2 2/3 2/4

0

0

0 K/1 K/2

Schedule:

Tokens 1:

Tokens 2:

Tokens K:

11,1 /U

21,2 /U

K1,K /U

1)uniform(0, i.i.d. 1,2,...,i , U,...,U,,U i,Ki,2i,1

12,1 /U

22,2 /U

K2,K /U

27

Random-Distortion Scheduler

DlnK22Dln|S|21

E ijij

ij3

Result 3: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,

kk

2

min1I

81D

28

Poisson-Competition Scheduler

)( Poisson~N kk

)1

(1

ln21

Eij

1ij2

)( Bernoulli kAmounts to: at a slot, the permutation matrix is of type k ~

For latency 2:}E}m)1()T,T[N{max{ ij

2ijijmnnij1m

Waiting time of Geo/D/1 queue (known)

Brownian approximation

29

Numerical EvaluationsGoal: Evaluate latencies over a large set of service rate matrices (matrix M

defined earlier)

Algorithm to generate stochastic matricesBegin (k=0): set IxI matrix M such that [mij]=1/L, all ij

Step (k), k=1,…,k0:

• draw i1, j1, i2, j2 uniformly at random on 1,2,…,I

• draw d uniformly at random on [0,min(mi1j1,mi2j2)]

• [mi1j1]<-[mi1j1]-d, [mi2j2]<-[mi2j2]-d,[mi1j2]<-[mi1j2]+d, [mi2j1]<-[mi2j1]+d

Evolution of M is a Markov chainOne perhaps may prefer to generate M uniformly at

random over the space of doubly stochastic matrices

30

Numerical Evaluations: varying switch size

ij3ij

ijEmax

I

Ob.: except for small switch sizes, • the random-phase bound is tighter than PGPS;• the random-distortion bound is tightest

31

Numerical Evaluations: per port- pair latencies for a

64x64 matrix xE s.t. ij of Fraction ij

3ij

x

L=4096K=2423

Ob.:• the fraction is larger for the random-phase than PGPS • for large enough x, the fraction is largest for the random-

distortion

32

Numerical Evaluation for Random Permutation Scheduler

L

ij3E

ij2E

01.0

33

Excerpts from the Analysis

34

Preliminaries

)}Em()T,T[N{G ijmnnijm,n “Good” Event:

Assume: N, 21 R, 43

1mnt

2ns

Result 1: )()st()t,s[N 43ijij

)T,T[)t,s[ & mnn )Em()T,T[N ijmnnij

Eij

4321

35

Preliminaries Cont’d

Result 2:21 s)s,0[N & t)t,0[N

)T,T[)t,s[ mnn

Putting the Pieces Together:

} )()st()t,s[N{

}s)s,0[N{

} t)t,0[N{

43ijij

2

1

m,nG

Gn,m is implied by the events easier to handle

36

Random-phase Scheduler

k,tkk Xt)t,0[N Scheduler def:

ttUk,t kkk1X )1,0(Unif~Uk

ts all ,1)st()t,s[N kk

ts all |,S|)st()t,s[N ijkij

Assume |S|21

ij43

Then

}s)s,0[N{

} t)t,0[N{

2

1

m,nG

Remains only to handle two events

37

Random-phase Scheduler (cont’d)

Note Xt)t,0[N 1tk

kt,1 k

k,tt ]X[E:

Hoeffding K/)(21

21e)t)t,0[N(P

Similarly K/)(21

22e)s)s,0[N(P

L

1s2

L

1t1

ij m,nm,n

)s)s,0[N(P

)t)t,0[N(P1)G(PFinally,

1L2L2

1

sum to L, periodicity

)1L2ln(

2K

:21

> 0

38

Random-phase Scheduler: DERANDOMIZATION

Method of conditional probabilities

Assume events of sequence a A,...,A,A1n21

s-rv of sequence a Y,...,Y,Y2n21

s-rv of array an X,...,X,X2n,i2,i1,i

1n,...,2,1i

]zY,...,zY|)X(f[E

)zY,...,zY|A(P

mm11

n

1kikik

mm11i

2

)x(fx some ,z,...,z,z any ikm21

39

Random-phase Scheduler: DERANDOMIZATION (cont’d)

Result there exist 2n21 y,...,y,y

])X(f[E)yY,...,yY|A(P2

22

n

1kikiknn11i

In addition, if 1])X(f[Ei

n

1kikik

2

2n1i Y,..., Yby determined completely is A

tindependen mutually are Y,...,Y2n1

)x( xsomefor ),Y(X ikkikk,i

1)yY,...,yY|A(Pi

nn11i 22Then

40

Random-phase Scheduler: DERANDOMIZATION (cont’d)

Application to our problem

kk UY

)U(X kikk,i iixik kk1)x(

Hoeffding from ),x(fx ik

}k)k,0[N{A 1k

1L2L2

)s)s,0[N(P

)t)t,0[N(P

L

1s2

L

1t1

We showed

By the method of cond. prob., it follows that the latency holds w.p.1

< 1

41

Conclusion• We showed that one can obtain less pessimistic

bounds on latency that hold in probability• One can derandomize and obtain latencies that hold

with probability 1• In many cases the obtained latencies are better

than a best-known latency• Approach of the Point Processes may be used to

construct other schedulers• Worth to try to obtain sharper results• The question remains: what is the best possible

latency for load larger than 1/4