12
A robust packet scheduling algorithm for proportional delay differentiation services Jianbin Wei a, * , Cheng-Zhong Xu a , Xiaobo Zhou b , Qing Li a a Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI 48202, USA b Department of Computer Science, University of Colorado at Colorado Springs, Colorado Springs, CO 80918, USA Received 20 November 2004; received in revised form 15 June 2006; accepted 20 June 2006 Available online 12 July 2006 Abstract Proportional delay differentiation (PDD) model is an important approach to relative differentiated services provisioning on the Inter- net. It aims to maintain pre-specified packet queueing-delay ratios between different classes of traffic at each hop. Existing PDD packet scheduling algorithms are able to achieve the goal in long time-scales when the system is highly utilized. This paper presents a new PDD scheduling algorithm, called Little’s average delay (LAD), based on a proof of Little’s Law. It monitors the arrival rate of the packets in each traffic class and the cumulative delays of the packets and schedules the packet according to their transient queueing properties in order to achieve the desired class delay ratios in both short and long time-scales. Simulation results show that LAD is able to provide predictable and controllable services in various system conditions and that such services, whenever feasible, can be guaranteed, indepen- dent of the distributions of packet arrivals and sizes. In comparison with other PDD scheduling algorithms, LAD can provide the same level of service quality in long time-scales and more accurate and robust control over the delay ratio in short time-scales. In particular, LAD outperforms its main competitors significantly when the desired delay ratio is large. Ó 2006 Elsevier B.V. All rights reserved. Keywords: Quality of service; Packet scheduling; Proportional delay differentiation; Little’s law 1. Introduction The past decade has seen an increasing demand for pro- visioning of different levels of quality of service (QoS) on the Internet to support different types of network applications and different user requirements. To meet this demand, two service architectures are proposed: Integrated Services (Int- Serv) [4] and Differentiated Services (DiffServ) [2]. IntServ requires to reserve routing resources along the service deliv- ery paths using a protocol like Resource Reservation Proto- col (RSVP) for QoS guarantee. Since all the routers need to maintain per-flow state information, this requirement hin- ders the IntServ architecture from widespread deployment. In contrast, DiffServ aims to provide differentiated services among classes of aggregated traffic flows, instead of offering absolute QoS measures to individual flows. It is implement- ed by stateless priority scheduling in the core routers, in col- laboration of stateful resource management at the network edges. To receive different levels of QoS, application packets are assigned to different service types or traffic classes at the network edges [21]; DiffServ-compatible routers in the net- work core perform stateless prioritized packet forwarding, so-called ‘‘per-hop behaviors’’ (PHBs), to the classified packets. Due to its per-class stateless routing, the DiffServ architecture exhibits a good scalability. Early PHB proposals of DiffServ focused on the con- struction of versatile end-to-end services with guaranteed QoS. Two examples are ‘‘expedited forwarding’’ [10] and ‘‘assured forwarding’’ [9]. An alternative to absolute Diff- Serv is a relative differentiated services model to quantify the difference of QoS between classes of traffic. In this mod- el, the network traffic is divided into a number of classes 0140-3664/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2006.06.009 * Corresponding author. Address: Department of Mathematics and Computer Science, South Dakota School of Mines & Technology, Rapid City, SD 57701, USA. Tel.: +1 13135775147; fax: +1 13135771101. E-mail addresses: [email protected] (J. Wei), [email protected] (C.-Z. Xu), [email protected] (X. Zhou), [email protected] (Q. Li). www.elsevier.com/locate/comcom Computer Communications 29 (2006) 3679–3690

A robust packet scheduling algorithm for proportional delay differentiation services

Embed Size (px)

Citation preview

www.elsevier.com/locate/comcom

Computer Communications 29 (2006) 3679–3690

A robust packet scheduling algorithm for proportionaldelay differentiation services

Jianbin Wei a,*, Cheng-Zhong Xu a, Xiaobo Zhou b, Qing Li a

a Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI 48202, USAb Department of Computer Science, University of Colorado at Colorado Springs, Colorado Springs, CO 80918, USA

Received 20 November 2004; received in revised form 15 June 2006; accepted 20 June 2006Available online 12 July 2006

Abstract

Proportional delay differentiation (PDD) model is an important approach to relative differentiated services provisioning on the Inter-net. It aims to maintain pre-specified packet queueing-delay ratios between different classes of traffic at each hop. Existing PDD packetscheduling algorithms are able to achieve the goal in long time-scales when the system is highly utilized. This paper presents a new PDDscheduling algorithm, called Little’s average delay (LAD), based on a proof of Little’s Law. It monitors the arrival rate of the packets ineach traffic class and the cumulative delays of the packets and schedules the packet according to their transient queueing properties inorder to achieve the desired class delay ratios in both short and long time-scales. Simulation results show that LAD is able to providepredictable and controllable services in various system conditions and that such services, whenever feasible, can be guaranteed, indepen-dent of the distributions of packet arrivals and sizes. In comparison with other PDD scheduling algorithms, LAD can provide the samelevel of service quality in long time-scales and more accurate and robust control over the delay ratio in short time-scales. In particular,LAD outperforms its main competitors significantly when the desired delay ratio is large.� 2006 Elsevier B.V. All rights reserved.

Keywords: Quality of service; Packet scheduling; Proportional delay differentiation; Little’s law

1. Introduction

The past decade has seen an increasing demand for pro-visioning of different levels of quality of service (QoS) on theInternet to support different types of network applicationsand different user requirements. To meet this demand, twoservice architectures are proposed: Integrated Services (Int-Serv) [4] and Differentiated Services (DiffServ) [2]. IntServrequires to reserve routing resources along the service deliv-ery paths using a protocol like Resource Reservation Proto-col (RSVP) for QoS guarantee. Since all the routers need tomaintain per-flow state information, this requirement hin-ders the IntServ architecture from widespread deployment.

0140-3664/$ - see front matter � 2006 Elsevier B.V. All rights reserved.

doi:10.1016/j.comcom.2006.06.009

* Corresponding author. Address: Department of Mathematics andComputer Science, South Dakota School of Mines & Technology, RapidCity, SD 57701, USA. Tel.: +1 13135775147; fax: +1 13135771101.

E-mail addresses: [email protected] (J. Wei), [email protected](C.-Z. Xu), [email protected] (X. Zhou), [email protected] (Q. Li).

In contrast, DiffServ aims to provide differentiated servicesamong classes of aggregated traffic flows, instead of offeringabsolute QoS measures to individual flows. It is implement-ed by stateless priority scheduling in the core routers, in col-laboration of stateful resource management at the networkedges. To receive different levels of QoS, application packetsare assigned to different service types or traffic classes at thenetwork edges [21]; DiffServ-compatible routers in the net-work core perform stateless prioritized packet forwarding,so-called ‘‘per-hop behaviors’’ (PHBs), to the classifiedpackets. Due to its per-class stateless routing, the DiffServarchitecture exhibits a good scalability.

Early PHB proposals of DiffServ focused on the con-struction of versatile end-to-end services with guaranteedQoS. Two examples are ‘‘expedited forwarding’’ [10] and‘‘assured forwarding’’ [9]. An alternative to absolute Diff-Serv is a relative differentiated services model to quantifythe difference of QoS between classes of traffic. In this mod-el, the network traffic is divided into a number of classes

3680 J. Wei et al. / Computer Communications 29 (2006) 3679–3690

with ordered QoS requirements in such a way that thetraffic of a higher ranked class receives better (or at leastno worse) services than the traffic of lower ranked classes,in terms of local (per-hop) metrics like queueing delay andpacket loss [5]. The Internet traffic is classified by applica-tions and users at the network edges according to variousservice-level cost/performance agreements and policy con-straints. Due to the lack of admission control or resource res-ervation in the network core, relative DiffServ provides noQoS guarantee to services. However, with the support ofserver-side QoS adaptation, DiffServ-capable routers assureend-to-end relative service differentiation. Although abso-lute DiffServ is desired to Internet services like audio/videostreaming applications that have hard time constraints,relative DiffServ with respect to delay is sufficient to softreal-time applications like e-Commerce transactions.

Recently, Dovrolis, et al. defined a proportional delaydifferentiation (PDD) model in support of relative DiffServ[6,7]. It ensures the quality spacing between classes of traf-fic to be proportional to certain pre-specified class differen-tiation parameters. Since then, many packet schedulingalgorithms have been developed to implement the PDDmodel. Representatives of the PDD algorithms includebacklog-proportional rate (BPR) [6], joint buffer manage-ment and scheduling (JoBS) [14], waiting-time priority(WTP) [7], adaptive WTP [12], hybrid proportional delay(HPD) [7], and mean-delay proportional (MDP) [18]. Theydemonstrated various characteristics in support of thePDD model in different class load conditions and differenttime-scales. Most of them are capable of achieving desireddelay ratios, if the ratios are feasible, under heavy load con-ditions and in long time-scales. For example, HPD takesinto account the delay of head-of-line packet of a back-logged class and the delay of departed packets simulta-neously and achieves desired delay ratios in both shortand long time-scales on average when the delay ratio issmall. However, it yields large ratio variations in statisticsin short time-scales. For large desired delay ratios, largerelative errors (on average) are observed as well. Detailsof the PDD algorithms are reviewed in Section 2.

In this paper, we present a new PDD algorithm, calledLittle’s average delay (LAD), based on a proof of Little’sLaw. Little’s Law regarding a queueing system states thestationary relationship between queue length, arrival rate,and queueing delay on average in the long run [15]. Its proofreveals a transient property regarding the queueing length[22]. That is, the queueing length of a class at any time isequal to the product of the traffic arrival rate and the wait-ing time of backlogged packets, plus the experienced delayof departed packets. Accordingly, LAD monitors the aver-age arrival rate of every traffic class and the queueing delayof arrived packets, including both the waiting packets in thequeue and departed packets for the purpose of controllingthe delay ratio in both long and short time-scales.

Simulation results show that LAD overcomes the limita-tions of its main competitors: AWTP, HDP, and MDP.Specifically, whenever the PDD model of a desired class

delay ratio is feasible, LAD is capable of providing moreaccurate and robust control over the delay ratio than itscompetitors in short time-scales. The improvement is sig-nificant when the desired delay ratio is large. In longtime-scales, LAD performs no worse than its competitorsunder any load conditions. Moreover, the performance ofLAD is independent of the distributions of packet arrivalsand packet sizes because of the generality of Little’s Law.

The remainder of the paper is organized as follows. Sec-tion 2 gives an overview of the PDD model and a briefreview of the existing PDD algorithms. Section 3 presentsthe LAD algorithm and discusses its design and implemen-tation issues. Section 4 evaluates the algorithm via exten-sive simulation and compares it with other PDDalgorithms. We conclude this paper in Section 5.

2. Background and related work

2.1. Proportional Delay Differentiation Model

We consider packet scheduling of a lossless, work-con-serving, and non-preemptive link that services M(M P 2) first-come-first-served (FCFS) queues, one foreach traffic class (Hereinafter the terms ‘‘queue’’ and‘‘class’’ will be used interchangeably). The lossless propertyrequires that the average arrival rate of the aggregate trafficmust be less than the link capacity and that there is enoughqueueing space to buffer backlogged packets. The work-conserving property is that the link is never left idle as longas there are backlogged packets waiting for service in thequeues. The non-preemptive property requires the trans-mission of a packet cannot be interrupted. It is assumedthat the traffic in different classes has independent arrivaland packet size processes. Therefore, the aggregate trafficof the queueing system is determined by the superpositionof the M traffic streams. Denote ki the arrival rate of class i,1 6 i 6M. It follows that the arrival rate of aggregate traf-fic of the system, k ¼

PMi¼1ki. Let C represents the link

capacity. The system utilization rate q ¼ ðPM

i¼1kixiÞ=C,where xi represents the average packet size of of class i.

The objective of the PDD model is to control the qualityspacing between different classes so that their average delayratios be proportional to certain class differentiationparameters pre-defined by network operators. Let Wi

denote the average delay of class i, and di the pre-defineddelay differentiation parameter. The PDD model requiresto ensure that for any two classes i and j, 1 6 i, j 6M,

W i

W j¼ di

dj: ð1Þ

Notice that the PDD model is not always feasible. Becauseof the additional constraint of the Conservation Law inpriority queueing systems, there may not necessarily exista work-conserving scheduler that can meet the constraintof (1). It is known that the average delay of a class has aminimum value due to its inherent class load and the min-imum value can be achieved by the use of a strict priority

J. Wei et al. / Computer Communications 29 (2006) 3679–3690 3681

based scheduler over the classes. In the strict priority basedscheduler, class i cannot be serviced until the queues forclasses i + 1, i + 2, . . ., M are all empty. Assume W sp

i bethe average delay of class i due to the strict priority sched-uling. The upper bound of feasible delay ratio for a G/G/1system with two classes is W sp

1 =W sp2 [7]. For a M/G/1 system

with two classes, the upper bound is 1/(1 � q) [3,12].The PDD model requires the differentiated services be

predictable and controllable in the sense that networkoperators should be able to adjust the service quality spac-ing between any two classes by setting delay differentiationparameters and that the average delay ratios of differentclasses be consistent with their delay differentiation param-eters in both long and short time-scales. Such consistencyshould also be maintained for individual packets departedsuccessively from different classes. In addition, the servicedifferentiation should be independent of class load trafficcharacteristics. Regardless of the distributions of packetarrivals and sizes, the consistency should be maintainedwhenever the PDD model is feasible.

2.2. PDD scheduling algorithms

Since Dovrolis, et al. formulated the PDD model in 1999[5], many packet scheduling algorithms have been proposedfor this model. The existing PDD algorithms can be classi-fied into three categories: rate-based, time-dependent prior-ity based, and Little’s Law based. Rate-based algorithms,as exemplified by BPR [6] and JoBS [14], adjust service rateallocations of classes dynamically to meet the proportionaldelay differentiation constraints. BPR adjusts the servicerate of a class according to its backlogged queue length,while JoBS allocates the service rate of a class based ondelay predictions of its backlogged traffic. Other examplesin this category include dynamic weighted fair queueing[13], and proportional queue control mechanism [17].Rate-based scheduling algorithms are able to provide dif-ferent levels of QoS to different classes. But the accuracyof their control over the delay ratio is unfortunately depen-dent on class load conditions [6]. Due to the dynamic nat-ure of the Internet traffic, the class load distribution on arouter tends to change quickly with time. This limits theapplicability of the rate-based PDD algorithms.

Time-dependent priority based algorithms adjust thepriority of a backlogged class according to the experienceddelay of its head-of-line packet. WTP [7] and adaptiveWTP [12] fall into this category. In WTP, the priority ofa backlogged class is adjusted to be proportional to itshead-of-line packet’s delay normalized with respect to itsdelay differentiation parameter. It uses a set of control vari-ables bi, 1 6 i 6M, as scaling factors of the adjustment.Let cW i denote the delay of the head-of-line packet of classi. According to WTP, the priority of backlogged class i, Pi,is set to bi

cW i=di dynamically on departure of each packet.A packet of a backlogged class with the highest priority willbe forwarded next. Albeit simple, WTP implements thePDD model only when the system utilization rate q

approaches unity [20,19]. In Section 4, we will also showthat its achieved class delay ratios in short time-scalesexhibit high statistical variations.

It is noticed that the WTP control parameters bi dependupon the class load distributions. In [12], Leung, et al.derived a necessary condition, with respect to the class loaddistribution, for feasible WTP control parameters toachieve desired class delay ratios. The derivation is basedon an assumption that the arrival process of each trafficclass is a Poisson distribution. They developed an adaptiveWTP (AWTP) algorithm to adjust the feasible set ofcontrol parameters {bi} according to the delay of thehead-of-line packet in each class and the class loaddistribution. The authors demonstrated the accuracy andadaptivity of the algorithm, in comparison with WTP,under various system utilization rates and in both shortand long time-scales. The authors argued that AWTPwas applicable to the traffic of more practical Pareto

distributions. Their simulation assumed a Paretodistribution with the shape parameter a = 1.9. We notethat the shape parameter a characterizes the degree ofself-similarity of network traffic. The larger a, the less bur-sty and self-similar behaviors were observed in trace studies[11]. In Section 4, we will show that AWTP fails to realizethe PDD model for Pareto distributions with small a.

The third class of the PDD algorithms is based on theLittle’s Law, which relates the average queue length (interms of the number of packets in queue) to the averagearrival rate and the average waiting time of packets. Fora given arrival rate of the packets, the PDD algorithmscontrol the actual delay ratio between different classes byequalizing their normalized queue lengths with respect tothe pre-defined delay differentiation parameters. The equal-ization process is a feedback control process based on theaverage delay of the arrived packets in a time window.PAD, HPD, and MDP are three representatives in this cat-egory. The LAD algorithm proposed in this paper belongsto this category as well. They differ in the way of averagedelay calculation. It is known that at time t, arrived packetsof a class in a time window [t � s, t], can be in one of thetwo states: departed or waited in the queue. PAD considersthe average delay of departed packets in the time windowonly. It is capable of achieving the PDD model constraintsin various load conditions, provided that the desired classdelay ratios are feasible. However, PAD exhibits a patho-logical behavior in short time-scales – occasionally higherclasses to experience larger delays than lower classes –because the algorithm ignores the waiting time of back-logged packets. To solve this issue, HPD was proposedto take into account the average delay of departed packets,and the delay of the head-of-line packet at the same time.Let fW i denote the average delay of the departed packets.HPD deploys a simple linear function: gfW i þ ð1� gÞcW i ,0 6 g 6 1, to measure the queueing delay of class i. Theweighting parameter g can be adjusted according to net-work operators’ requirements. HPD is reduce to PADwhen g = 1 and WTP when g = 0. HPD enhances the aver-

3682 J. Wei et al. / Computer Communications 29 (2006) 3679–3690

age control quality of PAD, and meanwhile avoids its path-ological behavior problem. However, in Section 4, we willshow that HPD achieves the class delay ratio with largestatistical variations in short time-scales.

MDP considers the delay of all arrived packets of eachclass in a time window [t � s, t]. In addition to the delayof the packets in the window, it also takes into accountthe estimated delay of backlogged packets in future [t,1). In Section 4, we will show that MDP delivers perfor-mance comparable to HPD in both short and long time-scales, and MDP achieves the class delay ratio with smallervariations. However, its performance deteriorates as thetarget quality spacing between the classes is enlarged.

Note that PAD, HPD and MDP aim to equalize thenormalized queue length of different classes based on heu-ristic delay information of arrived packets. LAD presentedin this paper is based on a proof of Little’s Law [22]. It con-siders the delay of departed packets as well as the delay ofthe packets in the backlogged queue in the time window[t � s, t].

3. Little’s average delay algorithm

3.1. Little’s Law

For a G/G/1 queueing system, Little’s Law states thatthe average number of packets in the system is equal tothe product of average arrival rate of packets and the aver-age waiting time of the packets in the system. Define L(T)as the average number of the packets in the system duringthe time interval [0,T], W(T) as the waiting time per packetaveraged over all packets, k(T) as the average arrival rate.Suppose W(T) and k(T) have limits as T fi1, that is

W ¼ limT!1

W ðT Þ; and k ¼ limT!1

kðT Þ:

Then, the limit of L(T), denoted by L, exists and

L ¼ kW : ð2ÞThe beauty of Little’s Law (2) is that it does not dependupon any particular queueing discipline (packet schedulingalgorithms); nor does it depend upon any specific assump-tions regarding the packet arrival distribution or the packetsize distribution. It is applicable to the queueing system ofeach traffic class in the PDD model.

LAD algorithm controls the delay ratios between differ-ent classes based on the Little’s Law. Substituting L/k forW, the objective of PDD model in (1) leads to a newconstraint:

Li

kidi¼ Lj

kjdj; ð3Þ

for any two classes i and j. To ensure proportional delay dif-ferentiation between two classes, their normalized queuelength with respect to their respective arrival rates and delaydifferentiation parameters should be kept equal. The LADalgorithm is to control the delay ratio by adjusting their aver-age queueing lengths according to their arrival rates.

Notice that (2) reveals an asymptotic (or stationary)relationship between the queue length, packet arrival rate,and packet waiting time in the system. It is not enough toguide PDD scheduling because the objective of proportion-al delay needs to meet in small time windows. Because mostof Web requests are small in size [1], provisioning of rela-tive delay differentiation service in short time-scales is asimportant as in long time-scales. LAD algorithm is basedon a transient property of the queueing system, as revealedby a proof of the Little’s Law [22]. Following is a sketch ofthe proof.

Suppose that packets p1, p2, . . . arrive at time t1, t2, . . .(0 6 ti < ti+1), and depart at td

1, td2, . . .. The packets are

not necessarily forwarded in FCFS discipline. DenoteN(T) the total number of arrived packets in the time inter-val [0,T]; Nd(T) and Nc(T) the number of departed packetsand the number of waiting packets in queue, respectively. Itfollows that at time T,

NðT Þ ¼ N dðT Þ þ N cðT Þ: ð4ÞDefine Ii(t) as the presentation function of packet pi at timet, that is

I iðtÞ ¼1; if packet pi is present at time t;

0; otherwise:

�Then, we have

NcðT Þ ¼XNðT Þi¼1

I iðtÞ: ð5Þ

Since packet pi stays in queue during the interval [ti, tdi ] and

its queueing delay wi ¼ tdi � ti, we haveZ T

0

I iðtÞdt ¼wi; td

i 6 T ;

T � ti; tdi > T :

(ð6Þ

Therefore, the cumulative queue length in the interval [0, T]isZ T

0

NcðtÞdt ¼Z T

0

XNðT Þi¼1

I iðtÞdt

¼XNd ðT ÞþNcðT Þ

i¼1

Z T

0

I iðtÞdt

¼X

fi:tdi 6Tgwi þ

Xfi:ti6T ;tdi >Tg

ðT � tiÞ; ð7Þ

and the average queue length in interval [0, T] is

LðT Þ ¼ 1

T

Z T

0

LðtÞdt ¼ kðT ÞW ðT Þ; ð8Þ

where

kðT Þ ¼ NðT ÞT

; ð9Þ

W ðT Þ ¼

Pfi:tdi 6Tg

wi

NðT Þ þ

Pfi:ti6T ;td

i >TgðT � tiÞ

NðT Þ : ð10Þ

J. Wei et al. / Computer Communications 29 (2006) 3679–3690 3683

Assume that k(T) and W(T) exist as T fi1, (8) leads tothat

L ¼ limT!1

kðT ÞW ðT Þ ¼ kW : ð11Þ

This completes the proof.

3.2. The LAD algorithm

The basic idea of LAD algorithm is to control the delayratio of classes by monitoring their arrival rates and queue-ing delays of their arrived packets based on transient rela-tionship between the queue length, arrival rate and waitingtime, as revealed by (8). In particular, (10) defines the aver-age waiting time per packet in a window of size T. Thenumerator of the first term actually represents the accumu-lated delays of all departed packets and the numerator ofthe second term represents the accumulated waiting timeof the packets in the backlogged queue so far at time T.Accordingly, we define the LAD algorithm as follows.

For class i, the LAD scheduler maintains three controlvariables to monitor its traffic flow over finite time windowT: the cumulative delays of departed packets W d

i ; the num-ber of arrived packets Ni; and current queue length Nc

i . Atthe beginning of each time window, these variables are(re)initialized. Note that the size of T is in terms of numberof successively departed packets from the system. Thesecontrol variables are updated according to the followingrules:

At the beginning of each time window, Ni Nci and

Wi ‹ 0.Upon the receipt of a packet of class i, the packet istimestamped and Ni ‹ Ni + 1, and Nc

i Nci þ 1.

After transmitting a packet of class i, N ci N c

i � 1 andW d

i W di þ w, where w is the measured delay of the

packet.

Let W ci denote the current cumulative delay of back-

logged packets in the queue i. According to (10), we setthe priority of class i as

P i ¼W d

i þ W ci

Nidi: ð12Þ

Whenever the queueing system is available for packettransmission, a backlogged packet of class j* with the high-est priority is selected. That is,

j� ¼ arg max16i6M

P i: ð13Þ

Ties for the highest priority are broken by serving thepacket that has entered the queueing system earliest. Notethat the validity of Little’s Law does not depend upon anyparticular queueing discipline. Therefore, the next packetcan be any backlogged packet if a more complicated sched-uling algorithm is needed.

There are some important issues in the implementationof the LAD algorithm. The foremost is the time window

size T. It is known that Little’s Law is valid when the timewindow is sufficiently large. However, provisioning PDDservices in short time-scales is as important as in longtime-scales. A good choice of T should strike a balancebetween system stability and responsiveness. On one side,a large T would avoid abrupt changes of average queueingdelay due to bursty traffic. Particularly, when T is suffi-ciently large, the average delay of the packets in the timewindow would hide the effect of the distributions of packetarrivals and packet sizes. On the other side, a small T

would lead to an agile scheduler that responds to thechange of traffic conditions quickly. As we shall presentedin Section 4.4, LAD is able to provide PDD services inboth long and short time-scales. Thus, we believe T = 100packets is a good choice since it gives good responsivenessto LAD.

Another important implementation issue is the calcula-tion of the cumulative delay of backlogged packets in eachqueue W c

i . It is too costly to scan each queue to re-calculateW c

i every time when a packet is to be transmitted and thepriority of each class needs to be adjusted. Instead, we cal-culate W c

i recursively in the following way. Suppose that attime u when the last transmitted packet was selected fromclass i, the class has m backlogged packets p1, p2, . . ., pm

and their arrival times are t1, t2, . . ., tm, respectively. Sincethe queueing system assumes no FCFS scheduling princi-ple, the next packet to be selected for forwarding from classi can be any packet in the queue. Without loss of generality,we assume packet pk is forwarded at time u + s, that is, thetime interval between two successive packet departures is s.Suppose there are n new packet arrivals during the intervaland their arrival time are tm+1, tm+2, . . ., tm+n. It followsthat

W ci ðuþ sÞ ¼ W c

i ðuÞ � ðu� tkÞ þ ðm� 1Þ � s

þXn

j¼1

ðuþ s� tmþjÞ: ð14Þ

Recall that the traffic of class i has an arrival rate of ki.During the interval of s, the average number of packetarrivals is kis. Note that E[s] is the average service timeof a packet. For a stable system, it should be less than orequal to the average inter-arrival time. Thus, for E[n], theaverage number of packets entering into the system durings, we have E½n� ¼

PNi¼1E½ni� 6 1, where E[ni] is the average

arrived packets from class i. Therefore, the main computa-tion overhead of the updating is the multiplication, whichis appropriate in real environment [7].

For each packet transmission, LAD needs to calculateand compare the priorities of all backlogged classes, whichrequires at most N calculations and N � 1 comparisons.The calculation overhead is mainly due to the update ofcontrol variables and timestamping operations. The costfor update is small because it involves only a few additionoperations. The timestamping operation is needed so thatthe delay of a packet can be measured. As pointed out in[7], we do not expect this requirement to be an important

3684 J. Wei et al. / Computer Communications 29 (2006) 3679–3690

difficulty in practice. Furthermore, the timestamping oper-ation is required for all PDD algorithms.

The AWTP, WTP, PAD, HPD, and MDP also requireN � 1 comparisons to select the backlogged class with high-est priority. In addition, AWTP needs to record the numberof arrived packets to estimate the load of every class, whichrequires a few addition operations. The PAD needs to recordthe delay of all departed packets with a few addition opera-tions. The HPD is a combination of PAD and WTP. Thenumber of addition operations is same as PAD since WTPrequires no such addition operations. The MDP requiresmore operations than LAD since it needs to estimate theaverage delay of backlogged packet as well as calculate thedelay of all arrived packets. In summary, the complexity ofLAD is smaller than MDP, similar to PAD and HPD, andslightly larger than AWTP and WTP.

4. Simulation results

In this section, we present simulation results of LAD todemonstrate its performance and properties. We first inves-tigate the predictability and controllability of LAD underdifferent system conditions (i.e. class load distributionand system utilization) and in various time-scales. Then,we illustrated its generality by using arbitrary distributionsof packet arrivals and sizes. Finally, we compare LAD withother PDD algorithms, including WTP, AWTP, ADP,HPD, and MDP. A primary performance metric is errorbetween desired class delay ratio and achieved ratio. Theresults are an average of 1000 runs.

The experiments are based on a model that consists ofthree main components:

• N packet generators, which generate packets of indepen-dent arrival and size distributions by using GNU Scien-tific Library [8].

• N packet queues, which hold packets of correspondingclasses. Although LAD assumes no requirements forpacket scheduling for each queue, the simulator assumesthe FCFS principle to ensure the same service order indifferent run.

• A packet scheduler that performs LAD and other PDDalgorithms under various settings of proportional delaydifferentiation parameters.

The experiments assumed the distributions of packetarrivals and sizes are similar to those in [7,12]. That is,the interarrivals between packets of a class are identicallyindependent distributed (i.i.d.) random variables of a Par-eto or Poisson distribution. The packets are uniform in sizeor variable with a small number of choices. The transmis-sion time of a packet is proportional to its size.

The probability density function P(y) of the Pareto dis-tribution and its mean l

P ðyÞ ¼ aba

yaþ1; ð15Þ

l ¼ aba� 1

; ð16Þ

where a, 0 < a < 2, is the shape parameter (also called‘‘tail’’ index) and b is the scale parameter. It is known thata characterizes the degree of self-similarity of network traf-fic. The larger a, the less bursty and self-similar behaviorswere observed in trace studies [11]. As in [7], we set theshape parameter a = 1.5 in the experiments. In the Paretodistribution, the system utilization is controlled by adjust-ing b. When a = 1.5, by (16), we have the system utilizationrate

q ¼ 1

l¼ a� 1

ab¼ 1

3b: ð17Þ

Finally, we note that a possible performance factor of thePDD algorithms is class load distribution. The load metricof a traffic class is in terms of the service time. The load dis-tribution between two class i and j is equal to the ratio of ki

to kj, when the packets in different classes have the samesize.

4.1. Predictability of LAD

We investigated the predictability of LAD in experi-ments over three classes of Pareto distributed traffic(a = 1.5). Their delay differentiation parameters (d1, d2,d3) were set to (4,2,1) and the class load distribution (k1,k2, k3) varies between (1,1,1), (1,2,4) and (4,2,1).

We obtained the simulations results in short (T = 100),moderate (T = 1000), and long (T = 10,000) time-scales,as the system utilization rate q varies. Due to space lim-itation, Fig. 1 shows the results of short and long time-scales. From Fig. 1, we can see that LAD can achievethe desired delay ratios accurately, independent of classload distributions, in long time-scales in all the testcases.

As the timescale decreases, the resulted error betweenachieved and desired delay ratios under moderate systemutilization rates becomes no longer negligible, particularlyin the case the desired delay ratio of 4. This is mainlybecause the system utilization difference in short and longtime-scales. Although the experiment assumed stable sys-tem utilization rates in the long run, the system utilizationrates were hardly maintained accurately in the short run.Due to the burstiness of the Internet traffic, the systemtransient utilization rates in short time windows were oftenlower than the stationary rates. That means even when thedesired delay ratio can be achieved in long time-scales, itmaybe infeasible in short time-scales. For example, whenthe long run system utilization is 65%, the transient systemutilization in short time-scales in most of the runs would betoo low to ensure the feasibility of the desired delay ratio of4 between class 1 and class 3. When the utilization in longrun reaches 80%, the transient utilization rates in most ofthe runs becomes high enough to achieve the target delayratios.

a

b

Fig. 2. Individual packet delays of three classes using LAD in differentsystem utilizations. (a) q = 90%; (b) q = 65%.

a

b

Fig. 1. Delay ratios of three classes using LAD in different systemutilizations and time-scales. (a) T = 100 packets. (b) T = 10,000 packets.

J. Wei et al. / Computer Communications 29 (2006) 3679–3690 3685

Note that the feasibility of LAD is affected by the systemutilization rate does not mean that LAD has requirementsfor an accurate estimation of the system utilization rate. Itcan be seen from the results from the setting of class delayratio of 2 in Fig. 1. As long as that the utilization is highenough for the PDD ratio to be feasible, LAD can achieveratio in both short and long time-scales, independent ofclass load distributions.

To investigate the behaviors of individual packets fromdifferent classes, we plot in Fig. 2 the individual delays ofpackets departed from 3000th to 4000th time unit.Fig. 2(a) shows the results when the system utilization ratewas set to 90% and all classes had the same load (i.e.,

k1 = k2 = k3). From this figure, we can see that individualpackets of a higher class exhibit smaller changes of delaythan those of a lower class in both dimensions of time (xaxis) and delay (y axis). We refer to this as a ‘‘scale-differ-ence’’. For example, in the delay dimension, the delay ofindividual packets in class 3 ranges between 15 to 55 timeunits, while that of class 1 changes in a range of 110 to210 time units. This scale-difference suggests that wheneach class has backlogged packets in queue, the packets

in class 3 (higher quality class) tend to be selected with ahigher probability than those in class 1. In the time dimen-sion, the monotonous delay increasing or decreasing peri-ods of delays in class 3 are between 10 to 50 time units,while those of class 1 are between 50 to 150 units. Thistime-wise scale-difference implies that the backlogged pack-ets in class 3 tend to be forwarded at a faster rate thanthose in class.

When the system load is low, the scale-difference phe-nomenon is not obvious in the time dimension, but remainsmanifesting in the delay dimension. Fig. 2(b) depicts theindividual packet delays where the system utilization wasset to 65%. In general, the queue lengths in a low utilizedsystem are short and their differences are often small. Thisleads to a small period of monotonous changes in eachclass in the time dimension and minimal difference betweentheir periods. In the delay dimension, the backlogged pack-ets of a higher class still have a higher probability to be for-warded than those of a lower class and therefore experiencesmaller delays. In Fig. 2(a) and (b), there are no signs of thepathological behavior with the packets in different classes.

In summary, we conclude that LAD is capable of pro-viding predictable delay differentiation services in bothshort and long time-scales. In heavy load conditions, pack-

3686 J. Wei et al. / Computer Communications 29 (2006) 3679–3690

ets in different classes receive different qualities of services,which are consistent with their per-class differentiationparameters.

4.2. Controllability of LAD

We studied the controllability of LAD through anexperiment over two classes of traffic with an equal loaddistribution (i.e. k1 = k2). Fig. 3 plots the achieved classdelay ratios in various time scales, in comparison withthe desired delay ratios as the system utilization rate chang-es. From this group of figures, it can be observed that LADis capable of achieving the target delay ratio of 2 in all thetime-scales that we tested under medium or high systemutilization rates. When the desired delay differentiationratio d1/d2 increases to 4, a marginal error (less than10%) occurs. Fig. 3(b) shows that when the desired ratiogoes up to 8, LAD cannot meet the PDD constraints inthe long timescale, unless the system utilization rate is high-er than 70%. Aforementioned, this is because the ratio isinfeasible when q 6 70% in the long timescale. Whenever

a

b

Fig. 3. Delay ratios of class 1 to class 2 using LAD in different systemutilizations and time-scales. The desired delay ratios are 2, 4, 8, 16, and 32.(a) T = 100 packets; (b) T = 10,000 packets.

the ratio is feasible, the service difference between thesetwo classes can be controlled accurately by network opera-tors. With the increase of the desired class delay ratio d1/d2,the minimum system utilization rate that makes the givendelay ratio feasible increases.

In comparison with Fig. 3(a), it can be seen that with theincrease of time-scales, the error of the achieved ratiodecreases. For example, the delay ratio d1/d2 = 16 cannotbe achieved in the small timescale of 100 packets, unlessthe system utilization rate q P 85%. With the increase ofthe timescale T to 10,000 packets, the desired ratio canbe approximated when q becomes equal to or larger than75%. This is due to the difference between the system tran-sient utilization in short time-scales and the stationary sys-tem utilization in long time-scales, as we explained inSection 4.2. Since the system transient utilization rate isoften smaller than the stationary utilization rate for burstytraffic, a feasible delay ratio in short timescale tends to befeasible in long time-scales too under the same trafficconditions.

4.3. Generality of LAD

Recall that the generality of a PDD algorithm meansthat its performance should be independent of the distribu-tions of packet arrivals and sizes. Results in precedingexperiments are for Pareto distributed traffic with a uni-form packet size. We studied the generality of LADthrough experiments over two classes with equal class loadsin the long timescale. In addition, we assumed that thepacket arrivals of the same class followed a Pareto or aPoisson distribution and that the packet sizes of each classwere variable. The variable packet sizes were set in thesame pattern as in [6]. That is, 40% of the packets wereset to 40 bytes, 50% packets 550 bytes, and 10% packets1500 bytes. The transmission time of a packet with averagesize (441 bytes) is referred to as one time unit.

We carried out the experiments for traffic with differentpacket arrival distributions: Pareto, Poisson, and mixeddistributions. In the mixed arrival distribution, we assumedpackets of class 1 followed a Poisson distribution and pack-ets of class 2 are Pareto distributed. Fig. 4 presents the sim-ulation results of the experiments with variable packetsizes. To measure the actual feasible range for the samepacket stream, the results due to the strict priority-basedalgorithm are included, as well.

From Fig. 4, we can observe that the feasible delay ratioincreases with the system utilization rate and that wheneverthe desired ratio is feasible, it can be achieved by LADaccurately; otherwise, the achieved ratio by LAD is veryclose to that measured by the strict priority-based algo-rithm. For example, Fig. 4(a) shows that the maximumachievable delay ratio is 22 when the system utilization rateq = 85%. Although the desired ratio of 32 is infeasible,LAD achieves the maximum feasible ratio.

Note that in Fig. 4(b), the feasible ranges are differentfrom those calculated using 1/(1 � q). It is due to the differ-

a

b

c

Fig. 4. Delay ratios of class 1 to class 2 using LAD and priority-basedpacket scheduling algorithms. The packets have various sizes. (a) Paretoarrival distribution; (b) Poission arrival distribution; (c) Mixed arrivaldistribution.

J. Wei et al. / Computer Communications 29 (2006) 3679–3690 3687

ence between the system transient utilization rate in a shorttime window and the stationary utilization rate in the longrun. Such difference becomes smaller with the increase ofthe time window size. Hence, the maximum feasible delayratio becomes closer to the theoretical upper bound. Forexample, with the upper bound 10 at q = 90%, when thetime window increases from 100,000 to 1,000,000 packets,

the measured feasible ratio changes slightly from 9.65 to9.84.

In summary, Fig. 4 shows that the performance of LADis not affected by system utilization rates and is independentof the packet arrival distribution, whenever the desireddelay ratio is feasible. In comparison with the results fromuniform packet sizes in preceding experiments, we can alsoobserve the independence of packet sizes. This generality ofLAD is attributed to the generality of Little’s Law.

4.4. Comparison with other PDD algorithms

We compared LAD with other PDD algorithms, includ-ing WTP, AWTP, PAD, HPD, and MDP. In the experi-ments, we assumed two classes of traffic with the equalclass loads. The packet arrivals of each class followed aPareto distribution (a = 1.5) and all the packets had equalsize. We generated a stream of packets beforehand andassumed the same packet stream for all the experimentswith different PDD algorithms. Recall that AWTP adjuststhe feasible set of control parameters according to the delayof the head-of-line packet in each class and the system uti-lization. Our implementation of AWTP used the jumpingwindow method, as suggested in [12], to estimate the arrivalrate of traffic. HPD is a hybrid of WTP and PAD with aweighting parameter g. We set the parameter g to 0.875as recommended in [7]. MDP takes into account the delayof departed packets and the estimated delay of all otherwaiting packets in the determination of class priorities.Although the MDP authors suggested a simplified methodto approximate the average delay for all arrived packets tomake a tradeoff between quality and run-time overhead[18], we implemented its original version in this experiment.

4.4.1. Comparison in short timescale

We first compared the short timescale performance ofthe algorithms under different system utilization rates.The time window was set to T = 100 packets. The simula-tion results for the cases of d1/d2 = 2 and 8 are plotted inFig. 5(a) and (b), respectively.

Fig. 5(a) shows that all the PDD algorithms, exceptAWTP, can meet the PDD constraints to an acceptableextent for a small delay ratio under moderate and high sys-tem load conditions. In particular, LAD achieves thedesired delay ratio with minimum errors consistently. Incontrast, HPD and MDP demonstrate good performanceunder moderate load conditions, but yield relatively largeerrors when the system utilization rate goes up to as highas 90%. Recall that HPD is a hybrid of WTP and PAD.Both WTP and PAD gain performance as the utilizationrate increases, but their improvement rates are different.Hence, a linear combination of the WTP and PAD witha constant weighting parameter g in HPD is expected togenerate a convex performance plot with respect to the uti-lization rate. This impact of linear combination can be seenmore clearly in Fig. 5(b) for the case of a large desireddelay ratio.

Fig. 5. Delay ratios of class 1 to class 2 using different PDD algorithms indifferent system utilizations. The classes have the same load and the timewindow is 100 packets. (a) d6/d2 = 2; (b) d6/d2 = 8.

Fig. 6. Impact of scaling parameter a on AWTP when applied to Paretodistributed traffic. d1/d2 = 2.

3688 J. Wei et al. / Computer Communications 29 (2006) 3679–3690

The reason for the inaccuracy of MDP in highly utilizedsystems is the estimation error of the delays of backloggedpackets in a time window of [t � s,1) at any time t.Although MDP can measure the delay of packets in thetime window [t � s, t], MDP uses a lower bound to esti-mate the delay of the packets in future [t,1). With theincrease of the system utilization, there are more packetsin a backlogged queue during the interval s, and conse-quently the estimation error increases. When the systemutilization rate goes beyond certain point, the impact ofestimation accuracy becomes significant and the overallperformance of MDP starts to deteriorate. Fig. 5(b) showsthat the estimation error is exaggerated in the case of alarge desired delay ratio and the gap between LAD andMDP is enlarged.

Fig. 5 shows that WTP yields relatively large errorswhen the system utilization rate is moderate. This is consis-tent with the findings of other researchers [7,12]. AWTPwas proposed as a remedy of this problem [12]. It relieson a policy iteration algorithm to adjust the feasible setof control parameters according to the delay of the head-of-line packet in each class and the class load distributions.

The algorithm is based on an assumption that the arrivalprocess of each traffic class is a Poisson distributions.The authors showed that AWTP be applicable to thetraffic of a Pareto distribution with the shape parametera = 1.9.

We note that the shape parameter a characterizes thedegree of self-similarity and burstiness of network traffic.The larger a, the less bursty and self-similar behaviors wereobserved in trace studies [11]. Given the fact that 0 < a < 2,the Pareto distribution with a = 1.9 bears much resem-blance to a Poisson distribution. Fig. 5 shows the resultsfrom a Pareto distribution with a = 1.5. We experimentedwith both AWTP and LAD for more Pareto distributionswith various a and plotted the results in Fig. 6. From thisfigure, we observe that the control parameters of AWTPare unable to meet the PDD constraints over general Par-eto distributed traffic. By contrast, LAD is insensitive tothe Pareto distribution shape.

From Fig. 5 we can also observe that the performance ofAWTP is close to that of LAD when the system utilizationis as large as 95%. This is because when the system is heav-ily loaded, the system can be described using a fluid model,which is independent of the distribution of packet arrivals[16].

4.4.2. Comparison in long time-scales

We compared LAD with other PDD algorithms, focus-ing on their robustness in different time-scales. The experi-ment settings remain the same as in the last one, except thatthe system utilization rate is fixed at 90%. Fig. 7(a) and (b)show three percentiles (the 5th, 50th, and 95th) of achieveddelay ratios for the target ratio of 2 and 8, respectively. Wegive the numbers in the figures directly for some of thelarge percentiles.

Fig. 7 shows that LAD achieves the target ratios accu-rately in all of the time-scales that we tested and outper-forms its competitors consistently in terms of the errorsin various percentiles. This implies that LAD is morerobust to keep the class delay ratio under control and deliv-

a

b

Fig. 7. Percentiles of achieved delay ratios using different PDD algorithmsin different time-scales. These two classes have the same load and thesystem utilization rate is 90%. (a) d6/d2 = 2; (b) d6/d2 = 8.

J. Wei et al. / Computer Communications 29 (2006) 3679–3690 3689

er the desired ratio with small statistical variations.Although all the algorithms are able to meet the PDD con-straints in terms of their medians with small deviations inlong time-scales, LAD is outstanding to provide tight androbust control in a statistical sense over the class delayratio in short time-scales.

Fig. 7(a) shows that all the PDD algorithms, except MDP,are able to achieve the delay ratio of 2 with a high probabilityin the short timescale of 100 packets. LAD demonstrates anexcellent robustness because more than 90 percentage of thetotal runs would produce ratios between 1.6 and 2.4. MDP isrobust, as well, but its achieved ratios center around 1.6. Incontrast, AWTP, HPD, and PAD exhibit a ‘‘heavy tail’’property in that majority of the runs, under the control ofthe algorithms, would lead to delay ratios that are close tothe target ratio of 2, but the algorithms could lose the controlin a few occasions.

Fig. 7(a) also shows that the success probability of thealgorithms increases with the time scale. In the long time-scale of 10,000 packets, all the algorithms are able toachieve the target delay ratio robustly.

In comparison with Fig. 7(b), we observe that all thePDD algorithms lose certain degrees of robustness whenthe desired delay ratio d1/d2 is large. In the short timescale

of 100 packets, LAD performs slightly better than WTPand AWTP, but outperforms PAD, HPD and MDP signif-icantly in terms of their medians. This also indicates that100-packet is a good choice of T. The goodness of WTPand AWTP are mainly due to the high utilization ratio(90%) that we assumed in this experiment. WTP andAWTP provide consistent levels of QoS, independent ofthe desired delay ratio. This is because they use extra con-trol parameters to adjust the impact of the pre-defineddelay ratio. But they are lack of robustness because of theirmedians with large statistical variations. PAD, HPD andMDP perform in a similar way to LAD. They differ inthe way of delay estimation of arrived packets. Fig. 7(b)shows that their performance gap in short time-scales getslarger as the delay ratio increases. As the timescale increas-es, all the PDD algorithms gain more control over thedelay ratio. In the long timescale of 10,000 packets, LADprovides similar levels of QoS to HPD and MDP.

We conclude that in short time-scales, LAD consistentlyoutperforms its competitors for large target delay ratios.Meanwhile, 100 packets is a good choice of time windowsize T. For small target ratios, most of the algorithmscan provide an acceptable level of quality of service. Underheavy load conditions and in long time-scales, LAD per-forms similarly to HPD and MDP. WTP, AWTP, andPAD are not as robust as the others due to their large sta-tistical variations.

5. Conclusions

We have proposed a new proportional delay differen-tiation algorithm, called LAD, to implement the PDDmodel. The algorithm is derived from a proof of Little’sLaw. It monitors the arrival rate of the packets in eachtraffic class and their cumulative delays and achieves thedesired class delay ratios in both short and long time-scales. Simulation results have shown that LAD is ableto meet the PDD constraints, independent of the distri-butions of packet arrivals and packet sizes. In compari-son with other PDD algorithms, LAD provides thesame level of service quality in long time-scales and moreaccurate and robust control over the delay ratio in shorttime-scales.

Our future work will focus on the proportional loss dif-ferentiation model and combine it with the PDD model.The requirements for absolute QoS, such as the end-to-end delay will also be investigated.

References

[1] M. Arlitt, C.L. Williamson, Internet web servers: workload charac-terization and performance implications, IEEE/ACM Transactionson Networking 5 (5) (1997) 631–645.

[2] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, W. Weiss, AnArchitecture for Differentiated Services, IETF, Request for Com-ments 2475, December 1998.

[3] G. Bolch, S. Greiner, H. de Meer, K.S. Trivedi, Queueing Networksand Markov Chains, John Wiley & Sons, 1999.

3690 J. Wei et al. / Computer Commun

[4] R. Braden, D. Clark, S. Shenker. Integrated services in theinternet architecture: an overview, Request for Comments 1633,June 1994.

[5] C. Dovrolis, P. Ramanathan, A case for relative differentiated servicesand the proportional differentiation model, IEEE Network 13 (5)(1999) 26–34.

[6] C. Dovrolis, D. Stiliadis, P. Ramanathan, Proportional differentiatedservices: delay differentiation and packet scheduling, in: Proceedingsof SIGCOMM, 1999, pp. 109–120

[7] C. Dovrolis, D. Stiliadis, P. Ramanathan, Proportional differentiatedservices: delay differentiation and packet scheduling, IEEE/ACMTransactions on Networking 10 (1) (2002) 12–26.

[8] Free Software Foundation. GSL – GNU Scientific Library. <http://www.gnu.org/software/gsl/>.

[9] J. Heinanen, F. Baker, W. Weiss, J. Wroclawski, Assured forwardingPHP group, Network Working Group, Request for Comments 2597,June 1999.

[10] V. Jacobson, K. Nichols, K. Poduri, An expedited forwarding PHB,Network Working Group, Request for Comments 2598 (1999).

[11] W.E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On the self-similar nature of ethernet traffic (extended version), IEEE/ACMTransactions on Networking 2 (1) (1994) 1–15.

[12] M.K. Leung, J.C. Lui, D.K. Yau, Adaptive proportional delaydifferentiated services: characterization and performance evaluation,IEEE/ACM Transactions on Networking 9 (6) (2001) 801–817.

[13] C.-C. Li, S.-L. Tsao, M.C. Chen, Y. Sun, Y.-M. Huang, Proportionaldelay differentiation service based on weighted fair queueing, in:Proceedings of IEEE International Conference on Computer Com-munications and Network (ICCCN), October 2000, pp. 418–423.

[14] J. Liebeherr, N. Christin. JoBS: Joint buffer management andscheduling for differentiated services, in: Proceedings of IWQoS2001, Karlsruhe, Germany, June 2001, pp. 404–418.

[15] J.D. Little, A proof of the queueing formula L = kW, OperationsResearch 9 (1961) 383–387.

[16] V. Misra, W.-B. Gong, D. Towsley, Fluid-based analysis of a networkof aqm routers supporting tcp flows with an application to red, in:Proceedings of Sigcomm, 2000, pp. 151–160.

[17] Y. Moret, S. Fdida, A proportional queue control mechanism toprovide differentiated services, in: Proceedings of InternationalSymposium on Computer and Information System (ISCIS), Belek,Turkey, October 1998.

[18] T. Nandagopal, N. Venkitaraman, R. Sivakumar, V. Bharghavan, Delaydifferentiation and adaptation in core stateless networks, in: Proceedingsof IEEE Infocom, Tel-Aviv, Israel, April 2000, pp. 421–430.

[19] R.D. Nelson, Heavy traffic response times for a priority queue withlinear priorities, Operations Research 38 (3) (1990) 560–563.

[20] A. Netterman, I. Adiri, A dynamic priority queue with general concavepriority functions, Operations Research 27 (6) (1979) 1088–1100.

[21] K. Nichols, S. Blake, F. Baker, D. L. Black, Definition of thedifferentiated services field (DS Field) in the IPv4 and IPv6 headers,Network Working Group, Request for Comments 2474, December 1998.

[22] S.J. Stidham, A last word on L = kW, Operations Research 22 (2)(1974) 417–421.

Jianbin Wei received the BS degree in computerscience from Huazhong University of Science andTechnology, China, in 1997, and the MS andPhD degrees in computer engineering fromWayne State University in 2003 and 2006,respectively. His research interests are in com-puter communications and networks, distributedand Internet computing systems.

Cheng-Zhong Xu received the BS and MS degrees

in computer science from Nanjing University in1986 and 1989, respectively, and the Ph.D. incomputer science from the University of HongKong in 1993. He is an Associate Professor in theDepartment of Electrical and Computer Engineerof Wayne State University. His research interestslie in distributed are in distributed and parallelsystems, particularly in resource management forhigh performance cluster and grid computing andscalable and secure Internet services. He has

published more than100 peer-reviewed articles in journals and conferenceproceedings in these areas. He is the author of the book Scalable and

ications 29 (2006) 3679–3690

Secure Internet Services and Architecture (CRC Press, 2005) and a co-author of the book Load Balancing in Parallel Computers: Theory and

Practice (Kluwer Academic, 1997). He serves on the editorial boards of J.of Parallel and Distributed Computing, J. of Parallel, Emergent, andDistributed Systems, J. of High Performance Computing and Networking,and J. of Computers and Applications. He was the founding program co-chair of International Workshop on Security in Systems and Networks(SSN), the general co-chair of the IFIP 2006 International Conference onEmbedded and Ubiquitous Computing (EUC’06), and a member of theprogram committees of numerous international conferences. His researchwas supported in part by the US National Science Foundation, NASA,and Cray Research. He is a recipient of the Faculty Research Award ofWayne State University in 2000, the President’s Award for Excellence inTeaching in 2002, and the Career Development Chair Award in 2003. Heis a senior member of the IEEE.

Xiaobo Zhou received the BS, MS, and PhDdegrees in computer science from Nanjing Uni-versity, in 1994, 1997, and 2000, respectively. Heis an assistant professor in the Department ofComputer Science at the University of Coloradoat Colorado Springs. His research interests are inscalable distributed systems and Internet services,network communications and security. He haspublished about 40 articles in journals and peer-reviewed conference proceedings in these areas.He was a founding program co-chair of the IEEE

International Workshop on Security in Systems and Networks (SSN), theworkshops chair of the IFIP 2006 International Conference on Embedded

and Ubiquitous Computing (EUC’06), and a program committee memberof numerous IEEE conferences. He was a guest coeditor of the Journal of

Parallel and Distributed Computing and the Journal of Network and

Computer Applications. He serves on the editorial board of Journal of

Autonomic and Trusted Computing. Dr. Zhou’s research was supported inpart by the US Air Force Research Laboratory. He was a visiting scientistin 1999 and a Postdoctorate research associate in 2000 at the PaderbornCenter for Parallel Computing, University of Paderborn, Germany. FromJanuary 2001 to August 2003, he was a visiting assistant professor in theDepartment of Computer Science at Wayne State University, Detroit. Hewas the recipient of the Outstanding Researcher of the Year of the EASCollege and the CRCW award of the University of Colorado at ColoradoSprings in 2005. He is a member of the IEEE Computer Society.

Qing Li received the BS and MS degrees in civil engineering fromHuazhong University of Science and Technology, China, in 1997 and2000, respectively. She received the MS degrees in computer engineeringfrom Wayne State University in 2003.