Favour Queue

8/2/2019 Favour Queue

1/12

a r X i v : 1 1 0 3 . 2 3 0 3 v 2 [ c s . N I ] 2 4 M a r 2 0 1 1

1

FavourQueue: a Stateless Active QueueManagement to Speed Up Short TCP Flows (and

others too!)Pascal Anelli 1 , Emmanuel Lochin 2,3 and Remi Diana 2,3

1 Universite de la R eunion - EA2525 LIM, Sainte Clotilde, France2 Universite de Toulouse ; ISAE ; Toulouse, France

3 TeSA/CNES/Thales ; Toulouse ; France

Index Terms Active Queue Management; TCP; PerformanceEvaluation; Simulation; Flow interaction.

Abstract This paper presents and analyses the implemen-tation of a novel active queue management (AQM) namedFavourQueue that aims to improve delay transfer of short livedTCP ows over a best-effort network. The idea is to dequeue in

rst packets that do not belong to a ow previously enqueued.The rationale is to mitigate the delay induced by long-lived TCPows over the pace of short TCP data requests and to preventdropped packets at the beginning of a connection and duringrecovery period. Although the main target of this AQM is toaccelerate short TCP trafc, we show that FavourQueue doesnot only improve the performance of short TCP trafc but alsoimprove the performance of all TCP trafc in terms of drop ratioand latency whatever the ow size. In particular, we demonstratethat FavourQueue reduces the loss of a retransmitted packet,decrease the RTO recovery ratio and improves the latency up to30% compared to DropTail.

I. I NTRODUCTION

Internet is still dominated by web trafc running on topof short-lived TCP connections [ 1]. Indeed, as shown in[2], among 95% of the client TCP trafc and 70% of theserver TCP trafc have a size lower than ten packets. Thisfollows a common web design practice that is to keep viewedpages lightweight to improve interactive browsing in terms of response time [ 3]. In other words, the access to a webpageoften triggers several short web trafcs that allow to keepthe downloaded page small and to speed up the display of the text content compared to other heavier components thatmight compose it 1 (e.g. pictures, multimedia content, designcomponents). As a matter of fact and following the growth of the web content, we can still expect a large amount of shortweb trafc in the near future.

TCP performance suffers signicantly in the presence of bursty, non-adaptive cross-trafc or when the congestion win-dow is small (i.e. in the slow-start phase or when it operatesin the small window regime). Indeed, bursty losses, or lossesduring the small window regime, may cause RetransmissionTimeouts (RTO) which trigger a slow-start phase. In thecontext of short TCP ows, TCP fast retransmit cannot betriggered if not enough packets are in transit. As a result, the

1See for instance: Best Practices for Speeding Up Your Web Site fromYahoo developer network.

loss recovery is mainly done thanks to the TCP RTO andthis strongly impacts the delay. Following this, in this studywe seek to improve the performance of this pervasive shortTCP trafc without impacting on long-lived TCP ows. Weaim to exploit router capabilities to enhance the performance

of short TCP ows over a best-effort network, by giving ahigher priority to a TCP packet if no other packet belongingto the same ow is already enqueued inside a router queue. Therationale is that isolated losses (for instance losses that occurat the early stage of the connection) have a strong impact onthe TCP ow performance than losses inside a large window.Then, we propose an AQM, called FavourQueue, which allowsto better protect packet retransmission and short TCP trafcwhen the network is severely congested.

In order to give to the reader a clear view of the problemwe tackle with our proposal, we lean on paper [ 2]. Figure1 shows that the ow duration (or latency 2) of short TCPtrafc is strongly impacted by an initial lost packet which

is recovered later by an RTO. Indeed, at the early stageof the connection, the number of packets exchanged is tosmall to allow an accurate RTO estimation. Thus, an RTOis triggered by the default time value which is set to twoseconds by default [4]. In this gure, the authors also givethe cumulative distribution function of TCP ow length andthe probability density function of their completion time froman experimental measurement dataset obtained during one dayon a ISP BRAS link which aggregates more than 30,000 users.We have reproduced a similar experiment with ns-2 (i.e. witha similar ow length CDF according to a Pareto distribution)and obtained a similar probability density function of the TCPows duration as shown in Figure 2 for the DropTail queuecurve. Both gures ( 1 and 2) clearly highlight a latency peak att = 3 seconds which corresponds to this default RTO value [ 4].In this experiment scenario, the RTO recovery ratio is equalto 56% (versus 70% in the experiments of [ 2]). As a matterof fact, these experiments show that the success of the TCPslow-start is a key performance indicator. The second curvein Figure 2, shows the result we obtain by using our proposalcalled FavourQueue. Clearly, the peak previously emphasizedhas disappeared. This means the initial losses that strongly

2 The latency refers to the delay elapsed between the rst sent and the lastpacket received.
http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2http://arxiv.org/abs/1103.2303v2


2/12

2

0 1 2 3 4 5 60.001

0.01

0.1

1

P r o b a

b i l i t y

D e n s i

t y F u n c t

i o n

Flow duration [s]

serverclient

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000

C u m u l a t

i v e

D i s t r i b u t

i o n

F u n c

t i o n

Flow length, L [pkt]

serverclient

Fig. 1. TCP ow length distribution and latency (by courtesy of the authorsof [ 2]).

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000

C u m u

l a t i v e

D i s t r i b u

t i o n

F u n c

t i o n

Flow length (pkt)0 1 2 3 4 5 6

10 5

10 4

10 3

10 2

10 1

P r o

b a b

i l i t y D e n s

i t y F u n c t

i o n

Latency (s)

DropTailFavorQueue

Fig. 2. TCP ow latency distribution from our simulation model.

impacted the TCP trafc performance have decreased.An important contribution of this work is the demonstration

that our scheme, by favouring isolated TCP packets, decreasesthe latency by decreasing the loss ratio of short TCP owswithout impacting long TCP trafc. However, as FavourQueuedoes not discriminate short from long TCP ows, every owstake advantage of this mechanism when entering either theslow-start or a recovery phase. Our evaluations show that 58%of short TCP ows improve their latency and that 80% of long-lived TCP ows also take advantage of this AQM. Forall sizes of ows, on average, the expected gain of the transferdelay is about 30%. This gain results from the decrease of thedrop ratio of non opportunistic ows which are those that less

occupy the queue. Furthermore, the more the queue is loaded,the more FavourQueue has an impact. Indeed, when there isno congestion, FavourQueue does not have any effect on thetrafc. In other words, this proposal is activated only whenthe network is severely congested.

Finally, FavourQueue does not request any transport pro-tocol modication. Although we talk about giving a priorityto certain packet, there is no per-ow state needed insidethe FavourQueue router. This mechanism must be seen asan extension of DropTail that greatly enhances TCP sourcesperformance by favouring (more than prioritizing) certainTCP packets. The next Section II describes the design of

the proposed scheme. Then, we presents in Section III theexperimental methodology used in this paper. Sections IVand V dissects and analyses the performance of FavourQueue.Following these experiments and statistical analysis, we pro-pose a stochastic model of the mechanism Section VI. Wealso present a related work in Section VII where we positionFavourQueue with other propositions and in particular discusshow this AQM completes the action of recent proposals thataim to increase the TCP initial slow-start window. Finally,we propose to discuss the implementation and some securityissues in Section VIII and conclude this work Section IX.

II. F AVOUR Q UEUE DESCRIPTION

Short TCP ows usually carry short TCP requests such asHTTP requests or interactive SSH or Telnet commands. As aresult, their delay performance are mainly driven by:

1) the end-to-end transfer delay. This delay can be reducedif the queueing delay of each router is low;

2) the potential losses at the beginning connection. The rstpackets lost at the beginning of a TCP connection (i.e. inthe slow-start phase) are mainly recovered by the RTOmechanism. Furthermore, as the RTO is initially set toa high value, this greatly decreases the performance of short TCP ows.

The two main metrics on which we can act to minimizethe end to end delay and protect from loss the rst packetsof a TCP connection and are respectively the queuing delayand the drop ratio. Consequently, the idea we develop withFavourQueue is to favor certain packet in order to acceleratethe transfer delay by giving a preferential access to transmis-sion and to protect them from drop.

This corresponds to implement a preferential access to

transmission when a packet is enqueued and must be favoured(temporal priority) and a drop protection is provided when thequeue is full (drop precedence) with push-out scheme thatdequeue a standard packet in order to enqueue a favouredpacket.

When a packet is enqueued, a check is done on the wholequeue to seek another packet from the same ow. If noother packet is found, it becomes a favoured packet. Therationale is to decrease the loss of a retransmitted packetin order to decrease the RTO recovery ratio. The proposedalgorithm (given in Algorithm 1) extends the one presentedin [ 5] by adding a drop precedence to non-favoured packetsin order to decrease the loss ratio of favoured packets. The

selection of a favoured packet is done on a per-ow basis.As a result the complexity is as a function of the size of thequeue which corresponds to the maximum number of statethat the router must handle. The number of state is scalableconsidering todays routers capability to manage million of ows simultaneously [ 6]. However the selection decision islocal and temporary as the state only exists when at least onepacket is enqueued. This explains why we prefer the term of favouring packet more than prioritizing packet. Furthermore,FavourQueue does not introduced packet re-ordering inside aow which obviously badly impacts TCP performance [ 7].Finally, in the specic case where all the trafc becomes


3/12

3

Algorithm 1 FavourQueue algorithm1: function enqueue(p)2: # A new packet p of ow F is received 3: if less than 1 packet of F are present in the queue then4: # p is a favoured packet 5: if the queue is full then6: if only favoured packets in the queue then7: p is drop8: return9: end if

10: else11: # Push out 12: the last standard packet is dropped13: end if 14: p inserted in position pos15: pos pos +116: else17: # p is a standard packet 18: if the queue is not full then19: p is put at the end of the queue20: else21: p is dropped22: end if 23: end if

favoured, the behaviour of FavourQueue will be identical thanDropTail.

III. E XPERIMENTAL METHODOLOGY

We use ns-2 to evaluate the performance of FavourQueue.Our simulation model allows to apply different levels of load to efciently compare FavourQueue with DropTail. The

evaluations are done over a simple dumbbell topology. Thenetwork trafc is modeled in terms of ows where each owcorresponds to a TCP le transfer. We consider an isolatedbottleneck link of capacity C in bit per second. The trafcdemand, expressed as a bit rate, is the product of the owarrival rate and the average ow size E []. The load offeredto the link is then dened by the following ratio:

=E []

C . (1)

The load is changed by varying the arrival ow rate [ 8].Thus, the congestion level increases as a function of the load.As all ows are independent, the ow arrivals are modeled by a

Poisson process. A reasonable t to the heavy-tail distributionof the ow size observed in practice is provided by the Paretodistribution. The shape parameter is set to 1.3 and the meansize to 30 packets. Left side in Figure 2 gives the ows sizedistribution used in the simulation model.

At the TCP ow level, the ns-2 TCP connection establish-ment phase is enabled and the initial congestion window sizeis set to two packets. As a result, the TCP SYN packet istaken into account in all dataset. The load introduced in thenetwork consists in several ows with different RTT accordingto the recommendation given in the Common TCP evaluationsuite paper [8]. The load is ranging from 0.05 to 0.95 with a

step of 0.1. The simulation is bounded to 500 seconds for eachgiven load. To remove both TCP feedback synchronization andphase effect, a trafc load of 10% is generated in the oppositedirection. The ows in the transient phase are removed fromthe analysis. More precisely, only ows starting after the rstforty seconds are used in the analysis. The bottleneck link capacity is set to 10Mbps . All other links have a capacityof 100Mbps . According to the small buffers rule [9], bufferscan be reduced by a factor of ten. The rule of thumb saysthe buffer size B can be set to T C with T the round-trip propagation delay and C the link capacity. We chooseT = 100 ms as it corresponds to the averaged RTT of theows in the experiment. The buffer size at the two routers isset to a bandwidth-delay product with a delay of 10ms . Thepacket length is xed to 1500 bytes and the buffer size has alength of 8 packets.

To improve the condence of these statistical results, eachexperiment for a given load is done ten times using differentsequences of pseudo-random numbers (in the following wetalk about ten replications experiment ). Some gures also

average the ten replications, meaning that we aggregate andaverage all ows from all ten replications and for all loadconditions. In this case, we talk about ten averaged experiment results which represents a dataset of nearly 17 million of packets. The rationale is to consider these data as a realmeasurement capture where the load is varying as a functionof time (as in [ 2]) since each load condition has the sameduration. In other words, this represents a global network behaviour.

The purpose of these experiments is to weight up thebenets brought by our scheme in the context of TCP best-effort ows. To do this, we rst experiment a given scenariowith DropTail then, we compare with the results obtained with

FavourQueue. We enable FavourQueue only on the uplink (data path) while DropTail always remains on the downlink (ACK path). We only compare all identical terminating owsfor both experiments (i.e. DropTail and FavourQueue) in orderto assess the performance obtained in terms of service for asame set of ows.

We assume our model follows Internet short TCP owscharacteristics as we nd the same general distribution latencyform than Figure 1 which is as a function of the measurementsobtained in Figure 2. This comparison provides a correctvalidation model in terms of latency. As explained above,Figure 2 corresponds and illustrates a ten averaged experiment .

IV. P ERFORMANCE EVALUATION OF TCP FLOWS WITHFAVOUR QUEUE

We present in this section global performance obtained byFavourQueue then we deeper analyze its performance andinvestigate the case of persistent ows. We compare a sameset of ows to assess the performance obtained with DropTailand FavourQueue.

A. Overall performance

We are interested in assessing the performance of each TCPows in terms of latency and goodput. We recall from Section


4/12

4

I that we dened the latency as the time to complete a datadownload (i.e the transmission time) and the goodput is theaverage pace of the download. In order to assess the overallperformance of FavourQueue compared to DropTail, Figure3 gives the mean and standard deviation of the latency as afunction of the trafc load of FavourQueue. We both studyFavourQueue with and without the push-out mechanism inorder to distinguish the supplementary gain provided by thedrop precedence.

The results are unequivocal. FavourQueue version withoutpush-out as presented in [5] provides a gain when the loadincreases compared to DropTail (i.e. when the queue hasa signicant probability of having a non-zero length) whilethe drop precedence (with push-out) clearly brings out asignicant gain in terms of latency. Basically, Figure 4 showsthat both queues (with and without push-out) globally drop thesame amount of packets. However, the push-out version betterprotects short TCP ow (and more generally: all ows enteringa slow-start phase) as when the queue is congested, it alwaysenqueues a packet from a new ow. As a result, initial lost

of packets further decreases. Indeed, as already emphasizedin Figure 2 from the introduction, losses do not occur at thebeginning of a connection and as a result, the ow is notimpacted anymore by the retransmission overhead resultingfrom an RTO. Thus, our favouring scheme allows to preventlost packets during the startup phase. As a matter of fact, thisis explained by a different distribution of these losses.

0.00

0.50

1.00

1.50

2.00

2.50

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

L a t e n c y

( s )

Load

DropTailFavorQueuewithout PO

FavorQueue

Fig. 3. Overall latency according to trafc load.

0.00

0.05

0.10

0.15

0.20

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

D r o p r a

t i o

Load

DropTailFavorQueuewithout PO

FavorQueue

Fig. 4. Overall drop ratio according to trafc load.

Following this, we have computed the resulting normalizedgoodput for all ows size for all experiments and obtained is2.4% with DropTail and 3.5% with FavourQueue (i.e. around1% of difference). This value is not weak as it corresponds toan increase of 45%.

Figure 5 gives the average latency as a function of the owlength. The cumulative distribution function of the ow lengthis also represented. On average, we observe that FavourQueueobtains a lower latency than DropTail whatever the ow length.This difference is also larger for the short TCP ows whichare also numerous (we recall that the distribution of the owssize follows a Pareto distribution and as a result the number of short TCP ow is higher). This demonstrates that FavourQueueparticularly favors the slow-start of every ow and as a matterof fact: short TCP ows. The cloud pattern obtained for aow size higher than hundred is due to the decrease of thestatistical sample (following the Pareto distribution used forthe experiment) that result in a greater dispersion of the resultsobtained. As a result, we cannot drive a consistent latencyanalysis for sizes higher than hundred.

To complete these results, Figure 6 gives the latency ob-tained when we increase the size of both queues. We observethat whatever the queue size, FavourQueue always obtainsa lower latency. Behind a given queuesize (in Figure 6 atx = 60 ), the increase of the queue does not have an impacton the latency. This enforces the consistency of the solutionas Internet routers prevent the use of large queue size.

0.1

1

10

100

1 10 100 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

L a t e n c y

( s )

C u m u l a

t i v e

D i s t r i b u t i o n

F u n c t

i o n

Flow length (pkt)

Flow length CDF

DropTail

FavorQueue

Fig. 5. Flow length CDF and mean latency as a function of the ow length.

0.00

0.50

1.00

1.50

2.00

2.50

0 20 40 60 80 100 120

L a t e n c y

( s )

Queue size (pkt)

DropTail

FavorQueue

Fig. 6. Overall latency according to queuesize.


5/12

5

B. Performance analysis

To rene our analysis of the latency, we propose to evaluatethe difference of latencies per ows for both queues. Wedenote i = T di T f i with T d and T f the latency observedrespectively by DropTail and FavourQueue for a given ow i .Figure 7 gives the distribution of the latencies difference. Thisgure illustrates that there is more decrease of the latency for

each ow than increase. Furthermore for 16% of ows, thereis no impact on the latency i.e. = 0 . In other words, 84%of ows observe a change of latency; 55% of ows observea decrease ( > 0) and 10% of ows observe a signicantchange ( > 1 second). However, 30% of the ows observe anincrease of their latency ( < 0). In summary, FavourQueuehas a positive impact on certain ows that are penalised withDropTail.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1051015

P r o

b .

(s)

Fig. 7. Cumulative distribution function of latency difference .

In order to assess the ows that gain in terms of latency,Figure 8 gives the probability of latency improvement. Forthe whole set of short TCP ows, (i.e. with a size lower than10 packets), the probability to improve the latency reaches58% while the probability to decrease is 25%. For long TCPows (i.e. above 100 packets), the probability to improve andto decrease the latency is respectively 80% and 20%. Theows with a size around 30 packets are the ones with thehighest probability to be penalised. For long TCP ows, thelarge variation of the probability indicates a uncertainty whichmainly depends on the experimental conditions of the ows.We have to remark that long TCP ows are less present inthis experimental model (approximately 2% of the ows have a

size higher or equal to 100 packets). As this curve correspondsto a ten averaged experiment, each long TCP ows haveexperienced various load conditions and this explains theselarge oscillations.

Medium sized ows are characterized by a predominanceof the slow-start phase. During this phase, each ow oppor-tunistically occupies the queue and as a results less packets arefavoured due to the growth of the TCP window. The increaseof the latency observed for medium sized ows (ranging from10 to 100) is investigated later in subsection V-A . We will alsosee in the next subsection IV-C that FavourQueue acts like ashaper for these particular ows.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

P r o

b .

Flow length (pkt)

Decrease

Increase

Identical

Fig. 8. Probability to change the latency.

To estimate the latency variation, we dene G(x) the latencygain for the ows of length x as follows:

G(x) =N i =1 x i

N

i =1 T dx i. (2)

with N , the number of ows of length x. A positive gainindicates a decrease of the latency with FavourQueue. Figure9 provides the positive, negative and total gains as a functionof the ows size. We observe an important total gain for theshort TCP ows. The ows with an average size obtain thehighest negative gain and this gain also decreases when thesize of the ows increases. Although some short ows observean increase of their latency, in a general manner, the positivegain is always higher. This preliminary analysis illustratesthat FavourQueue improves by 30% on average the best-effortservice in terms of latency. The ows that take the biggestadvantage of this scheme are the short ows with a gain upto 55%.

0

0.1

0.2

0.3

0.4

0.5

0.6

1 10 100 1000 10000 100000

G a i n

| G ( x ) |

Flow length (pkt)

Positive

Total

Negative

Fig. 9. Average Latency gain per ow length.

Finally and to conclude with this section, we plot in Figure10 the number of ows in the system under both AQM as afunction of time to assess the change in the stability of thenetwork. We observe that FavourQueue considerably reduces


6/12

6

both the average number of ows in the network as well asthe variability.

010

203040

5060

708090

100110

120130

140

0 500 1000 1500 2000

N u m

b e r o f

f l o w s

Time (s)

DropTail

010

203040

5060

708090

100110

120130

140

0 500 1000 1500 2000Time (s)

FavorQueue

Fig. 10. Number of simultaneous ows in the network.

C. The case of persistent ows

0

1020

304050

6070

8090

100

110120130

140

0 500 1000 1500 2000

N u m

b e r o f

f l o w s

Time (s)

DropTail

0

1020

304050

6070

8090

100

110120130

140

0 500 1000 1500 2000Time (s)

FavorQueue

Fig. 11. Number of short ows in the network when persistent ows areactives.

Following [10], we evaluate how the proposed schemeaffects persistent ows with randomly arriving short TCPows. We now change the network conditions with 20% of short TCP ows with exponentially distributed ow sizes witha mean of 6 packets. Fourty seconds later, 50 persistent owsare sent. Figure 11 gives the number of simultaneous short

ows in the network. When the 50 persistent ows start, thenumber of short ows increases and oscillates around 100with DropTail. By using FavourQueue, the number increasesto 30 short ows. The short ows still take advantage of thefavour scheme and Figure 12 conrms this point. However weobserve in Figure 13 that the persistent ows are not penalized.The mean throughput is nearly the same (1.81% for DropTailversus 1.86% for FavourQueue) and the variance is smallerwith FavourQueue. Basically, FavourQueue acts as a shaper byslowing down opportunistic ows while decreasing the dropratio of non opportunistic ows (those which less occupy thequeue).

0.1

1

10

100

0 10 20 30 40 50

L a t e n c y

( s )

Flow length (pkt)

DropTail

FavorQueue

Fig. 12. Mean latency as a function of ow size for short ows in presenceof persistent ows.

0

0.5

1

1.5

2

2.5

15000 20000 25000 30000

T h r o u g h p u t

( % )

Flow length (pkt)

DropTail

FavorQueue

Fig. 13. Mean throughput as a function of ow length for 50 persistentows.

V. U NDERSTANDING FAVOUR Q UEUEThe previous section has shown the benets obtained with

FavourQueue in terms of service. In this section, we analysethe reasons of the improvements brought by FavourQueue bylooking at the AQM performance. We study the drop ratioand the queueing delay obtained by both queues in order toassess the reasons of the gain obtained by FavourQueue. Werecall that for all experiments, FavourQueue is only set on theupstream. The reverse path uses a DropTail queue. In a rstpart, we look at the impact of the AQM on the network thenon the end-host.

A. Impact on the network Figure 14 shows the evolution of the average queueing delay

depending on the size of the ow. This gure corresponds tothe 10 averaged replications experiment (as dened SectionIII). Basically, the results obtained by FavourQueue and Drop-Tail are similar. Indeed, the average queueing delay is 2.8msfor FavourQueue versus 2.9ms for DropTail and both curvessimilarly behave. We can notice that the queueing delay forthe medium sized ows slightly increases with FavourQueue.These ows are characterized by a predominance of the slow-start phase as most of the packets that belong to these owsare emitted during the slow-start. Since during this phase


7/12


8/12

8

0

0.01

0.02

0.03

0.04

0.05

0.06

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

R T O r a

t i o T ( )

Load

DropTail

FavorQueue

Fig. 16. RTO ratio as a function of the network load.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

R T O r e c o v e r y r a

t i o

( x )

Flow length (pkt)

DropTail

FavorQueue

Fig. 17. RTO recovery ratio according to ow length.

a noticeable decrease of the RTO ratio due to the decrease of the packet lost rate on the rst packets of the ow. Thus, thenumber of duplicate acknowledgement is higher, allowing totrigger a Fast Retransmit recovery phase. The trend showsa global decrease of the RTO ratio when the ow lengthincreases. On the overall, the RTO recovery ratio reaches 56%for DropTail and 38% for FavourQueue. The decrease of thegain obtained follows the increase of the ow size. This meansthat FavourQueue helps the connection establishment phase.

VI. S TOCHASTIC MODEL OF FAVOUR QUEUE

We analyze in this part the impacts of the temporal anddrop priorities previously dened in Section II and propose astochastic model of the mechanism.

A. Preliminary statistical analysis

We rst estimate the probability to favour a ow as afunction of its length by a statistical analysis. We deneP (Favor |S = x), the probability to favour a ow of sizes , as follows:

P (Favor |S = s) =N i =1 fa i

N i =1 s + R i

. (5)

with f a i , the number of packets which have been favouredand R i the number of retransmitted packets of a given i ow.

The number of favoured packets corresponds to the numberof packets selected to be favoured at the router queue. Figure18 gives the results obtained and shows that:

the ows with a size of two packets are always favoured; the middle sized ows that mainly remain in a slow-

start phase are less favoured compared to short ows.The ratio reaches 50% meaning that one packet amongtwo is favoured;

long TCP ows get a favouring ratio around 70%.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

P ( F a v o r

)

Flow length (pkt)

Fig. 18. Probability of packet favouring according to ow length.

0

10

20

30

40

50

60

1 10 100 1000 10000 100000

P u s

h

o u t p r o p o r

t i o n

( % )

Flow length (pkt)

Fig. 19. Push-out proportion of drop as a function of ow length.

We also investigate the ratio of packets dropped resultingfrom the push-out algorithm as a function of the ow lengthin order to assess whether some ows are more penalised bypush-out. As shown, Figure 19, the mean is about 30% for allows, meaning that the push-out algorithm does not impact

more short than long TCP ows.We now propose to build a stochastic model of Figure 18in the following.

B. Stochastic model

We denote S : the random variable of the size of the owand Z : the Bernoulli random variable which is equal to 0 if no favoured packets are present in the queue and 1 otherwise.We then distinguish three different phases:

phase #1 : each ows have a size lower than s1 . In thisphase, the ows are in slow-start mode. This size is aparameter of the model which depends of the load. ;


9/12

9

phase #2 : each ows have a size higher than s 1 and lowerthan s2 . In this phase, ows progressively leave the slow-start mode (corresponding to the bowl between [10 : 100]in Figure 18). This is the most complex phase to modelas all ows are either in the congestion avoidance phaseor at the end of their slow-start. s 2 is also a parameter of the model which depends of the load;

phase #3 : each ows have a size higher than s 2 . Allows are in congestion avoidance phase. Note that thestatistical sample which represents this cloud is not largeenough to correctly model this part (as already pointedout in Section IV-A ). However, one other important resultgiven by Figure 18 is that 70% of packets of ows incongestion avoidance mode are favoured. We will usethis information to infer the model. This also conrmsthat the spacing between each packet in the congestionavoidance phase increases the probability of an arrivingpacket to be favoured.

First phase: We consider a bursty arrival and assume thatall packets belonging to the previous RTT have left the queue.

Then, the burst size (BS) can take the following values: BS =1, 2, 4, 8, 16, 32,... . If Z = 0 , we assume that a maximumof 3 packets can be favoured in a row 3, the packets numberthat are favoured are 1, 2, 3, 4, 5, 6, 8, 9, 10, 16, 17, 18,... and1, 2, 4, 8, 16, 32,... if Z = 1 . Thus, if Z = 0 , the probabilityto favour a packet of a ow of size s is:

P (Favor |(Z = 0 , S = s)) =

s, s 6s 1

s , 7 s 109s , 11 s 15

s 6s , 16 s 18

12s , 19 s 31

(...)

(6)

and with Z = 1 :

P (Favor |(Z = 1 , S = s)) =

1, s = 12s , 2 s 33s , 4 s 7

4s , 8 s 15

5s , 16 s 31

(...)

(7)

The probability to favour a packet of a ow of size s isthus:

P (Favor |S = s ) = P (Z = 0) .P (Favor |(Z = 0 , S = s )) + (8)P (Z = 1) .P (Favor |(Z = 1 , S = s ))

Once again, P (Z = 0) and P (Z = 1) depends on the loadof the experiment and must be given.

Second phase: In this phase, each ow progressively leavesthe slow-start phase. First, when a ow nishes its slow-startphase, each following packets have a probability to be favouredof 70% (as shown in in Figure 18). So, we now need to

3The rationale is the following, if Z = 0 a single packet (such as the SYNpacket) is favoured and one RTT later, the burst of two packets (or larger)will be favoured if we consider that the rst packet of this burst is directlyserved.

compute an average value of the probabilty to favour a packetfor a given ow. We also have to take into account that, fora given size of ow s , only a proportion of these ows haveeffectively left the slow-start phase. The other ones remainin slow-start and the analysis of their probabilty to favour apacket follows the rst phase. To correctly describe this phase,we need to assess which part of ows of size s , s1 s s2 ,has left the slow start phase at packet s1 , s1 + 1 , ... s . As arst approximation, we use a uniform distribution between s 1and s2 . This means that for ows of size s , the proportion of ows which have left the slow-start phase at s 1 , s 1 + 1 , ...s 1 is

1s 2 s 1

and the proportion of ows of size s whichhave not yet left the slow-start phase is thus s 2 ss 2 s 1 .

If we denote pk the proportion of ows of size s s 1 thathave left the slow start-phase at k we have:

P (Favor |(S = s, Z = 0)) =s s 1 1

i =0

pk .P (Favor |k = s 1 + i, Z = 0 , S = s )

and

P (Favor |(S = s, Z = 1)) =s s 1 1

i =0

pk .P (Favor |k = s 1 + i, Z = 1 , S = s )

and as in ( 8) we obtain:

P (Favor |S = s ) =

P (Z = 0) .s s 1 1

i =0

pk .P (Favor |k = s 1 + i, Z = 0 , S = s ) +

P (Z = 1) .s s 1 1

i =0

pk .P (Favor |k = s 1 + i, Z = 1 , S = s )

Third phase: The model of this phase is quite simple. Infact, each packet of a ow which have left the slow-start phasehave a probability to be favoured of 70%. We compute theprobability for a packet to be favoured by taking into accountthe time at which a ow has left the slow-start phase and theproportion of ows as in the second phase.

Model tting: To verify our model, among the ten loads thatare averaged in Figure 18, we choose two verify our modelfor two loads: = 0 .25 and = 0 .85. For the rst one wehave estimated P (Z = 1) = 0 .25 and P (Z = 1) = 0 .7 forthe second. Figures 20 and 21 show that our model correctly

ts both experiments.This model allows to understand the peaks in Figure 20when the ow size is lower than hundred packets. These peaksare explained by the modelling of the rst phase. Indeed, thetrafc during the slow-start is bursty, then, each burst has eitherone or two packets favoured as a function of Z (i.e. up to threepackets are favoured when Z = 0 and only one when Z = 1as given by (6) and (7)).

VII. R ELATED WORK

Several improvements have been proposed in the literatureand at the IETF to attempt to solve the problem of short


10/12

10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

P ( F a v o r )

Flow length (pkt)

SimulationTheoretical

Fig. 20. Model tting for = 0 .25 with P (Z = 1) = 0 .25.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

P ( F a v o r )

Flow length (pkt)

SimulationTheoretical

Fig. 21. Model tting for = 0 .85 with P (Z = 1) = 0 .7.

TCP ows performance. Existing solutions can be classiedinto three different action types: (1) to enable a schedulingalgorithm at the router queue level; (2) to give a priorityto certain TCP packets or (3) to act at the TCP level inorder to decrease the number of RTO or the loss probability.Concerning the two rst items, the solution involves the corenetwork while the third one involves modications at theend-host. In this related work, we rst situate FaQ amongseveral core network solutions and then explain how FaQmight complete end-hosts solutions.

A. Enhancing short TCP ows performance inside the corenetwork

1) The case of short and long TCP ows differentiation:

Several studies [13][10] [14] have proposed to serve rst shortTCP trafc to improve the overall system performance. Thesestudies follow one queueing theory result which stands that theoverall mean latency is reduced when the shortest job is servedrst [ 15]. One of the precursor in the area is [ 14], where theauthors proposed to adapt the Least Attained Service (LAS)[15] , which is a scheduling mechanism that favors short jobswithout prior knowledge of job sizes, for packets networks.As for FavourQueue, LAS is not only a scheduling disciplinebut a buffer management mechanism. This mechanism followsFavourQueue principle since the priority given to the packetis done without knowlegde of the size of the ow and that

the classication is closely related to the buffer managementscheme. However, the next packet serviced under LAS isthe one that belongs to the ow that has received the leastamount of service. By this denition, LAS will serve packetsfrom a newly arriving ow until that ow has received anamount of service equal to the amount of least service receivedby a ow in the system before its arrival. Compared toLAS, FavourQueue has no notion of amount of service aswe seek to favour short job by accelerating their connectionestablishement. Thus, there is no conguration and no complexsettings.

In [ 13] and [ 10] , the authors push further the same idea andattempt to differentiate short from long TCP ows accordingto a scheduling algorithm. The differences between thesesolutions are based on the number of queues used whichare either ow stateless or stateful. Theses solutions uses anAQM which enables a push out algorithm to protect short TCPow packets from loss. Short TCP ows identication is doneinside the router by looking at the TCP sequence number [ 10] .However and in order to correctly distinguish short from long

TCP ows, the authors modify the standard TCP sequencenumbering which involves a major modication of the TCP/IPstack. In [ 13] , the authors propose another solution with aper-ow state and decit round robin (DRR) scheduling toprovide fairness guarantee. The main drawback of [ 14][13 ] isthe need of a per-ow state while [ 10] requires TCP sendersmodications.

2) The case of giving a priority to certain TCP packets:Giving a priority to certain TCP packets is not a novel idea.Several studies have tackled the benet of this concept toimprove the performance of TCP connection. This approachwas really popular during the QoS networks research epochas many queueing disciplines was enabled over IntServ and

DiffServ testbed allowing researchers to investigate such pri-ority effects. Basically, the priority can be set intra-ow orinter-ow. Marco Mellia et Al. [ 16] have proposed to useintra-ow priority in order to protect from loss some keyidentied packets of a TCP connection in order to increasethe TCP throughput of a ow over an AF DiffServ class. Inthis study, the authors observe that TCP performance sufferssignicantly in the presence of bursty, non-adaptive cross-trafc or when it operates in the small window regime, i.e. ,when the congestion window is small. The main argument isthat bursty losses, or losses during the small window regime,may cause retransmission timeouts (RTOs) which will result inTCP entering the slow-start phase. As a possible solution, the

authors propose qualitative enhancements to protect againstloss: the rst several packets of the ow in order to allowTCP to safely exit the initial small window regime; severalpackets after an RTO occurs to make sure that the retransmittedpacket is delivered with high probability and that TCP senderexits the small window regime; several packets after receivingthree duplicate acknowledgement packets in order to protectthe retransmission. This allows to protect against losses thepackets that strongly impact on the average TCP throughput.In [3][17], the authors propose a solution on inter-ow priority.The short TCP ow are marked IN. Thus, packets from theseows are marked as a low drop priority. The differentiation in


11/12

11

core routers is applied by an active queue management. Whensender has sent a number of packets that exceeds the owidentication threshold, the packet are marked OUT and thedrop probability increase. However, these approaches need thesupport of a DiffServ architecture to perform [18].

B. Acting at the TCP level

The last solution is to act at the TCP level. The rstpossibility is to improve the behavior of TCP when a packetis dropped during this start up phase (i.e. initial window size,limited transit). The second one is to prevent this drop bydecreasing the probability of segments lost. For instance, in[19] , the authors propose to apply an ECN mark to SYN/ACKsegments in order to avoid to drop them. The main drawback of these solutions is that they require important TCP sendermodications that might involve heavy standardisation pro-cess.

We wish to point out that one of the current hot topiccurrently discussed within the Internet Congestion Control

Research Group (ICCRG) deals with the TCP initial windowsize. In a recent survey, the authors of [ 20] highlight that theproblem of short-lived ows is still not yet fully investigatedand that the congestion control schemes developed so far donot really work if the connection lifetime is only one ortwo RTTs. Clearly, they argue for further investigation onthe impact of initial value of the congestion window on theperformance of short-lived ows. Some recent studies havealso demonstrated that larger initial TCP window helps fasterrecovery of packet losses and as a result improves the latencyin spite of increased packet losses [21 ], [ 22]. Several proposalshave also proposed solutions to mitigate the impact of the slowstart [23 ], [24], [25] .

Although we do not act at the end-host side, we share thecommon goal to reduce latency during the slow start phase of a short TCP connection. However, we do not target the sameobjective. Indeed, end-host solutions, that propose to increasethe number of packets of the initial window, seek to mitigatethe impact of the RTT loop while we seek to favour shortTCP trafc when the network is congested. At the early stageof the connection, the number of packets exchanged is lowand a short TCP request is both constrained by the RTT loopand the small amount of data exchange. Thus, some studiespropose to increase this initial window value [21 ], [22]; tochange the pace at which the slow-start sends data packetsby shrinking the timescale at which TCP operates [26 ]; even

to completely suppress the slow-start [ 24]. Basically, all theseproposals attempt to mitigate the impact of the slow-start loopthat might be counterproductive over large bandwidth productnetworks. On the contrary, FavourQueue do not act on thenumber of data exchanged but prevents losses at the beginningof the connection. As a result, we believe that FavourQueuemust not be seen as a competitor of these end-host proposalsbut as a complementary mechanism. We propose to illustratethis complementarity by looking at the performance obtainedwith an initial congestion window sets to ten packets. Figure22 gives the complementary cumulative distribution functionof the latency for DropTail and FavourQueue with ows with

an initial slow-start set to two or ten packets. We do not havechanged the experimental conditions (i.e. the router buffer isstill set to eight packets) and this experiment corresponds toa ten averaged experiments (see section III). As explained in[21] , if we focus on the results obtained with DropTail forboth initial window size, the increase of the initial windowimproves the latency (with the price of an increase of the lossrate as also denoted in [ 21] ). However, the use of FavourQueueenforces the performance obtained and complement the actionof such end-host modications making FavourQueue a genericsolution to improve short TCP trafc whatever the slow-startvariant used.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.001 0.01 0.1 1 10

C o m p l e m e n

t a r y

C u m u

l a t i v e

D i s t r i b u t

i o n

F u n c

t i o n

Latency (s)

DropTail iw=2DropTail iw=10FavorQueue iw=2FavorQueue iw=10

Fig. 22. Comparison of the benet obtained in terms of latency with aninitial TCP window size of ten packets.

VIII. D ISCUSSION

A. Security considerationIn the related work presented in Section VII, we present

a similar solution to our proposal that gives priority to TCPpackets with a SYN ag set. One of the main criticism thatraises such kind of proposals usually deal with TCP SYN oodattack where TCP SYN packets may be used by maliciousclients to improve this kind of threat [27]. However, this is afalse problem as accelerating these packets do not introduceany novel security or stability side-effects as explained in[28] . Today, current kernel enables protection to mitigate suchwell-known denial of service attack 4 and current IntrusionDetection Systems (IDS) such as SNORT 5 combined withrewall rules allow network providers and companies to stopsuch attack. Indeed, the core network should not be involvedin such end-host security issue that should remain under thereponsability of edge networks and end-hosts. Concerning thereverse path and as raised in [ 28] , provoking web servers orhosts to send SYN/ACK packets to third parties in order toperform a SYN/ACK ood attack would be greatly inefcient.This is because the third parties would immediately drop suchpackets, since they would know that they did not generate theTCP SYN packets in the rst place.

4 See for instance http://www.symantec.com/connect/articles/hardening-tcpip-stack-sy5 http://www.snort.org/
http://www.symantec.com/connect/articles/hardening-tcpip-stack-syn-attackshttp://www.snort.org/http://www.snort.org/http://www.symantec.com/connect/articles/hardening-tcpip-stack-syn-attacks


12/12
http://www.cs.northwestern.edu/~akuzma/doc/ecn.pdfhttp://www.ietf.org/rfc/rfc4782.txthttp://www.springerlink.com/content/6078217660g68170/http://www.computer.org/portal/web/csdl/doi/10.1109/ICNS.2007.78http://www.nanog.org/meetings/nanog47/presentations/Monday/Labovitz_ObserveReport_N47_Mon.pdf

Documents

Favour Queue