[IEEE 2010 IEEE 18th International Workshop on Quality of Service (IWQoS) - Beijing, China (2010.06.16-2010.06.18)] 2010 IEEE 18th International Workshop on Quality of Service (IWQoS)

Macro-Scheduling of Base Stations for Video-on-Demand Flows in WiMAX Networks

Shubhadip Mitra, UmaMaheswari Devi, Parul Gupta, Malolan Chetlur, Shivkumar KalyanaramanIBM Research – India, Bangalore, KA, India

Abstract—We consider lifetime quality-of-experience (QoE)management for video-on-demand (VoD) users in WiMAXnetworks. For efficient resource utilization while enhancinguser experience, we propose run-time load balancing throughjoint scheduling among multiple base stations (BSs), referredto as macro scheduling. Macro scheduling employs a utility-maximization approach. To achieve long-term proportional fair-ness (PF) and manage lifetime QoE, user and flow utilities aremodeled as functions of their past service rates, in addition tocurrent channel conditions and bandwidth needs.

We show that scheduling flows across multiple BSs jointly toachieve PF is NP-hard in the strong sense. Since approximationalgorithms proposed in prior work are computationally expensivefor online use, we design efficient heuristics that perform as wellas the approximation algorithms. A simulation-based evaluationshows that our overall macro scheduling scheme can improve thenumber of satisfied users by up to 35% in comparison to otherapproaches, while only minimally sacrificing on throughput.

I. INTRODUCTION

The IEEE 802.16e WiMAX technology promises highbroadband data rates over large coverage areas in mobile,wireless contexts [1]. WiMAX is also envisioned for enterprisedeployments for supporting wide range of high data-rate appli-cations such as mobile workforce support, video-based train-ing, etc.. Despite the promise of high data rates, the inherentlyvarying and unreliable nature of the wireless channel posessignificant challenges to providing guaranteed services andensuring quality-of-experience (QoE) over WiMAX. In thispaper, we propose a solution towards managing the lifetimeQoE of video-on-demand (VoD) users.

The state-of-the-art resource allocation techniques inWiMAX consider single base stations (BSs) only. This ap-proach suffers from two limitations: (1) subscriber station(SS) assignments are based merely on local knowledge ofBSs and SSs, mainly, local signal strengths, and (2) SSreassignments are attempted only when there is a change in therelative qualities of the channels of an SS to its neighboringBSs or hand-offs become inevitable. Current approaches arehence unlikely to efficiently utilize the network’s aggregatedresources and can lead to local congestions.

We address the above two limitations for alleviating con-gestions, improving fairness, and enhancing QoE for VoDusers. This is achieved by taking a global view of the networkand jointly considering clusters of BSs for SS assignmentsand allocation of BS resources to SSs. To improve fairness,joint BS reassignment and allocation is performed periodicallyover scheduling epochs, regardless of changes in channelquality. We refer to this periodic, joint scheduling technique asmacro-scheduling of base stations (MSBS). VoD flows can useplayout buffers at the SSs to store future video frames and canbe assumed to have backlogged queues. Therefore, to ensurelong-term fairness and QoE and to improve resource utilization(even) in low-mobility scenarios, we consider assignmentsbased on past rates of flows, in addition to their (long-term)bandwidth needs and channel conditions.

Motivating example: Let us consider the simple network inFig. 1. If assignments here are based on signal strengths only,then both A and B would be assigned to the first BS and Cto the second. With an achievable data rate of 9 units/sec forA and B each, this assignment can meet the needs of neitherA nor B, and under-utilizes BS2. (Assume that at each BS,transmission time is equally allocated to SSs because theirlong term needs are equal.) On the other hand, if the twoBSs are jointly considered and either A or B is assigned toBS2 (instead of BS1), then the needs of at least one of themwould be met (in addition to that of C). Further improvementis possible by noting that serving one of the flows, say A,exclusively from BS1 would lead to A’s buffer building up.

Fig. 1: An example illustrating a

need for MSBS. Nos. on the links

indicate achievable data rates (based

on signal strengths). All 3 SSs need

a service of 6 units/sec each.

For instance, if A is served byBS1 in the first epoch (say,one sec) and B and C by BS2,then at the end of the epoch,A would have a surplus ofthree units, while B, a deficitof the same number of units.A can hence tolerate degradedservice for a limited time inthe future, and in the nextepoch, may be assigned to

BS2 while B is switched over to BS1 (if past service providedto A and B are considered). Thus, B’s deficit can be offset bythe increased service it receives from BS1, and at the end of thesecond epoch, both A and B would have been served 12 unitseach, their stipulated amounts over 2 epochs. By periodicallyswitching A and B between BSs 1 and 2, the needs of boththe SSs could be met over reasonable time scales.

Contributions

Our main contribution is to develop a joint BS schedul-ing framework for WiMAX networks, referred to as macro-scheduling of BSs (MSBS), as described above, to improveQoE for VoD users. MSBS uses a utility-maximization ap-proach to determine BS-SS assignments and allocations ineach epoch. To ensure proportional fairness (PF) [2], eachflow’s utility is a weighted logarithmic function of its alloca-tion. In addition, to aid with lifetime QoE management andimprove long-term fairness, each flow’s utility is modeled todepend on its past service rate apart from current channel con-ditions and allocation. Further, we consider tuning the MSBSscheme to trade throughput for higher fairness and QoE, andadjusting the epoch duration at run-time to lower overhead.An experimental evaluation shows that MSBS can improvethe number of satisfied VoD users by 35% in comparisonto other common approaches while only minimally loweringaggregate throughput.

In the context of joint BS scheduling, we make the followingsecondary contributions. Jointly scheduling BSs to achieve PFis known to be NP-hard if each SS or flow can be assigned

2

to at most one BS [3]. In this paper, we show that theproblem is in fact NP-hard in the strong sense, precludingthe possibility of even a pseudo-polynomial-time algorithm orfully polynomial time approximation scheme (FPTAS). Sinceapproximation algorithms proposed in the literature [3] are notcomputationally efficient for online use, we propose efficientheuristics for the joint scheduling problem and evaluate theirperformance and computational efficiency.

The rest of the paper is organized as follows. Sec. II surveysrelated work. Our system model and notation are introducedin Sec. III. Sec. IV proves joint BS scheduling is NP-hard inthe strong sense, and proposes heuristics. Sec. V presents anempirical evaluation, Sec. VI concludes.

II. RELATED WORK

Li et al.’s PF scheduling for multi-rate WLANs is closelyrelated work to this paper [3]. Li et al. consider maximizingthe sum of logarithms of rates allocated to users in a networkof access points (APs). They claim the problem can be shownto be NP-hard by adapting the reduction in [4], and proposetwo approximation algorithms with approximation ratios 5.828and 2 + ε. The work of Li et al. differs from ours in thefollowing: Li et al. are mainly concerned with maximizingaggregate throughput while serving users in a fair manner, andnot with QoE management for VoD flows. As such, they do notconsider a flow’s past service rate while modeling its utilityfunction, and hence, their approach may not optimally utilizethe resources of a network in the context of VoD. Also, theirapproximation algorithms are computationally expensive andnot suitable for online use. On the other hand, the heuristicswe propose perform as well as the approximation algorithmsin practice while requiring significantly less time.

Improving resource utilization in a WCDMA-like systemof multiple cells through a distributed and cross-layer coor-dination framework among BSs, SSs, and a central server, isconsidered in [5]. The approach therein does not take fairnessinto account and also requires specialized SSs.

In [6], dynamic load balancing and mitigating loss due tointerference by powering down BSs in a coordinated mannerin CDMA networks is considered.

The suitability of PF in HSDPA systems for video streamingwhen the traffic is a mix of elastic and non-elastic applica-tions is empirically studied in [7]. [8] proposes a schedulingalgorithm for streaming services in OFDMA systems. How-ever, allocations are considered for single cells only. Theperformance of weighted PF in terms of spectral efficiency,throughput, resource utilization, and fairness in WiMAX net-works is studied in [9]. Several other works consider thesubcarrier allocation problem in generic OFDMA systems formaximizing throughput and improving QoS subject to a powerbudget. Refer to [10], [11], and references therein. These donot consider WiMAX specific issues and cannot be easilyextended to be used by WiMAX MAC schedulers.

III. SYSTEM MODEL AND DEFINITIONS

Model and Notation: We consider joint allocation of re-sources to VoD flows of SSs in a cluster of BSs that can bethought of as constituting a WiMAX sub-network. B denotesthe number of BSs in the sub-network, and N , the numberof SSs. Each SS is assumed to be associated with a singleVoD flow, so, when unambiguous, SS and VoD flow are used

interchangeably. (Refer to [12] for extension that considersmultiple VoD flows associated with a single SS.)

The number of OFDM slots in each frame at each BS inthe downlink is denoted S. mij(t) denotes the average rate, interms of the average number of data bytes that can be packedin a slot, for SS i at BS j in a macro-scheduling epoch startingat time t. mij(t) depends on the quality of the channel fromSS i to BS j. xij(t) denotes the number of slots allotted toflow i at BS j over all frames in a macro-scheduling epochthat begins at time t. wi > 0 is the weight assigned to flowi based on the bandwidth it requires (which is also the flow’splayout rate).

R̄i(t) denotes the average rate at which flow i is serveduntil t. T > 1 is the length of the averaging window that isused to update the average rate. Average rate is updated everyepoch. di = (T − 1) · R̄i(t) > 0. In all of the above notation,subscript j is omitted when only a single BS is considered, andfor conciseness, time t is omitted. An ordered list of numbers〈y1, y2, . . .〉 is denoted vector �y.

We now define the utility function to associate with flows.Flow utility function: Our goal is to determine an assignmentof flows to BSs and an allocation of slots to flows at theirassigned BSs that is in keeping with the bandwidth needsof flows such that playout stalls are minimized. VoD flowsare less sensitive to latency than live-streaming or interactiveapplications, and can buffer future frames. A flow that hasadequate buffer built up over the past in its lifecycle canafford being served at a lower rate in the current epoch despiteits higher playout rate. Similarly, opportunistic scheduling,which favors flows with good channel conditions and canhence improve overall system throughput, can be used toprepare for future contingency provided doing so does notadversely impact other flows. We hence use a utility functionthat can guarantee proportional fairness (PF), which is knownto balance maximizing aggregate throughput and user fairness.

PF scheduling is generally used for scheduling best-effort(BE) traffic. Scope for buffering and higher tolerance tolatency place VoD in an intermediary class between BE andstringent real-time traffic. Also, latency-based scheduling, suchas, modified largest weighted delay first [13], proposed forreal-time traffic is not designed to encourage streaming futureframes, and as such, is less suited for VoD. We hence designa scheme based on PF utility function for VoD. This schemecan be tuned to improve QoE at the expense of throughput byrestricting allocations to flows with large buffers.

In single-carrier transmission systems, in which only asingle flow is scheduled at each decision time, PF can beachieved by scheduling that flow for which wi·mij(t)

R̄i(t)is maxi-

mized [14]. The problem of extending PF scheduling to multi-carrier systems, in which multiple channels are scheduled ateach decision time, is considered in [15]. The authors considersingle-BS scheduling and derive an objective function thatshould be maximized for allocations to guarantee long-termPF. Using an approach similar to theirs, it can be shownthat long-term PF can be achieved in single-BS OFDMAsystems by allocating slots in each epoch to flows such that∑N

i=1 Ui(xi) is maximized, where Ui(xi), which is the utilityfunction for flow i, is given by

Ui(xi) = wi log(1 +mixi

di). (1)

Recall that di = (T − 1) · R̄i and T > 1 holds. di is theamount of data served to flow i in the past T − 1 epochs and

3

Ui(xi) is the marginal utility obtained by allocating xi slotsto flow i in the current epoch. Note that as T increases, morehistory, that is, a longer past, of flows is taken into accountto arrive at allocations for the current epoch, and a longer Tcan be thought of as improving fairness over a longer term.

When the number of BSs, B, exceeds one, it is easy to seethat the utility function for flow i expands to

Ui(�xi = 〈xij〉Bj=1) = wi log(1 +1

di

B∑j=1

mijxij). (2)

IV. MACRO-SCHEDULING OF MULTIPLE BSS (MSBS)Under MSBS, BS assignments and allocations are deter-

mined for SSs in each scheduling epoch. Each flow’s utilityis modeled using (2). Since each SS (and hence flow) can beassigned to at most one BS in a scheduling epoch, the goal isto determine a many-to-one mapping from the set of flows tothe set of BSs such that the sum of flow utilities is maximized.The problem solved in each epoch is as follows.

MBSRA: Maximize

N∑i=1

wi log

(ai +

B∑j=1

bijxij

),

ai = 1, bij =mij

(T − 1) · R̄i

subject to

N∑i=1

xij = S, for all 1 ≤ j ≤ B (3)

xij ≥ 0, xij ∈ N, for all 1 ≤ i ≤ N, 1 ≤ j ≤ B (4)

(∀j, �, j �= � :: xij > 0 ⇒ xi� = 0), for all i = 1, . . . , N. (5)

The constraint in (3) restricts the total number of slots allo-cated to all the flows in any BS to exactly S, while (4) requiresthe number of slots assigned to any flow to be a non-negativeinteger. (5) is the exclusivity constraint, confining each flowto a single BS. (3) and (5) implicitly restrict the number ofslots assigned to any flow to at most S.

If the number of BSs is one, the problem to be solved is

SBSRA: Maximize U(�x) =

N∑i=1

wi log(ai + bi · xi),

ai = 1, bi =mi

di=

mi

(T − 1) · R̄i

subject to

N∑i=1

xi = S (6)

xi ≥ 0, xi ∈ N, for all 1 ≤ i ≤ N. (7)

For simplicity, we assume that mi > 0, and hence bi > 0,holds for all i.

Both MBSRA and SBSRA are convex programs with non-negativity and integrality constraints. In [12], we presenta quadratic-time exact algorithm for solving SBSRA. Themultiple BS version, however, turns out to be NP-hard in thestrong sense. Let SBSRA-rel denote the unconstrained versionof SBSRA, in which both the non-negativity and integralityconstraints in (7) are relaxed. Our hardness proof for MBSRAuses the solution to SBSRA-rel, so we present it first.A. Solution to SBSRA-rel

Since SBSRA-rel contains only an equality constraint andits objective function is concave, it can be solved usingthe technique of Lagrange multipliers [16]. Applying thetechnique, we have the following system of linear equations,where λ is the Lagrange multiplier.

wibi

ai + bixi= λ,

N∑i=1

xi = S

Solving the above equations, we have

xi =wi∑N

j=1wj

(S +

N∑j=1

aj

bj) − ai

bi. (8)

Since wi > 0, xi > −ai

biholds for all i. The optimal solution

is unique and can take negative values for some unknowns.We next show that MBSRA is NP-hard in the strong sense.

B. Hardness Result for MBSRAWe prove that MBSRA is NP-hard in the strong sense

by providing a reduction from the 3-PARTITION (3-part)problem, which is NP-complete in the strong sense. In [3],Li et al. claim that a variant of MBSRA (that differs insome coefficients) can be shown to be NP-hard by adapting areduction in [4] from the 3-dimensional matching problem.

3-part is a number problem [17] defined as follows.Definition 1 (3-part): Given set E of 3m elements,e1, e2, . . . , e3m, a bound K ∈ Z+, and a size s(ei) = si ∈Z+ for each ei ∈ E such that K/4 < si < K/2 and∑3m

i=1 si = mK . The problem is to determine whether E canbe partitioned into m disjoint sets E1, E2, . . . , Em such that∑

e∈Eis(e) = K , for 1 ≤ i ≤ m.

Theorem 1. The multiple-base station resource allocationproblem in WiMAX, MBSRA, is NP-hard in the strong sense.

Proof: To prove the theorem, we show that the decisionversion of MBSRA is NP-complete in the strong sense. It iseasy to see that the decision version is in NP. We provide apseudo-polynomial reduction [17] from 3-part to it.

In 3-part, without loss of generality, assume that s1 ≥ si,for all 1 < i ≤ 3m.SBS-3part: Consider an instance, denoted SBS-3part, of theunconstrained version of the single-BS allocation problemSBSRA-rel, with the following parameters: N = 3m, S =mK , ai = wi = 1, and bi = 1

1+s1−si> 0 (because s1 ≥ si),

for 1 ≤ i ≤ N . As mentioned in Sec. IV-A, SBSRA-rel hasa unique optimal solution when bi > 0. By (8), the uniqueoptimal solution to SBS-3part is given by �x = �s, which isnon-negative. The unique optimal objective value is given by

OPT =

3m∑i=1

log(1 + bi · xi) =

3m∑i=1

log(1 +si

1 + s1 − si). (9)

We now reduce an instance of 3-part to MBSRA.Reduction, MBS-3part: Construct an instance of MBSRA,denoted MBS-3part, from an arbitrary instance of 3-part asfollows. Let B = m, N = 3m, S = K , and wi = 1, for1 ≤ i ≤ N . For each i, let bij = bi = 1

1+s1−si, for all

1 ≤ j ≤ B. We claim that a solution to 3-part exists if andonly if there is a solution to MBS-3part with objective valueat least OPT as defined in (9).⇐: Assume MBS-3part has a solution with objective value atleast OPT . We first show that this solution is unique with anobjective value of OPT exactly and also identify the value ofeach xij . For this, note that in the solution mentioned, for eachi, at most one of xij , for 1 ≤ j ≤ B, that corresponds to theBS that i is assigned to, is positive. Let βi denote this uniqueBS, if such a BS exists, and be undefined, otherwise (that is,when xij = 0 for all j). The number of slots allotted to i isxiβi if βi exists, and 0, otherwise. Now, a feasible solution toSBS-3part can be constructed from the solution to MBS-3partas follows:

xi =

{xiβi

, βi is defined

0, otherwise

This is because (i) bij of MBS-3part equals bi of SBS-3partfor all i, j, (ii) for each i, at most one of xij is positive in any

4

solution to MBS-3part, (iii) the total capacity of the K BSsin MBS-3part, K · m, is equal to the capacity of the singleBS in SBS-3part, hence

∑Bj=1

∑3mk=1 xkj = mK =

∑3mi=1 xi

and (iv) the remaining parameters, N = 3m and the ai’s andwi’s, are identical in both the problems. Hence, unless βi iswell defined and xiβi = si > 0, for each i, the objective valueof SBS-3part (given by

∑3mi=1 log(1+ xi

1+s1−si)) can be either

increased or achieved with a solution different from �s. Thiswould contradict either the optimality of OPT for SBS-3partor the fact that its optimal solution is unique. Hence, MBS-3part has a unique solution given by xij = 0, for all j �= βi

and xiβi = si, for all i, and it is easy to see that a solution to3-part is obtained by simply assigning ei to set Eβi .⇒: Assume 3-part has a solution. Let element ei be assignedto set Ek. Then, a solution to MBS-3part that achieves anobjective value of exactly OPT can be obtained by assigningSS i to BS k.

Since the reduction can be performed in polynomial timeand all numbers in MBS-3part (M , B, S, wi, bi, and OPT ) arepolynomially bounded by the numbers in 3-part, the decisionversion of MBSRA is NP-complete in the strong sense. Hence,MBSRA is NP-hard in the strong sense. �

In [3], Li et al. proposed two approximation algorithms forsolving a convex optimization problem with exclusivity andnon-negativity constraints and a logarithmic utility functionfor the objective. (Their utility function is different from oursas they do not take past service into consideration. Also, theydo not have integrality constraints as we have in (4).) Theirapproximation algorithms can be used for solving MBSRA aswell. In the first algorithm, denoted cvapPF, the exclusivityconstraint in (5), which restricts each SS to a single BS, andthe integrality constraint in (4) are relaxed and a solution withnon-exclusive, that is, fractional, assignment and fractionalallocation is obtained. The non-exclusive assignment is thenrounded to respect exclusivity using the technique based on bi-partite graph construction proposed in [18] for the generalizedassignment problem. (They do not consider rounding thefractional allocation.) Li et al. show that this method providesan approximation ratio of 5.828. Their second algorithm,nlapPF, discretizes an equivalent exact non-linear formulationof MBSRA to construct a relaxed linear program with non-exclusive assignments. The solution to the discretized programis then rounded to obtain a solution with approximation ratio2 + ε.

cvapPF involves solving a convex program, and nlapPF,a linear program, in which the number of variables neededdepends on the desired accuracy. Hence, the running timesof both these algorithms are quite exorbitant, and are notsuited for online deployment to be invoked periodically at run-time. We therefore propose efficient heuristics, and comparetheir performance with cvapPF and an upper bound to thetheoretical optimal.C. Heuristics

Solving MBSRA optimally requires simultaneously deter-mining SS-BS assignments and allocation of slots to SSs,making the problem combinatorial. To lower complexity, ourheuristics decouple the assignment part from the allocation.Highest Bandwidth First (hbf): The first heuristic, calledhighest bandwidth first (hbf), assigns an SS to the BS fromwhich it is likely to receive the highest bandwidth (bw). When

the load distribution is asymmetric, the highest-bw BS foran SS need not be the one with the strongest signal. Todetermine SS assignments, the single-BS allocation algorithmASBS described in [12, Sec. V-C], is run at each BS withall SSs. Each SS is assigned to the BS at which the bwassigned by ASBS is the largest. In the second step, at eachBS, allocations to just the SS’s assigned to it are performedusing ASBS again. In our simulations, hbf was more than threeorders of magnitude faster than cvapPF.Approximating the fractional solution (ffrac): The convexand linear solvers of cvapPF and nlapPF are used for de-termining near-optimal solutions with fractional assignments(wherein an SS can be assigned to multiple BSs and allottednon-integral slots) that are then rounded. To lower runningtime, we consider heuristics for this step, and use ASBS todetermine a fractional assignment. The idea is to iterativelydetermine allocations for SSs in successive BSs in B passes.In the first pass, in each BS, only those flows for which thatBS provides the highest bw are allocated. In general, in theith pass, in BS k, SSs for which BS k provides among thehighest i bws are allocated. One aspect to note is that due tonon-linearity, the utility function associated with an SS cannotbe identical at all BSs. Let yi denote the number of bytesallocated to SS j in its top i BSs. Then, the utility functionto be used at the BS with the next highest bw is obtainedby adding yi/di to the log term in (1). Also, in pass i, SSallocations in the BSs are revised using i rounds for betteraccuracy. More details are omitted for want of space and canbe found in [12]. The complexity of this heuristic, termedffrac (for fast fractional), is O(B3N2), and in our simulationexperiments was found to be more than an order of magnitudefaster than the convex solver. The fractional solution providedby ffrac is then rounded as with cvapPF to ensure that eachSS is assigned to at most one BS.

V. EMPIRICAL EVALUATION

A. Experimental SetupWe use the macro-cell model and system parameters rec-

ommended by the WiMAX Forum [19, Sec. 2.1.1, Tables2.1.1 – 2.1.6] for our experiments. We simulate 19 cells,with a frequency reuse of three, placed in a regular, three-tier hexagonal tessellation. Performance evaluation is overthe serving set of the inner seven cells; the outermost tierof twelve cells model realistic interference. For the mobilityexperiments, we assumed pedestrian mobility at 3km/h [19].Since the epoch durations over which the channel quality isestimated are long in comparison to channel coherence times,we model only large-scale effects, namely, path loss per themodified COST231 Hata urban propagation model and 8dBLog-Normal Shadowing. Interference only due to other BSsis modeled assuming edge users in neighboring cells can beallocated in non-overlapping subchannels.

SS locations are randomly generated in two modes, uniformand hotspot. SS count is set at 150, each SS is associatedwith a single VoD flow, and the playout rate of each flowis set at 640 kbps. Further increase in load led to a sharpdrop in performance for all the algorithms. For the uniformcase, each SS is placed randomly in a disc of radius 2R(from the centre of the tessellation), where R is the cellradius. For the hotspot case, N/2 users are placed in a disc ofradius R and the remaining half are placed in the annulus

5

of width R. Each scheduling epoch is set to 10 s in theinitial experiments. Higher epoch durations are tested in a laterexperiment. Experiments are conducted over 250 epochs.

SSs are assumed to be provided with sufficiently largeplayout buffers. A simple player strategy that can play contentas it is received without having to pre-buffer is assumed.Playout buffers are assumed to be empty at the start ofeach simulation run. Our measures of performance are meanstatlling fraction (MSF), which is the percentage of time a userstalls in an epoch, and percentage of satisfied users (PSU) perepoch. Stalling duration in an epoch is computed based onthe content delivered in the epoch, SS buffer contents at thebeginning of the epoch, and playout rate. A user is satisfied inan epoch, if their VoD flow does not stall. A flow stalls whenits buffer is empty and it is served at less than its playout rate.

We compare Li et al.’s cvapPF [3], our heuristics hbf andffrac, and the common strongest-signal first (ssf) heuristic, thatassigns an SS to that BS to which its signal is the strongest.We also consider a variant of our MSBS scheme that can betuned to improve QoE at the expense of aggregate throughput.In this variant, called bufballoc, flows that have sufficientlylarge buffers accumulated at the receivers are not allocatedany bandwidth until their buffers fall below an upper threshold,which is a tunable parameter. We evaluate this variant with hbfas the underlying scheduling algorithm. Further, we evaluatethe impact of pre-buffering, by letting each flow pre-buffer fora configurable amount of time before beginning to play. Weshow the results when pre-buffering is enabled for bufballoc(the scheme with pre-buffering is called pbufballoc).

We compare results for T = 50 epochs (500 s) for all thealgorithms and also T = 1 epoch (10 s) for cvapPF (denotedcvapPF1), which is the equivalent of Li et al.’s algorithm in[3]. (cvapPF with T = 50 is denoted cvapPF50.) Compar-ison with cvapPF1 is to study the impact of lifetime QoEmanagement that takes into account past history of servicerates. Recall that T is the length of the averaging window forupdating average service rate and can be increased to improveQoE. In the plots, opt denotes the fractional solution returnedby the convex solver used as part of cvapPF50, and providesan upper bound on the optimal performance.

B. Discussion of ResultsSimulation results for the hotspot scenario for stationary

users are plotted in Fig. 2. PSU and aggregate throughput areplotted by epoch in insets (a) and (b), respectively. cvapPF50,ffrac, and hbf have very similar PSU, and at steady state, with94% PSU, lag opt by less than 2%. It should be noted that therunning times of ffrac and hbf are lower than that of cvapPF50by more than one order and three orders of magnitude,respectively. Hence, these heuristics are highly preferable tocvapPF in online settings. bufballoc, which restricts allocationsto flows based on the playout buffer level, increases PSU by4%, when the buffer threshold is set to 2 mins content. We alsoran experiments for higher threshold values, and found thatPSU increases with increasing threshold up to a limit and thenfalls and settles to the hbf level. pbufballoc, with pre-bufferingfor 2 mins before beginning to play out, significantly improvesPSU (by 20%) during the initial epochs, and at steady state,settles to bufballoc levels. (PSU for bufballoc is higher thanthat of opt since it is modified to yield a solution that deviatesfrom a PF allocation and is skewed in favor of improving

0 50 100 1500

0.2

0.4

0.6

0.8

1

Subscriber Stations (SS)

Mea

n S

tall

Fra

ctio

n (M

SF

−U

A) cvapPF1

ssfffrachbfcvapPF50optbufballocpbufballoc

0 50 100 150 200 250

20

40

60

80

100

120

Epoch Number

Thr

ough

put (

Mbp

s)

ssfffrachbfoptcvapPF50bufballocpbufballoccvapPF1

(a) (b)

0 50 100 150 200 2500

0.2

0.4

0.6

0.8

1

Epoch Number

Mea

n st

all F

ract

ion

(MS

F−

EA

) cvapPF1ssfcvapPF50ffrachbfoptbufballocpbufballoc

0 50 100 1500

0.2

0.4

0.6

0.8

1

Subscriber Stations (SS)

Mea

n S

tall

Fra

ctio

n (M

SF

−U

A) cvapPF1

ssfffrachbfcvapPF50optbufballocpbufballoc

(c) (d)

0 50 100 150 2000

20

40

60

80

100

T

Avg

. % s

atis

fied

user

s (A

vg P

SU

)

pbufballocbufballocffrachbfssf

0 500 1000 1500 20000

50

100

Epoch Number

% s

atis

fied

user

s (P

SU

)

0 500 1000 1500 20000

20

40

60

80

100

epoc

h du

ratio

n (s

econ

ds)

epoch duration

uniform epochadaptive epoch

(e) (f)Fig. 2: (a)-(d) Comparison of algorithms for stationary users in a hotspot

scenario. (e) Impact of T to PSU. (f) Steady-state performance with dynamic

user population and dynamically changing epoch durations. (In all the graphs,

the order in legend matches the order of the curves.)

fairness.) All of these algorithms, with T = 50, outperformcvapPF1, which ignores past history, by more than 30%, andssf by 35%. These results show that for streaming flows withreal-time constraints (albeit somewhat weak), an assignmentbased merely on signal strength or merely maximizing short-term fairness are not efficient strategies.

Referring to inset (b), we see that the higher PSU ofbufballoc and pbufballoc cost 10% in terms of throughput. ssf’sthroughput is the highest, as expected, since it always favorsSSs with strong channels at all BSs. opt and our heuristics arequite close, while cvapPF1’s throughput is the lowest. This isbecause ignoring past history may require allocation to SSswith weak channels but adequate buffers to guarantee short-term fairness in every epoch, which can be wasteful.

MSF averaged over all users per epoch (MSF-EA) andaveraged over the entire session per user (MSF-UA) are plottedin insets (c) and (d), respectively. Although PSU’s for cvapPF1and ssf are close, the higher aggregate throughput for ssftranslates to a lower MSF-EA (of 0.06) than for cvapPF1(0.2). MSF-EA for pbufballoc (the best) is 0.01, that is, auser stalls for 100 ms per epoch on average, and for the otheralgorithms considered ranges between 0.02 and 0.05. Referringto inset (d), the 75 of the 150 SSs that almost never stall arethose who are away from the hotspot. For the remaining SSsin the hotspot, stalling fractions are as shown.

The above experiments were repeated with uniformly dis-tributed users and mobile users moving at pedestrian speeds aswell. In the uniform case, more users could be sustained with

6

similar PSU and throughput as in the hotspot case (except withssf) due to a better distribution of users. PSU and throughputwere lower in the case of mobile users, as the system capacityis lowered due to mobility.

In inset (e), we study the performance for varying T , theextent to which past service is taken into account. We findthat in our setup, the gains are minimal after T = 50, soconsidering history over the past 5-10 mins suffices.

The experiments described so far assumed that the users arestatic, all arriving at time 0. A more realistic scenario is onein which arrivals are staggered and users dynamically join andleave. Further, the focus has been more on evolutionary start-up conditions and less on steady-state conditions. To evaluatethe steady-state performance of our scheme under dynamicconditions and a shorter pre-buffering duration of 40 s, wesimulated pbufballoc with the following parameters: 75 usersjoin at time 0, the remaining 75 users arrive at the rate of1 user per 5 seconds (0.2 arrivals/sec); after 150 users havearrived, existing users depart and new users join, each at arate of 0.2/sec. The simulation was for 2000 epochs with anepoch duration of 20 s. PSU results in inset (f) show that thesteady-state performance under dynamic conditions is close tothat under static scenario. Also, performance during start-upis significantly improved (even with a shorter pre-bufferingduration) and is close to the steady-state performance due tomore realistic user arrivals.Lowering overheads: We measured the time taken by eachof the algorithms and found that hbf and its variants require13 ms on average per epoch. (ffrac required 1 s while cvapPF,30 s). For an epoch duration of 10 or 20 s, the computationaloverhead due to the heuristics is thus minimal. This overheadcan in fact be masked by performing the allocations towardsthe end of the previous epoch. Another overhead of MSBS isthe time taken to reassign SSs to BSs at epoch boundaries.Efficient techniques that have been developed for minimizinghandoff latencies for real-time sessions such as VoIP can beused to accomplish this task within 50 ms. This presents anoverhead of 0.5% for a 10 s epoch (and 0.25% for a 20 sepoch). Further, not all SSs will be switched in every epochand thus this overhead is also reasonable.

One way of lowering the overheads is by increasing theepoch duration. Since statically increasing the epoch durationbeyond 20 s somewhat degrades the performance, we dynam-ically adjust the epoch duration at run-time using a simplecontrol law to maintain PSU at a desired set-point. We startwith an epoch duration of 20 s, decrease it when the PSU isstable above the set-point (set to 95%) and increase when thePSU falls and stays below the threshold. Results in inset (f)show that the simple adaptive approach degrades PSU by only2% while increasing the average epoch duration to close to 50sec, which is a 150% improvement. We believe that there isscope for increasing the epoch duration further using moresophisticated techniques.

VI. CONCLUSION AND FUTURE WORK

In this paper, we have considered coordinated scheduling ina network of WiMAX BSs, called macro-scheduling of BSs(MSBS), for managing the QoE of VoD flows. Our utility-maximization approach for guaranteeing long-term propor-tional fairness and managing lifetime QoE shows that MSBS

can improve fairness for VoD flows significantly, and, specif-ically, accounting for past service history during schedulingcan increase the percentage of satisfied users, by up to 35%.

As part of the QoE management problem, we showedthat the network-wide PF scheduling problem is NP-hard inthe strong sense. Since approximation algorithms previouslyproposed can require long running times, we have proposedefficient heuristics that can solve the problem more than threeorders of magnitude faster, while yielding comparable results.

The work reported herein provides some guidelines forscheduling VoD flows. Using these guidelines to design ef-ficient algorithms that can provide guaranteed services toreal-time applications with more stringent requirements, suchas video-conferencing, under WiMAX, will be considered infuture work.

REFERENCES

[1] IEEE, “802.16e-2005 IEEE standard for local and metropolitan areanetworks,” http://standards.ieee.org/getieee802/802.16.html, 2005.

[2] F. P. Kelly, A. K. Maulloo, and D. K. H. Tan, “Rate control in commu-nication networks: shadow prices, proportional fairness and stability,”The J. of the Operational Research Society, vol. 49, no. 3, pp. 237–252,March 1998.

[3] L. Li, M. Pal, and Y. Yang, “Proportional fairness in multi-rate wirelessLANs,” in IEEE INFOCOM, April 2008, pp. 1678–1686.

[4] T. Bu, L. Li, and R. Ramjee, “Generalized proportional fair schedulingin third generation wireless data networks,” in IEEE INFOCOM, April2006, pp. 1–12.

[5] A. Sang, X. Wang, M. Madihian, and R. Gitlin, “Coordinated loadbalancing, handoff/cell-site selection, and scheduling in multi-cell packetdata systems,” Wireless Networks, vol. 14, no. 1, pp. 103–120, 2008.

[6] S. Das, H. Viswanathan, and G. Rittenhouse, “Dynamic load balanc-ing through coordinated scheduling in packet data systems,” in IEEEINFOCOM, April 2003, pp. 786–796.

[7] V. Vukadinovic and G. Karlsson, “Video streaming in 3.5g: Onthroughput-delay performance of proportional fair scheduling,” in 14thIEEE International Symposium on Modeling, Analysis, and Simulationof Computer and Telecommunication Systems, Sep 2006, pp. 393–400.

[8] X. Z. H. Lei, C. Fan and D. Yang, “Qos aware packet schedulingalgorithm for ofdma systems,” in Vehicular Technology Conference, Sep2007, pp. 1877–1881.

[9] F. Hou, J. She, P. Ho, and X. Shen, “Performance analysis of weightedproportional fairness scheduling in ieee 802.16 networks,” in Interna-tional Conference on Communications, May 2008, pp. 3452–3456.

[10] M. Ergen, S. Coleri, and P. Varaiya, “Qos aware adaptive resourceallocation techniques for fair scheduling in ofdma based broadbandwireless access systems,” IEEE Transactions on Broadcasting, vol. 49,no. 4, pp. 362–370, Dec 2003.

[11] J. Gross, J. Klaue, H. Karl, and A. Wolisz, “Subcarrier allocation forvariable bit rate video streams in wireless ofdm systems,” in VehicularTechnology Conference, Oct 2003, pp. 2481–2485.

[12] S. Mitra, U. Devi, P. Gupta, P. Dutta, M. Chetlur, andS. Kalyanaraman, “Macro-scheduling of base stationsfor video-on-demand flows in WiMAX networks,” IBMResearch, India, Tech. Rep. RI09011, July 2009, available athttp://domino.research.ibm.com/library/cyberdig.nsf/index.html.

[13] M. Andrews and et al., “Providing quality of service over a sharedwireless link,” IEEE Communications Magazine, pp. 150–154, February2001.

[14] A. Jalali, R. Padovani, and R. Pankaj, “Data throughput of CDMA-HDR a high efficiency-high data rate personal communication wirelesssystem,” in IEEE Vehicular Technology Conference, 2000, pp. 1854–1858.

[15] H. Kim and Y. Han, “A proportional fair scheduling for multicarriertransmission systems,” IEEE Communications Letters, vol. 9, no. 3, pp.210–212, March 2005.

[16] S. Boyd and L. Vandenberghe, Convex Optimization. CambridgeUniversity Press, March 2004.

[17] M. Garey and D. Johnson, Computers and Intractability : a Guide tothe Theory of NP-Completeness. W. H. Freeman and company, NY,ch. 4.

[18] D. Shmoys and E. Tardos, “An approximation algorithm for the gener-alized assignment problem,” Mathematical Programming, vol. 62, no. 3,pp. 461–474, 1993.

[19] WiMAX Forum, “WiMAX system evaluation methodology, ver. 2.1,”July 2008.

Documents

[IEEE 2010 IEEE 18th International Workshop on Quality of Service (IWQoS) - Beijing, China (2010.06.16-2010.06.18)] 2010 IEEE 18th International Workshop on Quality of Service (IWQoS)