Call center sta ng: Service-level constraints and index prioritiesSoh and Gurvich: Call center sta ng: Service-level constraints and index priorities 3 Figure 1 Index functions in

Submitted to Operations Researchmanuscript

Call center staffing:Service-level constraints and index priorities

Seung Bum SohCollege of Business Administration, Sejong University, [email protected]

Itai GurvichKellogg School of Management, Northwestern University, [email protected]

May 2, 2016

Call centers attribute different values to different customer segments. These values are reflected in quality-

of-service targets. The prevelant Target Service Factor (TSF) formulation requires, for example, that 80% of

VIP customers wait less than 20 seconds while setting the target to 30 seconds for non-VIP customers. The

call center must determine the staffing level together with a prioritization rule that meets these targets at

minimal cost. In practice, due to the underlying complexity of these systems, the prioritization rule is often

selected in a heuristic manner rather than being systematically optimized. When considering the universe

of prioritization policies, index rules provide a customizable and easy to define heuristic and for this reason

are implemented in various call-center software packages.

We use the TSF formulation as a stepping stone towards a better understanding of index rules. We first

construct an asymptotically optimal solution for the TSF problem. The prioritization component of our

solution is a tracking policy rather than an index rule. We prove that despite index rules’ significant flexibility,

no instance of these prioritization rules is optimal for the TSF problem. The sub-optimality of index rules

follows from an essential characteristic of these: restricting attention to index rules (as is heuristically done

in practice) is asymptotically equivalent to requiring that a VIP customer always waits less than a regular

(non-VIP) customer who arrives at the same time. This, in particular, implies that the use of index rules in

practice can be rationalized if (and only if) the manager requires such strong differentiation.

1. Introduction

Waiting times are regarded as central to customers’ service experience and are used as a key

performance indicator for call centers. In the spirit of measuring what is managed, call centers use

metrics that aggregate waiting-time related information. They are collectively referred to as QoS

(Quality-of-Service) metrics. A notable example is the Target-Service-Factor (TSF, the fraction of

customers who exceed a target waiting time) ; see Chapter 2 of Koole [2013] for further discussion

of call-center performance metrics. TSF is one of the key performance indicators that are traced

and targetted in call centers.

When the call center serves multiple customer classes, their relative importance is typically

reflected in the TSF targets requiring, for example, that 80% of class-1 customers wait less than 20

seconds but 80% of class 2 customers wait less than 30 seconds. Solving the call-center’s staffing

problem in a multi-class environment requires determining the capacity level and the prioritization

rule that, combined, guarantee that the service-level targets are met at a minimal cost.

1

Soh and Gurvich: Call center staffing: Service-level constraints and index priorities2

In the two-class model that we consider here, the staffing problem is given by

(TSF)

min N

s.t. P{WN,π1 >w1

}≤ α1,

s.t. P{WN,π2 >w2

}≤ α2,

N ∈Z+, π ∈Π,

where, informally at this stage, WN,πi represents the steady-state waiting time of class i when the

number of servers is N and the prioritization policy is π. We seek to find the minimal number

of servers N together with a supporting prioritization policy π so that the TSF constraints are

met. Here, Π denotes the family of “admissible” prioritization policies; see Definition 1. To the

extent that the TSF constraint reflects relative importance, having α1 ≤ α2 together with w1 ≤w2corresponds to class 1 being the VIP class.

Truly solving this problem to optimality by jointly determining the staffing level and the priority

rule is difficult. This paper is the first to offer an asymptotically optimal solution for a relatively

simple network. In practice, call centers typically restrict attention to parametric families of pri-

oritization rules and subsequently adjust the parameters and the staffing level to meet the QoS

targets. This heuristic approach is understandable given the complexity of the problem. Degrees

of freedom in the prioritization policy are built into the software that manages the real time call

selection and routing of calls. The software provides sufficiently many adjustable parameters while

maintaining simplicity. A family of prioritization heuristics is that of index priorities; see e.g. Chan

et al. [2014] or Koole and Pot [2006] where these are referred to as Time Function Scheduling

(TFS). This family of policies is also well understood analytically in the context of convex holding

cost minimization; see Van Mieghem [1995] and Mandelbaum and Stolyar [2004]

A customer is assigned an index upon arrival. As the customer waits, his index increases period-

ically. The system manager chooses the initial indices for each customer type as well as the update

intervals and update-to levels (these are easily defined in a spreadsheet). This leads to an index

function as the one in Figure 1 which is used in an Israeli Bank that serves three categories of

retail customers. In real time, a server who becomes available serves the customer with the highest

index.

Somewhat more formally, customer type i is associated with a piecewise-constant index gi(·)

so that a server that becomes available chooses for service the customer at the head of queue

i∗ ∈ arg maxi gi(wi(t)) where wi(t) is the accumulated waiting time of the customers at the head

of the class-i queue. This family of policies is intuitive to understand and easy to implement.

Moreover, as both the updated periods and levels can be defined by the user it is expected to

be rather flexible. Analytically understanding the role of index policies in constrained staffing


Figure 1 Index functions in practice

Figure 2 The non-index structure of optimal prioritization

optimization is practically important, due to their use in practice, and intellectually important,

due to the attention they have received in the literature on the optimization of queues.

We first asymptotically solve the TSF formulation which, despite its prevalence in practice, has

not been solved even in the simplest form of multi-class queues. We provide a simple staffing

expression based on the M/M/N queue that explicitly captures the parameters of the problem. We

prove that this staffing solution, together with a corresponding priority policy, is asymptotically

optimal when the volume of arrivals is high.

Since we are interested in better understanding the role of index rules, we proceed to show

that all asymptotically optimal solutions to the TSF formulation are, in particular, non-index: all

optimal prioritization policies share some non-conventional properties:

(i) Non-monotonicity of waiting time in congestion: For at least one customer class (be it VIP or

Regular), the waiting time has the structure depicted in Figure 2 – it is greater when the congestion

(specifically the total queue) in the system is moderate relative to when the congestion is high.

(ii) Discontinuity of waiting time in congestion: Customers arriving within a short time interval


may experience starkly different waiting times (this corresponds to the discontinuity point ? in

Figure 2).

Since, as we also prove, index rules generate waiting time profiles that are continuous and

monotone in the total congestion, we conclude that if one heuristically restrict attention to index

rules – even if one choose the “best” index functions – staffing costs are non-negligibly greater than

the minimum necessary.

Thus, while the family of index rules is seemingly rich, there is no instance of an index rule that is

asymptotically optimal (together with an appropriately chosen staffing level) for the TSF problem.

To better understand this gap, it is useful to point to an interesting phenomenon one finds when

considering the use of index rules in practice. Figure 3 displays the waiting-time processes1 on a

single day (May 10th, 2007) in a three-class Bank call center that uses the index rule in Figure

12.

Figure 3 Waiting-time processes in a large Israeli bank

A striking fact in this graph is that, sample path by sample path (we see a similar pattern

on other days), a perfect real-time ranking between the classes is preserved, that is, “Private”

customers wait less than their “Star” counterparts, whereas “Star” customers wait less than their

“Rainbow” counterparts. This reflects a rather strong notion of what it means to be a VIP customer

1 Specifically, for each 2.5 minutes interval the graph depicts the waiting time averaged over customers that arrivedover that specific time interval

2 We are thankful to the Technion Service Enterprise Engineering Laboratory (SEE Lab) for the data


– VIP customers wait less at all times (rather than, say, on average over a day or an hour). This

requirement can be mathematically formulated as

P{WN,π1w1

≤ WN,π2

w2

}≈ 1.

This is a perfect service-level differentiation. One can view service-level differentiation (SLD) on a

continuum parameterized by a constant β ∈ [0,1] and captured by the constraint

P{WN,π1w1

≤ WN,π2

w2

}≥ β (SLD(β))

The case of β = 0 corresponds to requiring that differentiation holds only in aggregate by TSF

formulation as in: on average at least 80% of VIP customers wait less than 20 seconds and at

least 80% of Regular customers wait less than 30 seconds – whereas β = 1 reflects a perfect real

time differentiation as observed in Figure 3. We label by TSF+SLD(β) a formulation that has

both the TSF constraint and the SLD(β) constraints.

The observation that perfect SLD is preserved in this data is generalizable. Perfect service-level

differentiation is a characterizing property of index policies. That is, requiring perfect service-level

differentiation but giving full freedom in the choice of the prioriziation policy is equivalent to

restricting attention to index rules in the TSF problem. More formally, the following two formula-

tions are asymptotically equivalent:

min N

s.t.P{WN,π1 >w1

}≤ α1,

(TSF + SLD(1)) P{WN,π2 >w2

}≤ α2,

P{WN,π1w1≤ W

N,π2w2

}= 1,

N ∈Z+, π ∈Π.

min N

s.t.P{WN,π1 >w1

}≤ α1,

⇔ (TSF + Index) P{WN,π2 >w2}≤ α2,N ∈Z+, π ∈Π(Index).

In words, restricting attention to index rules (as is often heuristically done) is asymptotically

equivalent to requiring perfect SLD (i.e, β = 1). Thus, whereas the a priori restriction of some

call centers to index rules may be driven by the software packages that they use, their choice is

consistent with optimality, provided that perfect SLD is desirable to them.

We close this introduction with a few important pointers to index rules in the literature. The use

of index-rule based heuristics in the context of staffing subject to QoS targets is not grounded in

theoretical foundation. However, it is the opposite case when considering holding cost (or waiting-

time cost) minimization for given staffing levels. The best known results in this context pertain

to the optimality of the Generalized cµ (Gcµ) rule for the minimization of convex holding costs

as pioneered by Van Mieghem [1995]. Waiting-cost minimization is also the subject of subsequent


extensions to Van Mieghem’s original work; see e.g. Mandelbaum and Stolyar [2004] and many oth-

ers that followed. Such rules were also shown to be optimal for some constrained staffing problems.

Average Speed of Answer (ASA) constraints give rise (as one possible solution) to a Fixed-Queue-

Ratio rule – a special case of an index rule; see Gurvich and Whitt [2007].

Thus, in certain contexts, these rules are very well understood. We expand this body of work

with some new insights about the role of index rules in constrained staffing problems. Constrained

staffing problems in multi-class queues have been also studied and we refer the reader to Aksin et al.

[2007] for some references. The typical paper in this literature seeks to identify one optimal (or

nearly optimal) solution. To that literature we add our solution to the TSF formulation. However,

questions as we raise here, that pertain to structural properties of the family of all asymptotically

optimal solutions, do not arise naturally in that line of work. They arise naturally here when we

seek to understand the role of index rules in staffing problems.

2. Model and analysis framework

We consider the two-class V model. Arrivals of class-i customers follow a Poisson process Ai =

(Ai(t), t≥ 0) with rate λi > 0; i= 1,2. Let λ= λ1 +λ2. The two Poisson processes are independent.

Service times are exponential with rate µ and are independent across customers and independent

of the arrival processes. Various metrics of interest will be superscripted by the staffing level N

and the prioritization policy π to denote their dependence on these decision variables.

The TSF and SLD constraints were defined in the introduction using a virtual waiting time

WN,πi . The Poisson Arrivals See Time Averages (PASTA) property guarantees that the fraction of

customers who wait more than a target w equals the fraction of time that the virtual waiting time

exceeds this value.

We substitute the virtual waiting time with a proxy - the local average waiting time, which is

based on the actual waiting times. Let wN,πi,k be the waiting time of the kth class-i customer to

arrive. The local average waiting time of customers arriving over an interval (t, t+ δ] is defined to

be,

WN,πδ,i (t) :=1

Ai(t+ δ)−Ai(t)

Ai(t+δ)∑k=Ai(t)+1

wN,πi,k , (1)

where 0/0 is defined to be 0.

Local average is the measure used in Figure 3 where δ is 2.5 minute. If δ is kept sufficiently small,

the local average captures the dynamics and variation of waiting time. The use of local average

(rather than virtual waiting time) helps us overcome a technical difficulty (see Lemma 5.1).

We assume throughout that

α1, α2 > 0, α1 ≤ α2 and α1 +α2 < 1. (2)


If α1 +α2 ≥ 1, the staffing problem trivializes and, even for small values of w1,w2 > 0, any staffing

level N >λ/µ is feasible; see the discussion on page page 285 of Gurvich et al. [2008],

We allow the controller to use information about the length of the queues, the elapsed service

times of customers that are in service and the accumulated waiting times of the all customers in the

queues and we let X(t) be the state descriptor that captures all this information; see the appendix

for the formal definition.

Definition 1 (admissible policies): A prioritization policy π is admissible if:

1. It does not use admission control – all customers are served.

2. It serves customers in a first-come first-served (FCFS) fashion within each class.

3. It is work conserving and stationary with respect to X.

Let Π be the family of admissible policies.

Non-preemption and FCFS within each class are both natural assumptions for our application.

The literature does have cases in which non-work-conserving policies are shown to be optimal or

asymptotically optimal (see Gurvich et al. [2008]). The results in Gurvich et al. [2008] are driven,

however, by order-of-magnitude differences between the probability targets of the various classes

(say α1 = 0.01 but α2 = 0.2) that lead inherently to solutions that induce perfect SLD rendering our

main research questions irrelevant. Work conservation has the following immediate consequence

which we state formally and repeatedly refer to.

Lemma 2.1 Fix the number of servers N , the service rate µ and the arrival rates λ1, λ2. Then,

under any admissible policy π, the total number of customers in the system XΣ(t) (respectively the

total queue QΣ(t)) follows the distribution of the total number of customers (respectively the queue

length) in an M/M/N queue with arrival rate λ= λ1 +λ2, service rate µ and N servers.

Many-server analysis: The problem TSF+SLD(β) is difficult to solve exactly. Instead, we resort

to identifying solutions that are asymptotically optimal as the system size grows. A series of solu-

tions is asymptotically optimal if the induced optimality gap is negligible in a central-limit-theorem

(CLT) scaling. This is consistent with many studies in the context of call-center optimization; see

e.g. Armony [2005], Armony and Mandelbaum [2011], Borst et al. [2004]. We consider a sequence

of V models with the total arrival rate λ= λ1 + λ2 increasing along the sequence but keeping the

service rate µ fixed. We assume that there exists ai > 0, i= 1,2 so that a1 + a2 = 1 and λi = aiλ

for i= 1,2 and all λ.

All relevant quantities are superscripted by λ to make the dependence on λ explicit. As is

standard in the literature, the targets w1 and w2 scale with λ: wλi =wi/

√λ , i= 1,2 where w̄i’s are

fixed strictly positive constants. This guarantees that the system is in the Halfin-Whitt regime (see


Lemma 5.5 below). The Halfin-Whitt regime facilitates a CLT type of analysis. In our numerical

experiments we illustrate how our proposed solutions perform for given parameters (rather than

asymptotically); see §4.

To simplify the notation, we denote a staffing-policy pair (N,π) by ξ and let N(ξ) and π(ξ) be,

respectively, the staffing and prioritization components of ξ. The problem TSF+SLD(β) is now

re-written as:

minξλ

N(ξλ)

s.t. P{W ξ

λ,λδ,1 (∞)>wλ1

}≤ α1,

P{W ξ

λ,λδ,2 (∞)>wλ2

}≤ α2, (3)

P

{W ξ

λ,λδ,1 (∞)wλ1

≤W ξ

λ,λδ,2 (∞)wλ2

}≥ β,

ξλ ∈Z+×Π.

W ξλ,λ

δ,i (∞) should be read as the the local average (as in (1)) when the system is initialized (at

t = 0) with its steady-state distribution, the customer incoming rate is λ and the staffing-policy

pair ξλ is used.

Definition 2 (asymptotic feasibility): A sequence of staffing-policy pairs{ξλ}

is asymptotically

feasible, if ξλ ∈Z+×Π for all λ and for each � > 0,

limsupδ→0

limsupλ→∞

P{W ξ

λ,λδ,i (∞)>wλi (1− �)

}≤ αi(1 + �), i= 1,2,

and

limsupδ→0

limsupλ→∞

P

{W ξ

λ,λδ,1 (∞)wλ1

≤W ξ

λ,λδ,2 (∞)wλ2

+ �

}≥ β(1− �).

We say that a sequence {Nλ} is an asymptotically feasible sequence of staffing levels for (3) if

Nλ =N (ξλ) for a sequence {ξλ} of asymptotically feasible staffing-policy pairs.

Definition 3 (asymptotic optimality): A sequence of staffing-policy pairs {ξλ} is asymptoti-

cally optimal if it is asymptotically feasible and[N(ξλ)−N(ξ̃λ)

]+= o

(√λ)

as λ→∞,

for any other sequence {ξ̃λ} of asymptotically feasible staffing-policy pairs.

We say that a sequence {Nλ} is an asymptotically optimal sequence of staffing levels for (3) if

Nλ =N (ξλ) for an asymptotically optimal sequence {ξλ} of staffing policy pairs.


3. The proposed solution to TSF+SLD and its implications

This section contains our main results. We identify an asymptotically optimal staffing and priori-

tization solution. The two components of our solution are inter-dependent. The proposed staffing

level is feasible if one uses it together with our proposed prioritization policy. For simplicity of

exposition, we discuss them separately (each in a dedicated subsection), starting with the staffing

component. Theorems stated in this section are proved in §5 and in the technical supplement Soh

and Gurvich [2015].

3.1. Optimal staffing and the cost of SLD

To specify our staffing recommendation, let QN,λ(∞) be a random variable whose distribution is

that of the steady-state queue length in an M/M/N queue with arrival rate λ, service rate µ and

N servers (as µ is fixed throughout, we omit it from the superscript).

For each λ, let Nλ? be the solution of the following M/M/N staffing problem

Nλ? = min{N ∈Z+ : P

{QN,λ(∞)≥ λ1wλ1 +λ2wλ2

}≤min{α1,1−β}+α2

}. (4)

Notice that, by (2), min{α1,1−β}+α2 < 1 holds

Theorem 1 (asymptotically optimal staffing) The sequence {Nλ? } is an asymptotically opti-

mal sequence of staffing levels for (3).

The prioritization component π (ξλ) of ξλ, constructed in the next subsection, guarantees that all

the inequalities W ξλ,λ

δ,1 (∞)≤wλ1 , Wξλ,λδ,2 (∞)≤wλ2 and W

ξλ,λδ,1 (∞)/w1 ≤W

ξλ,λδ,2 (∞)/w2 are asymp-

totically satisfied when the total queue length is smaller than λ1wλ1 + λ2w

λ2 . Thus, the choice of

the staffing level (4) guarantees that the total violation is bounded by min{α1,1− β}+ α2. The

challenge in designing the priorities is to make sure that this “violation budget” is distributed

correctly between the three constraints; for example, the TSF constraint for class 2 absorbs at

most α2 of the total violation budget.

Remark 1 (aggregate-based staffing) Theorem 1 maps TSF+SLD(β) into a staffing problem for

a single class queue. The fact that this solution works for the multiclass queue will be supported

by our careful choice of the prioritization rule. Such a reduction to a one dimensional model is a

recurring theme in recent literature on staffing; see e.g. Armony and Mandelbaum [2011], Gurvich

et al. [2008], Gurvich and Whitt [2007]. In the latter two papers, a global Average Speed of Answer

(ASA) constraint allows for a relatively simple mapping of the parameters into the single-class

staffing problem. This mapping is more subtle in our setting as is reflected in the non-trivial way

in which the constants α1, α2 and β appear on the right-hand side of (4).


Figure 4 η∗(β) as a function of β for α1 = 0.3, α2 = 0.2, w̄1 = 2 and w̄2 = 4

Theorem 2 The sequence {Nλ? } satisfies

Nλ? =λ

µ+ η?

√λ

µ+ o(√λ),

where η? is convex increasing in the SLD degree β. It is strictly increasing for β ≥ 1−α1.

Increasing the degree of SLD beyond a certain level (in particular, setting β = 1) is thus costly: the

optimal staffing level for the TSF formulation is too small for TSF+SLD(1). Conversely, relaxing

the SLD requirement reduces the staffing costs.

3.2. Prioritization and the optimality of index policies

The asymptotically optimal staffing {Nλ? } is coupled with a carefully chosen sequence of policies

{πλ?} such that the sequence of staffing-policy pairs {ξλ = (Nλ? , πλ? )} is asymptotically optimal in

the sense of Definition 3. The policy that we construct in this section is an instance of so-called

tracking policies.

Tracking policies defined below are also admissible policies (customers are served in a FCFS

manner within the same class). The queue length of class i at time t for i= 1,2 is denoted by Qλi (t)

and QλΣ (t) is the total queue length, i.e., QΣ (t) = Q1 (t) +Q2 (t). From these, normalized queue

lengths are defined as follows:

Q̂i(t) :=Qi(t)√λ, Q̂Σ(t) := Q̂1(t) + Q̂2(t).

Definition 4 (queue tracking policies) Let f = (f1, f2) : R+→ R2+ be a non-negative function

such that f1(x) + f2(x) = x for all x ≥ 0. A server that becomes available at time t chooses the

queue to serve according to the following criterion:


Figure 5 Tracking: (Left) General description (Right) Proposed tracking function under sub-perfect SLD (β < 1)

1. If one queue is empty but the other is not, admit to service the customer at the head of the

non-empty queue.

2. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) 6= Q̂2(t)− f2(Q̂Σ(t)), choose the class i

with the positive value of Q̂i(t)− fi(Q̂Σ(t)).

3. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) = Q̂2(t)− f2(Q̂Σ(t)), choose class 1.

An arriving customer is immediately assigned to an available server if there are such servers. If all

servers are busy upon this customer’s arrival, the customer is placed in the queue.

Figure 5(Left) illustrates the tracking function mechanism. It plots the targeted value for Q̂1

(f1(Q̂Σ)) vs. the value of Q̂Σ. If at time t, Q̂1(t)> f1(Q̂Σ(t)) (as in point A in the figure) the rule

prioritizes class 1 so as to pull it back toward f1(Q̂Σ(t)). If Q̂1(t) < f1(Q̂Σ(t)) as in point B in

the graph, the rule prioritizes class 2 over class 1 so as to push queue 1 towards f1(Q̂Σ(t)). The

tracking policy targets Q̂i(t)≈ fi(Q̂Σ(t)).

The family of tracking functions is immense. The challenge is to carefully choose the tracking

function f so that together with our proposed staffing the solution is asymptotically optimal (in

particular, asymptotically feasible).

Optimal tracking functions: We use the tracking function f?1 defined below, as plotted also in

Figure 5(Right):

f?1 (x) =

0, 0≤ x< a1w̄1+a2w̄2a1w̄1

κ,a1w̄1

a1w̄1+a2w̄2x−κ, a1w̄1+a2w̄2

a1w̄1κ≤ x< a1w̄1 + a2w̄2− 2

(1 + a1w̄1

a2w̄2

)κ,

2a1w̄1+a2w̄22(a1w̄1+a2w̄2)

x− a2w̄22, a1w̄1 + a2w̄2− 2

(1 + a1w̄1

a2w̄2

)κ≤ x< a1w̄1 + a2w̄2,

x− a2w̄2 +κ, a1w̄1 + a2w̄2 ≤ x< q̄,a1w̄1−κ, q̄≤ x.

(5)

and f?2 (x) = x− f?1 (x).


The threshold q̄ is given by

q̄= a1w̄1 + a2w̄2 +log{α2+min{α1,1−β}

α2

}η? (β)

(6)

where η?(β) is the unique solution (keeping other parameters constant) to(1 +

ηΦ(η)

φ(η)

)−1exp(−η(a1w̄1 + a2w̄2)) = min{α1,1−β}+α2, (7)

Here, φ(·), Φ(·) are, respectively, the standard normal density and distribution functions. The

constant κ can be any

0


The choice of the parameter q̄, combined with the optimal staffing in Theorem 1, guar-

antees that P{Q̂ξ

λ,λΣ (t)∈ [a1w̄1 + a2w̄2, q̄)

}≈ min{α1,1−β} and P

{Q̂ξ

λ,λΣ (t)∈ [q̄,∞)

}≈

α2. Then, P{W ξ

λ,λδ,1 (t)>w1

}≈ P

{W ξ

λ,λδ,1 (t)/w

λ1 >W

ξλ,λδ,2 (t)/w

λ2

}≈ min{α1,1−β} and

P{W ξ

λ,λδ,2 (t)>w2

}≈ α2 hold and the solution is feasible. In the following theorem, πλf? denotes

the tracking policy which applies the tracking function f?.

Theorem 3 (asymptotic optimality) Let Nλ? be as in (4). Then, the sequence of staffing-policy

pairs {ξλ = (Nλ? , πλf?)} is asymptotically optimal.

Remark 2 (on the interplay of cost reduction and prioritization) There are degrees of freedom in

choosing the tracking function – there are various choices that generate the same asymptotic opti-

mality result. We chose f? so that SLD is violated only when the total queue is small (specifically,

less than q̄). With this choice, f? has the appealing property that class-1 (the VIP) customers get

superior service when the total congestion in the system is large. Thus, reducing β (from β < 1)

allows for a cost reduction in terms of staffing (recall Theorem 2) while making sure that the VIP

customers get superior service when it really matters.

One thing that may strike the reader as non-standard in Figure 5(Right) is the discontinuity and

non-monotonicity of the tracking function f?. This is not a consequence of the specific way in which

we choose the functions – as the next theorem shows it cannot be avoided without compromise to

optimality.

For the formal statement, we say that a tracking policy is non-monotone if at least one of the

functions f1(·) and f2(·) is non-monotone. A function f is defined to be piecewise Lipschitz if it has

a finite number of discontinuity points {x1, . . . , xm,} and is Lipschitz continuous on each interval

(including [0, x1) and [xm,∞)).

Theorem 4 (non-monotonicity and discontinuity) Assume β < 1 (sub-perfect SLD) and let

f be a piecewise Lipschitz tracking function such that the sequence {(Nλ? , πλf )} is an asymptotically

optimal sequence of staffing-policy pairs. Then, f is discontinuous and non-monotone.

The optimal staffing allows for a limited “budget” that must be carefully allocated to the three

different constraints (the two TSF constraints and the SLD constraints). For different values of Q̂Σ

one must choose which constraints are “sacrificed” in favor of meeting the others. What we prove

is that the only way to do that while minimizing the total budget is to alternate the priorities and

this, in particular, introduces the non-monotonicity and the discontinuity that are stated in the

theorem.


On the equivalence of index policies and perfect SLD: Theorem 4 proves that, asymptotically,

optimal tracking functions must be discontinuous and non-monotone for β < 1. That is, imposing

monotonicity (which is a property of index policies; see below) is related to requiring perfect SLD

(β = 1). We next show that requiring perfect SLD and restricting the tracking functions to be

monotone is indeed equivalent in the asymptotic sense. More importantly, proving that monotone

tracking rules are equivalent to index policies, allows us to conclude the equivalence between

imposing a perfect SLD constraint and restricting the optimization to use only index policies.

To formalize this result, recall that an index policy is one that, upon service completion at time

t, admits to a customer from class

i∗ = arg maxi

gi(Qi(t)),

where g1 and g2 are non-negative increasing functions. We denote by Π(index)⊂Π the family ofindex policies.

Theorem 5 (equivalence of index policies and SLD(1) ) There exists a sequence of staffing-

policy pairs {ξλ = (Nλ, πλ)} that is asymptotically optimal for both formulations:

minξλN(ξλ)

s.t.P{W ξ

λ,λδ,1 (∞)>wλ1

}≤ α1,

(TSF + SLD(1)) P{W ξ

λ,λδ,2 (∞)>wλ2

}≤ α2,

P

{Wξλ,λδ,1

(∞)

wλ1≤

Wξλ,λδ,2

(∞)

wλ2

}= 1,

ξλ ∈Z+×Π.

minξN(ξλ)

s.t.P{W ξ

λ,λδ,1 (∞)>wλ1

}≤ α1,

(TSF + Index) P{W ξ

λ,λδ,2 (∞)>wλ2

}≤ α2,

ξλ ∈Z+×Π(index).

In particular, letting N∗,λ1 be the objective function value of TSF+SLD(1) and N∗,λ2 be that of

TSF+Index, we have that |N∗,λ1 −N∗,λ2 |= o(

√λ).

Theorem 5 is argued in three steps stated in Propositions 6-8. The first, Proposition 6, shows that

index policies and monotone tracking policies are separate mathematical representations of the

same policy and the proof is in 5.2.1.

Proposition 6 (equivalence of index policies and monotone tracking policies ) Given a

monotone tracking function f , there exists an index function g such that the monotone tracking

policy with f is equivalent to the index policy with index g. Similarly, given an index function g,

there exists a monotone tracking function f such that the index policy with g is equivalent to the

tracking policy with f . The equivalence is in the sense that at any given state of (Q1 (t) ,Q2 (t)),

both policies take the same action.


Given this equivalence, to prove Theorem 5, it remains to shows that TSF+SLD(1) is equivalent

to the following:

minξλ N(ξλ)

(TSF + MT) s.t. P{W ξ

λ,λδ,1 (∞)>wλ1

}≤ α1,

P{W ξ

λ,λδ,2 (∞)>wλ2

}≤ α2,

ξλ ∈Z+×Π(MT) ,

where Π(MT) ⊂ Π is the family of monotone tracking policies. Note that a monotone tracking

function must also be continuous by x= f1(x) + f2(x). It is easy to prove. For an arbitrary point

x1 > 0 and � > 0, f1(x1)≤ f1(x1 + �)≤ f1(x1) + �. The last inequality is from x1 + �− f1(x1 + �) =

f2(x1 + �)≥ f2(x1) = x1− f1(x1). We see as �→ 0, f1(x1 + �)→ f1(x1). The cases for � < 0 and for

f2 are verified similarly.

The equivalence is proved in the next two propositions. The first, Proposition 7, proves that

there exists a monotone tracking policy which, with the staffing level from Theorem 1, satisfies the

constraints in TSF+SLD(1).

Proposition 7 (nearly optimal monotone solutions for TSF+SLD(1) ) For any ϑ > 0,

there exists a monotone tracking function fϑ and a sequence {Nλϑ} such that {(Nλϑ , πλfϑ)} is asymp-

totically feasible for TSF+SLD(1) and |Nλϑ −Nλ? | ≤ ϑ√λ.

The nearly optimal tracking function, whose existence is established in Proposition 7, is depicted

in Figure 6 and explicitly specified as follows:

fϑ1 (x) =

0, 0≤ x< a1w̄1+a2w̄2

a1w̄1κϑ,

a1w̄1a1w̄1+a2w̄2

x−κϑ, a1w̄1+a2w̄2a1w̄1

κϑ ≤ x< a1w̄1 + a2w̄2,a1w̄1−κϑ, a1w̄1 + a2w̄2 ≤ x.

In the figure the constant qϑb is the point of intersection between the tracking function fϑ1 (x) and

the line x− a2w̄2. By (8), W ξλ,λ

δ,1 (t)≤wλ1 , Wξλ,λδ,2 (t)≤wλ2 and W

ξλ,λδ,1 (t)/w

λ1 ≤W

ξλ,λδ,2 (t)/w

λ2 hold at

times in which Q̂ξλ,λ

Σ (t)≤ qϑb and only the constraint Wξλ,λδ,2 (t)≤w2 is violated when Q̂

ξλ,λΣ (t)> q

ϑb .

Notice in Figure 6 that fϑ2 (x) = x− fϑ1 (x)>a2w̄2 when x> qϑb .

Proposition 8 closes the equivalence argument toward Theorem 5 in showing that, in terms

of staffing, requiring monotonicity is as demanding as requiring SLD(1). Since we identified in

Proposition 7 that a monotone tracking solution is asymptotically optimal for TSF+SLD(1), we

conclude that TSF+MT and TSF+SLD(1) are asymptotically equivalent and, by Proposition 6,

so are TSF+index and TSF+SLD(1).


Figure 6 Proposed tracking function under perfect SLD (β = 1)

Proposition 8 (equivalence of monotone tracking and SLD(1) under TSF constraints)

Let{ξλ1}

be a series of asymptotic solutions for TSF+SLD(1) and{ξλ2}

be one for TSF+MT.

Then,

lim infλ→∞

N (ξλ2 )−N (ξλ1 )√λ

≥ 0.

Staffing

TSF+Gcµ TSF+SLD(𝛽)

𝛽

TSF+Index TSF+SLD(β)

Figure 7 Optimal staffing levels


Remark 3 (the cost of index policies) Figure 7 displays the minimal staffing requirement to satisfy

the TSF constraints P{W1 ≤ 10 sec.} ≤ 0.2 and P{W2 ≤ 10 sec.} ≤ 0.2, when the arrival rate of

each class is 50 customers per minute and the average service time is 5 minutes. The asymptotic

optimal staffing is derived using Theorem 1. When adding the SLD constraint, the cost changes

with the value of β in the SLD constraint as β varies between 0.65 and 1. This is is captured by

the solid line. When, instead of having the SLD constraint, one restricts attention to index policies

the staffing level is always 517 (captured by the dashed line). The gap between the two lines is the

cost of the restriction to index policies. For β = 1, consistent with our equivalence result, there is

no gap. For small value of β the gap is 7 servers, corresponding to a non-negligible gap relative to

the square root of the system size.

4. Numerical experiments

The purpose of this section is to illustrate the performance of the solutions we derived via the

asymptotic analysis. We use this opportunity to underscore some important aspects of the proposed

(nearly optimal) staffing rule.

Altogether, we consider 40 parameter combinations (4 sets, each consisting of 10 cases). Across

cases in each set, we vary the arrival rates λ1, λ2 and the target parameters w1,w2 as in Table 1.

Note that these parameter combinations are the same across sets, i.e., the parameter combinations

for Scenario k of Set i is the same with that of Scenario k of Set j. We also vary α1, α2 for the TSF

constraints and the SLD target β for the SLD constraint as in Table 2. Set 1 and Set 2 differ in

that Set 1 has no SLD constraint (or β = 0). Set 1 and Set 3 differ in that Set 3 has smaller α1’s

(tighter TSF for class 1). Lastly, Set 3 and Set 4 differ in that Set 4 requires perfect SLD).

Scenario 1 2 3 4 5 6 7 8 9 10λ1 125 100 250 200 500 400 500 500 500 500λ2 125 150 250 300 500 600 500 500 500 500w1 0.2 0.2 0.1 0.1 0.05 0.05 0.025 0.025 0.1 0.1w2 0.3 0.3 0.15 0.15 0.07 0.07 0.03 0.03 0.2 0.2

Table 1 Common parameters for the whole sets

The mean service time is set to 1 time unit throughout (all parameters are specified in the same

time units, so that λ= 125 for example corresponds to a rate of 125 per time unit, be it an hour

or a minute). The number of servers (staffing) is determined via our formula (4)

Nλ? (λ1, λ2,w1,w2, α1, α2, β) = min{N ∈Z+ : P

{QN,λ(∞)≥ λ1wλ1 +λ2wλ2

}≤min{α1,1−β}+α2

}.


Scenario 1 2 3 4 5 6 7 8 9 10α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 1β 0 (No SLD constraint)α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 2β 0.925 0.925 0.925 0.925 0.925 0.925 0.925 0.95 0.925 0.95α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 3β 0 (No SLD constraint)α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 4β 1 (Perfect SLD constraint)

Table 2 Probability targets for each set

Given Nλ? (different for each parameter combination), we compute η= (Nλ? − (λ1 +λ2)/µ)/

√λ and

use it to design the propopsed tracking functions as in (5). This gives us the tracking policy which,

as before, we denote by π.

We built a simulation model in ARENA. For each parameter combination, we simulate a sample

path of the two-class queue with the proposed solution (Nλ? , π)(λ1, λ2,w1,w2, α1, α2, β) for T =

40000 time units–corresponding to 8-40 million arrivals depending on the parameter combination.

At the end of each simulation run, we record 1Ai(T )

∑Ai(T )k=1 1{wi,k >wi(1− �)} with Ai(T ) being the

number of class i customers that arrived over the simulation horizon and wi,k is the actual waiting

time of the kth class i customer processed. Since

1

Ai(T )

Ai(T )∑k=1

1{wi,k >wi(1− �)}→ P{Wi(∞)>wi(1− �)} as T →∞,

this statistic provides a proxy for the performance of the policy with respect to the TSF constraints.

Our asymptotic theory says that under the proposed solution it holds that limsupλ→∞ P{W λi (∞)>

wλi (1 − �)} ≤ αi(1 + �)(i = 1,2) and hence we check whether 1Ai(T )∑Ai(T )

k=1 1{wi,k > wi(1 − �)} ≥

αi(1 + �) holds or not for each i.

For the SLD constraint, we fix δ= 0.05 and compute at intervals of size δ, Wδ,i(t) as in equation

(1). Specifically, for each m= 1, . . . ,M = 40000/0.05 = 800000, we compute

Wδ,i (mδ) :=1

Ai(mδ+ δ)−Ai(mδ)

Ai(mδ+δ)∑k=Ai(mδ)+1

wi,k.

With the fixed δ, the long run average

1

M

M∑m=1

1

{Wδ,1(mδ)

w1≤ Wδ,2(mδ)

w2(1 + �)

}→ P

{Wδ,1(∞)w1

≤ Wδ,2(∞)w2

(1 + �)

}as M →∞.


Figure 8 Set 1(TSF Formulation (β = 0)): (Left) Performance with recommend staffing; (Right) With one server

added for each violating scenario

serves as a proxy for the SLD and we check, in the simulation, whether it exceeds β(1− �) or not.

We set �= 0.1 which is a reasonably ambitious criterion. The math says that as λ grows large with

appropriate scaling we could take � to be increasingly smaller.

Figure 8 depicts the results for Set 1. We plot two bar series corresponding to

the realized TSFs ( 1Ai(T )

∑Ai(T )k=1 1{wi,k > wi(1 − �)}(i = 1,2)) and the realized SLD

( 1M

∑Mm=1 1

{Wδ,1(mδ)

w1≤ Wδ,2(mδ)

w2(1 + �)

}) is captured by the black-circle series for each of the sce-

narios. Notice that there is no target on the SLD but, still, we can compute the outcome.

The labels report the values (the TSF series correspond to the left y-axis and the SLD to the

right y-axis). Colored in white are the metrics that do not meet the feasibility criteria αi(1 + �)

of the TSF constraints. The performance is rather impressive. There are 5 violations (out of 20

constraints) and with the exception of the violations of class 1 in scenario 6 and of class 2 in

scenario 7 (0.345 instead of 0.33 and 0.229 instead of 0.22), the others are truly small (as in 0.332

instead of the targetted 0.33 in scenario 5 or 0.221 instead of 0.22 in scenario 10). Scenarios 5-8

have the smallest wi targets (e.g. w1 = 0.025 and w2 = 0.03) and hence the most sensitive. The

violations are resolved with the addition of a single server as plotted in the graph on the right of

the figure.

Notably, the SLD is very high (indeed, around 0.7 or higher) even though we imposed no SLD

constraint. This is an opportunity to recall Figure 7 where the required number of servers is

invariant to β as long as 1− β ≥ α1. With a target of α1 = 0.3 this means one gets “for free” an

SLD of up to β = 0.7 (when α1 = 0.3) and up to β = 0.8 (when α1 = 0.2 as in Scenario 8 and 10 of

Table 2). Our policy uses this allowance.

We run 10 simulations with newly computed staffing levels and the policies for Set 2 to consider


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

1

2

3

4

5

6

7

8

9

10

b0 bg0

Normalized safety capacity =

Set 2 ( ) Set 1 ( )

Figure 9 (Left) Performance with proposed staffing for Set 2 (TSF+SLD(β)); (Right) The cost of SLD

TSF+SLD(β) with positive β’s. The results are reported in Figure 9. In this case, the TSF criteria

are met in all 10 scenarios with the exception of minor violations in the first two scenarios (that

disappear with the addition of one server). The SLD is mostly around 0.9 which is close to the

target and always above the criterion of β(1− �). We notice again that α1 matters only throughmin{α1,1− β}: For a given β target the solution can afford to have α1 as small as 1− β and thepolicy uses some of this allowance. Thus, even though α1 >α2, the performance obtained is such

that P{W1 >w1}< P{W2 >w2} in all scenarios.Finally, on the right of Figure 9 we visualize the cost difference of TSF vs. TSF+SLD(β) by

comparing the staffing level for TSF that underlies the graph on the right of Figure 8 and the

staffing level of TSF+SLD(β) that underlies the left graph of Figure 9. We plot the normalized

safety capacity (i.e., the servers required above the offered load λ/µ.) These numbers are proxies

for the square-root coefficient η∗ in Theorem 2 and capture the cost of SLD.

The parameter combinations in Set 3 are obtained from Set 1 by introducing more demanding

values of α1 (e.g. 0.15 instead of 0.3). It underscores further the interaction between α1 and β. The

results are reported in Figure 10. There are negligible violations (0.166 instead of 0.165) for class

1 in scenarios 5 and 9. The more substantial violations are in scenarios 3,7. The violations are

concentrated in scenarios with the smallest wi (w1 = 0.025 and w2 = 0.03). An extra staffing corrects

for some of this but leaves slight violations for class 1. This should not be surprising: with very

small w and α values the system is pushed away from heavy-traffic and hence the approximations

are somewhat less precise. Finally, even though β = 0, our policy again extracts as much SLD as

it can get “for free” which is 1−α1 (0.85 or 0.9 depending on the scenario).In Set 4, we consider the case of perfect SLD (β = 1): we use the same data from Set 3 but

put β = 1 as a requirement in all scenarios. The policy is a monotone policy as in Figure 6. The


Figure 10 Set 3 (TSF (β = 0), tight α): (Left) Performance with recommended staffing; (Right) With one server

added for each violating scenario

performance is reported in Figure 11. The TSF targets are met with excess under the recommended

staffing and the SLD is very close to 1 for all values. The graph on the right is a visualization of

SLD and the workings of the asymptotic arguments. One informally expects (and our proofs are

built on formally establishing that) Wδ,i(t)/wi ≈Wi(t)/wi ≈Qi(t)/λiwi because of the sample pathLittle’s law and since δ is small. With SLD close to 1 we expect

Q1(t)

λ1w1≤ Q2(t)λ1w2

, t≥ 0.

The graph on the right nicely visualizes this asymptotic argument. We simulate Scenario 1 (λ1 =

λ2 = 125) of Set 4 and zoom-in on the time interval [460,480] to obtain a clear view.

460 470 480

1.5

1

0

0.5

Simulation time

Figure 11 Set 4 (TSF + Perfect SLD (β = 1)): (L) Performance with recommended staffing (R) samplepath SLD


5. Proofs

This section is dedicated to the proofs of our main results. §5.1 proves Theorem 3 which is on the

asymptotic optimality of our proposed staffing and policy. §5.2 contains proofs of the theorems

that deal with the equivalence of index policies and perfect SLD. The proofs of other theorems and

auxiliary lemmas are in Soh and Gurvich [2015].

5.1. Proof of Theorem 3:

We divide the lengthy proof into sub-modules as follows: we first prove an equivalence between

the waiting time formulation (3) and a queue-based formulation. This may be somewhat expected

given (8). The challenge, however, is to establish such a “Little’s law” result at the formulation

level (rather than for a given policy as is typically the case). It is in this step where considering

local averages is instrumental. Building on this asymptotic equivalence, a lower bound for the

local-average-queue-based formulation provides an asymptotic lower bound for the waiting-time

formulation. We identify the staffing recommendation Nλ? in (4) as such a bound; see §5.1.1.

In §5.1.2 we proceed to show that the tracking policy with our proposed function f? indeed has

the desired asymptotic properties in that, when used, it results in Q̂ξλ,λi (t)≈ f?i (Q̂

ξλ,λΣ (t)). Whereas

this type of so-called state-space collapse is, by now, a standard result, a challenge arises here from

the possible discontinuity of the tracking function f?.

We combine the pieces in §5.1.3 to have the proof of Theorem 3. We prove there that our proposed

solution is asymptotically feasible. Since its staffing component (and, in particular, its cost) is a

lower bound on any such solutions, we conclude that our solution is asymptotically optimal.

5.1.1. From waiting times to queues Before directly showing the relationship of (8), we

introduce local average queue lengths which connect local average waiting times and queue lengths:

Qξλ,λδ,i (t) =

1

δ

∫ t+δt

Qξλ,λi (s)ds. (9)

The following lemma shows that if a sequence {Nλ? } is asymptotically feasible for (3) it must

(asymptotically) satisfy the appropriate constraints in terms of queues and vice versa. We let

Qξλ,λδ,i (∞) be the δ average when the queue length is initialized (at time t= 0) with its stedy-state

distribution.

Lemma 5.1 Let {ξλ} be a sequence of admissible staffing-policy pairs. Then the followings are

equivalent.

1. For each �1 > 0, the following holds.

limsupδ→0

limsupλ→∞

P{W ξ

λ,λδ,1 (∞)>wλ1 (1− �1)

}≤ α1 + �1,


limsupδ→0

limsupλ→∞

P{W ξ

λ,λδ,2 (∞)>wλ2 (1− �1)

}≤ α2 + �1,

limsupδ→0

limsupλ→∞

P

{W ξ

λ,λδ,1 (∞)wλ1

≤W ξ

λ,λδ,2 (∞)wλ2

− �1

}≥ β− �1.

2. For each �2 > 0, the following holds.

limsupδ→0

limsupλ→∞

P{Qξ

λ,λδ,1 (∞)>λ1wλ1 (1− �2)

}≤ α1 + �2,

limsupδ→0

limsupλ→∞

P{Qξ

λ,λδ,2 (∞)>λ2wλ2 (1− �2)

}≤ α2 + �2,

limsupδ→0

limsupλ→∞

P

{Qξ

λ,λδ,1 (∞)λ1wλ1

≤Qξ


− �2

}≥ β− �2.

Lemma 5.1 proves that replacing the waiting-time constraints with queue-based constraints

should yield (asymptotically) identical results in terms of optimality. Thus, we turn to study a

queue-based formulation starting with identifying a lower bound. To that end, given �, consider

the problem

minξλ

N(ξλ)

s.t. P{Qξ

λ,λδ,1 (∞)>λ1wλ1 (1− �)

}≤ α1 + �,

P{Qξ

λ,λδ,2 (∞)>λ2wλ2 (1− �)

}≤ α2 + �, (10)

P

{Qξ


≤Qξ


− �

}≥ β− �.

We next state our lower bound result. Recall that QN,λ(∞) is a random variable with the

stationary distribution of the queue length in an M/M/N queue with arrival rate λ, service rate

µ and N(> λ/µ) servers. As before, the variable QN,λδ (∞) is the local average on [0, δ] when the

M/M/N queue is initialized with its steady-state distribution.

Proposition 9 Fix λ, δ, � > 0 and let ξλδ,� be a feasible solution for (10).

Define

Nλ? (δ, �) = min{N ∈Z+ : P

{QN,λδ (∞)≥ λ1wλ1 +λ2wλ2

}≤min{α1,1−β}+α2 + 2�

}.

Then, Nλ? (δ, �)≤N(ξλδ,�), where N(ξλδ,�)

is the staffing component of ξλδ,�.

Proof: Define the following events on the underlying probability space:

A :={Qξ

λ,λδ,1 (∞) +Q

ξλ,λδ,2 (∞)≤ λ1wλ1 +λ2wλ2

},

B :={Qξ

λ,λδ,1 (∞) +Q

ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q

ξλ,λδ,2 (∞)>λ2wλ2

},


C :={Qξ

λ,λδ,1 (∞) +Q

ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q

ξλ,λδ,2 (∞)≤ λ2wλ2

}.

These events form partitions of the space. Further, on C,

Qξλ,λδ,1 (∞) > λ1wλ1 +λ2wλ2 −Q

ξλ,λδ,2 (∞)

=

(λ1w

λ1 +λ2w

λ2

λ2wλ2

)(λ2w

λ2 −Q

ξλ,λδ,2 (∞)) +

λ1wλ1

λ2wλ2Qξ

λ,λδ,2 (∞)

≥ λ1wλ1

λ2wλ2Qξ

λ,λδ,2 (∞) .

Also it is evident that Qξλ,λδ,1 (∞)>λ1wλ2 on C. In turn,

C ⊆

{Qξ


>Qξ


}∩{Qξ

λ,λδ,1 (∞)>λ1wλ2

}. (11)

Using these definitions and the assumed feasibility of the staffing-policy pair,

P{B} = P{Qξ

λ,λδ,1 (∞) +Q

ξλ,λδ,2 (∞)>λ1wλ1 +λwλ2 ,Q

ξλ,λδ,2 (∞)>λ2wλ2

}≤ P

{Qξ

λ,λδ,2 (∞)>λ2wλ2

}≤ P

{Qξ

λ,λδ,2 (∞)>λ2wλ2 (1− �)

}≤ α2 + �, .

and

P{C} ≤ min

{P

{Qξ


>Qξ


},P{Qξ

λ,λδ,1 (∞)>λ1wλ2

}}

≤ min

{P

{Qξ


>Qξ


− �

},P{Qξ

λ,λδ,1 (∞)>λ1wλ2 − �

}}.

≤ min{α1 + �,1−β+ �}..

where the first inequality follows from (11). Hence

P{Qξ

λ,λδ,1 (∞) +Q

ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2

}= P{B ∪C}= P{B}+P{C} ≤min{α1,1−β}+α2 + 2�.

Since the policy is work conserving, Qξλ,λδ,1 (∞)+Q

ξλ,λδ,2 (∞) has the law of Q

N,λδ (∞) with N =N(ξλ).

Nλ? (δ, �) is the minimum staffing among the ones that satisfy above and hence Nλ? (δ, �)≤N(ξλδ,�).

�

Note that Nλ? (δ, �) is closely related to our proposed staffing component Nλ? in (4). The following

lemma guarantees that the staffing levelNλ? in (4), combined with a feasible policy, is asymptotically

optimal.

Lemma 5.2

limsup�→0

limsupδ→0

limsupλ→∞

Nλ? (δ, �)−Nλ?√λ

≥ 0.


5.1.2. Asymptotic tracking We show that the local average queue lengths actually “tracks”

the tracking function given by (5). The following lemma is a corollary of a result in Gurvich and

Whitt [2009a].

Lemma 5.3 Let f be a Lipschitz continuous function and let {ξλf } be a sequence of staffing-policy

pairs where the staffing N(ξλf ) follows square-root scaling, i.e., N(ξλf ) = λ/µ+ η

√λ/µ+ o(

√λ) for

some η > 0 and a tracking policy constructed by f is used. For each � > 0, there exists δ′ > 0 such

that for all δ with 0< δ≤ δ′ the following holds.

limsupλ→∞

P{∣∣∣∣Q̂ξλf ,λδ,i (∞)− Q̂ξλf ,λi (∞)∣∣∣∣> �}< �.

Notably, Lemma 5.3 requires the smoothness of the tracking functions which is clearly violated

by our tracking function f?. Instead, given a piecewise Lipschitz function we will construct two

other tracking policies that provide lower and upper bounds for our tracking policy and do satisfy

the smoothness requirements.

For the following, fix a piecewise Lipschitz tracking function f . Let Df := {d1, . . . , dn} be the set

of discontinuity points of f (recall that this is a finite set). Also, let

D↑f = {di ∈Df : f1 (di+)− f1 (di−)≥ 0} and D↓f = {di ∈Df : f1 (di+)− f1 (di−)< 0}

Given ς < min{d1, d2− d1, ..., dn− dn−1}/2 (ς will be adjusted within our proofs), define the

following tracking functions fς

and f ς so that fς

1 (x)≥ fς

1(x) and f

ς

2 (x)≤ fς

2(x) for all x≥ 0:

fς

1 (x) =

f (di− ς)

(x−di+ς

ς

)+ f (di+)

(di−xς

), di− ς ≤ x≤ di, di ∈D↑f ,

f (di−)(x−diς

)+ f (di + ς)

(di+ς−x

ς

), di ≤ x≤ di + ς, di ∈D↓f ,

f (x) , otherwise,

f ς1(x) =

f (di−)

(x−diς

)+ f (di + ς)

(di+ς−x

ς

), di− ς ≤ x≤ di, di ∈D↑f ,

f (di− ς)(x−di+ς

ς

)+ f (di+)

(di−xς

), di ≤ x≤ di + ς, di ∈D↓f ,

f (x) , otherwise.

Note that f (di−) and f (di+) are the left and right limit of f at di which are well-defined by the

fact that there are finite number of discontinuous points. fς

2 (x) and fς

2(x) are defined by x−f ς1 (x)

and x− f ς1(x). Having defined these functions, we can now prove the following result. Below, as

before, πλf? is the tracking policy with respect to the function f? in the λth system with f?.

Proposition 10 uses these smoothed functions to show the tracking function works. Lemma 5.4,

along with Lemma 5.3, is used in the proof of Proposition 10. Given a tracking function f , we

denote by πf the tracking policy defined with respect to f . Let Qf (t) = (Qf1(t),Q

f2(t)) be the queue

lengths at time t if the policy πf is used.


Lemma 5.4 Fix λ,µ, and N . Fix tracking functions g= (g1, g2) and h= (h1, h2) such that g1 (x)≤

h1 (x) and (consequently) g2 (x) ≥ h2 (x) for all x ≥ 0. Suppose that Qg (0) = Qh (0). There then

exists a construction of the sample paths so that the following inequalities hold almost surely,

Qg1 (t)≤Qh1 (t) and Qg2 (t)≥Qh2 (t) , t≥ 0.

Proposition 10 For each λ, let ξλ = (Nλ, πλf?) where the sequence {Nλ} satisfies Nλ = λ/µ +

η√λ/µ+ o(

√λ). Then, for any � > 0, there exists δ

′> 0 such that for all δ ∈ (0, δ′ ]

limsupλ→∞

P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �.

Proof: From the staffing Nλ and the tracking functions fς(x) and f ς (x), define the staffing and

policy pairs ξς

λ = (Nλ, πλ

fς ) and ξ

ς

λ= (Nλ, πλfς ). By construction, the tracking function f

ςis Lipschitz

continuous. Hence, we can apply Theorem 3.1 of Gurvich and Whitt [2009a] to obtain(Q̂ξςλ,λ

1 (∞)− fς

1

(Q̂ξ

λ,λΣ (∞)

), Q̂

ξςλ,λ

2 (∞)− fς

2

(Q̂ξ

λ,λΣ (∞)

))⇒ (0,0) .

Note that the staffing component of ξλ, ξλ

and ξλ are the same and the total queue length process is

irrelevant of prioritization policies once the staffing is the same and the policies are all admissible.

We then have for each � > 0 that

limsupλ→∞

P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 , for i= 1,2. (12)

By Lemma 5.3, there exists δ′> 0 such that for all δ ∈ (0, δ′ ],

limsupλ→∞

P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}< �6 . (13)

Define the set

F ς ={x≥ 0 :

∣∣∣f ςi (x)− f?i (x)∣∣∣> �3} .By the construction of f

ς, for each � > 0, there exists ς > 0 such that

P{Q̂ξΣ(∞)∈ F ς

}<�

6.

Moreover, since Q̂ξΣ(∞), the limiting process of Q̂ξλ,λΣ (∞), has a density (see Halfin and Whitt

[1981]), P{Q̂ξΣ(∞)∈ ∂F ς}= 0 so that the following holds by Portmanteau Theorem (see Billingsley

[1999]),

limλ→∞

P{Q̂ξ

λ,λΣ (∞)∈ F ς

}= P

{Q̂ξΣ (∞)∈ F ς

}.


Then, for each � > 0, there exists ς > 0 that satisfies

limsupλ→∞

P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 . (14)

Combining (12)-(14), we conclude that for each � > 0, there exists ς > 0 and δ′ > 0 so that for

all δ ∈ (0, δ′ ]

limsupλ→∞

P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �} < limsup

λ→∞P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}

+limsupλ→∞

P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}

+limsupλ→∞

P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}

<�

6+�

6+�

6=�

2.

Similarly for ξςλ,

limsupλ→∞

P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �2 .

Finally, note that by construction f ς1(x) ≤ f?(x) ≤ f ς1(x) for all x ≥ 0 so that, by Lemma 5.4

Q̂ξςλ,λ

δ,1 (∞)≤ Q̂ξλ,λδ,1 (∞)≤ Q̂

ξςλ,λδ,1 (∞) and, in turn, there exists δ

′such that for all δ ∈ (0, δ′ ]

limsupλ→∞

P{∣∣∣Q̂ξλ,λδ,1 (∞)− f?1 (Q̂ξλ,λΣ (∞))∣∣∣> �}

≤ limsupλ→∞

P{Q̂ξ

λ,λδ,1 (∞)− f?1

(Q̂ξ

λ,λΣ (∞)

)> �}

+ limsupλ→∞

P{Q̂ξ

λ,λδ,1 (∞)− f?1

(Q̂ξ

λ,λΣ (∞)

) �}

+ limsupλ→∞

P{Q̂ξςλ,λ

δ,1 (∞)− f?1(Q̂ξ

λ,λΣ (∞)

) 0, (15)

and a steady-state exists for X. Furthermore,

limsupλ→∞

1√λE[Qξ

λ,λ1 (∞) +Q

ξλ,λ2 (∞)]


5.1.3. Asymptotic feasibility and optimality: Proof of Theorem 3 We start with the

asymptotic feasibility. Specifically, we argue that with ξλ being our proposed staffing-policy pair,

it holds that for all � > 0, there exists δ′> 0 such that for all δ ∈ [0, δ′), the following holds.

limsupλ→∞

P{Qξ

λ,λδ,i (∞)>λiwλi (1− �)

}≤ αi + �, i= 1,2,

limsupλ→∞

P

{Qξ


≤Qξ


− �

}≥ β− �.

Asymptotic feasibility for the waiting time formulation then follows from Lemma 5.1. The above

formulation is equivalent to the following.

limsupλ→∞

P{Q̂ξ

λ,λδ,i (∞)>aiw̄i(1− �)

}≤ αi + �, i= 1,2,

limsupλ→∞

P

{Q̂ξ

λ,λδ,1 (∞)a1w̄1

≤Q̂ξ

λ,λδ,2 (∞)a2w̄2

− �

}≥ β− �.

The following inclusions follow directly from the structure of f?. Recall that Q̂ξΣ denote the limit

process of Q̂ξλ,λ

Σ .{f?1

(Q̂ξΣ (∞)

)− a1w̄1 ≥ 0

}⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

},{f?2

(Q̂ξΣ (∞)

)− a2w̄2 ≥ 0

}⊆{Q̂ξΣ (∞)∈ [q̄,∞)

},

and f?1

(Q̂ξΣ (∞)

)a1w̄1

−f?2

(Q̂ξΣ (∞)

)a2w̄2

≥ 0

⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Consequently,

P{f?1

(Q̂ξΣ (∞)

)− a1w̄1 ≥ 0

}≤ P

{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

},

P{f?2

(Q̂ξΣ (∞)

)− a2w̄2 ≥ 0

}≤ P

{Q̂ξΣ (∞)∈ [q̄,∞)

},

P

f?1

(Q̂ξΣ (∞)

)a1w̄1

−f?2

(Q̂ξΣ (∞)

)a2w̄2

≥ 0

≤ P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Applying Proposition 1 and Theorem 1 of Halfin and Whitt [1981] and using the definition of q̄

and η in (6) and (7),

P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

}= P

{Q̂ξΣ (∞)> 0

}P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄) | Q̂

ξΣ (∞)> 0

}=

(1 +

ηΦ(η)

φ(η)

)−1· (exp(−η(a1w̄1 + a2w̄2))− exp(−ηq̄))

=

(1 +

ηΦ(η)

φ(η)

)−1·(

exp(−η(a1w̄1 + a2w̄2))

− exp(−η(a1w̄1 + a2w̄2)) · exp(− log

{α2 + min{α1,1−β}

α2

}))


=

(1 +

ηΦ(η)

φ(η)

)−1(1 +

ηΦ(η)

φ(η)

)(min{α1,1−β}+α2)(

1− α2α2 + min{α1,1−β}

)= min{α1,1−β},

P{Q̂ξΣ (∞)∈ [q̄,∞)

}= P

{Q̂ξΣ (∞)> 0

}P{Q̂ξΣ (∞)≥ q̄ | Q̂

ξΣ (∞)> 0

}=

(1 +

ηΦ(η)

φ(η)

)−1· exp(−ηq̄)

=

(1 +

ηΦ(η)

φ(η)

)−1exp(−η(a1w̄1 + a2w̄2)) ·

α2α2 + min{α1,1−β}

= α2, (16)

Let Df? be the set of discontinuity points of f?. Since Q̂ξΣ(∞) has a density, P{Q̂ξΣ (∞)∈Df?

}=

0, so that, by continuous mapping theorem (See Theorem 3.4.3. of Whitt [2002]),(Q̂ξ

λ,λΣ (∞) , f?1

(Q̂ξ

λ,λΣ (∞)

), f?2

(Q̂ξ

λ,λΣ (∞)

))⇒(Q̂ξΣ (∞) , f?1

(Q̂ξΣ (∞)

), f?2

(Q̂ξΣ (∞)

)),

By Portmanteau Theorem (and using the 0 measure of the discontinuity points), we have

limsupλ→∞

P{f?i

(Q̂ξ

λ,λΣ (∞)

)− aiw̄i ≥ 0

}= P

{f?i

(Q̂ξΣ (∞)

)− aiw̄i ≥ 0

}≤ αi for i= 1,2,

limsupλ→∞

P

f?1

(Q̂ξ

λ,λΣ (∞)

)a1w̄1

−f?2

(Q̂ξ

λ,λΣ (∞)

)a2w̄2

≥ 0

= Pf

?1

(Q̂ξΣ (∞)

)a1w̄1

−f?2

(Q̂ξΣ (∞)

)a2w̄2

≥ 0

≤ 1−β.By Proposition 10, for each � > 0 there exist δ

′> 0 such that for all δ ∈ (0, δ′ ] and i= 1,2,

limsupλ→∞

P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}< �2 ,

limsupλ→∞

P{Q̂ξ

λ,λδ,i (∞)≥ aiw̄i(1− �)

}≤ limsup

λ→∞P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}

+limsupλ→∞

P{f?i

(Q̂ξ

λ,λΣ (∞)

)≥ aiw̄i

}≤ �

2+αi ≤ αi + �,

and

limsupλ→∞

P

{Q̂ξ

λ,λδ,1 (∞)a1w̄1

>Q̂ξ

λ,λδ,2 (∞)a2w̄2

− �

}≤ limsup

λ→∞P

∣∣∣∣∣∣Q̂

ξλ,λδ,1 (∞)a1w̄1

−f?1

(Q̂ξ

λ,λΣ (∞)

)a1w̄1

∣∣∣∣∣∣> �a1w̄1 min{a1w̄1

2,a2w̄2

2

}+limsup

λ→∞P

f?1

(Q̂ξ

λ,λΣ (∞)

)a1w̄1

−f?2

(Q̂ξ

λ,λΣ (∞)

)a2w̄2

≥ 0

+limsup

λ→∞P

∣∣∣∣∣∣Q̂

ξλ,λδ,2 (∞)a2w̄2

−f?2

(Q̂ξ

λ,λΣ (∞)

)a2w̄2

∣∣∣∣∣∣> �a2w̄2 min{a1w̄1

2,a2w̄2

2

}


≤ �2

+ 1−β+ �2≤ 1−β+ �.

This concludes the feasibility argument and we turn to optimality. Let {ξ̃λ} be another asymp-totically feasible sequence. By the definition of asymptotic feasibility, for any � > 0 there exists a

δ1 > 0 such that for all δ ∈ (0, δ1] the following holds

limsupλ→∞

P{Q̂ξ̃

λ,λδ,i (∞)>aiw̄i(1− �)

}≤ αi + �, i= 1,2,

limsupλ→∞

P

{Q̂ξ̃

λ,λδ,1 (∞)a1w̄1

≤Q̂ξ̃

λ,λδ,2 (∞)a2w̄2

− �

}≥ β− �.

Thus, given δ and � we have by Proposition 9 that N(ξ̃λ)≥Nλ? (δ,2�). In particular, given δ, � > 0

limsupλ→∞

Nλ? −N(ξ̃λ)√λ

≤ limsupλ→∞

Nλ? −Nλ? (δ,2�)√λ

+limsupλ→∞

Nλ? (δ,2�)−N(ξ̃λ)√λ

≤ limsupλ→∞

Nλ? −Nλ? (δ,2�)√λ

.

Finally, since �, δ are arbitrary, we have by Lemma 5.2 that

limsup�→0

limsupδ→0

limsupλ→∞

[Nλ? −N(ξ̃λ)]+ ≤ 0,

which, together with the formerly established asymptotic feasibility, establishes the asymptotic

optimality of ξλ = (Nλ? , πλf?). �

5.2. Proofs on the equivalence between perfect SLD and index policies

5.2.1. Index rule and tracking policy: Proof of Proposition 6 For a given monotone

tracking function f = (f1, f2), define the generalized inverses of f1 and f2 as follows.

g1 (y) := sup{x≥ 0 : f1 (x)≤ y} , and g2 (y) := inf {x≥ 0 : f2 (x)≥ y} .

We will show that the tracking policy by f = (f1, f2) and the index policy by g1 and g2 are

equivalent. For the index policy, guideline 1 to server if g1 (Q1 (t))≥ g2 (Q2 (t)) and it chooses class2, if g1 (Q1 (t))< g2 (Q2 (t)).

Suppose class 1 is to be served at time t under the tracking policy f . Then

Q2(t)≤ f2(QΣ(t)) and f1(QΣ(t))≤Q1(t).

g1 and g2 are increasing. Also gi (fi (x)) = x for i= 1,2. Hence applying g2 to the first inequality

and g1 to the second one result in

g2(Q2(t))≤QΣ(t) and QΣ(t)≤ g1(Q1(t)).

Combining those two, we can see class is also served under the index policy by g1 and g2. The

case where class 2 is to be served can be proved similarly and hence the first statement of the

theorem is proved.


Now we will prove the second statement of the theorem. For simplicity, we assume that g1 is right

continuous with left limits and g2 is left continuous with right limits. For a given index functions

g1 and g2, define f1, which is a function of total queue length QΣ, as follows.

f1 (QΣ) :=

0, g2 (QΣ)< g1 (0) ,

QΣ, g1 (QΣ)< g2 (0) ,

inf {x : 0≤ x≤QΣ, [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅} , otherwise.

Note that with continuous gi’s, gi(x−) = gi(x), i = 1,2 hold. Then f1(QΣ) := inf{x : 0 ≤ x ≤

QΣ, g1(x) = g2(QΣ − x)} in the third case. More complicated formula as the above is devised to

incorporate discontinuous index functions. We will show that f1 (QΣ) is a well-defined function and

the tracking policy by f = (f1, f2) is equivalent to index policy with g1 and g2. We complete the

proof by showing that f = (f1, f2) is an increasing function.

If g2 (QΣ)< g1 (0) or g1 (QΣ)< g2 (0), f1 (QΣ) is evidently well defined. It remains to show there

exists x such that 0≤ x≤QΣ and [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅ when g2 (QΣ)≥

g1 (0) and g1 (QΣ)≥ g2 (0).

We show this with a realistic assumption that g1 and g2 have finite number of discontinuous

points on [0,QΣ]. The case for countably infinite case is proved in Soh and Gurvich [2015]. First,

define g (x) := g1 (x)− g2 (QΣ−x) and let {x1, x2, ..., xn} be the discontinuous points of g (x) on

[0,QΣ]. Note that g (x) is increasing and g(x−)≤ g (x) always holds.

We first prove that ∪x∈[0,QΣ] [g(x−), g (x)] , is a connected set. Since g is continuous at [0, x1),

A := ∪x∈[0,x1) [g(x−), g (x)] = ∪x∈[0,x1)g (x) is a connected set, i.e., an interval. We will show that

A ∪B is also an interval where B := [g(x1−), g (x1)] . Let g′ be a modified function from g such

that g′ (x1) = g(x1−) and g′ (x) = g (x) for all x ∈ [0, x1). Then g′ (x) is continuous at x ∈ [0, x1].

Evidently, g′ (x)∈A for all x∈ [0, x1) and g′ (x1)∈B. Hence a continuous function g′ (x) on [0, x1]

is a path from g′ (0) = g (0) ∈A to g′ (x1) ∈B. A and B are both path-connected sets and A∪B

is a path-connected set, i.e., an interval. Repeating this argument until x=QΣ, we conclude that

∪x∈[0,QΣ] [g(x−), g (x)] is a connected set.

By the conditions g2 (QΣ)≥ g1 (0) and g1 (QΣ)≥ g2 (0), g (0) = g1 (0)− g2 (QΣ)≤ 0 and g (QΣ) =

g1 (QΣ) − g2 (0) ≥ 0 hold. Therefore, the above interval contains 0 and there exists x′ such

that 0 ∈ [g(x′−), g (x′)] . Then g1 (x′) ≥ g2 (QΣ−x′) and g1(x′) ≤ g2(QΣ − x′+) hold and hence

[g1 (x′−) , g1 (x′)]∩ [g2 (QΣ−x′) , g2 (QΣ−x′+)] 6= ∅. f1 (QΣ) is defined to be the infimum of such x′

and therefore is well-defined.

Now we show that the tracking policy by f = (f1, f2) is equivalent to the index policy by g1

and g2. If g2 (Q1 (t) +Q2 (t))< g1 (0), f1 (Q1 (t) +Q2 (t)) is defined to be 0. Since g1 is increasing,

g2 (Q1 (t) +Q2 (t))< g1 (0)< g1 (Q1 (t) +Q2 (t)). Then, class 2 is always served first under both the


tracking policy and the index policy. If g1 (Q1 (t) +Q2 (t))< g2 (0), f1 (Q1 (t) +Q2 (t)) is defined to

be Q and class 1 is to be served under the both policies.

Now we check the remaining case when g2 (Q1 (t) +Q2 (t)) ≥ g1 (0) and g1 (Q1 (t) +Q2 (t)) ≥

g2 (0). Suppose class 1 is to be served at time t with the index policy. Then g1 (Q1 (t))≥ g2 (Q2 (t)) . If

[g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] 6= ∅, f1 (Q1 (t) +Q2 (t)) is less or equal to Q1 (t),

since it is defined to be the infimum of x that satisfies [g1 (x−) , g1 (x)]∩ [g2 (Q−x) , g2 (Q−x+)] 6=

∅. If [g1 (Q1 (t)−) , g1 (Q1 (t))] ∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅, max [g2 (Q2 (t)) , g2 (Q2 (t)+)] <

min [g1 (Q1 (t)−) , g1 (Q1 (t))] and hence f1 (Q1 (t) +Q2 (t)) must be less than Q1 (t), since g1 and

g2 are increasing and g1(Q1(t)) ≥ g2(Q2(t)). Q1 (t) ≥ f1 (Q1 (t) +Q2 (t)) commands class 1 to be

served at time t with the tracking policy.

Suppose class 2 is to be served at time t with the index policy. Then g1 (Q1 (t)) < g2 (Q2 (t)) .

Evidently, [g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅ and since g1 and g2 are increasing

functions, f1 (Q1 (t) +Q2 (t)) is larger than Q1 (t), i.e., class 2 is to be served at time t with the

tracking policy.

It remains to show that f = (f1, f2) is a monotone function. Suppose, on the contrary, that f

is a non-monotone function, i.e., there exist q′ and q′′ such that q′ < q′′ and fi (q′) > fi (q

′′) for

some i. Assume without loss of generality that i = 1 (f1 (q′) > f1 (q

′′)). Choose a q′′′ such that

f1 (q′′)< q′′′ < f1 (q

′). If Q1 (t) = q′′′ and Q2 (t) = q

′′− q′′′, f1 (Q1 (t) +Q2 (t)) = f1 (q′′)< q′′′ =Q1 (t)

and hence class 1 customer must be served. If Q1 (t) = q′′′ and Q2 (t) = q

′−q′′′, f1 (Q1 (t) +Q2 (t)) =

f1 (q′)> q′′′ =Q1 (t) and hence class 2 customer must be served.

Under the equivalent index policy by g1 and g2 (the existence of which is verified above), g1 (q′′′)≥

g2 (q′′− q′′′) and g1 (q′′′)< g2 (q′− q′′′) must hold. But this contradicts the fact that g2 is a monotone

function since q′′− q′′′ > q′− q′′′. Therefore, f = (f1, f2) must be a monotone function.

5.2.2. Near optimality of monotone solutions: Proof of Proposition 7 Having defined

fϑ, the proof of this theorem is a straightforward adaptation of the proof of Theorem 3 and, in

particular, of the arguments in sections 5.1.2 and 5.1.3. Fix ϑ > 0 and let Q̂ϑΣ(∞) be defined as

the limit of the properly scaled M/M/N steady-state queues when the staffing in the λth queues is

Nλ = bNλ? +ϑ√λc. Asymptotic tracking follows as in Proposition 10 with f? replaced by fϑ there.

fϑ is structured to satisfy{fϑ1

(Q̂ϑΣ (∞)

)− a1w̄1 ≥ 0

}= ∅,

{fϑ2

(Q̂ϑΣ (∞)

)− a2w̄2 ≥ 0

}⊆{Q̂ϑΣ (∞)∈ [qϑb ,∞)

},

and fϑ1

(Q̂ϑΣ (∞)

)a1w̄1

−fϑ2

(Q̂ϑΣ (∞)

)a2w̄2

≥ 0

= ∅.


Proceeding as in §5.1.3, it follows trivially that the SLD constraint (with β = 1 here) and the

constraint for class 1 are asymptotically satisfied and we turn to the class-2 constraint. Recall that

with the sequence {Nλ? }, it holds that P{Q̂ξΣ(∞)∈ [q̄,∞)} ≤ α2 (see (16)). Since ϑ is strictly positive,

it then follows (by the explicit expressions for Q̂ξΣ(∞) and Q̂ϑΣ(∞)) that there exists $ > 0 such

that P{Q̂ϑΣ(∞) ∈ [q̄,∞)} ≤ α2−$. We can now choose κϑ sufficiently small (and in turn qϑb close

to q̄) such that P{Q̂ϑΣ(∞) ∈ [qϑb , q̄]} ≤$/2, in which case, we have that P{Q̂ϑΣ(∞) ∈ [q̄ϑb ,∞)} ≤ α2.

From here the proof of asymptotic feasibility is concluded as in §5.1.3. �

5.2.3. Perfect SLD and monotone tracking policy: Proof of Proposition 8 Suppose,

on the contrary, that N (ξ1)>N (ξ2). We will show that the TSF constraints cannot be satisfied

with the staffing N (ξ2) and hence N (ξ2)≥N (ξ1) must hold.

If N (ξ1) > N (ξ2), the queue length distribution under the staffing N (ξ1) is first order

stochastically dominated by the one under N (ξ2), i.e., for any q > 0, P{Q̂ξ1Σ (∞)∈ [q,∞)

}<

P{Q̂ξ2Σ (∞)∈ [q,∞)

}. Also by the proof of Theorem 3, P

{Q̂ξ1Σ ≥ a1w̄1 + a2w̄2

}= α2. Then there

exists � > 0 such that P{Q̂ξ2Σ ≥ a1w̄1 + a2w̄2 + �

}>α2.

Let f be the tracking function for ξ2. Then either f1(a1w̄1 + a2w̄2 + �) > a1w̄1 or

f2(a1w̄1 + a2w̄2 + �) > a2w̄2 or both. Since f is monotone, either P{Q̂ξ22 (∞)>a2w̄2

}≥

P{Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �

}> α2 or P

{Q̂ξ21 (∞)>a1w̄1

}≥ P

{Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �

}>

α2 ≥ α1. Hence at least one of the TSF constraints cannot be met when the staff is less than N(ξ1).

6. Concluding remarks

A starting point in process management is to identify a sufficient set of Key Performance Indicators

(KPIs). Call centers must choose QoS metrics that reliably reflect the call center’s objectives.

This is a design decision: a question of formulation choice that should be influenced by financial

and marketing considerations. To make an informed formulation choice, one must understand

how different formulations translate into outcomes – these would be the optimal staffing and

prioritization decisions.

Our paper seeks to contribute to the understanding of this relationship between formula-

tions and outcomes/decisions. We characterize the relationship between index rules (on the out-

comes/decisions end) and real time service-level differentiation (on the formulation end) in showing

that index rules can be justified if and only if one requires, in addition to the marginal class-level

targets, perfect service level differentiation.

Implicitly, this work brings together two paradigms of staffing optimization. In the first, costs

are assigned to waiting and the optimizer seeks to minimize the combined cost of waiting staffing.

In the second, constraints are placed on waiting-related metrics and the optimizer seeks to min-

imize staffing costs while meeting the imposed constraints. The role of index rules is relatively


well understood under the first paradigm, yet we study them (and call centers use them) for con-

strained staffing problems. The better understanding of index rules, we hope, can facilitate the

understanding of the duality (or lack therof) between waiting-cost to waiting-time targets.

Acknowledgements

We thank Yong Pin Zhou and Retsef Levi for helpful comments in early stages of this work. We are

grateful to three anonymous reviewers and the associate editor for their comments and suggestions

that lead to significant improvements to this work.

References

Z. Aksin, M. Armony, and V. Mehrotra. The modern call center: A multi-disciplinary perspective

on operations management research. Production and Operations Management, 16(6):665–688,

2007.

T. Apostol and I. Makai. Mathematical analysis, volume 2. Addison-Wesley Reading, MA, 1974.

M. Armony. Dynamic routing in large-scale service systems with heterogeneous servers. Queueing

Systems, 51(3-4):287–329, 2005.

M. Armony and A. Mandelbaum. Routing and staffing in large-scale service systems: the case of

homogeneous impatient customers and heterogeneous servers. Operations Research, 59:50–65,

2011.

S. Asmussen. Applied probability and queues. Springer Verlag, 2003.

P. Billingsley. Convergence of Probability Measures. 2nd ed., Wiley, New York, 1999.

S. Borst, A. Mandelbaum, and M. Reiman. Dimensioning large call centers. Operations Research,

52(1):17–34, 2004.

W. Chan, G. Koole, and P. L’Ecuyer. Dynamic call center routing policies using call waiting and

agent idle times. Manufacturing & Service Operations Management, 16(4):544–560, 2014.

H. Chen and D. Yao. Fundamentals of queueing networks. Springer, 2001.

I. Gurvich and W. Whitt. Service-level differentiation in many-server service systems: A solution

based on fixed-queue-ratio routing. Operations Research, 29:567–588, 2007.

I. Gurvich and W. Whitt. Queue-and-idleness-ratio controls in many-server service systems. Math-

ematics of Operations Research, 34(2):363–396, 2009a.

I. Gurvich and W. Whitt. Scheduling flexible servers with convex delay costs in many-server service

systems. Manufacturing & Service Operations Management, 11(2):237–253, 2009b.

I. Gurvich, M. Armony, and A. Mandelbaum. Service level differentiation in call centers with fully

flexible servers. Management Science, 54(2):279–294, 2008.

S. Halfin and W. Whitt. Heavy-traffic limits for queues with many exponential servers. Operations

Research, 29(3):567–588, 1981.


G. Koole. Call center optimization. Lulu. com, 2013.

G. Koole and A. Pot. An overview of routing and staffing algorithms in multi-skill customer contact

centers. working. Technical report, 2006.

A. Mandelbaum and S. Stolyar. Scheduling flexible servers with convex delay costs: Heavy-traffic

optimality of the generalized cµ-rule. Operations Research, 52:836–855, 2004.

S. Soh and I. Gurvich. Technical supplement to the paper: “call cecenter staffing: Service-level

constraints and index priorities”. 2015. Available from the authors’ websites.

J. A. Van Mieghem. Dynamic scheduling with convex delay costs: The generalized cµ rule. Ann.

Appl. Prob., 5:809–833, 1995.

W. Whitt. Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and Their

Application to Queues. Springer-Verlag, New York., 2002.

W. Whitt. Proofs of the martingale FCLT. Probab. Surv, 4:268–302, 2007.


7. Technical supplement to the paper:Call center staffing: Service-level constraints and index priorities

This supplement has three parts. In the first part, we prove auxiliary lemmas in §5.1. In the second

part, the remaining results that were stated without proofs are given.

7.1. Proofs of auxiliary lemmas in §5.1

We first provide an explicit construction of the process X. Recall that Qi(t) is the number of

customers in the class-i queue at time t and Z(t) is the number of customers in service at time

t. We number the servers and let As(t) be an N -dimensional process whose lth element captures

the elapsed service time of the customer in service with server l. This coordinate is set to 0 if

server l is idle at time t. We let Wi(t) be a vector of dimension of Qi(t) whose jth entry is the

accumulated waiting time of the jth customer in the class-i queue. Customers are listed in their

order of arrival so that the first element of Wi(t) is the accumulated waiting time of the class-i

customer that arrived first amongst those that are still waiting. An arriving customer is added at

the end of the vector with an initial value of 0. We remove the item in the head of the vector when

the first customer in the queue is admitted to service.

Admissible policies are assumed to be stationary with respect to the process

X(t) = (Z(t),Q1(t),Q2(t),As(t),W1(t),W2(t)),

i.e., an assignment decision at time t may use only the information captured by X(t). This guar-

antees that X(t) is a Markov process.

Proof of Lemma 5.1: We first show that if part (1.) holds, part (2.) also holds. This will be

proved by showing that for a given �2, we can specify δ̄ such that for all δ ∈ (0, δ̄], the inequalities

of (2.) hold.

Documents

Call center sta ng: Service-level constraints and index prioritiesSoh and Gurvich: Call center sta ng: Service-level constraints and index priorities 3 Figure 1 Index functions in