Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Submitted to Operations Researchmanuscript
Call center staffing:Service-level constraints and index priorities
Seung Bum SohCollege of Business Administration, Sejong University, [email protected]
Itai GurvichKellogg School of Management, Northwestern University, [email protected]
May 2, 2016
Call centers attribute different values to different customer segments. These values are reflected in quality-
of-service targets. The prevelant Target Service Factor (TSF) formulation requires, for example, that 80% of
VIP customers wait less than 20 seconds while setting the target to 30 seconds for non-VIP customers. The
call center must determine the staffing level together with a prioritization rule that meets these targets at
minimal cost. In practice, due to the underlying complexity of these systems, the prioritization rule is often
selected in a heuristic manner rather than being systematically optimized. When considering the universe
of prioritization policies, index rules provide a customizable and easy to define heuristic and for this reason
are implemented in various call-center software packages.
We use the TSF formulation as a stepping stone towards a better understanding of index rules. We first
construct an asymptotically optimal solution for the TSF problem. The prioritization component of our
solution is a tracking policy rather than an index rule. We prove that despite index rules’ significant flexibility,
no instance of these prioritization rules is optimal for the TSF problem. The sub-optimality of index rules
follows from an essential characteristic of these: restricting attention to index rules (as is heuristically done
in practice) is asymptotically equivalent to requiring that a VIP customer always waits less than a regular
(non-VIP) customer who arrives at the same time. This, in particular, implies that the use of index rules in
practice can be rationalized if (and only if) the manager requires such strong differentiation.
1. Introduction
Waiting times are regarded as central to customers’ service experience and are used as a key
performance indicator for call centers. In the spirit of measuring what is managed, call centers use
metrics that aggregate waiting-time related information. They are collectively referred to as QoS
(Quality-of-Service) metrics. A notable example is the Target-Service-Factor (TSF, the fraction of
customers who exceed a target waiting time) ; see Chapter 2 of Koole [2013] for further discussion
of call-center performance metrics. TSF is one of the key performance indicators that are traced
and targetted in call centers.
When the call center serves multiple customer classes, their relative importance is typically
reflected in the TSF targets requiring, for example, that 80% of class-1 customers wait less than 20
seconds but 80% of class 2 customers wait less than 30 seconds. Solving the call-center’s staffing
problem in a multi-class environment requires determining the capacity level and the prioritization
rule that, combined, guarantee that the service-level targets are met at a minimal cost.
1
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities2
In the two-class model that we consider here, the staffing problem is given by
(TSF)
min N
s.t. P{WN,π1 >w1
}≤ α1,
s.t. P{WN,π2 >w2
}≤ α2,
N ∈Z+, π ∈Π,
where, informally at this stage, WN,πi represents the steady-state waiting time of class i when the
number of servers is N and the prioritization policy is π. We seek to find the minimal number
of servers N together with a supporting prioritization policy π so that the TSF constraints are
met. Here, Π denotes the family of “admissible” prioritization policies; see Definition 1. To the
extent that the TSF constraint reflects relative importance, having α1 ≤ α2 together with w1 ≤w2corresponds to class 1 being the VIP class.
Truly solving this problem to optimality by jointly determining the staffing level and the priority
rule is difficult. This paper is the first to offer an asymptotically optimal solution for a relatively
simple network. In practice, call centers typically restrict attention to parametric families of pri-
oritization rules and subsequently adjust the parameters and the staffing level to meet the QoS
targets. This heuristic approach is understandable given the complexity of the problem. Degrees
of freedom in the prioritization policy are built into the software that manages the real time call
selection and routing of calls. The software provides sufficiently many adjustable parameters while
maintaining simplicity. A family of prioritization heuristics is that of index priorities; see e.g. Chan
et al. [2014] or Koole and Pot [2006] where these are referred to as Time Function Scheduling
(TFS). This family of policies is also well understood analytically in the context of convex holding
cost minimization; see Van Mieghem [1995] and Mandelbaum and Stolyar [2004]
A customer is assigned an index upon arrival. As the customer waits, his index increases period-
ically. The system manager chooses the initial indices for each customer type as well as the update
intervals and update-to levels (these are easily defined in a spreadsheet). This leads to an index
function as the one in Figure 1 which is used in an Israeli Bank that serves three categories of
retail customers. In real time, a server who becomes available serves the customer with the highest
index.
Somewhat more formally, customer type i is associated with a piecewise-constant index gi(·)
so that a server that becomes available chooses for service the customer at the head of queue
i∗ ∈ arg maxi gi(wi(t)) where wi(t) is the accumulated waiting time of the customers at the head
of the class-i queue. This family of policies is intuitive to understand and easy to implement.
Moreover, as both the updated periods and levels can be defined by the user it is expected to
be rather flexible. Analytically understanding the role of index policies in constrained staffing
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities3
Figure 1 Index functions in practice
Figure 2 The non-index structure of optimal prioritization
optimization is practically important, due to their use in practice, and intellectually important,
due to the attention they have received in the literature on the optimization of queues.
We first asymptotically solve the TSF formulation which, despite its prevalence in practice, has
not been solved even in the simplest form of multi-class queues. We provide a simple staffing
expression based on the M/M/N queue that explicitly captures the parameters of the problem. We
prove that this staffing solution, together with a corresponding priority policy, is asymptotically
optimal when the volume of arrivals is high.
Since we are interested in better understanding the role of index rules, we proceed to show
that all asymptotically optimal solutions to the TSF formulation are, in particular, non-index: all
optimal prioritization policies share some non-conventional properties:
(i) Non-monotonicity of waiting time in congestion: For at least one customer class (be it VIP or
Regular), the waiting time has the structure depicted in Figure 2 – it is greater when the congestion
(specifically the total queue) in the system is moderate relative to when the congestion is high.
(ii) Discontinuity of waiting time in congestion: Customers arriving within a short time interval
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities4
may experience starkly different waiting times (this corresponds to the discontinuity point ? in
Figure 2).
Since, as we also prove, index rules generate waiting time profiles that are continuous and
monotone in the total congestion, we conclude that if one heuristically restrict attention to index
rules – even if one choose the “best” index functions – staffing costs are non-negligibly greater than
the minimum necessary.
Thus, while the family of index rules is seemingly rich, there is no instance of an index rule that is
asymptotically optimal (together with an appropriately chosen staffing level) for the TSF problem.
To better understand this gap, it is useful to point to an interesting phenomenon one finds when
considering the use of index rules in practice. Figure 3 displays the waiting-time processes1 on a
single day (May 10th, 2007) in a three-class Bank call center that uses the index rule in Figure
12.
Figure 3 Waiting-time processes in a large Israeli bank
A striking fact in this graph is that, sample path by sample path (we see a similar pattern
on other days), a perfect real-time ranking between the classes is preserved, that is, “Private”
customers wait less than their “Star” counterparts, whereas “Star” customers wait less than their
“Rainbow” counterparts. This reflects a rather strong notion of what it means to be a VIP customer
1 Specifically, for each 2.5 minutes interval the graph depicts the waiting time averaged over customers that arrivedover that specific time interval
2 We are thankful to the Technion Service Enterprise Engineering Laboratory (SEE Lab) for the data
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities5
– VIP customers wait less at all times (rather than, say, on average over a day or an hour). This
requirement can be mathematically formulated as
P{WN,π1w1
≤ WN,π2
w2
}≈ 1.
This is a perfect service-level differentiation. One can view service-level differentiation (SLD) on a
continuum parameterized by a constant β ∈ [0,1] and captured by the constraint
P{WN,π1w1
≤ WN,π2
w2
}≥ β (SLD(β))
The case of β = 0 corresponds to requiring that differentiation holds only in aggregate by TSF
formulation as in: on average at least 80% of VIP customers wait less than 20 seconds and at
least 80% of Regular customers wait less than 30 seconds – whereas β = 1 reflects a perfect real
time differentiation as observed in Figure 3. We label by TSF+SLD(β) a formulation that has
both the TSF constraint and the SLD(β) constraints.
The observation that perfect SLD is preserved in this data is generalizable. Perfect service-level
differentiation is a characterizing property of index policies. That is, requiring perfect service-level
differentiation but giving full freedom in the choice of the prioriziation policy is equivalent to
restricting attention to index rules in the TSF problem. More formally, the following two formula-
tions are asymptotically equivalent:
min N
s.t.P{WN,π1 >w1
}≤ α1,
(TSF + SLD(1)) P{WN,π2 >w2
}≤ α2,
P{WN,π1w1≤ W
N,π2w2
}= 1,
N ∈Z+, π ∈Π.
min N
s.t.P{WN,π1 >w1
}≤ α1,
⇔ (TSF + Index) P{WN,π2 >w2}≤ α2,N ∈Z+, π ∈Π(Index).
In words, restricting attention to index rules (as is often heuristically done) is asymptotically
equivalent to requiring perfect SLD (i.e, β = 1). Thus, whereas the a priori restriction of some
call centers to index rules may be driven by the software packages that they use, their choice is
consistent with optimality, provided that perfect SLD is desirable to them.
We close this introduction with a few important pointers to index rules in the literature. The use
of index-rule based heuristics in the context of staffing subject to QoS targets is not grounded in
theoretical foundation. However, it is the opposite case when considering holding cost (or waiting-
time cost) minimization for given staffing levels. The best known results in this context pertain
to the optimality of the Generalized cµ (Gcµ) rule for the minimization of convex holding costs
as pioneered by Van Mieghem [1995]. Waiting-cost minimization is also the subject of subsequent
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities6
extensions to Van Mieghem’s original work; see e.g. Mandelbaum and Stolyar [2004] and many oth-
ers that followed. Such rules were also shown to be optimal for some constrained staffing problems.
Average Speed of Answer (ASA) constraints give rise (as one possible solution) to a Fixed-Queue-
Ratio rule – a special case of an index rule; see Gurvich and Whitt [2007].
Thus, in certain contexts, these rules are very well understood. We expand this body of work
with some new insights about the role of index rules in constrained staffing problems. Constrained
staffing problems in multi-class queues have been also studied and we refer the reader to Aksin et al.
[2007] for some references. The typical paper in this literature seeks to identify one optimal (or
nearly optimal) solution. To that literature we add our solution to the TSF formulation. However,
questions as we raise here, that pertain to structural properties of the family of all asymptotically
optimal solutions, do not arise naturally in that line of work. They arise naturally here when we
seek to understand the role of index rules in staffing problems.
2. Model and analysis framework
We consider the two-class V model. Arrivals of class-i customers follow a Poisson process Ai =
(Ai(t), t≥ 0) with rate λi > 0; i= 1,2. Let λ= λ1 +λ2. The two Poisson processes are independent.
Service times are exponential with rate µ and are independent across customers and independent
of the arrival processes. Various metrics of interest will be superscripted by the staffing level N
and the prioritization policy π to denote their dependence on these decision variables.
The TSF and SLD constraints were defined in the introduction using a virtual waiting time
WN,πi . The Poisson Arrivals See Time Averages (PASTA) property guarantees that the fraction of
customers who wait more than a target w equals the fraction of time that the virtual waiting time
exceeds this value.
We substitute the virtual waiting time with a proxy - the local average waiting time, which is
based on the actual waiting times. Let wN,πi,k be the waiting time of the kth class-i customer to
arrive. The local average waiting time of customers arriving over an interval (t, t+ δ] is defined to
be,
WN,πδ,i (t) :=1
Ai(t+ δ)−Ai(t)
Ai(t+δ)∑k=Ai(t)+1
wN,πi,k , (1)
where 0/0 is defined to be 0.
Local average is the measure used in Figure 3 where δ is 2.5 minute. If δ is kept sufficiently small,
the local average captures the dynamics and variation of waiting time. The use of local average
(rather than virtual waiting time) helps us overcome a technical difficulty (see Lemma 5.1).
We assume throughout that
α1, α2 > 0, α1 ≤ α2 and α1 +α2 < 1. (2)
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities7
If α1 +α2 ≥ 1, the staffing problem trivializes and, even for small values of w1,w2 > 0, any staffing
level N >λ/µ is feasible; see the discussion on page page 285 of Gurvich et al. [2008],
We allow the controller to use information about the length of the queues, the elapsed service
times of customers that are in service and the accumulated waiting times of the all customers in the
queues and we let X(t) be the state descriptor that captures all this information; see the appendix
for the formal definition.
Definition 1 (admissible policies): A prioritization policy π is admissible if:
1. It does not use admission control – all customers are served.
2. It serves customers in a first-come first-served (FCFS) fashion within each class.
3. It is work conserving and stationary with respect to X.
Let Π be the family of admissible policies.
Non-preemption and FCFS within each class are both natural assumptions for our application.
The literature does have cases in which non-work-conserving policies are shown to be optimal or
asymptotically optimal (see Gurvich et al. [2008]). The results in Gurvich et al. [2008] are driven,
however, by order-of-magnitude differences between the probability targets of the various classes
(say α1 = 0.01 but α2 = 0.2) that lead inherently to solutions that induce perfect SLD rendering our
main research questions irrelevant. Work conservation has the following immediate consequence
which we state formally and repeatedly refer to.
Lemma 2.1 Fix the number of servers N , the service rate µ and the arrival rates λ1, λ2. Then,
under any admissible policy π, the total number of customers in the system XΣ(t) (respectively the
total queue QΣ(t)) follows the distribution of the total number of customers (respectively the queue
length) in an M/M/N queue with arrival rate λ= λ1 +λ2, service rate µ and N servers.
Many-server analysis: The problem TSF+SLD(β) is difficult to solve exactly. Instead, we resort
to identifying solutions that are asymptotically optimal as the system size grows. A series of solu-
tions is asymptotically optimal if the induced optimality gap is negligible in a central-limit-theorem
(CLT) scaling. This is consistent with many studies in the context of call-center optimization; see
e.g. Armony [2005], Armony and Mandelbaum [2011], Borst et al. [2004]. We consider a sequence
of V models with the total arrival rate λ= λ1 + λ2 increasing along the sequence but keeping the
service rate µ fixed. We assume that there exists ai > 0, i= 1,2 so that a1 + a2 = 1 and λi = aiλ
for i= 1,2 and all λ.
All relevant quantities are superscripted by λ to make the dependence on λ explicit. As is
standard in the literature, the targets w1 and w2 scale with λ: wλi =wi/
√λ , i= 1,2 where w̄i’s are
fixed strictly positive constants. This guarantees that the system is in the Halfin-Whitt regime (see
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities8
Lemma 5.5 below). The Halfin-Whitt regime facilitates a CLT type of analysis. In our numerical
experiments we illustrate how our proposed solutions perform for given parameters (rather than
asymptotically); see §4.
To simplify the notation, we denote a staffing-policy pair (N,π) by ξ and let N(ξ) and π(ξ) be,
respectively, the staffing and prioritization components of ξ. The problem TSF+SLD(β) is now
re-written as:
minξλ
N(ξλ)
s.t. P{W ξ
λ,λδ,1 (∞)>wλ1
}≤ α1,
P{W ξ
λ,λδ,2 (∞)>wλ2
}≤ α2, (3)
P
{W ξ
λ,λδ,1 (∞)wλ1
≤W ξ
λ,λδ,2 (∞)wλ2
}≥ β,
ξλ ∈Z+×Π.
W ξλ,λ
δ,i (∞) should be read as the the local average (as in (1)) when the system is initialized (at
t = 0) with its steady-state distribution, the customer incoming rate is λ and the staffing-policy
pair ξλ is used.
Definition 2 (asymptotic feasibility): A sequence of staffing-policy pairs{ξλ}
is asymptotically
feasible, if ξλ ∈Z+×Π for all λ and for each � > 0,
limsupδ→0
limsupλ→∞
P{W ξ
λ,λδ,i (∞)>wλi (1− �)
}≤ αi(1 + �), i= 1,2,
and
limsupδ→0
limsupλ→∞
P
{W ξ
λ,λδ,1 (∞)wλ1
≤W ξ
λ,λδ,2 (∞)wλ2
+ �
}≥ β(1− �).
We say that a sequence {Nλ} is an asymptotically feasible sequence of staffing levels for (3) if
Nλ =N (ξλ) for a sequence {ξλ} of asymptotically feasible staffing-policy pairs.
Definition 3 (asymptotic optimality): A sequence of staffing-policy pairs {ξλ} is asymptoti-
cally optimal if it is asymptotically feasible and[N(ξλ)−N(ξ̃λ)
]+= o
(√λ)
as λ→∞,
for any other sequence {ξ̃λ} of asymptotically feasible staffing-policy pairs.
We say that a sequence {Nλ} is an asymptotically optimal sequence of staffing levels for (3) if
Nλ =N (ξλ) for an asymptotically optimal sequence {ξλ} of staffing policy pairs.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities9
3. The proposed solution to TSF+SLD and its implications
This section contains our main results. We identify an asymptotically optimal staffing and priori-
tization solution. The two components of our solution are inter-dependent. The proposed staffing
level is feasible if one uses it together with our proposed prioritization policy. For simplicity of
exposition, we discuss them separately (each in a dedicated subsection), starting with the staffing
component. Theorems stated in this section are proved in §5 and in the technical supplement Soh
and Gurvich [2015].
3.1. Optimal staffing and the cost of SLD
To specify our staffing recommendation, let QN,λ(∞) be a random variable whose distribution is
that of the steady-state queue length in an M/M/N queue with arrival rate λ, service rate µ and
N servers (as µ is fixed throughout, we omit it from the superscript).
For each λ, let Nλ? be the solution of the following M/M/N staffing problem
Nλ? = min{N ∈Z+ : P
{QN,λ(∞)≥ λ1wλ1 +λ2wλ2
}≤min{α1,1−β}+α2
}. (4)
Notice that, by (2), min{α1,1−β}+α2 < 1 holds
Theorem 1 (asymptotically optimal staffing) The sequence {Nλ? } is an asymptotically opti-
mal sequence of staffing levels for (3).
The prioritization component π (ξλ) of ξλ, constructed in the next subsection, guarantees that all
the inequalities W ξλ,λ
δ,1 (∞)≤wλ1 , Wξλ,λδ,2 (∞)≤wλ2 and W
ξλ,λδ,1 (∞)/w1 ≤W
ξλ,λδ,2 (∞)/w2 are asymp-
totically satisfied when the total queue length is smaller than λ1wλ1 + λ2w
λ2 . Thus, the choice of
the staffing level (4) guarantees that the total violation is bounded by min{α1,1− β}+ α2. The
challenge in designing the priorities is to make sure that this “violation budget” is distributed
correctly between the three constraints; for example, the TSF constraint for class 2 absorbs at
most α2 of the total violation budget.
Remark 1 (aggregate-based staffing) Theorem 1 maps TSF+SLD(β) into a staffing problem for
a single class queue. The fact that this solution works for the multiclass queue will be supported
by our careful choice of the prioritization rule. Such a reduction to a one dimensional model is a
recurring theme in recent literature on staffing; see e.g. Armony and Mandelbaum [2011], Gurvich
et al. [2008], Gurvich and Whitt [2007]. In the latter two papers, a global Average Speed of Answer
(ASA) constraint allows for a relatively simple mapping of the parameters into the single-class
staffing problem. This mapping is more subtle in our setting as is reflected in the non-trivial way
in which the constants α1, α2 and β appear on the right-hand side of (4).
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities10
Figure 4 η∗(β) as a function of β for α1 = 0.3, α2 = 0.2, w̄1 = 2 and w̄2 = 4
Theorem 2 The sequence {Nλ? } satisfies
Nλ? =λ
µ+ η?
√λ
µ+ o(√λ),
where η? is convex increasing in the SLD degree β. It is strictly increasing for β ≥ 1−α1.
Increasing the degree of SLD beyond a certain level (in particular, setting β = 1) is thus costly: the
optimal staffing level for the TSF formulation is too small for TSF+SLD(1). Conversely, relaxing
the SLD requirement reduces the staffing costs.
3.2. Prioritization and the optimality of index policies
The asymptotically optimal staffing {Nλ? } is coupled with a carefully chosen sequence of policies
{πλ?} such that the sequence of staffing-policy pairs {ξλ = (Nλ? , πλ? )} is asymptotically optimal in
the sense of Definition 3. The policy that we construct in this section is an instance of so-called
tracking policies.
Tracking policies defined below are also admissible policies (customers are served in a FCFS
manner within the same class). The queue length of class i at time t for i= 1,2 is denoted by Qλi (t)
and QλΣ (t) is the total queue length, i.e., QΣ (t) = Q1 (t) +Q2 (t). From these, normalized queue
lengths are defined as follows:
Q̂i(t) :=Qi(t)√λ, Q̂Σ(t) := Q̂1(t) + Q̂2(t).
Definition 4 (queue tracking policies) Let f = (f1, f2) : R+→ R2+ be a non-negative function
such that f1(x) + f2(x) = x for all x ≥ 0. A server that becomes available at time t chooses the
queue to serve according to the following criterion:
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities11
Figure 5 Tracking: (Left) General description (Right) Proposed tracking function under sub-perfect SLD (β < 1)
1. If one queue is empty but the other is not, admit to service the customer at the head of the
non-empty queue.
2. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) 6= Q̂2(t)− f2(Q̂Σ(t)), choose the class i
with the positive value of Q̂i(t)− fi(Q̂Σ(t)).
3. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) = Q̂2(t)− f2(Q̂Σ(t)), choose class 1.
An arriving customer is immediately assigned to an available server if there are such servers. If all
servers are busy upon this customer’s arrival, the customer is placed in the queue.
Figure 5(Left) illustrates the tracking function mechanism. It plots the targeted value for Q̂1
(f1(Q̂Σ)) vs. the value of Q̂Σ. If at time t, Q̂1(t)> f1(Q̂Σ(t)) (as in point A in the figure) the rule
prioritizes class 1 so as to pull it back toward f1(Q̂Σ(t)). If Q̂1(t) < f1(Q̂Σ(t)) as in point B in
the graph, the rule prioritizes class 2 over class 1 so as to push queue 1 towards f1(Q̂Σ(t)). The
tracking policy targets Q̂i(t)≈ fi(Q̂Σ(t)).
The family of tracking functions is immense. The challenge is to carefully choose the tracking
function f so that together with our proposed staffing the solution is asymptotically optimal (in
particular, asymptotically feasible).
Optimal tracking functions: We use the tracking function f?1 defined below, as plotted also in
Figure 5(Right):
f?1 (x) =
0, 0≤ x< a1w̄1+a2w̄2a1w̄1
κ,a1w̄1
a1w̄1+a2w̄2x−κ, a1w̄1+a2w̄2
a1w̄1κ≤ x< a1w̄1 + a2w̄2− 2
(1 + a1w̄1
a2w̄2
)κ,
2a1w̄1+a2w̄22(a1w̄1+a2w̄2)
x− a2w̄22, a1w̄1 + a2w̄2− 2
(1 + a1w̄1
a2w̄2
)κ≤ x< a1w̄1 + a2w̄2,
x− a2w̄2 +κ, a1w̄1 + a2w̄2 ≤ x< q̄,a1w̄1−κ, q̄≤ x.
(5)
and f?2 (x) = x− f?1 (x).
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities12
The threshold q̄ is given by
q̄= a1w̄1 + a2w̄2 +log{α2+min{α1,1−β}
α2
}η? (β)
(6)
where η?(β) is the unique solution (keeping other parameters constant) to(1 +
ηΦ(η)
φ(η)
)−1exp(−η(a1w̄1 + a2w̄2)) = min{α1,1−β}+α2, (7)
Here, φ(·), Φ(·) are, respectively, the standard normal density and distribution functions. The
constant κ can be any
0
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities13
The choice of the parameter q̄, combined with the optimal staffing in Theorem 1, guar-
antees that P{Q̂ξ
λ,λΣ (t)∈ [a1w̄1 + a2w̄2, q̄)
}≈ min{α1,1−β} and P
{Q̂ξ
λ,λΣ (t)∈ [q̄,∞)
}≈
α2. Then, P{W ξ
λ,λδ,1 (t)>w1
}≈ P
{W ξ
λ,λδ,1 (t)/w
λ1 >W
ξλ,λδ,2 (t)/w
λ2
}≈ min{α1,1−β} and
P{W ξ
λ,λδ,2 (t)>w2
}≈ α2 hold and the solution is feasible. In the following theorem, πλf? denotes
the tracking policy which applies the tracking function f?.
Theorem 3 (asymptotic optimality) Let Nλ? be as in (4). Then, the sequence of staffing-policy
pairs {ξλ = (Nλ? , πλf?)} is asymptotically optimal.
Remark 2 (on the interplay of cost reduction and prioritization) There are degrees of freedom in
choosing the tracking function – there are various choices that generate the same asymptotic opti-
mality result. We chose f? so that SLD is violated only when the total queue is small (specifically,
less than q̄). With this choice, f? has the appealing property that class-1 (the VIP) customers get
superior service when the total congestion in the system is large. Thus, reducing β (from β < 1)
allows for a cost reduction in terms of staffing (recall Theorem 2) while making sure that the VIP
customers get superior service when it really matters.
One thing that may strike the reader as non-standard in Figure 5(Right) is the discontinuity and
non-monotonicity of the tracking function f?. This is not a consequence of the specific way in which
we choose the functions – as the next theorem shows it cannot be avoided without compromise to
optimality.
For the formal statement, we say that a tracking policy is non-monotone if at least one of the
functions f1(·) and f2(·) is non-monotone. A function f is defined to be piecewise Lipschitz if it has
a finite number of discontinuity points {x1, . . . , xm,} and is Lipschitz continuous on each interval
(including [0, x1) and [xm,∞)).
Theorem 4 (non-monotonicity and discontinuity) Assume β < 1 (sub-perfect SLD) and let
f be a piecewise Lipschitz tracking function such that the sequence {(Nλ? , πλf )} is an asymptotically
optimal sequence of staffing-policy pairs. Then, f is discontinuous and non-monotone.
The optimal staffing allows for a limited “budget” that must be carefully allocated to the three
different constraints (the two TSF constraints and the SLD constraints). For different values of Q̂Σ
one must choose which constraints are “sacrificed” in favor of meeting the others. What we prove
is that the only way to do that while minimizing the total budget is to alternate the priorities and
this, in particular, introduces the non-monotonicity and the discontinuity that are stated in the
theorem.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities14
On the equivalence of index policies and perfect SLD: Theorem 4 proves that, asymptotically,
optimal tracking functions must be discontinuous and non-monotone for β < 1. That is, imposing
monotonicity (which is a property of index policies; see below) is related to requiring perfect SLD
(β = 1). We next show that requiring perfect SLD and restricting the tracking functions to be
monotone is indeed equivalent in the asymptotic sense. More importantly, proving that monotone
tracking rules are equivalent to index policies, allows us to conclude the equivalence between
imposing a perfect SLD constraint and restricting the optimization to use only index policies.
To formalize this result, recall that an index policy is one that, upon service completion at time
t, admits to a customer from class
i∗ = arg maxi
gi(Qi(t)),
where g1 and g2 are non-negative increasing functions. We denote by Π(index)⊂Π the family ofindex policies.
Theorem 5 (equivalence of index policies and SLD(1) ) There exists a sequence of staffing-
policy pairs {ξλ = (Nλ, πλ)} that is asymptotically optimal for both formulations:
minξλN(ξλ)
s.t.P{W ξ
λ,λδ,1 (∞)>wλ1
}≤ α1,
(TSF + SLD(1)) P{W ξ
λ,λδ,2 (∞)>wλ2
}≤ α2,
P
{Wξλ,λδ,1
(∞)
wλ1≤
Wξλ,λδ,2
(∞)
wλ2
}= 1,
ξλ ∈Z+×Π.
minξN(ξλ)
s.t.P{W ξ
λ,λδ,1 (∞)>wλ1
}≤ α1,
(TSF + Index) P{W ξ
λ,λδ,2 (∞)>wλ2
}≤ α2,
ξλ ∈Z+×Π(index).
In particular, letting N∗,λ1 be the objective function value of TSF+SLD(1) and N∗,λ2 be that of
TSF+Index, we have that |N∗,λ1 −N∗,λ2 |= o(
√λ).
Theorem 5 is argued in three steps stated in Propositions 6-8. The first, Proposition 6, shows that
index policies and monotone tracking policies are separate mathematical representations of the
same policy and the proof is in 5.2.1.
Proposition 6 (equivalence of index policies and monotone tracking policies ) Given a
monotone tracking function f , there exists an index function g such that the monotone tracking
policy with f is equivalent to the index policy with index g. Similarly, given an index function g,
there exists a monotone tracking function f such that the index policy with g is equivalent to the
tracking policy with f . The equivalence is in the sense that at any given state of (Q1 (t) ,Q2 (t)),
both policies take the same action.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities15
Given this equivalence, to prove Theorem 5, it remains to shows that TSF+SLD(1) is equivalent
to the following:
minξλ N(ξλ)
(TSF + MT) s.t. P{W ξ
λ,λδ,1 (∞)>wλ1
}≤ α1,
P{W ξ
λ,λδ,2 (∞)>wλ2
}≤ α2,
ξλ ∈Z+×Π(MT) ,
where Π(MT) ⊂ Π is the family of monotone tracking policies. Note that a monotone tracking
function must also be continuous by x= f1(x) + f2(x). It is easy to prove. For an arbitrary point
x1 > 0 and � > 0, f1(x1)≤ f1(x1 + �)≤ f1(x1) + �. The last inequality is from x1 + �− f1(x1 + �) =
f2(x1 + �)≥ f2(x1) = x1− f1(x1). We see as �→ 0, f1(x1 + �)→ f1(x1). The cases for � < 0 and for
f2 are verified similarly.
The equivalence is proved in the next two propositions. The first, Proposition 7, proves that
there exists a monotone tracking policy which, with the staffing level from Theorem 1, satisfies the
constraints in TSF+SLD(1).
Proposition 7 (nearly optimal monotone solutions for TSF+SLD(1) ) For any ϑ > 0,
there exists a monotone tracking function fϑ and a sequence {Nλϑ} such that {(Nλϑ , πλfϑ)} is asymp-
totically feasible for TSF+SLD(1) and |Nλϑ −Nλ? | ≤ ϑ√λ.
The nearly optimal tracking function, whose existence is established in Proposition 7, is depicted
in Figure 6 and explicitly specified as follows:
fϑ1 (x) =
0, 0≤ x< a1w̄1+a2w̄2
a1w̄1κϑ,
a1w̄1a1w̄1+a2w̄2
x−κϑ, a1w̄1+a2w̄2a1w̄1
κϑ ≤ x< a1w̄1 + a2w̄2,a1w̄1−κϑ, a1w̄1 + a2w̄2 ≤ x.
In the figure the constant qϑb is the point of intersection between the tracking function fϑ1 (x) and
the line x− a2w̄2. By (8), W ξλ,λ
δ,1 (t)≤wλ1 , Wξλ,λδ,2 (t)≤wλ2 and W
ξλ,λδ,1 (t)/w
λ1 ≤W
ξλ,λδ,2 (t)/w
λ2 hold at
times in which Q̂ξλ,λ
Σ (t)≤ qϑb and only the constraint Wξλ,λδ,2 (t)≤w2 is violated when Q̂
ξλ,λΣ (t)> q
ϑb .
Notice in Figure 6 that fϑ2 (x) = x− fϑ1 (x)>a2w̄2 when x> qϑb .
Proposition 8 closes the equivalence argument toward Theorem 5 in showing that, in terms
of staffing, requiring monotonicity is as demanding as requiring SLD(1). Since we identified in
Proposition 7 that a monotone tracking solution is asymptotically optimal for TSF+SLD(1), we
conclude that TSF+MT and TSF+SLD(1) are asymptotically equivalent and, by Proposition 6,
so are TSF+index and TSF+SLD(1).
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities16
Figure 6 Proposed tracking function under perfect SLD (β = 1)
Proposition 8 (equivalence of monotone tracking and SLD(1) under TSF constraints)
Let{ξλ1}
be a series of asymptotic solutions for TSF+SLD(1) and{ξλ2}
be one for TSF+MT.
Then,
lim infλ→∞
N (ξλ2 )−N (ξλ1 )√λ
≥ 0.
Staffing
TSF+Gcµ TSF+SLD(𝛽)
𝛽
TSF+Index TSF+SLD(β)
Figure 7 Optimal staffing levels
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities17
Remark 3 (the cost of index policies) Figure 7 displays the minimal staffing requirement to satisfy
the TSF constraints P{W1 ≤ 10 sec.} ≤ 0.2 and P{W2 ≤ 10 sec.} ≤ 0.2, when the arrival rate of
each class is 50 customers per minute and the average service time is 5 minutes. The asymptotic
optimal staffing is derived using Theorem 1. When adding the SLD constraint, the cost changes
with the value of β in the SLD constraint as β varies between 0.65 and 1. This is is captured by
the solid line. When, instead of having the SLD constraint, one restricts attention to index policies
the staffing level is always 517 (captured by the dashed line). The gap between the two lines is the
cost of the restriction to index policies. For β = 1, consistent with our equivalence result, there is
no gap. For small value of β the gap is 7 servers, corresponding to a non-negligible gap relative to
the square root of the system size.
4. Numerical experiments
The purpose of this section is to illustrate the performance of the solutions we derived via the
asymptotic analysis. We use this opportunity to underscore some important aspects of the proposed
(nearly optimal) staffing rule.
Altogether, we consider 40 parameter combinations (4 sets, each consisting of 10 cases). Across
cases in each set, we vary the arrival rates λ1, λ2 and the target parameters w1,w2 as in Table 1.
Note that these parameter combinations are the same across sets, i.e., the parameter combinations
for Scenario k of Set i is the same with that of Scenario k of Set j. We also vary α1, α2 for the TSF
constraints and the SLD target β for the SLD constraint as in Table 2. Set 1 and Set 2 differ in
that Set 1 has no SLD constraint (or β = 0). Set 1 and Set 3 differ in that Set 3 has smaller α1’s
(tighter TSF for class 1). Lastly, Set 3 and Set 4 differ in that Set 4 requires perfect SLD).
Scenario 1 2 3 4 5 6 7 8 9 10λ1 125 100 250 200 500 400 500 500 500 500λ2 125 150 250 300 500 600 500 500 500 500w1 0.2 0.2 0.1 0.1 0.05 0.05 0.025 0.025 0.1 0.1w2 0.3 0.3 0.15 0.15 0.07 0.07 0.03 0.03 0.2 0.2
Table 1 Common parameters for the whole sets
The mean service time is set to 1 time unit throughout (all parameters are specified in the same
time units, so that λ= 125 for example corresponds to a rate of 125 per time unit, be it an hour
or a minute). The number of servers (staffing) is determined via our formula (4)
Nλ? (λ1, λ2,w1,w2, α1, α2, β) = min{N ∈Z+ : P
{QN,λ(∞)≥ λ1wλ1 +λ2wλ2
}≤min{α1,1−β}+α2
}.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities18
Scenario 1 2 3 4 5 6 7 8 9 10α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 1β 0 (No SLD constraint)α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 2β 0.925 0.925 0.925 0.925 0.925 0.925 0.925 0.95 0.925 0.95α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 3β 0 (No SLD constraint)α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 4β 1 (Perfect SLD constraint)
Table 2 Probability targets for each set
Given Nλ? (different for each parameter combination), we compute η= (Nλ? − (λ1 +λ2)/µ)/
√λ and
use it to design the propopsed tracking functions as in (5). This gives us the tracking policy which,
as before, we denote by π.
We built a simulation model in ARENA. For each parameter combination, we simulate a sample
path of the two-class queue with the proposed solution (Nλ? , π)(λ1, λ2,w1,w2, α1, α2, β) for T =
40000 time units–corresponding to 8-40 million arrivals depending on the parameter combination.
At the end of each simulation run, we record 1Ai(T )
∑Ai(T )k=1 1{wi,k >wi(1− �)} with Ai(T ) being the
number of class i customers that arrived over the simulation horizon and wi,k is the actual waiting
time of the kth class i customer processed. Since
1
Ai(T )
Ai(T )∑k=1
1{wi,k >wi(1− �)}→ P{Wi(∞)>wi(1− �)} as T →∞,
this statistic provides a proxy for the performance of the policy with respect to the TSF constraints.
Our asymptotic theory says that under the proposed solution it holds that limsupλ→∞ P{W λi (∞)>
wλi (1 − �)} ≤ αi(1 + �)(i = 1,2) and hence we check whether 1Ai(T )∑Ai(T )
k=1 1{wi,k > wi(1 − �)} ≥
αi(1 + �) holds or not for each i.
For the SLD constraint, we fix δ= 0.05 and compute at intervals of size δ, Wδ,i(t) as in equation
(1). Specifically, for each m= 1, . . . ,M = 40000/0.05 = 800000, we compute
Wδ,i (mδ) :=1
Ai(mδ+ δ)−Ai(mδ)
Ai(mδ+δ)∑k=Ai(mδ)+1
wi,k.
With the fixed δ, the long run average
1
M
M∑m=1
1
{Wδ,1(mδ)
w1≤ Wδ,2(mδ)
w2(1 + �)
}→ P
{Wδ,1(∞)w1
≤ Wδ,2(∞)w2
(1 + �)
}as M →∞.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities19
Figure 8 Set 1(TSF Formulation (β = 0)): (Left) Performance with recommend staffing; (Right) With one server
added for each violating scenario
serves as a proxy for the SLD and we check, in the simulation, whether it exceeds β(1− �) or not.
We set �= 0.1 which is a reasonably ambitious criterion. The math says that as λ grows large with
appropriate scaling we could take � to be increasingly smaller.
Figure 8 depicts the results for Set 1. We plot two bar series corresponding to
the realized TSFs ( 1Ai(T )
∑Ai(T )k=1 1{wi,k > wi(1 − �)}(i = 1,2)) and the realized SLD
( 1M
∑Mm=1 1
{Wδ,1(mδ)
w1≤ Wδ,2(mδ)
w2(1 + �)
}) is captured by the black-circle series for each of the sce-
narios. Notice that there is no target on the SLD but, still, we can compute the outcome.
The labels report the values (the TSF series correspond to the left y-axis and the SLD to the
right y-axis). Colored in white are the metrics that do not meet the feasibility criteria αi(1 + �)
of the TSF constraints. The performance is rather impressive. There are 5 violations (out of 20
constraints) and with the exception of the violations of class 1 in scenario 6 and of class 2 in
scenario 7 (0.345 instead of 0.33 and 0.229 instead of 0.22), the others are truly small (as in 0.332
instead of the targetted 0.33 in scenario 5 or 0.221 instead of 0.22 in scenario 10). Scenarios 5-8
have the smallest wi targets (e.g. w1 = 0.025 and w2 = 0.03) and hence the most sensitive. The
violations are resolved with the addition of a single server as plotted in the graph on the right of
the figure.
Notably, the SLD is very high (indeed, around 0.7 or higher) even though we imposed no SLD
constraint. This is an opportunity to recall Figure 7 where the required number of servers is
invariant to β as long as 1− β ≥ α1. With a target of α1 = 0.3 this means one gets “for free” an
SLD of up to β = 0.7 (when α1 = 0.3) and up to β = 0.8 (when α1 = 0.2 as in Scenario 8 and 10 of
Table 2). Our policy uses this allowance.
We run 10 simulations with newly computed staffing levels and the policies for Set 2 to consider
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities20
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
1
2
3
4
5
6
7
8
9
10
b0 bg0
Normalized safety capacity =
Set 2 ( ) Set 1 ( )
Figure 9 (Left) Performance with proposed staffing for Set 2 (TSF+SLD(β)); (Right) The cost of SLD
TSF+SLD(β) with positive β’s. The results are reported in Figure 9. In this case, the TSF criteria
are met in all 10 scenarios with the exception of minor violations in the first two scenarios (that
disappear with the addition of one server). The SLD is mostly around 0.9 which is close to the
target and always above the criterion of β(1− �). We notice again that α1 matters only throughmin{α1,1− β}: For a given β target the solution can afford to have α1 as small as 1− β and thepolicy uses some of this allowance. Thus, even though α1 >α2, the performance obtained is such
that P{W1 >w1}< P{W2 >w2} in all scenarios.Finally, on the right of Figure 9 we visualize the cost difference of TSF vs. TSF+SLD(β) by
comparing the staffing level for TSF that underlies the graph on the right of Figure 8 and the
staffing level of TSF+SLD(β) that underlies the left graph of Figure 9. We plot the normalized
safety capacity (i.e., the servers required above the offered load λ/µ.) These numbers are proxies
for the square-root coefficient η∗ in Theorem 2 and capture the cost of SLD.
The parameter combinations in Set 3 are obtained from Set 1 by introducing more demanding
values of α1 (e.g. 0.15 instead of 0.3). It underscores further the interaction between α1 and β. The
results are reported in Figure 10. There are negligible violations (0.166 instead of 0.165) for class
1 in scenarios 5 and 9. The more substantial violations are in scenarios 3,7. The violations are
concentrated in scenarios with the smallest wi (w1 = 0.025 and w2 = 0.03). An extra staffing corrects
for some of this but leaves slight violations for class 1. This should not be surprising: with very
small w and α values the system is pushed away from heavy-traffic and hence the approximations
are somewhat less precise. Finally, even though β = 0, our policy again extracts as much SLD as
it can get “for free” which is 1−α1 (0.85 or 0.9 depending on the scenario).In Set 4, we consider the case of perfect SLD (β = 1): we use the same data from Set 3 but
put β = 1 as a requirement in all scenarios. The policy is a monotone policy as in Figure 6. The
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities21
Figure 10 Set 3 (TSF (β = 0), tight α): (Left) Performance with recommended staffing; (Right) With one server
added for each violating scenario
performance is reported in Figure 11. The TSF targets are met with excess under the recommended
staffing and the SLD is very close to 1 for all values. The graph on the right is a visualization of
SLD and the workings of the asymptotic arguments. One informally expects (and our proofs are
built on formally establishing that) Wδ,i(t)/wi ≈Wi(t)/wi ≈Qi(t)/λiwi because of the sample pathLittle’s law and since δ is small. With SLD close to 1 we expect
Q1(t)
λ1w1≤ Q2(t)λ1w2
, t≥ 0.
The graph on the right nicely visualizes this asymptotic argument. We simulate Scenario 1 (λ1 =
λ2 = 125) of Set 4 and zoom-in on the time interval [460,480] to obtain a clear view.
460 470 480
1.5
1
0
0.5
Simulation time
Figure 11 Set 4 (TSF + Perfect SLD (β = 1)): (L) Performance with recommended staffing (R) samplepath SLD
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities22
5. Proofs
This section is dedicated to the proofs of our main results. §5.1 proves Theorem 3 which is on the
asymptotic optimality of our proposed staffing and policy. §5.2 contains proofs of the theorems
that deal with the equivalence of index policies and perfect SLD. The proofs of other theorems and
auxiliary lemmas are in Soh and Gurvich [2015].
5.1. Proof of Theorem 3:
We divide the lengthy proof into sub-modules as follows: we first prove an equivalence between
the waiting time formulation (3) and a queue-based formulation. This may be somewhat expected
given (8). The challenge, however, is to establish such a “Little’s law” result at the formulation
level (rather than for a given policy as is typically the case). It is in this step where considering
local averages is instrumental. Building on this asymptotic equivalence, a lower bound for the
local-average-queue-based formulation provides an asymptotic lower bound for the waiting-time
formulation. We identify the staffing recommendation Nλ? in (4) as such a bound; see §5.1.1.
In §5.1.2 we proceed to show that the tracking policy with our proposed function f? indeed has
the desired asymptotic properties in that, when used, it results in Q̂ξλ,λi (t)≈ f?i (Q̂
ξλ,λΣ (t)). Whereas
this type of so-called state-space collapse is, by now, a standard result, a challenge arises here from
the possible discontinuity of the tracking function f?.
We combine the pieces in §5.1.3 to have the proof of Theorem 3. We prove there that our proposed
solution is asymptotically feasible. Since its staffing component (and, in particular, its cost) is a
lower bound on any such solutions, we conclude that our solution is asymptotically optimal.
5.1.1. From waiting times to queues Before directly showing the relationship of (8), we
introduce local average queue lengths which connect local average waiting times and queue lengths:
Qξλ,λδ,i (t) =
1
δ
∫ t+δt
Qξλ,λi (s)ds. (9)
The following lemma shows that if a sequence {Nλ? } is asymptotically feasible for (3) it must
(asymptotically) satisfy the appropriate constraints in terms of queues and vice versa. We let
Qξλ,λδ,i (∞) be the δ average when the queue length is initialized (at time t= 0) with its stedy-state
distribution.
Lemma 5.1 Let {ξλ} be a sequence of admissible staffing-policy pairs. Then the followings are
equivalent.
1. For each �1 > 0, the following holds.
limsupδ→0
limsupλ→∞
P{W ξ
λ,λδ,1 (∞)>wλ1 (1− �1)
}≤ α1 + �1,
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities23
limsupδ→0
limsupλ→∞
P{W ξ
λ,λδ,2 (∞)>wλ2 (1− �1)
}≤ α2 + �1,
limsupδ→0
limsupλ→∞
P
{W ξ
λ,λδ,1 (∞)wλ1
≤W ξ
λ,λδ,2 (∞)wλ2
− �1
}≥ β− �1.
2. For each �2 > 0, the following holds.
limsupδ→0
limsupλ→∞
P{Qξ
λ,λδ,1 (∞)>λ1wλ1 (1− �2)
}≤ α1 + �2,
limsupδ→0
limsupλ→∞
P{Qξ
λ,λδ,2 (∞)>λ2wλ2 (1− �2)
}≤ α2 + �2,
limsupδ→0
limsupλ→∞
P
{Qξ
λ,λδ,1 (∞)λ1wλ1
≤Qξ
λ,λδ,2 (∞)λ2wλ2
− �2
}≥ β− �2.
Lemma 5.1 proves that replacing the waiting-time constraints with queue-based constraints
should yield (asymptotically) identical results in terms of optimality. Thus, we turn to study a
queue-based formulation starting with identifying a lower bound. To that end, given �, consider
the problem
minξλ
N(ξλ)
s.t. P{Qξ
λ,λδ,1 (∞)>λ1wλ1 (1− �)
}≤ α1 + �,
P{Qξ
λ,λδ,2 (∞)>λ2wλ2 (1− �)
}≤ α2 + �, (10)
P
{Qξ
λ,λδ,1 (∞)λ1wλ1
≤Qξ
λ,λδ,2 (∞)λ2wλ2
− �
}≥ β− �.
We next state our lower bound result. Recall that QN,λ(∞) is a random variable with the
stationary distribution of the queue length in an M/M/N queue with arrival rate λ, service rate
µ and N(> λ/µ) servers. As before, the variable QN,λδ (∞) is the local average on [0, δ] when the
M/M/N queue is initialized with its steady-state distribution.
Proposition 9 Fix λ, δ, � > 0 and let ξλδ,� be a feasible solution for (10).
Define
Nλ? (δ, �) = min{N ∈Z+ : P
{QN,λδ (∞)≥ λ1wλ1 +λ2wλ2
}≤min{α1,1−β}+α2 + 2�
}.
Then, Nλ? (δ, �)≤N(ξλδ,�), where N(ξλδ,�)
is the staffing component of ξλδ,�.
Proof: Define the following events on the underlying probability space:
A :={Qξ
λ,λδ,1 (∞) +Q
ξλ,λδ,2 (∞)≤ λ1wλ1 +λ2wλ2
},
B :={Qξ
λ,λδ,1 (∞) +Q
ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q
ξλ,λδ,2 (∞)>λ2wλ2
},
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities24
C :={Qξ
λ,λδ,1 (∞) +Q
ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q
ξλ,λδ,2 (∞)≤ λ2wλ2
}.
These events form partitions of the space. Further, on C,
Qξλ,λδ,1 (∞) > λ1wλ1 +λ2wλ2 −Q
ξλ,λδ,2 (∞)
=
(λ1w
λ1 +λ2w
λ2
λ2wλ2
)(λ2w
λ2 −Q
ξλ,λδ,2 (∞)) +
λ1wλ1
λ2wλ2Qξ
λ,λδ,2 (∞)
≥ λ1wλ1
λ2wλ2Qξ
λ,λδ,2 (∞) .
Also it is evident that Qξλ,λδ,1 (∞)>λ1wλ2 on C. In turn,
C ⊆
{Qξ
λ,λδ,1 (∞)λ1wλ1
>Qξ
λ,λδ,2 (∞)λ2wλ2
}∩{Qξ
λ,λδ,1 (∞)>λ1wλ2
}. (11)
Using these definitions and the assumed feasibility of the staffing-policy pair,
P{B} = P{Qξ
λ,λδ,1 (∞) +Q
ξλ,λδ,2 (∞)>λ1wλ1 +λwλ2 ,Q
ξλ,λδ,2 (∞)>λ2wλ2
}≤ P
{Qξ
λ,λδ,2 (∞)>λ2wλ2
}≤ P
{Qξ
λ,λδ,2 (∞)>λ2wλ2 (1− �)
}≤ α2 + �, .
and
P{C} ≤ min
{P
{Qξ
λ,λδ,1 (∞)λ1wλ1
>Qξ
λ,λδ,2 (∞)λ2wλ2
},P{Qξ
λ,λδ,1 (∞)>λ1wλ2
}}
≤ min
{P
{Qξ
λ,λδ,1 (∞)λ1wλ1
>Qξ
λ,λδ,2 (∞)λ2wλ2
− �
},P{Qξ
λ,λδ,1 (∞)>λ1wλ2 − �
}}.
≤ min{α1 + �,1−β+ �}..
where the first inequality follows from (11). Hence
P{Qξ
λ,λδ,1 (∞) +Q
ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2
}= P{B ∪C}= P{B}+P{C} ≤min{α1,1−β}+α2 + 2�.
Since the policy is work conserving, Qξλ,λδ,1 (∞)+Q
ξλ,λδ,2 (∞) has the law of Q
N,λδ (∞) with N =N(ξλ).
Nλ? (δ, �) is the minimum staffing among the ones that satisfy above and hence Nλ? (δ, �)≤N(ξλδ,�).
�
Note that Nλ? (δ, �) is closely related to our proposed staffing component Nλ? in (4). The following
lemma guarantees that the staffing levelNλ? in (4), combined with a feasible policy, is asymptotically
optimal.
Lemma 5.2
limsup�→0
limsupδ→0
limsupλ→∞
Nλ? (δ, �)−Nλ?√λ
≥ 0.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities25
5.1.2. Asymptotic tracking We show that the local average queue lengths actually “tracks”
the tracking function given by (5). The following lemma is a corollary of a result in Gurvich and
Whitt [2009a].
Lemma 5.3 Let f be a Lipschitz continuous function and let {ξλf } be a sequence of staffing-policy
pairs where the staffing N(ξλf ) follows square-root scaling, i.e., N(ξλf ) = λ/µ+ η
√λ/µ+ o(
√λ) for
some η > 0 and a tracking policy constructed by f is used. For each � > 0, there exists δ′ > 0 such
that for all δ with 0< δ≤ δ′ the following holds.
limsupλ→∞
P{∣∣∣∣Q̂ξλf ,λδ,i (∞)− Q̂ξλf ,λi (∞)∣∣∣∣> �}< �.
Notably, Lemma 5.3 requires the smoothness of the tracking functions which is clearly violated
by our tracking function f?. Instead, given a piecewise Lipschitz function we will construct two
other tracking policies that provide lower and upper bounds for our tracking policy and do satisfy
the smoothness requirements.
For the following, fix a piecewise Lipschitz tracking function f . Let Df := {d1, . . . , dn} be the set
of discontinuity points of f (recall that this is a finite set). Also, let
D↑f = {di ∈Df : f1 (di+)− f1 (di−)≥ 0} and D↓f = {di ∈Df : f1 (di+)− f1 (di−)< 0}
Given ς < min{d1, d2− d1, ..., dn− dn−1}/2 (ς will be adjusted within our proofs), define the
following tracking functions fς
and f ς so that fς
1 (x)≥ fς
1(x) and f
ς
2 (x)≤ fς
2(x) for all x≥ 0:
fς
1 (x) =
f (di− ς)
(x−di+ς
ς
)+ f (di+)
(di−xς
), di− ς ≤ x≤ di, di ∈D↑f ,
f (di−)(x−diς
)+ f (di + ς)
(di+ς−x
ς
), di ≤ x≤ di + ς, di ∈D↓f ,
f (x) , otherwise,
f ς1(x) =
f (di−)
(x−diς
)+ f (di + ς)
(di+ς−x
ς
), di− ς ≤ x≤ di, di ∈D↑f ,
f (di− ς)(x−di+ς
ς
)+ f (di+)
(di−xς
), di ≤ x≤ di + ς, di ∈D↓f ,
f (x) , otherwise.
Note that f (di−) and f (di+) are the left and right limit of f at di which are well-defined by the
fact that there are finite number of discontinuous points. fς
2 (x) and fς
2(x) are defined by x−f ς1 (x)
and x− f ς1(x). Having defined these functions, we can now prove the following result. Below, as
before, πλf? is the tracking policy with respect to the function f? in the λth system with f?.
Proposition 10 uses these smoothed functions to show the tracking function works. Lemma 5.4,
along with Lemma 5.3, is used in the proof of Proposition 10. Given a tracking function f , we
denote by πf the tracking policy defined with respect to f . Let Qf (t) = (Qf1(t),Q
f2(t)) be the queue
lengths at time t if the policy πf is used.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities26
Lemma 5.4 Fix λ,µ, and N . Fix tracking functions g= (g1, g2) and h= (h1, h2) such that g1 (x)≤
h1 (x) and (consequently) g2 (x) ≥ h2 (x) for all x ≥ 0. Suppose that Qg (0) = Qh (0). There then
exists a construction of the sample paths so that the following inequalities hold almost surely,
Qg1 (t)≤Qh1 (t) and Qg2 (t)≥Qh2 (t) , t≥ 0.
Proposition 10 For each λ, let ξλ = (Nλ, πλf?) where the sequence {Nλ} satisfies Nλ = λ/µ +
η√λ/µ+ o(
√λ). Then, for any � > 0, there exists δ
′> 0 such that for all δ ∈ (0, δ′ ]
limsupλ→∞
P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �.
Proof: From the staffing Nλ and the tracking functions fς(x) and f ς (x), define the staffing and
policy pairs ξς
λ = (Nλ, πλ
fς ) and ξ
ς
λ= (Nλ, πλfς ). By construction, the tracking function f
ςis Lipschitz
continuous. Hence, we can apply Theorem 3.1 of Gurvich and Whitt [2009a] to obtain(Q̂ξςλ,λ
1 (∞)− fς
1
(Q̂ξ
λ,λΣ (∞)
), Q̂
ξςλ,λ
2 (∞)− fς
2
(Q̂ξ
λ,λΣ (∞)
))⇒ (0,0) .
Note that the staffing component of ξλ, ξλ
and ξλ are the same and the total queue length process is
irrelevant of prioritization policies once the staffing is the same and the policies are all admissible.
We then have for each � > 0 that
limsupλ→∞
P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 , for i= 1,2. (12)
By Lemma 5.3, there exists δ′> 0 such that for all δ ∈ (0, δ′ ],
limsupλ→∞
P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}< �6 . (13)
Define the set
F ς ={x≥ 0 :
∣∣∣f ςi (x)− f?i (x)∣∣∣> �3} .By the construction of f
ς, for each � > 0, there exists ς > 0 such that
P{Q̂ξΣ(∞)∈ F ς
}<�
6.
Moreover, since Q̂ξΣ(∞), the limiting process of Q̂ξλ,λΣ (∞), has a density (see Halfin and Whitt
[1981]), P{Q̂ξΣ(∞)∈ ∂F ς}= 0 so that the following holds by Portmanteau Theorem (see Billingsley
[1999]),
limλ→∞
P{Q̂ξ
λ,λΣ (∞)∈ F ς
}= P
{Q̂ξΣ (∞)∈ F ς
}.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities27
Then, for each � > 0, there exists ς > 0 that satisfies
limsupλ→∞
P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 . (14)
Combining (12)-(14), we conclude that for each � > 0, there exists ς > 0 and δ′ > 0 so that for
all δ ∈ (0, δ′ ]
limsupλ→∞
P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �} < limsup
λ→∞P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}
+limsupλ→∞
P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}
+limsupλ→∞
P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}
<�
6+�
6+�
6=�
2.
Similarly for ξςλ,
limsupλ→∞
P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �2 .
Finally, note that by construction f ς1(x) ≤ f?(x) ≤ f ς1(x) for all x ≥ 0 so that, by Lemma 5.4
Q̂ξςλ,λ
δ,1 (∞)≤ Q̂ξλ,λδ,1 (∞)≤ Q̂
ξςλ,λδ,1 (∞) and, in turn, there exists δ
′such that for all δ ∈ (0, δ′ ]
limsupλ→∞
P{∣∣∣Q̂ξλ,λδ,1 (∞)− f?1 (Q̂ξλ,λΣ (∞))∣∣∣> �}
≤ limsupλ→∞
P{Q̂ξ
λ,λδ,1 (∞)− f?1
(Q̂ξ
λ,λΣ (∞)
)> �}
+ limsupλ→∞
P{Q̂ξ
λ,λδ,1 (∞)− f?1
(Q̂ξ
λ,λΣ (∞)
) �}
+ limsupλ→∞
P{Q̂ξςλ,λ
δ,1 (∞)− f?1(Q̂ξ
λ,λΣ (∞)
) 0, (15)
and a steady-state exists for X. Furthermore,
limsupλ→∞
1√λE[Qξ
λ,λ1 (∞) +Q
ξλ,λ2 (∞)]
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities28
5.1.3. Asymptotic feasibility and optimality: Proof of Theorem 3 We start with the
asymptotic feasibility. Specifically, we argue that with ξλ being our proposed staffing-policy pair,
it holds that for all � > 0, there exists δ′> 0 such that for all δ ∈ [0, δ′), the following holds.
limsupλ→∞
P{Qξ
λ,λδ,i (∞)>λiwλi (1− �)
}≤ αi + �, i= 1,2,
limsupλ→∞
P
{Qξ
λ,λδ,1 (∞)λ1wλ1
≤Qξ
λ,λδ,2 (∞)λ2wλ2
− �
}≥ β− �.
Asymptotic feasibility for the waiting time formulation then follows from Lemma 5.1. The above
formulation is equivalent to the following.
limsupλ→∞
P{Q̂ξ
λ,λδ,i (∞)>aiw̄i(1− �)
}≤ αi + �, i= 1,2,
limsupλ→∞
P
{Q̂ξ
λ,λδ,1 (∞)a1w̄1
≤Q̂ξ
λ,λδ,2 (∞)a2w̄2
− �
}≥ β− �.
The following inclusions follow directly from the structure of f?. Recall that Q̂ξΣ denote the limit
process of Q̂ξλ,λ
Σ .{f?1
(Q̂ξΣ (∞)
)− a1w̄1 ≥ 0
}⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)
},{f?2
(Q̂ξΣ (∞)
)− a2w̄2 ≥ 0
}⊆{Q̂ξΣ (∞)∈ [q̄,∞)
},
and f?1
(Q̂ξΣ (∞)
)a1w̄1
−f?2
(Q̂ξΣ (∞)
)a2w̄2
≥ 0
⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Consequently,
P{f?1
(Q̂ξΣ (∞)
)− a1w̄1 ≥ 0
}≤ P
{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)
},
P{f?2
(Q̂ξΣ (∞)
)− a2w̄2 ≥ 0
}≤ P
{Q̂ξΣ (∞)∈ [q̄,∞)
},
P
f?1
(Q̂ξΣ (∞)
)a1w̄1
−f?2
(Q̂ξΣ (∞)
)a2w̄2
≥ 0
≤ P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Applying Proposition 1 and Theorem 1 of Halfin and Whitt [1981] and using the definition of q̄
and η in (6) and (7),
P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)
}= P
{Q̂ξΣ (∞)> 0
}P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄) | Q̂
ξΣ (∞)> 0
}=
(1 +
ηΦ(η)
φ(η)
)−1· (exp(−η(a1w̄1 + a2w̄2))− exp(−ηq̄))
=
(1 +
ηΦ(η)
φ(η)
)−1·(
exp(−η(a1w̄1 + a2w̄2))
− exp(−η(a1w̄1 + a2w̄2)) · exp(− log
{α2 + min{α1,1−β}
α2
}))
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities29
=
(1 +
ηΦ(η)
φ(η)
)−1(1 +
ηΦ(η)
φ(η)
)(min{α1,1−β}+α2)(
1− α2α2 + min{α1,1−β}
)= min{α1,1−β},
P{Q̂ξΣ (∞)∈ [q̄,∞)
}= P
{Q̂ξΣ (∞)> 0
}P{Q̂ξΣ (∞)≥ q̄ | Q̂
ξΣ (∞)> 0
}=
(1 +
ηΦ(η)
φ(η)
)−1· exp(−ηq̄)
=
(1 +
ηΦ(η)
φ(η)
)−1exp(−η(a1w̄1 + a2w̄2)) ·
α2α2 + min{α1,1−β}
= α2, (16)
Let Df? be the set of discontinuity points of f?. Since Q̂ξΣ(∞) has a density, P{Q̂ξΣ (∞)∈Df?
}=
0, so that, by continuous mapping theorem (See Theorem 3.4.3. of Whitt [2002]),(Q̂ξ
λ,λΣ (∞) , f?1
(Q̂ξ
λ,λΣ (∞)
), f?2
(Q̂ξ
λ,λΣ (∞)
))⇒(Q̂ξΣ (∞) , f?1
(Q̂ξΣ (∞)
), f?2
(Q̂ξΣ (∞)
)),
By Portmanteau Theorem (and using the 0 measure of the discontinuity points), we have
limsupλ→∞
P{f?i
(Q̂ξ
λ,λΣ (∞)
)− aiw̄i ≥ 0
}= P
{f?i
(Q̂ξΣ (∞)
)− aiw̄i ≥ 0
}≤ αi for i= 1,2,
limsupλ→∞
P
f?1
(Q̂ξ
λ,λΣ (∞)
)a1w̄1
−f?2
(Q̂ξ
λ,λΣ (∞)
)a2w̄2
≥ 0
= Pf
?1
(Q̂ξΣ (∞)
)a1w̄1
−f?2
(Q̂ξΣ (∞)
)a2w̄2
≥ 0
≤ 1−β.By Proposition 10, for each � > 0 there exist δ
′> 0 such that for all δ ∈ (0, δ′ ] and i= 1,2,
limsupλ→∞
P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}< �2 ,
limsupλ→∞
P{Q̂ξ
λ,λδ,i (∞)≥ aiw̄i(1− �)
}≤ limsup
λ→∞P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}
+limsupλ→∞
P{f?i
(Q̂ξ
λ,λΣ (∞)
)≥ aiw̄i
}≤ �
2+αi ≤ αi + �,
and
limsupλ→∞
P
{Q̂ξ
λ,λδ,1 (∞)a1w̄1
>Q̂ξ
λ,λδ,2 (∞)a2w̄2
− �
}≤ limsup
λ→∞P
∣∣∣∣∣∣Q̂
ξλ,λδ,1 (∞)a1w̄1
−f?1
(Q̂ξ
λ,λΣ (∞)
)a1w̄1
∣∣∣∣∣∣> �a1w̄1 min{a1w̄1
2,a2w̄2
2
}+limsup
λ→∞P
f?1
(Q̂ξ
λ,λΣ (∞)
)a1w̄1
−f?2
(Q̂ξ
λ,λΣ (∞)
)a2w̄2
≥ 0
+limsup
λ→∞P
∣∣∣∣∣∣Q̂
ξλ,λδ,2 (∞)a2w̄2
−f?2
(Q̂ξ
λ,λΣ (∞)
)a2w̄2
∣∣∣∣∣∣> �a2w̄2 min{a1w̄1
2,a2w̄2
2
}
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities30
≤ �2
+ 1−β+ �2≤ 1−β+ �.
This concludes the feasibility argument and we turn to optimality. Let {ξ̃λ} be another asymp-totically feasible sequence. By the definition of asymptotic feasibility, for any � > 0 there exists a
δ1 > 0 such that for all δ ∈ (0, δ1] the following holds
limsupλ→∞
P{Q̂ξ̃
λ,λδ,i (∞)>aiw̄i(1− �)
}≤ αi + �, i= 1,2,
limsupλ→∞
P
{Q̂ξ̃
λ,λδ,1 (∞)a1w̄1
≤Q̂ξ̃
λ,λδ,2 (∞)a2w̄2
− �
}≥ β− �.
Thus, given δ and � we have by Proposition 9 that N(ξ̃λ)≥Nλ? (δ,2�). In particular, given δ, � > 0
limsupλ→∞
Nλ? −N(ξ̃λ)√λ
≤ limsupλ→∞
Nλ? −Nλ? (δ,2�)√λ
+limsupλ→∞
Nλ? (δ,2�)−N(ξ̃λ)√λ
≤ limsupλ→∞
Nλ? −Nλ? (δ,2�)√λ
.
Finally, since �, δ are arbitrary, we have by Lemma 5.2 that
limsup�→0
limsupδ→0
limsupλ→∞
[Nλ? −N(ξ̃λ)]+ ≤ 0,
which, together with the formerly established asymptotic feasibility, establishes the asymptotic
optimality of ξλ = (Nλ? , πλf?). �
5.2. Proofs on the equivalence between perfect SLD and index policies
5.2.1. Index rule and tracking policy: Proof of Proposition 6 For a given monotone
tracking function f = (f1, f2), define the generalized inverses of f1 and f2 as follows.
g1 (y) := sup{x≥ 0 : f1 (x)≤ y} , and g2 (y) := inf {x≥ 0 : f2 (x)≥ y} .
We will show that the tracking policy by f = (f1, f2) and the index policy by g1 and g2 are
equivalent. For the index policy, guideline 1 to server if g1 (Q1 (t))≥ g2 (Q2 (t)) and it chooses class2, if g1 (Q1 (t))< g2 (Q2 (t)).
Suppose class 1 is to be served at time t under the tracking policy f . Then
Q2(t)≤ f2(QΣ(t)) and f1(QΣ(t))≤Q1(t).
g1 and g2 are increasing. Also gi (fi (x)) = x for i= 1,2. Hence applying g2 to the first inequality
and g1 to the second one result in
g2(Q2(t))≤QΣ(t) and QΣ(t)≤ g1(Q1(t)).
Combining those two, we can see class is also served under the index policy by g1 and g2. The
case where class 2 is to be served can be proved similarly and hence the first statement of the
theorem is proved.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities31
Now we will prove the second statement of the theorem. For simplicity, we assume that g1 is right
continuous with left limits and g2 is left continuous with right limits. For a given index functions
g1 and g2, define f1, which is a function of total queue length QΣ, as follows.
f1 (QΣ) :=
0, g2 (QΣ)< g1 (0) ,
QΣ, g1 (QΣ)< g2 (0) ,
inf {x : 0≤ x≤QΣ, [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅} , otherwise.
Note that with continuous gi’s, gi(x−) = gi(x), i = 1,2 hold. Then f1(QΣ) := inf{x : 0 ≤ x ≤
QΣ, g1(x) = g2(QΣ − x)} in the third case. More complicated formula as the above is devised to
incorporate discontinuous index functions. We will show that f1 (QΣ) is a well-defined function and
the tracking policy by f = (f1, f2) is equivalent to index policy with g1 and g2. We complete the
proof by showing that f = (f1, f2) is an increasing function.
If g2 (QΣ)< g1 (0) or g1 (QΣ)< g2 (0), f1 (QΣ) is evidently well defined. It remains to show there
exists x such that 0≤ x≤QΣ and [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅ when g2 (QΣ)≥
g1 (0) and g1 (QΣ)≥ g2 (0).
We show this with a realistic assumption that g1 and g2 have finite number of discontinuous
points on [0,QΣ]. The case for countably infinite case is proved in Soh and Gurvich [2015]. First,
define g (x) := g1 (x)− g2 (QΣ−x) and let {x1, x2, ..., xn} be the discontinuous points of g (x) on
[0,QΣ]. Note that g (x) is increasing and g(x−)≤ g (x) always holds.
We first prove that ∪x∈[0,QΣ] [g(x−), g (x)] , is a connected set. Since g is continuous at [0, x1),
A := ∪x∈[0,x1) [g(x−), g (x)] = ∪x∈[0,x1)g (x) is a connected set, i.e., an interval. We will show that
A ∪B is also an interval where B := [g(x1−), g (x1)] . Let g′ be a modified function from g such
that g′ (x1) = g(x1−) and g′ (x) = g (x) for all x ∈ [0, x1). Then g′ (x) is continuous at x ∈ [0, x1].
Evidently, g′ (x)∈A for all x∈ [0, x1) and g′ (x1)∈B. Hence a continuous function g′ (x) on [0, x1]
is a path from g′ (0) = g (0) ∈A to g′ (x1) ∈B. A and B are both path-connected sets and A∪B
is a path-connected set, i.e., an interval. Repeating this argument until x=QΣ, we conclude that
∪x∈[0,QΣ] [g(x−), g (x)] is a connected set.
By the conditions g2 (QΣ)≥ g1 (0) and g1 (QΣ)≥ g2 (0), g (0) = g1 (0)− g2 (QΣ)≤ 0 and g (QΣ) =
g1 (QΣ) − g2 (0) ≥ 0 hold. Therefore, the above interval contains 0 and there exists x′ such
that 0 ∈ [g(x′−), g (x′)] . Then g1 (x′) ≥ g2 (QΣ−x′) and g1(x′) ≤ g2(QΣ − x′+) hold and hence
[g1 (x′−) , g1 (x′)]∩ [g2 (QΣ−x′) , g2 (QΣ−x′+)] 6= ∅. f1 (QΣ) is defined to be the infimum of such x′
and therefore is well-defined.
Now we show that the tracking policy by f = (f1, f2) is equivalent to the index policy by g1
and g2. If g2 (Q1 (t) +Q2 (t))< g1 (0), f1 (Q1 (t) +Q2 (t)) is defined to be 0. Since g1 is increasing,
g2 (Q1 (t) +Q2 (t))< g1 (0)< g1 (Q1 (t) +Q2 (t)). Then, class 2 is always served first under both the
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities32
tracking policy and the index policy. If g1 (Q1 (t) +Q2 (t))< g2 (0), f1 (Q1 (t) +Q2 (t)) is defined to
be Q and class 1 is to be served under the both policies.
Now we check the remaining case when g2 (Q1 (t) +Q2 (t)) ≥ g1 (0) and g1 (Q1 (t) +Q2 (t)) ≥
g2 (0). Suppose class 1 is to be served at time t with the index policy. Then g1 (Q1 (t))≥ g2 (Q2 (t)) . If
[g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] 6= ∅, f1 (Q1 (t) +Q2 (t)) is less or equal to Q1 (t),
since it is defined to be the infimum of x that satisfies [g1 (x−) , g1 (x)]∩ [g2 (Q−x) , g2 (Q−x+)] 6=
∅. If [g1 (Q1 (t)−) , g1 (Q1 (t))] ∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅, max [g2 (Q2 (t)) , g2 (Q2 (t)+)] <
min [g1 (Q1 (t)−) , g1 (Q1 (t))] and hence f1 (Q1 (t) +Q2 (t)) must be less than Q1 (t), since g1 and
g2 are increasing and g1(Q1(t)) ≥ g2(Q2(t)). Q1 (t) ≥ f1 (Q1 (t) +Q2 (t)) commands class 1 to be
served at time t with the tracking policy.
Suppose class 2 is to be served at time t with the index policy. Then g1 (Q1 (t)) < g2 (Q2 (t)) .
Evidently, [g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅ and since g1 and g2 are increasing
functions, f1 (Q1 (t) +Q2 (t)) is larger than Q1 (t), i.e., class 2 is to be served at time t with the
tracking policy.
It remains to show that f = (f1, f2) is a monotone function. Suppose, on the contrary, that f
is a non-monotone function, i.e., there exist q′ and q′′ such that q′ < q′′ and fi (q′) > fi (q
′′) for
some i. Assume without loss of generality that i = 1 (f1 (q′) > f1 (q
′′)). Choose a q′′′ such that
f1 (q′′)< q′′′ < f1 (q
′). If Q1 (t) = q′′′ and Q2 (t) = q
′′− q′′′, f1 (Q1 (t) +Q2 (t)) = f1 (q′′)< q′′′ =Q1 (t)
and hence class 1 customer must be served. If Q1 (t) = q′′′ and Q2 (t) = q
′−q′′′, f1 (Q1 (t) +Q2 (t)) =
f1 (q′)> q′′′ =Q1 (t) and hence class 2 customer must be served.
Under the equivalent index policy by g1 and g2 (the existence of which is verified above), g1 (q′′′)≥
g2 (q′′− q′′′) and g1 (q′′′)< g2 (q′− q′′′) must hold. But this contradicts the fact that g2 is a monotone
function since q′′− q′′′ > q′− q′′′. Therefore, f = (f1, f2) must be a monotone function.
5.2.2. Near optimality of monotone solutions: Proof of Proposition 7 Having defined
fϑ, the proof of this theorem is a straightforward adaptation of the proof of Theorem 3 and, in
particular, of the arguments in sections 5.1.2 and 5.1.3. Fix ϑ > 0 and let Q̂ϑΣ(∞) be defined as
the limit of the properly scaled M/M/N steady-state queues when the staffing in the λth queues is
Nλ = bNλ? +ϑ√λc. Asymptotic tracking follows as in Proposition 10 with f? replaced by fϑ there.
fϑ is structured to satisfy{fϑ1
(Q̂ϑΣ (∞)
)− a1w̄1 ≥ 0
}= ∅,
{fϑ2
(Q̂ϑΣ (∞)
)− a2w̄2 ≥ 0
}⊆{Q̂ϑΣ (∞)∈ [qϑb ,∞)
},
and fϑ1
(Q̂ϑΣ (∞)
)a1w̄1
−fϑ2
(Q̂ϑΣ (∞)
)a2w̄2
≥ 0
= ∅.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities33
Proceeding as in §5.1.3, it follows trivially that the SLD constraint (with β = 1 here) and the
constraint for class 1 are asymptotically satisfied and we turn to the class-2 constraint. Recall that
with the sequence {Nλ? }, it holds that P{Q̂ξΣ(∞)∈ [q̄,∞)} ≤ α2 (see (16)). Since ϑ is strictly positive,
it then follows (by the explicit expressions for Q̂ξΣ(∞) and Q̂ϑΣ(∞)) that there exists $ > 0 such
that P{Q̂ϑΣ(∞) ∈ [q̄,∞)} ≤ α2−$. We can now choose κϑ sufficiently small (and in turn qϑb close
to q̄) such that P{Q̂ϑΣ(∞) ∈ [qϑb , q̄]} ≤$/2, in which case, we have that P{Q̂ϑΣ(∞) ∈ [q̄ϑb ,∞)} ≤ α2.
From here the proof of asymptotic feasibility is concluded as in §5.1.3. �
5.2.3. Perfect SLD and monotone tracking policy: Proof of Proposition 8 Suppose,
on the contrary, that N (ξ1)>N (ξ2). We will show that the TSF constraints cannot be satisfied
with the staffing N (ξ2) and hence N (ξ2)≥N (ξ1) must hold.
If N (ξ1) > N (ξ2), the queue length distribution under the staffing N (ξ1) is first order
stochastically dominated by the one under N (ξ2), i.e., for any q > 0, P{Q̂ξ1Σ (∞)∈ [q,∞)
}<
P{Q̂ξ2Σ (∞)∈ [q,∞)
}. Also by the proof of Theorem 3, P
{Q̂ξ1Σ ≥ a1w̄1 + a2w̄2
}= α2. Then there
exists � > 0 such that P{Q̂ξ2Σ ≥ a1w̄1 + a2w̄2 + �
}>α2.
Let f be the tracking function for ξ2. Then either f1(a1w̄1 + a2w̄2 + �) > a1w̄1 or
f2(a1w̄1 + a2w̄2 + �) > a2w̄2 or both. Since f is monotone, either P{Q̂ξ22 (∞)>a2w̄2
}≥
P{Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �
}> α2 or P
{Q̂ξ21 (∞)>a1w̄1
}≥ P
{Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �
}>
α2 ≥ α1. Hence at least one of the TSF constraints cannot be met when the staff is less than N(ξ1).
6. Concluding remarks
A starting point in process management is to identify a sufficient set of Key Performance Indicators
(KPIs). Call centers must choose QoS metrics that reliably reflect the call center’s objectives.
This is a design decision: a question of formulation choice that should be influenced by financial
and marketing considerations. To make an informed formulation choice, one must understand
how different formulations translate into outcomes – these would be the optimal staffing and
prioritization decisions.
Our paper seeks to contribute to the understanding of this relationship between formula-
tions and outcomes/decisions. We characterize the relationship between index rules (on the out-
comes/decisions end) and real time service-level differentiation (on the formulation end) in showing
that index rules can be justified if and only if one requires, in addition to the marginal class-level
targets, perfect service level differentiation.
Implicitly, this work brings together two paradigms of staffing optimization. In the first, costs
are assigned to waiting and the optimizer seeks to minimize the combined cost of waiting staffing.
In the second, constraints are placed on waiting-related metrics and the optimizer seeks to min-
imize staffing costs while meeting the imposed constraints. The role of index rules is relatively
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities34
well understood under the first paradigm, yet we study them (and call centers use them) for con-
strained staffing problems. The better understanding of index rules, we hope, can facilitate the
understanding of the duality (or lack therof) between waiting-cost to waiting-time targets.
Acknowledgements
We thank Yong Pin Zhou and Retsef Levi for helpful comments in early stages of this work. We are
grateful to three anonymous reviewers and the associate editor for their comments and suggestions
that lead to significant improvements to this work.
References
Z. Aksin, M. Armony, and V. Mehrotra. The modern call center: A multi-disciplinary perspective
on operations management research. Production and Operations Management, 16(6):665–688,
2007.
T. Apostol and I. Makai. Mathematical analysis, volume 2. Addison-Wesley Reading, MA, 1974.
M. Armony. Dynamic routing in large-scale service systems with heterogeneous servers. Queueing
Systems, 51(3-4):287–329, 2005.
M. Armony and A. Mandelbaum. Routing and staffing in large-scale service systems: the case of
homogeneous impatient customers and heterogeneous servers. Operations Research, 59:50–65,
2011.
S. Asmussen. Applied probability and queues. Springer Verlag, 2003.
P. Billingsley. Convergence of Probability Measures. 2nd ed., Wiley, New York, 1999.
S. Borst, A. Mandelbaum, and M. Reiman. Dimensioning large call centers. Operations Research,
52(1):17–34, 2004.
W. Chan, G. Koole, and P. L’Ecuyer. Dynamic call center routing policies using call waiting and
agent idle times. Manufacturing & Service Operations Management, 16(4):544–560, 2014.
H. Chen and D. Yao. Fundamentals of queueing networks. Springer, 2001.
I. Gurvich and W. Whitt. Service-level differentiation in many-server service systems: A solution
based on fixed-queue-ratio routing. Operations Research, 29:567–588, 2007.
I. Gurvich and W. Whitt. Queue-and-idleness-ratio controls in many-server service systems. Math-
ematics of Operations Research, 34(2):363–396, 2009a.
I. Gurvich and W. Whitt. Scheduling flexible servers with convex delay costs in many-server service
systems. Manufacturing & Service Operations Management, 11(2):237–253, 2009b.
I. Gurvich, M. Armony, and A. Mandelbaum. Service level differentiation in call centers with fully
flexible servers. Management Science, 54(2):279–294, 2008.
S. Halfin and W. Whitt. Heavy-traffic limits for queues with many exponential servers. Operations
Research, 29(3):567–588, 1981.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities35
G. Koole. Call center optimization. Lulu. com, 2013.
G. Koole and A. Pot. An overview of routing and staffing algorithms in multi-skill customer contact
centers. working. Technical report, 2006.
A. Mandelbaum and S. Stolyar. Scheduling flexible servers with convex delay costs: Heavy-traffic
optimality of the generalized cµ-rule. Operations Research, 52:836–855, 2004.
S. Soh and I. Gurvich. Technical supplement to the paper: “call cecenter staffing: Service-level
constraints and index priorities”. 2015. Available from the authors’ websites.
J. A. Van Mieghem. Dynamic scheduling with convex delay costs: The generalized cµ rule. Ann.
Appl. Prob., 5:809–833, 1995.
W. Whitt. Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and Their
Application to Queues. Springer-Verlag, New York., 2002.
W. Whitt. Proofs of the martingale FCLT. Probab. Surv, 4:268–302, 2007.
Soh and Gurvich: Call center staffing: Service-level constraints and index priorities1
7. Technical supplement to the paper:Call center staffing: Service-level constraints and index priorities
This supplement has three parts. In the first part, we prove auxiliary lemmas in §5.1. In the second
part, the remaining results that were stated without proofs are given.
7.1. Proofs of auxiliary lemmas in §5.1
We first provide an explicit construction of the process X. Recall that Qi(t) is the number of
customers in the class-i queue at time t and Z(t) is the number of customers in service at time
t. We number the servers and let As(t) be an N -dimensional process whose lth element captures
the elapsed service time of the customer in service with server l. This coordinate is set to 0 if
server l is idle at time t. We let Wi(t) be a vector of dimension of Qi(t) whose jth entry is the
accumulated waiting time of the jth customer in the class-i queue. Customers are listed in their
order of arrival so that the first element of Wi(t) is the accumulated waiting time of the class-i
customer that arrived first amongst those that are still waiting. An arriving customer is added at
the end of the vector with an initial value of 0. We remove the item in the head of the vector when
the first customer in the queue is admitted to service.
Admissible policies are assumed to be stationary with respect to the process
X(t) = (Z(t),Q1(t),Q2(t),As(t),W1(t),W2(t)),
i.e., an assignment decision at time t may use only the information captured by X(t). This guar-
antees that X(t) is a Markov process.
Proof of Lemma 5.1: We first show that if part (1.) holds, part (2.) also holds. This will be
proved by showing that for a given �2, we can specify δ̄ such that for all δ ∈ (0, δ̄], the inequalities
of (2.) hold.