54
Submitted to Operations Research manuscript Call center staffing: Service-level constraints and index priorities Seung Bum Soh College of Business Administration, Sejong University, [email protected] Itai Gurvich Kellogg School of Management, Northwestern University, [email protected] May 2, 2016 Call centers attribute different values to different customer segments. These values are reflected in quality- of-service targets. The prevelant Target Service Factor (TSF) formulation requires, for example, that 80% of VIP customers wait less than 20 seconds while setting the target to 30 seconds for non-VIP customers. The call center must determine the staffing level together with a prioritization rule that meets these targets at minimal cost. In practice, due to the underlying complexity of these systems, the prioritization rule is often selected in a heuristic manner rather than being systematically optimized. When considering the universe of prioritization policies, index rules provide a customizable and easy to define heuristic and for this reason are implemented in various call-center software packages. We use the TSF formulation as a stepping stone towards a better understanding of index rules. We first construct an asymptotically optimal solution for the TSF problem. The prioritization component of our solution is a tracking policy rather than an index rule. We prove that despite index rules’ significant flexibility, no instance of these prioritization rules is optimal for the TSF problem. The sub-optimality of index rules follows from an essential characteristic of these: restricting attention to index rules (as is heuristically done in practice) is asymptotically equivalent to requiring that a VIP customer always waits less than a regular (non-VIP) customer who arrives at the same time. This, in particular, implies that the use of index rules in practice can be rationalized if (and only if) the manager requires such strong differentiation. 1. Introduction Waiting times are regarded as central to customers’ service experience and are used as a key performance indicator for call centers. In the spirit of measuring what is managed, call centers use metrics that aggregate waiting-time related information. They are collectively referred to as QoS (Quality-of-Service) metrics. A notable example is the Target-Service-Factor (TSF, the fraction of customers who exceed a target waiting time) ; see Chapter 2 of Koole [2013] for further discussion of call-center performance metrics. TSF is one of the key performance indicators that are traced and targetted in call centers. When the call center serves multiple customer classes, their relative importance is typically reflected in the TSF targets requiring, for example, that 80% of class-1 customers wait less than 20 seconds but 80% of class 2 customers wait less than 30 seconds. Solving the call-center’s staffing problem in a multi-class environment requires determining the capacity level and the prioritization rule that, combined, guarantee that the service-level targets are met at a minimal cost. 1

Call center sta ng: Service-level constraints and index prioritiesSoh and Gurvich: Call center sta ng: Service-level constraints and index priorities 3 Figure 1 Index functions in

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Submitted to Operations Researchmanuscript

    Call center staffing:Service-level constraints and index priorities

    Seung Bum SohCollege of Business Administration, Sejong University, [email protected]

    Itai GurvichKellogg School of Management, Northwestern University, [email protected]

    May 2, 2016

    Call centers attribute different values to different customer segments. These values are reflected in quality-

    of-service targets. The prevelant Target Service Factor (TSF) formulation requires, for example, that 80% of

    VIP customers wait less than 20 seconds while setting the target to 30 seconds for non-VIP customers. The

    call center must determine the staffing level together with a prioritization rule that meets these targets at

    minimal cost. In practice, due to the underlying complexity of these systems, the prioritization rule is often

    selected in a heuristic manner rather than being systematically optimized. When considering the universe

    of prioritization policies, index rules provide a customizable and easy to define heuristic and for this reason

    are implemented in various call-center software packages.

    We use the TSF formulation as a stepping stone towards a better understanding of index rules. We first

    construct an asymptotically optimal solution for the TSF problem. The prioritization component of our

    solution is a tracking policy rather than an index rule. We prove that despite index rules’ significant flexibility,

    no instance of these prioritization rules is optimal for the TSF problem. The sub-optimality of index rules

    follows from an essential characteristic of these: restricting attention to index rules (as is heuristically done

    in practice) is asymptotically equivalent to requiring that a VIP customer always waits less than a regular

    (non-VIP) customer who arrives at the same time. This, in particular, implies that the use of index rules in

    practice can be rationalized if (and only if) the manager requires such strong differentiation.

    1. Introduction

    Waiting times are regarded as central to customers’ service experience and are used as a key

    performance indicator for call centers. In the spirit of measuring what is managed, call centers use

    metrics that aggregate waiting-time related information. They are collectively referred to as QoS

    (Quality-of-Service) metrics. A notable example is the Target-Service-Factor (TSF, the fraction of

    customers who exceed a target waiting time) ; see Chapter 2 of Koole [2013] for further discussion

    of call-center performance metrics. TSF is one of the key performance indicators that are traced

    and targetted in call centers.

    When the call center serves multiple customer classes, their relative importance is typically

    reflected in the TSF targets requiring, for example, that 80% of class-1 customers wait less than 20

    seconds but 80% of class 2 customers wait less than 30 seconds. Solving the call-center’s staffing

    problem in a multi-class environment requires determining the capacity level and the prioritization

    rule that, combined, guarantee that the service-level targets are met at a minimal cost.

    1

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities2

    In the two-class model that we consider here, the staffing problem is given by

    (TSF)

    min N

    s.t. P{WN,π1 >w1

    }≤ α1,

    s.t. P{WN,π2 >w2

    }≤ α2,

    N ∈Z+, π ∈Π,

    where, informally at this stage, WN,πi represents the steady-state waiting time of class i when the

    number of servers is N and the prioritization policy is π. We seek to find the minimal number

    of servers N together with a supporting prioritization policy π so that the TSF constraints are

    met. Here, Π denotes the family of “admissible” prioritization policies; see Definition 1. To the

    extent that the TSF constraint reflects relative importance, having α1 ≤ α2 together with w1 ≤w2corresponds to class 1 being the VIP class.

    Truly solving this problem to optimality by jointly determining the staffing level and the priority

    rule is difficult. This paper is the first to offer an asymptotically optimal solution for a relatively

    simple network. In practice, call centers typically restrict attention to parametric families of pri-

    oritization rules and subsequently adjust the parameters and the staffing level to meet the QoS

    targets. This heuristic approach is understandable given the complexity of the problem. Degrees

    of freedom in the prioritization policy are built into the software that manages the real time call

    selection and routing of calls. The software provides sufficiently many adjustable parameters while

    maintaining simplicity. A family of prioritization heuristics is that of index priorities; see e.g. Chan

    et al. [2014] or Koole and Pot [2006] where these are referred to as Time Function Scheduling

    (TFS). This family of policies is also well understood analytically in the context of convex holding

    cost minimization; see Van Mieghem [1995] and Mandelbaum and Stolyar [2004]

    A customer is assigned an index upon arrival. As the customer waits, his index increases period-

    ically. The system manager chooses the initial indices for each customer type as well as the update

    intervals and update-to levels (these are easily defined in a spreadsheet). This leads to an index

    function as the one in Figure 1 which is used in an Israeli Bank that serves three categories of

    retail customers. In real time, a server who becomes available serves the customer with the highest

    index.

    Somewhat more formally, customer type i is associated with a piecewise-constant index gi(·)

    so that a server that becomes available chooses for service the customer at the head of queue

    i∗ ∈ arg maxi gi(wi(t)) where wi(t) is the accumulated waiting time of the customers at the head

    of the class-i queue. This family of policies is intuitive to understand and easy to implement.

    Moreover, as both the updated periods and levels can be defined by the user it is expected to

    be rather flexible. Analytically understanding the role of index policies in constrained staffing

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities3

    Figure 1 Index functions in practice

    Figure 2 The non-index structure of optimal prioritization

    optimization is practically important, due to their use in practice, and intellectually important,

    due to the attention they have received in the literature on the optimization of queues.

    We first asymptotically solve the TSF formulation which, despite its prevalence in practice, has

    not been solved even in the simplest form of multi-class queues. We provide a simple staffing

    expression based on the M/M/N queue that explicitly captures the parameters of the problem. We

    prove that this staffing solution, together with a corresponding priority policy, is asymptotically

    optimal when the volume of arrivals is high.

    Since we are interested in better understanding the role of index rules, we proceed to show

    that all asymptotically optimal solutions to the TSF formulation are, in particular, non-index: all

    optimal prioritization policies share some non-conventional properties:

    (i) Non-monotonicity of waiting time in congestion: For at least one customer class (be it VIP or

    Regular), the waiting time has the structure depicted in Figure 2 – it is greater when the congestion

    (specifically the total queue) in the system is moderate relative to when the congestion is high.

    (ii) Discontinuity of waiting time in congestion: Customers arriving within a short time interval

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities4

    may experience starkly different waiting times (this corresponds to the discontinuity point ? in

    Figure 2).

    Since, as we also prove, index rules generate waiting time profiles that are continuous and

    monotone in the total congestion, we conclude that if one heuristically restrict attention to index

    rules – even if one choose the “best” index functions – staffing costs are non-negligibly greater than

    the minimum necessary.

    Thus, while the family of index rules is seemingly rich, there is no instance of an index rule that is

    asymptotically optimal (together with an appropriately chosen staffing level) for the TSF problem.

    To better understand this gap, it is useful to point to an interesting phenomenon one finds when

    considering the use of index rules in practice. Figure 3 displays the waiting-time processes1 on a

    single day (May 10th, 2007) in a three-class Bank call center that uses the index rule in Figure

    12.

    Figure 3 Waiting-time processes in a large Israeli bank

    A striking fact in this graph is that, sample path by sample path (we see a similar pattern

    on other days), a perfect real-time ranking between the classes is preserved, that is, “Private”

    customers wait less than their “Star” counterparts, whereas “Star” customers wait less than their

    “Rainbow” counterparts. This reflects a rather strong notion of what it means to be a VIP customer

    1 Specifically, for each 2.5 minutes interval the graph depicts the waiting time averaged over customers that arrivedover that specific time interval

    2 We are thankful to the Technion Service Enterprise Engineering Laboratory (SEE Lab) for the data

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities5

    – VIP customers wait less at all times (rather than, say, on average over a day or an hour). This

    requirement can be mathematically formulated as

    P{WN,π1w1

    ≤ WN,π2

    w2

    }≈ 1.

    This is a perfect service-level differentiation. One can view service-level differentiation (SLD) on a

    continuum parameterized by a constant β ∈ [0,1] and captured by the constraint

    P{WN,π1w1

    ≤ WN,π2

    w2

    }≥ β (SLD(β))

    The case of β = 0 corresponds to requiring that differentiation holds only in aggregate by TSF

    formulation as in: on average at least 80% of VIP customers wait less than 20 seconds and at

    least 80% of Regular customers wait less than 30 seconds – whereas β = 1 reflects a perfect real

    time differentiation as observed in Figure 3. We label by TSF+SLD(β) a formulation that has

    both the TSF constraint and the SLD(β) constraints.

    The observation that perfect SLD is preserved in this data is generalizable. Perfect service-level

    differentiation is a characterizing property of index policies. That is, requiring perfect service-level

    differentiation but giving full freedom in the choice of the prioriziation policy is equivalent to

    restricting attention to index rules in the TSF problem. More formally, the following two formula-

    tions are asymptotically equivalent:

    min N

    s.t.P{WN,π1 >w1

    }≤ α1,

    (TSF + SLD(1)) P{WN,π2 >w2

    }≤ α2,

    P{WN,π1w1≤ W

    N,π2w2

    }= 1,

    N ∈Z+, π ∈Π.

    min N

    s.t.P{WN,π1 >w1

    }≤ α1,

    ⇔ (TSF + Index) P{WN,π2 >w2}≤ α2,N ∈Z+, π ∈Π(Index).

    In words, restricting attention to index rules (as is often heuristically done) is asymptotically

    equivalent to requiring perfect SLD (i.e, β = 1). Thus, whereas the a priori restriction of some

    call centers to index rules may be driven by the software packages that they use, their choice is

    consistent with optimality, provided that perfect SLD is desirable to them.

    We close this introduction with a few important pointers to index rules in the literature. The use

    of index-rule based heuristics in the context of staffing subject to QoS targets is not grounded in

    theoretical foundation. However, it is the opposite case when considering holding cost (or waiting-

    time cost) minimization for given staffing levels. The best known results in this context pertain

    to the optimality of the Generalized cµ (Gcµ) rule for the minimization of convex holding costs

    as pioneered by Van Mieghem [1995]. Waiting-cost minimization is also the subject of subsequent

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities6

    extensions to Van Mieghem’s original work; see e.g. Mandelbaum and Stolyar [2004] and many oth-

    ers that followed. Such rules were also shown to be optimal for some constrained staffing problems.

    Average Speed of Answer (ASA) constraints give rise (as one possible solution) to a Fixed-Queue-

    Ratio rule – a special case of an index rule; see Gurvich and Whitt [2007].

    Thus, in certain contexts, these rules are very well understood. We expand this body of work

    with some new insights about the role of index rules in constrained staffing problems. Constrained

    staffing problems in multi-class queues have been also studied and we refer the reader to Aksin et al.

    [2007] for some references. The typical paper in this literature seeks to identify one optimal (or

    nearly optimal) solution. To that literature we add our solution to the TSF formulation. However,

    questions as we raise here, that pertain to structural properties of the family of all asymptotically

    optimal solutions, do not arise naturally in that line of work. They arise naturally here when we

    seek to understand the role of index rules in staffing problems.

    2. Model and analysis framework

    We consider the two-class V model. Arrivals of class-i customers follow a Poisson process Ai =

    (Ai(t), t≥ 0) with rate λi > 0; i= 1,2. Let λ= λ1 +λ2. The two Poisson processes are independent.

    Service times are exponential with rate µ and are independent across customers and independent

    of the arrival processes. Various metrics of interest will be superscripted by the staffing level N

    and the prioritization policy π to denote their dependence on these decision variables.

    The TSF and SLD constraints were defined in the introduction using a virtual waiting time

    WN,πi . The Poisson Arrivals See Time Averages (PASTA) property guarantees that the fraction of

    customers who wait more than a target w equals the fraction of time that the virtual waiting time

    exceeds this value.

    We substitute the virtual waiting time with a proxy - the local average waiting time, which is

    based on the actual waiting times. Let wN,πi,k be the waiting time of the kth class-i customer to

    arrive. The local average waiting time of customers arriving over an interval (t, t+ δ] is defined to

    be,

    WN,πδ,i (t) :=1

    Ai(t+ δ)−Ai(t)

    Ai(t+δ)∑k=Ai(t)+1

    wN,πi,k , (1)

    where 0/0 is defined to be 0.

    Local average is the measure used in Figure 3 where δ is 2.5 minute. If δ is kept sufficiently small,

    the local average captures the dynamics and variation of waiting time. The use of local average

    (rather than virtual waiting time) helps us overcome a technical difficulty (see Lemma 5.1).

    We assume throughout that

    α1, α2 > 0, α1 ≤ α2 and α1 +α2 < 1. (2)

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities7

    If α1 +α2 ≥ 1, the staffing problem trivializes and, even for small values of w1,w2 > 0, any staffing

    level N >λ/µ is feasible; see the discussion on page page 285 of Gurvich et al. [2008],

    We allow the controller to use information about the length of the queues, the elapsed service

    times of customers that are in service and the accumulated waiting times of the all customers in the

    queues and we let X(t) be the state descriptor that captures all this information; see the appendix

    for the formal definition.

    Definition 1 (admissible policies): A prioritization policy π is admissible if:

    1. It does not use admission control – all customers are served.

    2. It serves customers in a first-come first-served (FCFS) fashion within each class.

    3. It is work conserving and stationary with respect to X.

    Let Π be the family of admissible policies.

    Non-preemption and FCFS within each class are both natural assumptions for our application.

    The literature does have cases in which non-work-conserving policies are shown to be optimal or

    asymptotically optimal (see Gurvich et al. [2008]). The results in Gurvich et al. [2008] are driven,

    however, by order-of-magnitude differences between the probability targets of the various classes

    (say α1 = 0.01 but α2 = 0.2) that lead inherently to solutions that induce perfect SLD rendering our

    main research questions irrelevant. Work conservation has the following immediate consequence

    which we state formally and repeatedly refer to.

    Lemma 2.1 Fix the number of servers N , the service rate µ and the arrival rates λ1, λ2. Then,

    under any admissible policy π, the total number of customers in the system XΣ(t) (respectively the

    total queue QΣ(t)) follows the distribution of the total number of customers (respectively the queue

    length) in an M/M/N queue with arrival rate λ= λ1 +λ2, service rate µ and N servers.

    Many-server analysis: The problem TSF+SLD(β) is difficult to solve exactly. Instead, we resort

    to identifying solutions that are asymptotically optimal as the system size grows. A series of solu-

    tions is asymptotically optimal if the induced optimality gap is negligible in a central-limit-theorem

    (CLT) scaling. This is consistent with many studies in the context of call-center optimization; see

    e.g. Armony [2005], Armony and Mandelbaum [2011], Borst et al. [2004]. We consider a sequence

    of V models with the total arrival rate λ= λ1 + λ2 increasing along the sequence but keeping the

    service rate µ fixed. We assume that there exists ai > 0, i= 1,2 so that a1 + a2 = 1 and λi = aiλ

    for i= 1,2 and all λ.

    All relevant quantities are superscripted by λ to make the dependence on λ explicit. As is

    standard in the literature, the targets w1 and w2 scale with λ: wλi =wi/

    √λ , i= 1,2 where w̄i’s are

    fixed strictly positive constants. This guarantees that the system is in the Halfin-Whitt regime (see

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities8

    Lemma 5.5 below). The Halfin-Whitt regime facilitates a CLT type of analysis. In our numerical

    experiments we illustrate how our proposed solutions perform for given parameters (rather than

    asymptotically); see §4.

    To simplify the notation, we denote a staffing-policy pair (N,π) by ξ and let N(ξ) and π(ξ) be,

    respectively, the staffing and prioritization components of ξ. The problem TSF+SLD(β) is now

    re-written as:

    minξλ

    N(ξλ)

    s.t. P{W ξ

    λ,λδ,1 (∞)>wλ1

    }≤ α1,

    P{W ξ

    λ,λδ,2 (∞)>wλ2

    }≤ α2, (3)

    P

    {W ξ

    λ,λδ,1 (∞)wλ1

    ≤W ξ

    λ,λδ,2 (∞)wλ2

    }≥ β,

    ξλ ∈Z+×Π.

    W ξλ,λ

    δ,i (∞) should be read as the the local average (as in (1)) when the system is initialized (at

    t = 0) with its steady-state distribution, the customer incoming rate is λ and the staffing-policy

    pair ξλ is used.

    Definition 2 (asymptotic feasibility): A sequence of staffing-policy pairs{ξλ}

    is asymptotically

    feasible, if ξλ ∈Z+×Π for all λ and for each � > 0,

    limsupδ→0

    limsupλ→∞

    P{W ξ

    λ,λδ,i (∞)>wλi (1− �)

    }≤ αi(1 + �), i= 1,2,

    and

    limsupδ→0

    limsupλ→∞

    P

    {W ξ

    λ,λδ,1 (∞)wλ1

    ≤W ξ

    λ,λδ,2 (∞)wλ2

    + �

    }≥ β(1− �).

    We say that a sequence {Nλ} is an asymptotically feasible sequence of staffing levels for (3) if

    Nλ =N (ξλ) for a sequence {ξλ} of asymptotically feasible staffing-policy pairs.

    Definition 3 (asymptotic optimality): A sequence of staffing-policy pairs {ξλ} is asymptoti-

    cally optimal if it is asymptotically feasible and[N(ξλ)−N(ξ̃λ)

    ]+= o

    (√λ)

    as λ→∞,

    for any other sequence {ξ̃λ} of asymptotically feasible staffing-policy pairs.

    We say that a sequence {Nλ} is an asymptotically optimal sequence of staffing levels for (3) if

    Nλ =N (ξλ) for an asymptotically optimal sequence {ξλ} of staffing policy pairs.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities9

    3. The proposed solution to TSF+SLD and its implications

    This section contains our main results. We identify an asymptotically optimal staffing and priori-

    tization solution. The two components of our solution are inter-dependent. The proposed staffing

    level is feasible if one uses it together with our proposed prioritization policy. For simplicity of

    exposition, we discuss them separately (each in a dedicated subsection), starting with the staffing

    component. Theorems stated in this section are proved in §5 and in the technical supplement Soh

    and Gurvich [2015].

    3.1. Optimal staffing and the cost of SLD

    To specify our staffing recommendation, let QN,λ(∞) be a random variable whose distribution is

    that of the steady-state queue length in an M/M/N queue with arrival rate λ, service rate µ and

    N servers (as µ is fixed throughout, we omit it from the superscript).

    For each λ, let Nλ? be the solution of the following M/M/N staffing problem

    Nλ? = min{N ∈Z+ : P

    {QN,λ(∞)≥ λ1wλ1 +λ2wλ2

    }≤min{α1,1−β}+α2

    }. (4)

    Notice that, by (2), min{α1,1−β}+α2 < 1 holds

    Theorem 1 (asymptotically optimal staffing) The sequence {Nλ? } is an asymptotically opti-

    mal sequence of staffing levels for (3).

    The prioritization component π (ξλ) of ξλ, constructed in the next subsection, guarantees that all

    the inequalities W ξλ,λ

    δ,1 (∞)≤wλ1 , Wξλ,λδ,2 (∞)≤wλ2 and W

    ξλ,λδ,1 (∞)/w1 ≤W

    ξλ,λδ,2 (∞)/w2 are asymp-

    totically satisfied when the total queue length is smaller than λ1wλ1 + λ2w

    λ2 . Thus, the choice of

    the staffing level (4) guarantees that the total violation is bounded by min{α1,1− β}+ α2. The

    challenge in designing the priorities is to make sure that this “violation budget” is distributed

    correctly between the three constraints; for example, the TSF constraint for class 2 absorbs at

    most α2 of the total violation budget.

    Remark 1 (aggregate-based staffing) Theorem 1 maps TSF+SLD(β) into a staffing problem for

    a single class queue. The fact that this solution works for the multiclass queue will be supported

    by our careful choice of the prioritization rule. Such a reduction to a one dimensional model is a

    recurring theme in recent literature on staffing; see e.g. Armony and Mandelbaum [2011], Gurvich

    et al. [2008], Gurvich and Whitt [2007]. In the latter two papers, a global Average Speed of Answer

    (ASA) constraint allows for a relatively simple mapping of the parameters into the single-class

    staffing problem. This mapping is more subtle in our setting as is reflected in the non-trivial way

    in which the constants α1, α2 and β appear on the right-hand side of (4).

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities10

    Figure 4 η∗(β) as a function of β for α1 = 0.3, α2 = 0.2, w̄1 = 2 and w̄2 = 4

    Theorem 2 The sequence {Nλ? } satisfies

    Nλ? =λ

    µ+ η?

    √λ

    µ+ o(√λ),

    where η? is convex increasing in the SLD degree β. It is strictly increasing for β ≥ 1−α1.

    Increasing the degree of SLD beyond a certain level (in particular, setting β = 1) is thus costly: the

    optimal staffing level for the TSF formulation is too small for TSF+SLD(1). Conversely, relaxing

    the SLD requirement reduces the staffing costs.

    3.2. Prioritization and the optimality of index policies

    The asymptotically optimal staffing {Nλ? } is coupled with a carefully chosen sequence of policies

    {πλ?} such that the sequence of staffing-policy pairs {ξλ = (Nλ? , πλ? )} is asymptotically optimal in

    the sense of Definition 3. The policy that we construct in this section is an instance of so-called

    tracking policies.

    Tracking policies defined below are also admissible policies (customers are served in a FCFS

    manner within the same class). The queue length of class i at time t for i= 1,2 is denoted by Qλi (t)

    and QλΣ (t) is the total queue length, i.e., QΣ (t) = Q1 (t) +Q2 (t). From these, normalized queue

    lengths are defined as follows:

    Q̂i(t) :=Qi(t)√λ, Q̂Σ(t) := Q̂1(t) + Q̂2(t).

    Definition 4 (queue tracking policies) Let f = (f1, f2) : R+→ R2+ be a non-negative function

    such that f1(x) + f2(x) = x for all x ≥ 0. A server that becomes available at time t chooses the

    queue to serve according to the following criterion:

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities11

    Figure 5 Tracking: (Left) General description (Right) Proposed tracking function under sub-perfect SLD (β < 1)

    1. If one queue is empty but the other is not, admit to service the customer at the head of the

    non-empty queue.

    2. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) 6= Q̂2(t)− f2(Q̂Σ(t)), choose the class i

    with the positive value of Q̂i(t)− fi(Q̂Σ(t)).

    3. If both queues are non-empty and Q̂1(t)− f1(Q̂Σ(t)) = Q̂2(t)− f2(Q̂Σ(t)), choose class 1.

    An arriving customer is immediately assigned to an available server if there are such servers. If all

    servers are busy upon this customer’s arrival, the customer is placed in the queue.

    Figure 5(Left) illustrates the tracking function mechanism. It plots the targeted value for Q̂1

    (f1(Q̂Σ)) vs. the value of Q̂Σ. If at time t, Q̂1(t)> f1(Q̂Σ(t)) (as in point A in the figure) the rule

    prioritizes class 1 so as to pull it back toward f1(Q̂Σ(t)). If Q̂1(t) < f1(Q̂Σ(t)) as in point B in

    the graph, the rule prioritizes class 2 over class 1 so as to push queue 1 towards f1(Q̂Σ(t)). The

    tracking policy targets Q̂i(t)≈ fi(Q̂Σ(t)).

    The family of tracking functions is immense. The challenge is to carefully choose the tracking

    function f so that together with our proposed staffing the solution is asymptotically optimal (in

    particular, asymptotically feasible).

    Optimal tracking functions: We use the tracking function f?1 defined below, as plotted also in

    Figure 5(Right):

    f?1 (x) =

    0, 0≤ x< a1w̄1+a2w̄2a1w̄1

    κ,a1w̄1

    a1w̄1+a2w̄2x−κ, a1w̄1+a2w̄2

    a1w̄1κ≤ x< a1w̄1 + a2w̄2− 2

    (1 + a1w̄1

    a2w̄2

    )κ,

    2a1w̄1+a2w̄22(a1w̄1+a2w̄2)

    x− a2w̄22, a1w̄1 + a2w̄2− 2

    (1 + a1w̄1

    a2w̄2

    )κ≤ x< a1w̄1 + a2w̄2,

    x− a2w̄2 +κ, a1w̄1 + a2w̄2 ≤ x< q̄,a1w̄1−κ, q̄≤ x.

    (5)

    and f?2 (x) = x− f?1 (x).

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities12

    The threshold q̄ is given by

    q̄= a1w̄1 + a2w̄2 +log{α2+min{α1,1−β}

    α2

    }η? (β)

    (6)

    where η?(β) is the unique solution (keeping other parameters constant) to(1 +

    ηΦ(η)

    φ(η)

    )−1exp(−η(a1w̄1 + a2w̄2)) = min{α1,1−β}+α2, (7)

    Here, φ(·), Φ(·) are, respectively, the standard normal density and distribution functions. The

    constant κ can be any

    0

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities13

    The choice of the parameter q̄, combined with the optimal staffing in Theorem 1, guar-

    antees that P{Q̂ξ

    λ,λΣ (t)∈ [a1w̄1 + a2w̄2, q̄)

    }≈ min{α1,1−β} and P

    {Q̂ξ

    λ,λΣ (t)∈ [q̄,∞)

    }≈

    α2. Then, P{W ξ

    λ,λδ,1 (t)>w1

    }≈ P

    {W ξ

    λ,λδ,1 (t)/w

    λ1 >W

    ξλ,λδ,2 (t)/w

    λ2

    }≈ min{α1,1−β} and

    P{W ξ

    λ,λδ,2 (t)>w2

    }≈ α2 hold and the solution is feasible. In the following theorem, πλf? denotes

    the tracking policy which applies the tracking function f?.

    Theorem 3 (asymptotic optimality) Let Nλ? be as in (4). Then, the sequence of staffing-policy

    pairs {ξλ = (Nλ? , πλf?)} is asymptotically optimal.

    Remark 2 (on the interplay of cost reduction and prioritization) There are degrees of freedom in

    choosing the tracking function – there are various choices that generate the same asymptotic opti-

    mality result. We chose f? so that SLD is violated only when the total queue is small (specifically,

    less than q̄). With this choice, f? has the appealing property that class-1 (the VIP) customers get

    superior service when the total congestion in the system is large. Thus, reducing β (from β < 1)

    allows for a cost reduction in terms of staffing (recall Theorem 2) while making sure that the VIP

    customers get superior service when it really matters.

    One thing that may strike the reader as non-standard in Figure 5(Right) is the discontinuity and

    non-monotonicity of the tracking function f?. This is not a consequence of the specific way in which

    we choose the functions – as the next theorem shows it cannot be avoided without compromise to

    optimality.

    For the formal statement, we say that a tracking policy is non-monotone if at least one of the

    functions f1(·) and f2(·) is non-monotone. A function f is defined to be piecewise Lipschitz if it has

    a finite number of discontinuity points {x1, . . . , xm,} and is Lipschitz continuous on each interval

    (including [0, x1) and [xm,∞)).

    Theorem 4 (non-monotonicity and discontinuity) Assume β < 1 (sub-perfect SLD) and let

    f be a piecewise Lipschitz tracking function such that the sequence {(Nλ? , πλf )} is an asymptotically

    optimal sequence of staffing-policy pairs. Then, f is discontinuous and non-monotone.

    The optimal staffing allows for a limited “budget” that must be carefully allocated to the three

    different constraints (the two TSF constraints and the SLD constraints). For different values of Q̂Σ

    one must choose which constraints are “sacrificed” in favor of meeting the others. What we prove

    is that the only way to do that while minimizing the total budget is to alternate the priorities and

    this, in particular, introduces the non-monotonicity and the discontinuity that are stated in the

    theorem.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities14

    On the equivalence of index policies and perfect SLD: Theorem 4 proves that, asymptotically,

    optimal tracking functions must be discontinuous and non-monotone for β < 1. That is, imposing

    monotonicity (which is a property of index policies; see below) is related to requiring perfect SLD

    (β = 1). We next show that requiring perfect SLD and restricting the tracking functions to be

    monotone is indeed equivalent in the asymptotic sense. More importantly, proving that monotone

    tracking rules are equivalent to index policies, allows us to conclude the equivalence between

    imposing a perfect SLD constraint and restricting the optimization to use only index policies.

    To formalize this result, recall that an index policy is one that, upon service completion at time

    t, admits to a customer from class

    i∗ = arg maxi

    gi(Qi(t)),

    where g1 and g2 are non-negative increasing functions. We denote by Π(index)⊂Π the family ofindex policies.

    Theorem 5 (equivalence of index policies and SLD(1) ) There exists a sequence of staffing-

    policy pairs {ξλ = (Nλ, πλ)} that is asymptotically optimal for both formulations:

    minξλN(ξλ)

    s.t.P{W ξ

    λ,λδ,1 (∞)>wλ1

    }≤ α1,

    (TSF + SLD(1)) P{W ξ

    λ,λδ,2 (∞)>wλ2

    }≤ α2,

    P

    {Wξλ,λδ,1

    (∞)

    wλ1≤

    Wξλ,λδ,2

    (∞)

    wλ2

    }= 1,

    ξλ ∈Z+×Π.

    minξN(ξλ)

    s.t.P{W ξ

    λ,λδ,1 (∞)>wλ1

    }≤ α1,

    (TSF + Index) P{W ξ

    λ,λδ,2 (∞)>wλ2

    }≤ α2,

    ξλ ∈Z+×Π(index).

    In particular, letting N∗,λ1 be the objective function value of TSF+SLD(1) and N∗,λ2 be that of

    TSF+Index, we have that |N∗,λ1 −N∗,λ2 |= o(

    √λ).

    Theorem 5 is argued in three steps stated in Propositions 6-8. The first, Proposition 6, shows that

    index policies and monotone tracking policies are separate mathematical representations of the

    same policy and the proof is in 5.2.1.

    Proposition 6 (equivalence of index policies and monotone tracking policies ) Given a

    monotone tracking function f , there exists an index function g such that the monotone tracking

    policy with f is equivalent to the index policy with index g. Similarly, given an index function g,

    there exists a monotone tracking function f such that the index policy with g is equivalent to the

    tracking policy with f . The equivalence is in the sense that at any given state of (Q1 (t) ,Q2 (t)),

    both policies take the same action.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities15

    Given this equivalence, to prove Theorem 5, it remains to shows that TSF+SLD(1) is equivalent

    to the following:

    minξλ N(ξλ)

    (TSF + MT) s.t. P{W ξ

    λ,λδ,1 (∞)>wλ1

    }≤ α1,

    P{W ξ

    λ,λδ,2 (∞)>wλ2

    }≤ α2,

    ξλ ∈Z+×Π(MT) ,

    where Π(MT) ⊂ Π is the family of monotone tracking policies. Note that a monotone tracking

    function must also be continuous by x= f1(x) + f2(x). It is easy to prove. For an arbitrary point

    x1 > 0 and � > 0, f1(x1)≤ f1(x1 + �)≤ f1(x1) + �. The last inequality is from x1 + �− f1(x1 + �) =

    f2(x1 + �)≥ f2(x1) = x1− f1(x1). We see as �→ 0, f1(x1 + �)→ f1(x1). The cases for � < 0 and for

    f2 are verified similarly.

    The equivalence is proved in the next two propositions. The first, Proposition 7, proves that

    there exists a monotone tracking policy which, with the staffing level from Theorem 1, satisfies the

    constraints in TSF+SLD(1).

    Proposition 7 (nearly optimal monotone solutions for TSF+SLD(1) ) For any ϑ > 0,

    there exists a monotone tracking function fϑ and a sequence {Nλϑ} such that {(Nλϑ , πλfϑ)} is asymp-

    totically feasible for TSF+SLD(1) and |Nλϑ −Nλ? | ≤ ϑ√λ.

    The nearly optimal tracking function, whose existence is established in Proposition 7, is depicted

    in Figure 6 and explicitly specified as follows:

    fϑ1 (x) =

    0, 0≤ x< a1w̄1+a2w̄2

    a1w̄1κϑ,

    a1w̄1a1w̄1+a2w̄2

    x−κϑ, a1w̄1+a2w̄2a1w̄1

    κϑ ≤ x< a1w̄1 + a2w̄2,a1w̄1−κϑ, a1w̄1 + a2w̄2 ≤ x.

    In the figure the constant qϑb is the point of intersection between the tracking function fϑ1 (x) and

    the line x− a2w̄2. By (8), W ξλ,λ

    δ,1 (t)≤wλ1 , Wξλ,λδ,2 (t)≤wλ2 and W

    ξλ,λδ,1 (t)/w

    λ1 ≤W

    ξλ,λδ,2 (t)/w

    λ2 hold at

    times in which Q̂ξλ,λ

    Σ (t)≤ qϑb and only the constraint Wξλ,λδ,2 (t)≤w2 is violated when Q̂

    ξλ,λΣ (t)> q

    ϑb .

    Notice in Figure 6 that fϑ2 (x) = x− fϑ1 (x)>a2w̄2 when x> qϑb .

    Proposition 8 closes the equivalence argument toward Theorem 5 in showing that, in terms

    of staffing, requiring monotonicity is as demanding as requiring SLD(1). Since we identified in

    Proposition 7 that a monotone tracking solution is asymptotically optimal for TSF+SLD(1), we

    conclude that TSF+MT and TSF+SLD(1) are asymptotically equivalent and, by Proposition 6,

    so are TSF+index and TSF+SLD(1).

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities16

    Figure 6 Proposed tracking function under perfect SLD (β = 1)

    Proposition 8 (equivalence of monotone tracking and SLD(1) under TSF constraints)

    Let{ξλ1}

    be a series of asymptotic solutions for TSF+SLD(1) and{ξλ2}

    be one for TSF+MT.

    Then,

    lim infλ→∞

    N (ξλ2 )−N (ξλ1 )√λ

    ≥ 0.

    Staffing

    TSF+Gcµ TSF+SLD(𝛽)

    𝛽

    TSF+Index TSF+SLD(β)

    Figure 7 Optimal staffing levels

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities17

    Remark 3 (the cost of index policies) Figure 7 displays the minimal staffing requirement to satisfy

    the TSF constraints P{W1 ≤ 10 sec.} ≤ 0.2 and P{W2 ≤ 10 sec.} ≤ 0.2, when the arrival rate of

    each class is 50 customers per minute and the average service time is 5 minutes. The asymptotic

    optimal staffing is derived using Theorem 1. When adding the SLD constraint, the cost changes

    with the value of β in the SLD constraint as β varies between 0.65 and 1. This is is captured by

    the solid line. When, instead of having the SLD constraint, one restricts attention to index policies

    the staffing level is always 517 (captured by the dashed line). The gap between the two lines is the

    cost of the restriction to index policies. For β = 1, consistent with our equivalence result, there is

    no gap. For small value of β the gap is 7 servers, corresponding to a non-negligible gap relative to

    the square root of the system size.

    4. Numerical experiments

    The purpose of this section is to illustrate the performance of the solutions we derived via the

    asymptotic analysis. We use this opportunity to underscore some important aspects of the proposed

    (nearly optimal) staffing rule.

    Altogether, we consider 40 parameter combinations (4 sets, each consisting of 10 cases). Across

    cases in each set, we vary the arrival rates λ1, λ2 and the target parameters w1,w2 as in Table 1.

    Note that these parameter combinations are the same across sets, i.e., the parameter combinations

    for Scenario k of Set i is the same with that of Scenario k of Set j. We also vary α1, α2 for the TSF

    constraints and the SLD target β for the SLD constraint as in Table 2. Set 1 and Set 2 differ in

    that Set 1 has no SLD constraint (or β = 0). Set 1 and Set 3 differ in that Set 3 has smaller α1’s

    (tighter TSF for class 1). Lastly, Set 3 and Set 4 differ in that Set 4 requires perfect SLD).

    Scenario 1 2 3 4 5 6 7 8 9 10λ1 125 100 250 200 500 400 500 500 500 500λ2 125 150 250 300 500 600 500 500 500 500w1 0.2 0.2 0.1 0.1 0.05 0.05 0.025 0.025 0.1 0.1w2 0.3 0.3 0.15 0.15 0.07 0.07 0.03 0.03 0.2 0.2

    Table 1 Common parameters for the whole sets

    The mean service time is set to 1 time unit throughout (all parameters are specified in the same

    time units, so that λ= 125 for example corresponds to a rate of 125 per time unit, be it an hour

    or a minute). The number of servers (staffing) is determined via our formula (4)

    Nλ? (λ1, λ2,w1,w2, α1, α2, β) = min{N ∈Z+ : P

    {QN,λ(∞)≥ λ1wλ1 +λ2wλ2

    }≤min{α1,1−β}+α2

    }.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities18

    Scenario 1 2 3 4 5 6 7 8 9 10α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 1β 0 (No SLD constraint)α1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.3 0.2α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 2β 0.925 0.925 0.925 0.925 0.925 0.925 0.925 0.95 0.925 0.95α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 3β 0 (No SLD constraint)α1 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.1 0.15 0.1α2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.2 0.15Set 4β 1 (Perfect SLD constraint)

    Table 2 Probability targets for each set

    Given Nλ? (different for each parameter combination), we compute η= (Nλ? − (λ1 +λ2)/µ)/

    √λ and

    use it to design the propopsed tracking functions as in (5). This gives us the tracking policy which,

    as before, we denote by π.

    We built a simulation model in ARENA. For each parameter combination, we simulate a sample

    path of the two-class queue with the proposed solution (Nλ? , π)(λ1, λ2,w1,w2, α1, α2, β) for T =

    40000 time units–corresponding to 8-40 million arrivals depending on the parameter combination.

    At the end of each simulation run, we record 1Ai(T )

    ∑Ai(T )k=1 1{wi,k >wi(1− �)} with Ai(T ) being the

    number of class i customers that arrived over the simulation horizon and wi,k is the actual waiting

    time of the kth class i customer processed. Since

    1

    Ai(T )

    Ai(T )∑k=1

    1{wi,k >wi(1− �)}→ P{Wi(∞)>wi(1− �)} as T →∞,

    this statistic provides a proxy for the performance of the policy with respect to the TSF constraints.

    Our asymptotic theory says that under the proposed solution it holds that limsupλ→∞ P{W λi (∞)>

    wλi (1 − �)} ≤ αi(1 + �)(i = 1,2) and hence we check whether 1Ai(T )∑Ai(T )

    k=1 1{wi,k > wi(1 − �)} ≥

    αi(1 + �) holds or not for each i.

    For the SLD constraint, we fix δ= 0.05 and compute at intervals of size δ, Wδ,i(t) as in equation

    (1). Specifically, for each m= 1, . . . ,M = 40000/0.05 = 800000, we compute

    Wδ,i (mδ) :=1

    Ai(mδ+ δ)−Ai(mδ)

    Ai(mδ+δ)∑k=Ai(mδ)+1

    wi,k.

    With the fixed δ, the long run average

    1

    M

    M∑m=1

    1

    {Wδ,1(mδ)

    w1≤ Wδ,2(mδ)

    w2(1 + �)

    }→ P

    {Wδ,1(∞)w1

    ≤ Wδ,2(∞)w2

    (1 + �)

    }as M →∞.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities19

    Figure 8 Set 1(TSF Formulation (β = 0)): (Left) Performance with recommend staffing; (Right) With one server

    added for each violating scenario

    serves as a proxy for the SLD and we check, in the simulation, whether it exceeds β(1− �) or not.

    We set �= 0.1 which is a reasonably ambitious criterion. The math says that as λ grows large with

    appropriate scaling we could take � to be increasingly smaller.

    Figure 8 depicts the results for Set 1. We plot two bar series corresponding to

    the realized TSFs ( 1Ai(T )

    ∑Ai(T )k=1 1{wi,k > wi(1 − �)}(i = 1,2)) and the realized SLD

    ( 1M

    ∑Mm=1 1

    {Wδ,1(mδ)

    w1≤ Wδ,2(mδ)

    w2(1 + �)

    }) is captured by the black-circle series for each of the sce-

    narios. Notice that there is no target on the SLD but, still, we can compute the outcome.

    The labels report the values (the TSF series correspond to the left y-axis and the SLD to the

    right y-axis). Colored in white are the metrics that do not meet the feasibility criteria αi(1 + �)

    of the TSF constraints. The performance is rather impressive. There are 5 violations (out of 20

    constraints) and with the exception of the violations of class 1 in scenario 6 and of class 2 in

    scenario 7 (0.345 instead of 0.33 and 0.229 instead of 0.22), the others are truly small (as in 0.332

    instead of the targetted 0.33 in scenario 5 or 0.221 instead of 0.22 in scenario 10). Scenarios 5-8

    have the smallest wi targets (e.g. w1 = 0.025 and w2 = 0.03) and hence the most sensitive. The

    violations are resolved with the addition of a single server as plotted in the graph on the right of

    the figure.

    Notably, the SLD is very high (indeed, around 0.7 or higher) even though we imposed no SLD

    constraint. This is an opportunity to recall Figure 7 where the required number of servers is

    invariant to β as long as 1− β ≥ α1. With a target of α1 = 0.3 this means one gets “for free” an

    SLD of up to β = 0.7 (when α1 = 0.3) and up to β = 0.8 (when α1 = 0.2 as in Scenario 8 and 10 of

    Table 2). Our policy uses this allowance.

    We run 10 simulations with newly computed staffing levels and the policies for Set 2 to consider

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities20

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    b0 bg0

    Normalized safety capacity = 

    Set 2 (           ) Set 1 (           ) 

    Figure 9 (Left) Performance with proposed staffing for Set 2 (TSF+SLD(β)); (Right) The cost of SLD

    TSF+SLD(β) with positive β’s. The results are reported in Figure 9. In this case, the TSF criteria

    are met in all 10 scenarios with the exception of minor violations in the first two scenarios (that

    disappear with the addition of one server). The SLD is mostly around 0.9 which is close to the

    target and always above the criterion of β(1− �). We notice again that α1 matters only throughmin{α1,1− β}: For a given β target the solution can afford to have α1 as small as 1− β and thepolicy uses some of this allowance. Thus, even though α1 >α2, the performance obtained is such

    that P{W1 >w1}< P{W2 >w2} in all scenarios.Finally, on the right of Figure 9 we visualize the cost difference of TSF vs. TSF+SLD(β) by

    comparing the staffing level for TSF that underlies the graph on the right of Figure 8 and the

    staffing level of TSF+SLD(β) that underlies the left graph of Figure 9. We plot the normalized

    safety capacity (i.e., the servers required above the offered load λ/µ.) These numbers are proxies

    for the square-root coefficient η∗ in Theorem 2 and capture the cost of SLD.

    The parameter combinations in Set 3 are obtained from Set 1 by introducing more demanding

    values of α1 (e.g. 0.15 instead of 0.3). It underscores further the interaction between α1 and β. The

    results are reported in Figure 10. There are negligible violations (0.166 instead of 0.165) for class

    1 in scenarios 5 and 9. The more substantial violations are in scenarios 3,7. The violations are

    concentrated in scenarios with the smallest wi (w1 = 0.025 and w2 = 0.03). An extra staffing corrects

    for some of this but leaves slight violations for class 1. This should not be surprising: with very

    small w and α values the system is pushed away from heavy-traffic and hence the approximations

    are somewhat less precise. Finally, even though β = 0, our policy again extracts as much SLD as

    it can get “for free” which is 1−α1 (0.85 or 0.9 depending on the scenario).In Set 4, we consider the case of perfect SLD (β = 1): we use the same data from Set 3 but

    put β = 1 as a requirement in all scenarios. The policy is a monotone policy as in Figure 6. The

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities21

    Figure 10 Set 3 (TSF (β = 0), tight α): (Left) Performance with recommended staffing; (Right) With one server

    added for each violating scenario

    performance is reported in Figure 11. The TSF targets are met with excess under the recommended

    staffing and the SLD is very close to 1 for all values. The graph on the right is a visualization of

    SLD and the workings of the asymptotic arguments. One informally expects (and our proofs are

    built on formally establishing that) Wδ,i(t)/wi ≈Wi(t)/wi ≈Qi(t)/λiwi because of the sample pathLittle’s law and since δ is small. With SLD close to 1 we expect

    Q1(t)

    λ1w1≤ Q2(t)λ1w2

    , t≥ 0.

    The graph on the right nicely visualizes this asymptotic argument. We simulate Scenario 1 (λ1 =

    λ2 = 125) of Set 4 and zoom-in on the time interval [460,480] to obtain a clear view.

    460 470 480

    1.5

    1

    0

    0.5

    Simulation time

    Figure 11 Set 4 (TSF + Perfect SLD (β = 1)): (L) Performance with recommended staffing (R) samplepath SLD

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities22

    5. Proofs

    This section is dedicated to the proofs of our main results. §5.1 proves Theorem 3 which is on the

    asymptotic optimality of our proposed staffing and policy. §5.2 contains proofs of the theorems

    that deal with the equivalence of index policies and perfect SLD. The proofs of other theorems and

    auxiliary lemmas are in Soh and Gurvich [2015].

    5.1. Proof of Theorem 3:

    We divide the lengthy proof into sub-modules as follows: we first prove an equivalence between

    the waiting time formulation (3) and a queue-based formulation. This may be somewhat expected

    given (8). The challenge, however, is to establish such a “Little’s law” result at the formulation

    level (rather than for a given policy as is typically the case). It is in this step where considering

    local averages is instrumental. Building on this asymptotic equivalence, a lower bound for the

    local-average-queue-based formulation provides an asymptotic lower bound for the waiting-time

    formulation. We identify the staffing recommendation Nλ? in (4) as such a bound; see §5.1.1.

    In §5.1.2 we proceed to show that the tracking policy with our proposed function f? indeed has

    the desired asymptotic properties in that, when used, it results in Q̂ξλ,λi (t)≈ f?i (Q̂

    ξλ,λΣ (t)). Whereas

    this type of so-called state-space collapse is, by now, a standard result, a challenge arises here from

    the possible discontinuity of the tracking function f?.

    We combine the pieces in §5.1.3 to have the proof of Theorem 3. We prove there that our proposed

    solution is asymptotically feasible. Since its staffing component (and, in particular, its cost) is a

    lower bound on any such solutions, we conclude that our solution is asymptotically optimal.

    5.1.1. From waiting times to queues Before directly showing the relationship of (8), we

    introduce local average queue lengths which connect local average waiting times and queue lengths:

    Qξλ,λδ,i (t) =

    1

    δ

    ∫ t+δt

    Qξλ,λi (s)ds. (9)

    The following lemma shows that if a sequence {Nλ? } is asymptotically feasible for (3) it must

    (asymptotically) satisfy the appropriate constraints in terms of queues and vice versa. We let

    Qξλ,λδ,i (∞) be the δ average when the queue length is initialized (at time t= 0) with its stedy-state

    distribution.

    Lemma 5.1 Let {ξλ} be a sequence of admissible staffing-policy pairs. Then the followings are

    equivalent.

    1. For each �1 > 0, the following holds.

    limsupδ→0

    limsupλ→∞

    P{W ξ

    λ,λδ,1 (∞)>wλ1 (1− �1)

    }≤ α1 + �1,

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities23

    limsupδ→0

    limsupλ→∞

    P{W ξ

    λ,λδ,2 (∞)>wλ2 (1− �1)

    }≤ α2 + �1,

    limsupδ→0

    limsupλ→∞

    P

    {W ξ

    λ,λδ,1 (∞)wλ1

    ≤W ξ

    λ,λδ,2 (∞)wλ2

    − �1

    }≥ β− �1.

    2. For each �2 > 0, the following holds.

    limsupδ→0

    limsupλ→∞

    P{Qξ

    λ,λδ,1 (∞)>λ1wλ1 (1− �2)

    }≤ α1 + �2,

    limsupδ→0

    limsupλ→∞

    P{Qξ

    λ,λδ,2 (∞)>λ2wλ2 (1− �2)

    }≤ α2 + �2,

    limsupδ→0

    limsupλ→∞

    P

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    ≤Qξ

    λ,λδ,2 (∞)λ2wλ2

    − �2

    }≥ β− �2.

    Lemma 5.1 proves that replacing the waiting-time constraints with queue-based constraints

    should yield (asymptotically) identical results in terms of optimality. Thus, we turn to study a

    queue-based formulation starting with identifying a lower bound. To that end, given �, consider

    the problem

    minξλ

    N(ξλ)

    s.t. P{Qξ

    λ,λδ,1 (∞)>λ1wλ1 (1− �)

    }≤ α1 + �,

    P{Qξ

    λ,λδ,2 (∞)>λ2wλ2 (1− �)

    }≤ α2 + �, (10)

    P

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    ≤Qξ

    λ,λδ,2 (∞)λ2wλ2

    − �

    }≥ β− �.

    We next state our lower bound result. Recall that QN,λ(∞) is a random variable with the

    stationary distribution of the queue length in an M/M/N queue with arrival rate λ, service rate

    µ and N(> λ/µ) servers. As before, the variable QN,λδ (∞) is the local average on [0, δ] when the

    M/M/N queue is initialized with its steady-state distribution.

    Proposition 9 Fix λ, δ, � > 0 and let ξλδ,� be a feasible solution for (10).

    Define

    Nλ? (δ, �) = min{N ∈Z+ : P

    {QN,λδ (∞)≥ λ1wλ1 +λ2wλ2

    }≤min{α1,1−β}+α2 + 2�

    }.

    Then, Nλ? (δ, �)≤N(ξλδ,�), where N(ξλδ,�)

    is the staffing component of ξλδ,�.

    Proof: Define the following events on the underlying probability space:

    A :={Qξ

    λ,λδ,1 (∞) +Q

    ξλ,λδ,2 (∞)≤ λ1wλ1 +λ2wλ2

    },

    B :={Qξ

    λ,λδ,1 (∞) +Q

    ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q

    ξλ,λδ,2 (∞)>λ2wλ2

    },

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities24

    C :={Qξ

    λ,λδ,1 (∞) +Q

    ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2 ,Q

    ξλ,λδ,2 (∞)≤ λ2wλ2

    }.

    These events form partitions of the space. Further, on C,

    Qξλ,λδ,1 (∞) > λ1wλ1 +λ2wλ2 −Q

    ξλ,λδ,2 (∞)

    =

    (λ1w

    λ1 +λ2w

    λ2

    λ2wλ2

    )(λ2w

    λ2 −Q

    ξλ,λδ,2 (∞)) +

    λ1wλ1

    λ2wλ2Qξ

    λ,λδ,2 (∞)

    ≥ λ1wλ1

    λ2wλ2Qξ

    λ,λδ,2 (∞) .

    Also it is evident that Qξλ,λδ,1 (∞)>λ1wλ2 on C. In turn,

    C ⊆

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    >Qξ

    λ,λδ,2 (∞)λ2wλ2

    }∩{Qξ

    λ,λδ,1 (∞)>λ1wλ2

    }. (11)

    Using these definitions and the assumed feasibility of the staffing-policy pair,

    P{B} = P{Qξ

    λ,λδ,1 (∞) +Q

    ξλ,λδ,2 (∞)>λ1wλ1 +λwλ2 ,Q

    ξλ,λδ,2 (∞)>λ2wλ2

    }≤ P

    {Qξ

    λ,λδ,2 (∞)>λ2wλ2

    }≤ P

    {Qξ

    λ,λδ,2 (∞)>λ2wλ2 (1− �)

    }≤ α2 + �, .

    and

    P{C} ≤ min

    {P

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    >Qξ

    λ,λδ,2 (∞)λ2wλ2

    },P{Qξ

    λ,λδ,1 (∞)>λ1wλ2

    }}

    ≤ min

    {P

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    >Qξ

    λ,λδ,2 (∞)λ2wλ2

    − �

    },P{Qξ

    λ,λδ,1 (∞)>λ1wλ2 − �

    }}.

    ≤ min{α1 + �,1−β+ �}..

    where the first inequality follows from (11). Hence

    P{Qξ

    λ,λδ,1 (∞) +Q

    ξλ,λδ,2 (∞)>λ1wλ1 +λ2wλ2

    }= P{B ∪C}= P{B}+P{C} ≤min{α1,1−β}+α2 + 2�.

    Since the policy is work conserving, Qξλ,λδ,1 (∞)+Q

    ξλ,λδ,2 (∞) has the law of Q

    N,λδ (∞) with N =N(ξλ).

    Nλ? (δ, �) is the minimum staffing among the ones that satisfy above and hence Nλ? (δ, �)≤N(ξλδ,�).

    Note that Nλ? (δ, �) is closely related to our proposed staffing component Nλ? in (4). The following

    lemma guarantees that the staffing levelNλ? in (4), combined with a feasible policy, is asymptotically

    optimal.

    Lemma 5.2

    limsup�→0

    limsupδ→0

    limsupλ→∞

    Nλ? (δ, �)−Nλ?√λ

    ≥ 0.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities25

    5.1.2. Asymptotic tracking We show that the local average queue lengths actually “tracks”

    the tracking function given by (5). The following lemma is a corollary of a result in Gurvich and

    Whitt [2009a].

    Lemma 5.3 Let f be a Lipschitz continuous function and let {ξλf } be a sequence of staffing-policy

    pairs where the staffing N(ξλf ) follows square-root scaling, i.e., N(ξλf ) = λ/µ+ η

    √λ/µ+ o(

    √λ) for

    some η > 0 and a tracking policy constructed by f is used. For each � > 0, there exists δ′ > 0 such

    that for all δ with 0< δ≤ δ′ the following holds.

    limsupλ→∞

    P{∣∣∣∣Q̂ξλf ,λδ,i (∞)− Q̂ξλf ,λi (∞)∣∣∣∣> �}< �.

    Notably, Lemma 5.3 requires the smoothness of the tracking functions which is clearly violated

    by our tracking function f?. Instead, given a piecewise Lipschitz function we will construct two

    other tracking policies that provide lower and upper bounds for our tracking policy and do satisfy

    the smoothness requirements.

    For the following, fix a piecewise Lipschitz tracking function f . Let Df := {d1, . . . , dn} be the set

    of discontinuity points of f (recall that this is a finite set). Also, let

    D↑f = {di ∈Df : f1 (di+)− f1 (di−)≥ 0} and D↓f = {di ∈Df : f1 (di+)− f1 (di−)< 0}

    Given ς < min{d1, d2− d1, ..., dn− dn−1}/2 (ς will be adjusted within our proofs), define the

    following tracking functions fς

    and f ς so that fς

    1 (x)≥ fς

    1(x) and f

    ς

    2 (x)≤ fς

    2(x) for all x≥ 0:

    1 (x) =

    f (di− ς)

    (x−di+ς

    ς

    )+ f (di+)

    (di−xς

    ), di− ς ≤ x≤ di, di ∈D↑f ,

    f (di−)(x−diς

    )+ f (di + ς)

    (di+ς−x

    ς

    ), di ≤ x≤ di + ς, di ∈D↓f ,

    f (x) , otherwise,

    f ς1(x) =

    f (di−)

    (x−diς

    )+ f (di + ς)

    (di+ς−x

    ς

    ), di− ς ≤ x≤ di, di ∈D↑f ,

    f (di− ς)(x−di+ς

    ς

    )+ f (di+)

    (di−xς

    ), di ≤ x≤ di + ς, di ∈D↓f ,

    f (x) , otherwise.

    Note that f (di−) and f (di+) are the left and right limit of f at di which are well-defined by the

    fact that there are finite number of discontinuous points. fς

    2 (x) and fς

    2(x) are defined by x−f ς1 (x)

    and x− f ς1(x). Having defined these functions, we can now prove the following result. Below, as

    before, πλf? is the tracking policy with respect to the function f? in the λth system with f?.

    Proposition 10 uses these smoothed functions to show the tracking function works. Lemma 5.4,

    along with Lemma 5.3, is used in the proof of Proposition 10. Given a tracking function f , we

    denote by πf the tracking policy defined with respect to f . Let Qf (t) = (Qf1(t),Q

    f2(t)) be the queue

    lengths at time t if the policy πf is used.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities26

    Lemma 5.4 Fix λ,µ, and N . Fix tracking functions g= (g1, g2) and h= (h1, h2) such that g1 (x)≤

    h1 (x) and (consequently) g2 (x) ≥ h2 (x) for all x ≥ 0. Suppose that Qg (0) = Qh (0). There then

    exists a construction of the sample paths so that the following inequalities hold almost surely,

    Qg1 (t)≤Qh1 (t) and Qg2 (t)≥Qh2 (t) , t≥ 0.

    Proposition 10 For each λ, let ξλ = (Nλ, πλf?) where the sequence {Nλ} satisfies Nλ = λ/µ +

    η√λ/µ+ o(

    √λ). Then, for any � > 0, there exists δ

    ′> 0 such that for all δ ∈ (0, δ′ ]

    limsupλ→∞

    P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �.

    Proof: From the staffing Nλ and the tracking functions fς(x) and f ς (x), define the staffing and

    policy pairs ξς

    λ = (Nλ, πλ

    fς ) and ξ

    ς

    λ= (Nλ, πλfς ). By construction, the tracking function f

    ςis Lipschitz

    continuous. Hence, we can apply Theorem 3.1 of Gurvich and Whitt [2009a] to obtain(Q̂ξςλ,λ

    1 (∞)− fς

    1

    (Q̂ξ

    λ,λΣ (∞)

    ), Q̂

    ξςλ,λ

    2 (∞)− fς

    2

    (Q̂ξ

    λ,λΣ (∞)

    ))⇒ (0,0) .

    Note that the staffing component of ξλ, ξλ

    and ξλ are the same and the total queue length process is

    irrelevant of prioritization policies once the staffing is the same and the policies are all admissible.

    We then have for each � > 0 that

    limsupλ→∞

    P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 , for i= 1,2. (12)

    By Lemma 5.3, there exists δ′> 0 such that for all δ ∈ (0, δ′ ],

    limsupλ→∞

    P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}< �6 . (13)

    Define the set

    F ς ={x≥ 0 :

    ∣∣∣f ςi (x)− f?i (x)∣∣∣> �3} .By the construction of f

    ς, for each � > 0, there exists ς > 0 such that

    P{Q̂ξΣ(∞)∈ F ς

    }<�

    6.

    Moreover, since Q̂ξΣ(∞), the limiting process of Q̂ξλ,λΣ (∞), has a density (see Halfin and Whitt

    [1981]), P{Q̂ξΣ(∞)∈ ∂F ς}= 0 so that the following holds by Portmanteau Theorem (see Billingsley

    [1999]),

    limλ→∞

    P{Q̂ξ

    λ,λΣ (∞)∈ F ς

    }= P

    {Q̂ξΣ (∞)∈ F ς

    }.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities27

    Then, for each � > 0, there exists ς > 0 that satisfies

    limsupλ→∞

    P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}< �6 . (14)

    Combining (12)-(14), we conclude that for each � > 0, there exists ς > 0 and δ′ > 0 so that for

    all δ ∈ (0, δ′ ]

    limsupλ→∞

    P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �} < limsup

    λ→∞P{∣∣∣Q̂ξςλ,λδ,i (∞)− Q̂ξςλ,λi (∞)∣∣∣> �3}

    +limsupλ→∞

    P{∣∣∣Q̂ξςλ,λi (∞)− f ςi (Q̂ξλ,λΣ (∞))∣∣∣> �3}

    +limsupλ→∞

    P{∣∣∣f ςi (Q̂ξλ,λΣ (∞))− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �3}

    <�

    6+�

    6+�

    6=�

    2.

    Similarly for ξςλ,

    limsupλ→∞

    P{∣∣∣Q̂ξςλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �}< �2 .

    Finally, note that by construction f ς1(x) ≤ f?(x) ≤ f ς1(x) for all x ≥ 0 so that, by Lemma 5.4

    Q̂ξςλ,λ

    δ,1 (∞)≤ Q̂ξλ,λδ,1 (∞)≤ Q̂

    ξςλ,λδ,1 (∞) and, in turn, there exists δ

    ′such that for all δ ∈ (0, δ′ ]

    limsupλ→∞

    P{∣∣∣Q̂ξλ,λδ,1 (∞)− f?1 (Q̂ξλ,λΣ (∞))∣∣∣> �}

    ≤ limsupλ→∞

    P{Q̂ξ

    λ,λδ,1 (∞)− f?1

    (Q̂ξ

    λ,λΣ (∞)

    )> �}

    + limsupλ→∞

    P{Q̂ξ

    λ,λδ,1 (∞)− f?1

    (Q̂ξ

    λ,λΣ (∞)

    ) �}

    + limsupλ→∞

    P{Q̂ξςλ,λ

    δ,1 (∞)− f?1(Q̂ξ

    λ,λΣ (∞)

    ) 0, (15)

    and a steady-state exists for X. Furthermore,

    limsupλ→∞

    1√λE[Qξ

    λ,λ1 (∞) +Q

    ξλ,λ2 (∞)]

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities28

    5.1.3. Asymptotic feasibility and optimality: Proof of Theorem 3 We start with the

    asymptotic feasibility. Specifically, we argue that with ξλ being our proposed staffing-policy pair,

    it holds that for all � > 0, there exists δ′> 0 such that for all δ ∈ [0, δ′), the following holds.

    limsupλ→∞

    P{Qξ

    λ,λδ,i (∞)>λiwλi (1− �)

    }≤ αi + �, i= 1,2,

    limsupλ→∞

    P

    {Qξ

    λ,λδ,1 (∞)λ1wλ1

    ≤Qξ

    λ,λδ,2 (∞)λ2wλ2

    − �

    }≥ β− �.

    Asymptotic feasibility for the waiting time formulation then follows from Lemma 5.1. The above

    formulation is equivalent to the following.

    limsupλ→∞

    P{Q̂ξ

    λ,λδ,i (∞)>aiw̄i(1− �)

    }≤ αi + �, i= 1,2,

    limsupλ→∞

    P

    {Q̂ξ

    λ,λδ,1 (∞)a1w̄1

    ≤Q̂ξ

    λ,λδ,2 (∞)a2w̄2

    − �

    }≥ β− �.

    The following inclusions follow directly from the structure of f?. Recall that Q̂ξΣ denote the limit

    process of Q̂ξλ,λ

    Σ .{f?1

    (Q̂ξΣ (∞)

    )− a1w̄1 ≥ 0

    }⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

    },{f?2

    (Q̂ξΣ (∞)

    )− a2w̄2 ≥ 0

    }⊆{Q̂ξΣ (∞)∈ [q̄,∞)

    },

    and f?1

    (Q̂ξΣ (∞)

    )a1w̄1

    −f?2

    (Q̂ξΣ (∞)

    )a2w̄2

    ≥ 0

    ⊆{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Consequently,

    P{f?1

    (Q̂ξΣ (∞)

    )− a1w̄1 ≥ 0

    }≤ P

    {Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

    },

    P{f?2

    (Q̂ξΣ (∞)

    )− a2w̄2 ≥ 0

    }≤ P

    {Q̂ξΣ (∞)∈ [q̄,∞)

    },

    P

    f?1

    (Q̂ξΣ (∞)

    )a1w̄1

    −f?2

    (Q̂ξΣ (∞)

    )a2w̄2

    ≥ 0

    ≤ P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)} .Applying Proposition 1 and Theorem 1 of Halfin and Whitt [1981] and using the definition of q̄

    and η in (6) and (7),

    P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄)

    }= P

    {Q̂ξΣ (∞)> 0

    }P{Q̂ξΣ (∞)∈ [a1w̄1 + a2w̄2, q̄) | Q̂

    ξΣ (∞)> 0

    }=

    (1 +

    ηΦ(η)

    φ(η)

    )−1· (exp(−η(a1w̄1 + a2w̄2))− exp(−ηq̄))

    =

    (1 +

    ηΦ(η)

    φ(η)

    )−1·(

    exp(−η(a1w̄1 + a2w̄2))

    − exp(−η(a1w̄1 + a2w̄2)) · exp(− log

    {α2 + min{α1,1−β}

    α2

    }))

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities29

    =

    (1 +

    ηΦ(η)

    φ(η)

    )−1(1 +

    ηΦ(η)

    φ(η)

    )(min{α1,1−β}+α2)(

    1− α2α2 + min{α1,1−β}

    )= min{α1,1−β},

    P{Q̂ξΣ (∞)∈ [q̄,∞)

    }= P

    {Q̂ξΣ (∞)> 0

    }P{Q̂ξΣ (∞)≥ q̄ | Q̂

    ξΣ (∞)> 0

    }=

    (1 +

    ηΦ(η)

    φ(η)

    )−1· exp(−ηq̄)

    =

    (1 +

    ηΦ(η)

    φ(η)

    )−1exp(−η(a1w̄1 + a2w̄2)) ·

    α2α2 + min{α1,1−β}

    = α2, (16)

    Let Df? be the set of discontinuity points of f?. Since Q̂ξΣ(∞) has a density, P{Q̂ξΣ (∞)∈Df?

    }=

    0, so that, by continuous mapping theorem (See Theorem 3.4.3. of Whitt [2002]),(Q̂ξ

    λ,λΣ (∞) , f?1

    (Q̂ξ

    λ,λΣ (∞)

    ), f?2

    (Q̂ξ

    λ,λΣ (∞)

    ))⇒(Q̂ξΣ (∞) , f?1

    (Q̂ξΣ (∞)

    ), f?2

    (Q̂ξΣ (∞)

    )),

    By Portmanteau Theorem (and using the 0 measure of the discontinuity points), we have

    limsupλ→∞

    P{f?i

    (Q̂ξ

    λ,λΣ (∞)

    )− aiw̄i ≥ 0

    }= P

    {f?i

    (Q̂ξΣ (∞)

    )− aiw̄i ≥ 0

    }≤ αi for i= 1,2,

    limsupλ→∞

    P

    f?1

    (Q̂ξ

    λ,λΣ (∞)

    )a1w̄1

    −f?2

    (Q̂ξ

    λ,λΣ (∞)

    )a2w̄2

    ≥ 0

    = Pf

    ?1

    (Q̂ξΣ (∞)

    )a1w̄1

    −f?2

    (Q̂ξΣ (∞)

    )a2w̄2

    ≥ 0

    ≤ 1−β.By Proposition 10, for each � > 0 there exist δ

    ′> 0 such that for all δ ∈ (0, δ′ ] and i= 1,2,

    limsupλ→∞

    P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}< �2 ,

    limsupλ→∞

    P{Q̂ξ

    λ,λδ,i (∞)≥ aiw̄i(1− �)

    }≤ limsup

    λ→∞P{∣∣∣Q̂ξλ,λδ,i (∞)− f?i (Q̂ξλ,λΣ (∞))∣∣∣> �min{a1w̄12 , a2w̄22 }}

    +limsupλ→∞

    P{f?i

    (Q̂ξ

    λ,λΣ (∞)

    )≥ aiw̄i

    }≤ �

    2+αi ≤ αi + �,

    and

    limsupλ→∞

    P

    {Q̂ξ

    λ,λδ,1 (∞)a1w̄1

    >Q̂ξ

    λ,λδ,2 (∞)a2w̄2

    − �

    }≤ limsup

    λ→∞P

    ∣∣∣∣∣∣Q̂

    ξλ,λδ,1 (∞)a1w̄1

    −f?1

    (Q̂ξ

    λ,λΣ (∞)

    )a1w̄1

    ∣∣∣∣∣∣> �a1w̄1 min{a1w̄1

    2,a2w̄2

    2

    }+limsup

    λ→∞P

    f?1

    (Q̂ξ

    λ,λΣ (∞)

    )a1w̄1

    −f?2

    (Q̂ξ

    λ,λΣ (∞)

    )a2w̄2

    ≥ 0

    +limsup

    λ→∞P

    ∣∣∣∣∣∣Q̂

    ξλ,λδ,2 (∞)a2w̄2

    −f?2

    (Q̂ξ

    λ,λΣ (∞)

    )a2w̄2

    ∣∣∣∣∣∣> �a2w̄2 min{a1w̄1

    2,a2w̄2

    2

    }

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities30

    ≤ �2

    + 1−β+ �2≤ 1−β+ �.

    This concludes the feasibility argument and we turn to optimality. Let {ξ̃λ} be another asymp-totically feasible sequence. By the definition of asymptotic feasibility, for any � > 0 there exists a

    δ1 > 0 such that for all δ ∈ (0, δ1] the following holds

    limsupλ→∞

    P{Q̂ξ̃

    λ,λδ,i (∞)>aiw̄i(1− �)

    }≤ αi + �, i= 1,2,

    limsupλ→∞

    P

    {Q̂ξ̃

    λ,λδ,1 (∞)a1w̄1

    ≤Q̂ξ̃

    λ,λδ,2 (∞)a2w̄2

    − �

    }≥ β− �.

    Thus, given δ and � we have by Proposition 9 that N(ξ̃λ)≥Nλ? (δ,2�). In particular, given δ, � > 0

    limsupλ→∞

    Nλ? −N(ξ̃λ)√λ

    ≤ limsupλ→∞

    Nλ? −Nλ? (δ,2�)√λ

    +limsupλ→∞

    Nλ? (δ,2�)−N(ξ̃λ)√λ

    ≤ limsupλ→∞

    Nλ? −Nλ? (δ,2�)√λ

    .

    Finally, since �, δ are arbitrary, we have by Lemma 5.2 that

    limsup�→0

    limsupδ→0

    limsupλ→∞

    [Nλ? −N(ξ̃λ)]+ ≤ 0,

    which, together with the formerly established asymptotic feasibility, establishes the asymptotic

    optimality of ξλ = (Nλ? , πλf?). �

    5.2. Proofs on the equivalence between perfect SLD and index policies

    5.2.1. Index rule and tracking policy: Proof of Proposition 6 For a given monotone

    tracking function f = (f1, f2), define the generalized inverses of f1 and f2 as follows.

    g1 (y) := sup{x≥ 0 : f1 (x)≤ y} , and g2 (y) := inf {x≥ 0 : f2 (x)≥ y} .

    We will show that the tracking policy by f = (f1, f2) and the index policy by g1 and g2 are

    equivalent. For the index policy, guideline 1 to server if g1 (Q1 (t))≥ g2 (Q2 (t)) and it chooses class2, if g1 (Q1 (t))< g2 (Q2 (t)).

    Suppose class 1 is to be served at time t under the tracking policy f . Then

    Q2(t)≤ f2(QΣ(t)) and f1(QΣ(t))≤Q1(t).

    g1 and g2 are increasing. Also gi (fi (x)) = x for i= 1,2. Hence applying g2 to the first inequality

    and g1 to the second one result in

    g2(Q2(t))≤QΣ(t) and QΣ(t)≤ g1(Q1(t)).

    Combining those two, we can see class is also served under the index policy by g1 and g2. The

    case where class 2 is to be served can be proved similarly and hence the first statement of the

    theorem is proved.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities31

    Now we will prove the second statement of the theorem. For simplicity, we assume that g1 is right

    continuous with left limits and g2 is left continuous with right limits. For a given index functions

    g1 and g2, define f1, which is a function of total queue length QΣ, as follows.

    f1 (QΣ) :=

    0, g2 (QΣ)< g1 (0) ,

    QΣ, g1 (QΣ)< g2 (0) ,

    inf {x : 0≤ x≤QΣ, [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅} , otherwise.

    Note that with continuous gi’s, gi(x−) = gi(x), i = 1,2 hold. Then f1(QΣ) := inf{x : 0 ≤ x ≤

    QΣ, g1(x) = g2(QΣ − x)} in the third case. More complicated formula as the above is devised to

    incorporate discontinuous index functions. We will show that f1 (QΣ) is a well-defined function and

    the tracking policy by f = (f1, f2) is equivalent to index policy with g1 and g2. We complete the

    proof by showing that f = (f1, f2) is an increasing function.

    If g2 (QΣ)< g1 (0) or g1 (QΣ)< g2 (0), f1 (QΣ) is evidently well defined. It remains to show there

    exists x such that 0≤ x≤QΣ and [g1(x−), g1 (x)]∩ [g2 (QΣ−x) , g2(QΣ−x+)] 6= ∅ when g2 (QΣ)≥

    g1 (0) and g1 (QΣ)≥ g2 (0).

    We show this with a realistic assumption that g1 and g2 have finite number of discontinuous

    points on [0,QΣ]. The case for countably infinite case is proved in Soh and Gurvich [2015]. First,

    define g (x) := g1 (x)− g2 (QΣ−x) and let {x1, x2, ..., xn} be the discontinuous points of g (x) on

    [0,QΣ]. Note that g (x) is increasing and g(x−)≤ g (x) always holds.

    We first prove that ∪x∈[0,QΣ] [g(x−), g (x)] , is a connected set. Since g is continuous at [0, x1),

    A := ∪x∈[0,x1) [g(x−), g (x)] = ∪x∈[0,x1)g (x) is a connected set, i.e., an interval. We will show that

    A ∪B is also an interval where B := [g(x1−), g (x1)] . Let g′ be a modified function from g such

    that g′ (x1) = g(x1−) and g′ (x) = g (x) for all x ∈ [0, x1). Then g′ (x) is continuous at x ∈ [0, x1].

    Evidently, g′ (x)∈A for all x∈ [0, x1) and g′ (x1)∈B. Hence a continuous function g′ (x) on [0, x1]

    is a path from g′ (0) = g (0) ∈A to g′ (x1) ∈B. A and B are both path-connected sets and A∪B

    is a path-connected set, i.e., an interval. Repeating this argument until x=QΣ, we conclude that

    ∪x∈[0,QΣ] [g(x−), g (x)] is a connected set.

    By the conditions g2 (QΣ)≥ g1 (0) and g1 (QΣ)≥ g2 (0), g (0) = g1 (0)− g2 (QΣ)≤ 0 and g (QΣ) =

    g1 (QΣ) − g2 (0) ≥ 0 hold. Therefore, the above interval contains 0 and there exists x′ such

    that 0 ∈ [g(x′−), g (x′)] . Then g1 (x′) ≥ g2 (QΣ−x′) and g1(x′) ≤ g2(QΣ − x′+) hold and hence

    [g1 (x′−) , g1 (x′)]∩ [g2 (QΣ−x′) , g2 (QΣ−x′+)] 6= ∅. f1 (QΣ) is defined to be the infimum of such x′

    and therefore is well-defined.

    Now we show that the tracking policy by f = (f1, f2) is equivalent to the index policy by g1

    and g2. If g2 (Q1 (t) +Q2 (t))< g1 (0), f1 (Q1 (t) +Q2 (t)) is defined to be 0. Since g1 is increasing,

    g2 (Q1 (t) +Q2 (t))< g1 (0)< g1 (Q1 (t) +Q2 (t)). Then, class 2 is always served first under both the

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities32

    tracking policy and the index policy. If g1 (Q1 (t) +Q2 (t))< g2 (0), f1 (Q1 (t) +Q2 (t)) is defined to

    be Q and class 1 is to be served under the both policies.

    Now we check the remaining case when g2 (Q1 (t) +Q2 (t)) ≥ g1 (0) and g1 (Q1 (t) +Q2 (t)) ≥

    g2 (0). Suppose class 1 is to be served at time t with the index policy. Then g1 (Q1 (t))≥ g2 (Q2 (t)) . If

    [g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] 6= ∅, f1 (Q1 (t) +Q2 (t)) is less or equal to Q1 (t),

    since it is defined to be the infimum of x that satisfies [g1 (x−) , g1 (x)]∩ [g2 (Q−x) , g2 (Q−x+)] 6=

    ∅. If [g1 (Q1 (t)−) , g1 (Q1 (t))] ∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅, max [g2 (Q2 (t)) , g2 (Q2 (t)+)] <

    min [g1 (Q1 (t)−) , g1 (Q1 (t))] and hence f1 (Q1 (t) +Q2 (t)) must be less than Q1 (t), since g1 and

    g2 are increasing and g1(Q1(t)) ≥ g2(Q2(t)). Q1 (t) ≥ f1 (Q1 (t) +Q2 (t)) commands class 1 to be

    served at time t with the tracking policy.

    Suppose class 2 is to be served at time t with the index policy. Then g1 (Q1 (t)) < g2 (Q2 (t)) .

    Evidently, [g1 (Q1 (t)−) , g1 (Q1 (t))]∩ [g2 (Q2 (t)) , g2 (Q2 (t)+)] = ∅ and since g1 and g2 are increasing

    functions, f1 (Q1 (t) +Q2 (t)) is larger than Q1 (t), i.e., class 2 is to be served at time t with the

    tracking policy.

    It remains to show that f = (f1, f2) is a monotone function. Suppose, on the contrary, that f

    is a non-monotone function, i.e., there exist q′ and q′′ such that q′ < q′′ and fi (q′) > fi (q

    ′′) for

    some i. Assume without loss of generality that i = 1 (f1 (q′) > f1 (q

    ′′)). Choose a q′′′ such that

    f1 (q′′)< q′′′ < f1 (q

    ′). If Q1 (t) = q′′′ and Q2 (t) = q

    ′′− q′′′, f1 (Q1 (t) +Q2 (t)) = f1 (q′′)< q′′′ =Q1 (t)

    and hence class 1 customer must be served. If Q1 (t) = q′′′ and Q2 (t) = q

    ′−q′′′, f1 (Q1 (t) +Q2 (t)) =

    f1 (q′)> q′′′ =Q1 (t) and hence class 2 customer must be served.

    Under the equivalent index policy by g1 and g2 (the existence of which is verified above), g1 (q′′′)≥

    g2 (q′′− q′′′) and g1 (q′′′)< g2 (q′− q′′′) must hold. But this contradicts the fact that g2 is a monotone

    function since q′′− q′′′ > q′− q′′′. Therefore, f = (f1, f2) must be a monotone function.

    5.2.2. Near optimality of monotone solutions: Proof of Proposition 7 Having defined

    fϑ, the proof of this theorem is a straightforward adaptation of the proof of Theorem 3 and, in

    particular, of the arguments in sections 5.1.2 and 5.1.3. Fix ϑ > 0 and let Q̂ϑΣ(∞) be defined as

    the limit of the properly scaled M/M/N steady-state queues when the staffing in the λth queues is

    Nλ = bNλ? +ϑ√λc. Asymptotic tracking follows as in Proposition 10 with f? replaced by fϑ there.

    fϑ is structured to satisfy{fϑ1

    (Q̂ϑΣ (∞)

    )− a1w̄1 ≥ 0

    }= ∅,

    {fϑ2

    (Q̂ϑΣ (∞)

    )− a2w̄2 ≥ 0

    }⊆{Q̂ϑΣ (∞)∈ [qϑb ,∞)

    },

    and fϑ1

    (Q̂ϑΣ (∞)

    )a1w̄1

    −fϑ2

    (Q̂ϑΣ (∞)

    )a2w̄2

    ≥ 0

    = ∅.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities33

    Proceeding as in §5.1.3, it follows trivially that the SLD constraint (with β = 1 here) and the

    constraint for class 1 are asymptotically satisfied and we turn to the class-2 constraint. Recall that

    with the sequence {Nλ? }, it holds that P{Q̂ξΣ(∞)∈ [q̄,∞)} ≤ α2 (see (16)). Since ϑ is strictly positive,

    it then follows (by the explicit expressions for Q̂ξΣ(∞) and Q̂ϑΣ(∞)) that there exists $ > 0 such

    that P{Q̂ϑΣ(∞) ∈ [q̄,∞)} ≤ α2−$. We can now choose κϑ sufficiently small (and in turn qϑb close

    to q̄) such that P{Q̂ϑΣ(∞) ∈ [qϑb , q̄]} ≤$/2, in which case, we have that P{Q̂ϑΣ(∞) ∈ [q̄ϑb ,∞)} ≤ α2.

    From here the proof of asymptotic feasibility is concluded as in §5.1.3. �

    5.2.3. Perfect SLD and monotone tracking policy: Proof of Proposition 8 Suppose,

    on the contrary, that N (ξ1)>N (ξ2). We will show that the TSF constraints cannot be satisfied

    with the staffing N (ξ2) and hence N (ξ2)≥N (ξ1) must hold.

    If N (ξ1) > N (ξ2), the queue length distribution under the staffing N (ξ1) is first order

    stochastically dominated by the one under N (ξ2), i.e., for any q > 0, P{Q̂ξ1Σ (∞)∈ [q,∞)

    }<

    P{Q̂ξ2Σ (∞)∈ [q,∞)

    }. Also by the proof of Theorem 3, P

    {Q̂ξ1Σ ≥ a1w̄1 + a2w̄2

    }= α2. Then there

    exists � > 0 such that P{Q̂ξ2Σ ≥ a1w̄1 + a2w̄2 + �

    }>α2.

    Let f be the tracking function for ξ2. Then either f1(a1w̄1 + a2w̄2 + �) > a1w̄1 or

    f2(a1w̄1 + a2w̄2 + �) > a2w̄2 or both. Since f is monotone, either P{Q̂ξ22 (∞)>a2w̄2

    }≥

    P{Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �

    }> α2 or P

    {Q̂ξ21 (∞)>a1w̄1

    }≥ P

    {Q̂ξ2Σ (∞)>a1w̄1 + a2w̄2 + �

    }>

    α2 ≥ α1. Hence at least one of the TSF constraints cannot be met when the staff is less than N(ξ1).

    6. Concluding remarks

    A starting point in process management is to identify a sufficient set of Key Performance Indicators

    (KPIs). Call centers must choose QoS metrics that reliably reflect the call center’s objectives.

    This is a design decision: a question of formulation choice that should be influenced by financial

    and marketing considerations. To make an informed formulation choice, one must understand

    how different formulations translate into outcomes – these would be the optimal staffing and

    prioritization decisions.

    Our paper seeks to contribute to the understanding of this relationship between formula-

    tions and outcomes/decisions. We characterize the relationship between index rules (on the out-

    comes/decisions end) and real time service-level differentiation (on the formulation end) in showing

    that index rules can be justified if and only if one requires, in addition to the marginal class-level

    targets, perfect service level differentiation.

    Implicitly, this work brings together two paradigms of staffing optimization. In the first, costs

    are assigned to waiting and the optimizer seeks to minimize the combined cost of waiting staffing.

    In the second, constraints are placed on waiting-related metrics and the optimizer seeks to min-

    imize staffing costs while meeting the imposed constraints. The role of index rules is relatively

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities34

    well understood under the first paradigm, yet we study them (and call centers use them) for con-

    strained staffing problems. The better understanding of index rules, we hope, can facilitate the

    understanding of the duality (or lack therof) between waiting-cost to waiting-time targets.

    Acknowledgements

    We thank Yong Pin Zhou and Retsef Levi for helpful comments in early stages of this work. We are

    grateful to three anonymous reviewers and the associate editor for their comments and suggestions

    that lead to significant improvements to this work.

    References

    Z. Aksin, M. Armony, and V. Mehrotra. The modern call center: A multi-disciplinary perspective

    on operations management research. Production and Operations Management, 16(6):665–688,

    2007.

    T. Apostol and I. Makai. Mathematical analysis, volume 2. Addison-Wesley Reading, MA, 1974.

    M. Armony. Dynamic routing in large-scale service systems with heterogeneous servers. Queueing

    Systems, 51(3-4):287–329, 2005.

    M. Armony and A. Mandelbaum. Routing and staffing in large-scale service systems: the case of

    homogeneous impatient customers and heterogeneous servers. Operations Research, 59:50–65,

    2011.

    S. Asmussen. Applied probability and queues. Springer Verlag, 2003.

    P. Billingsley. Convergence of Probability Measures. 2nd ed., Wiley, New York, 1999.

    S. Borst, A. Mandelbaum, and M. Reiman. Dimensioning large call centers. Operations Research,

    52(1):17–34, 2004.

    W. Chan, G. Koole, and P. L’Ecuyer. Dynamic call center routing policies using call waiting and

    agent idle times. Manufacturing & Service Operations Management, 16(4):544–560, 2014.

    H. Chen and D. Yao. Fundamentals of queueing networks. Springer, 2001.

    I. Gurvich and W. Whitt. Service-level differentiation in many-server service systems: A solution

    based on fixed-queue-ratio routing. Operations Research, 29:567–588, 2007.

    I. Gurvich and W. Whitt. Queue-and-idleness-ratio controls in many-server service systems. Math-

    ematics of Operations Research, 34(2):363–396, 2009a.

    I. Gurvich and W. Whitt. Scheduling flexible servers with convex delay costs in many-server service

    systems. Manufacturing & Service Operations Management, 11(2):237–253, 2009b.

    I. Gurvich, M. Armony, and A. Mandelbaum. Service level differentiation in call centers with fully

    flexible servers. Management Science, 54(2):279–294, 2008.

    S. Halfin and W. Whitt. Heavy-traffic limits for queues with many exponential servers. Operations

    Research, 29(3):567–588, 1981.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities35

    G. Koole. Call center optimization. Lulu. com, 2013.

    G. Koole and A. Pot. An overview of routing and staffing algorithms in multi-skill customer contact

    centers. working. Technical report, 2006.

    A. Mandelbaum and S. Stolyar. Scheduling flexible servers with convex delay costs: Heavy-traffic

    optimality of the generalized cµ-rule. Operations Research, 52:836–855, 2004.

    S. Soh and I. Gurvich. Technical supplement to the paper: “call cecenter staffing: Service-level

    constraints and index priorities”. 2015. Available from the authors’ websites.

    J. A. Van Mieghem. Dynamic scheduling with convex delay costs: The generalized cµ rule. Ann.

    Appl. Prob., 5:809–833, 1995.

    W. Whitt. Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and Their

    Application to Queues. Springer-Verlag, New York., 2002.

    W. Whitt. Proofs of the martingale FCLT. Probab. Surv, 4:268–302, 2007.

  • Soh and Gurvich: Call center staffing: Service-level constraints and index priorities1

    7. Technical supplement to the paper:Call center staffing: Service-level constraints and index priorities

    This supplement has three parts. In the first part, we prove auxiliary lemmas in §5.1. In the second

    part, the remaining results that were stated without proofs are given.

    7.1. Proofs of auxiliary lemmas in §5.1

    We first provide an explicit construction of the process X. Recall that Qi(t) is the number of

    customers in the class-i queue at time t and Z(t) is the number of customers in service at time

    t. We number the servers and let As(t) be an N -dimensional process whose lth element captures

    the elapsed service time of the customer in service with server l. This coordinate is set to 0 if

    server l is idle at time t. We let Wi(t) be a vector of dimension of Qi(t) whose jth entry is the

    accumulated waiting time of the jth customer in the class-i queue. Customers are listed in their

    order of arrival so that the first element of Wi(t) is the accumulated waiting time of the class-i

    customer that arrived first amongst those that are still waiting. An arriving customer is added at

    the end of the vector with an initial value of 0. We remove the item in the head of the vector when

    the first customer in the queue is admitted to service.

    Admissible policies are assumed to be stationary with respect to the process

    X(t) = (Z(t),Q1(t),Q2(t),As(t),W1(t),W2(t)),

    i.e., an assignment decision at time t may use only the information captured by X(t). This guar-

    antees that X(t) is a Markov process.

    Proof of Lemma 5.1: We first show that if part (1.) holds, part (2.) also holds. This will be

    proved by showing that for a given �2, we can specify δ̄ such that for all δ ∈ (0, δ̄], the inequalities

    of (2.) hold.