Capacity Setting and Queuing Theory BAHC 510 Lecture 6 US Election Nov 6, 2012

Capacity Setting and

Queuing Theory

BAHC 510Lecture 6

US ElectionNov 6, 2012

Capacity and Resources• A key lever for improving patient flow.• How do we measure capacity?

– What is the capacity of a 20 seat restaurant?– A 16 bed ward?

• Capacity is a RATE– Customers/hour– Patients/day

• We can view a 16 bed ward as a queuing system with 16 servers– What is the capacity of a bed?– Does this analogy apply to the restaurant?

• A system is composed of resources with capacities.– Often we use the expressions “resource” and “capacity”

interchangeably (hopefully without confusion)

How Much Capacity is Needed? or How Many Resources are Needed?

0 50 100 150 200 250 300 350 40015

20

25

30

Ward Occupancy

Day

Mid

nig

ht

cen

sus

Base capacity

Surge capacity

Capacity tradeoffs when demand is variable

• Too much capacity or too many resources = idleness• Not enough capacity = waits• The resource manager must trade these off taking into

account system objectives and available resources• Should we set capacity equal to demand?

– What does this mean?– This is called a balanced system– It works perfectly when there is no variation in the system– It works terribly when there is variation! Why?

• Once behind, you never can catch up.

– Queuing theory quantifies these tradeoffs in terms of performance measures.

Queuing Models• (Mathematical) queuing models help us set capacity

(or determine the number of resources needed) to meet:– Service level targets– Average wait time targets– Average queue length targets

• Queuing models provide a more precise alternative to simulation

• They provide insights into how to plan, operate and manage a system

• Where are there queues in the health care system?

A single server queuing system

BufferServer

• A queue forms in a buffer• Servers may be people or physical space• The buffer may have a finite or unlimited capacity• The most basic models assume “customers” are of one typeand have common arrival and service rates

A multiple (N) server queuing system

Buffer

Server

Server

Server

.

.

.

Several parallel singer server queues

BufferServer

BufferServer

BufferServer

Parallel Queues vs. Multiple server Queues

• Provide examples of multiple server queues (MSQs)

• Provided examples of parallel queues (PQs)• In what situations would each of these

queuing systems be most appropriate? Why?

Networks of queues• Most health care systems are

interconnected networks of queues and servers with multiple waiting points and heterogeneous customers.– Provide some examples. – Often we model these complex systems with

simulation. • But in some cases we can use formulae to get

results

Queuing Theory background• Developed to analyze telephone systems in the

1930’s by Erlang.– How many lines are needed to ensure a caller tries to

dial and obtains a “line”.– Depending on the system configuration an arriving

customer may either be blocked or enter a queue and wait for service.

• Now they are applied to analyze internet traffic, telecommunications systems, call centers, airport security lines, banks and restaurants, rail networks, etc.

Queues and Variability• There are two components of a queuing system subject

to variability– The inter-arrival times of “jobs”– The service times or LOS – Why are these variable?

• We describe this variability by – Mean– Standard deviation– Probability distribution

• Usually the normal distribution doesn’t fit well• Often an exponential distribution fits well

– If we know its rate or mean we know everything about it.

The exponential distribution• P(T ≤ t) = 1 – e-λt

– The quantity λ is the rate.– The mean and standard deviation of the exponential distribution is 1/λ.– The median is ln(2)/ λ = .693/ λ

• Exponential distributions don’t allow negative times and have a small probability of long service times.

• Example; Patients arrive at rate 4 per hour.– The mean inter-arrival time is 15 minutes.– The median inter-arrival time is 10.39 minutes.– What is the probability that the time between two arrivals is less than

10 minutes (1/6 of an hour)• P( T ≤ 1/6) = 1 – e-4 (1/6)∙ = 1- e-2/3 = .487.

• The exponential distribution underlies queuing theory.• A queue with exponential service times and exponential inter-arrival

times and N (FCFS) servers and an infinite waiting room is called an M/M/N queue.

http://en.wikipedia.org/wiki/Exponential_distribution

Capacity management and queuing systems

• Capacity management involves determining the number of servers to use and the size of the waiting rooms.

• Examples– How many long term care beds are needed?– How many porters are needed?– How many nurses are needed?– How many cubicles are needed in an ED?

• Some healthcare systems have no buffers; all the waiting is done outside of the system or in upstream resources.– ALC cases waiting for LTC beds

Analyzing a queuing system

InputsArrival RateService RateNumber of ServersBuffer SizePopulation Size

OutputsCapacity UtilizationWait Time in QueueQueue LengthBlocking ProbabilityService Levels

QueueAnalyzer

Armann Ingolfsson’s Queuing Calculator

http://www.puterman.chcm.ubc.ca/bahc510/22_template.xlsm



Some Fundamental Quantities• Inputs

– The arrival rate per hour: λ– The service rate per hour: µ– The number of servers: s

• Can be 1 or more– The buffer size: K

• Can be finite or infinite

• Derived Quantities– The offered load: λ/ µ = R– Example; λ = 100 calls/hr and µ = 5 calls/ hr

• Then the offered load is 20 (this quantity is unit less)• This means the system needs at least 20 servers to meet its workload

• Another important quantity is the traffic intensity ρ = R/s– It gives the offered load per server– In example if there are 25 servers (N=25) ; ρ =20/25 =.8– So servers should be busy 80% of the time.

• If the traffic intensity exceeds 1; the system will be unstable.

Examples of Queuing Systems• Walk-in clinic with 6 seats and 2 physicians

– s = 2– K = 6

• Long term care facility with 100 beds– s= 100– K = ?

• A Finite Capacity Loss System– Model for an (old-fashion) phone system– s servers– K= 0– When all servers are busy, system is blocked and customers

are lost• A Congestion System

– s servers– K= ∞– When all servers are busy; customers wait

Performance Measures

• Capacity Utilization• Probability the system is empty• Average waiting time (in queue) – Wq

• Flow Time - Average Total Time in System – W • Average queue length – Lq

• Average number of jobs in the system - L• Probability that a customer waits for service• Probability that there are k customers in the system• Service Level – Probability that a customer waits less than T

time units for service.

An Example - M/M/1 Queue• Assume exponential inter-arrival time and service time distributions, infinite capacity and 1

server (s=1)• Calculations below are based on analytical expressions available in most operations

research texts on OR.• Customers arrive at rate 4 per hour, mean service time is 10 minutes.

– Service rate is 6 per hour– System utilization = Probability the server is occupied = = 2/3.– Safety capacity = service rate – arrival rate = 2 – P(System is empty) = 1- = 1/3.– P(k in the system) = k(1- ) = (1/3)(2/3)k

– Average Time in system= 1/safety capacity = ½ hour– Average Time in queue = Average time in system – average service time = ½ - 1/6 =

1/3 hour– Average Queue Length = 2/(1- ) = 4/3

• Suppose arrival rate increases to 5.9 customers per hour. – Then =5.9/6 = .9833– So P(System is empty) = .0167; Average time in system = 10 hours and Average number

of customers in the system = 58.9!

About the Waiting Line Analyzer• An M/M/s queue is the same as an M/M/1 queue except that there may be more

than one server. – In this model, there is a single buffer and s servers in the resource pool.– Jobs are processed on a FIFO basis.– When there are more than s jobs in the system, the buffer is occupied and waiting for

service occurs. The Erlang-C formula gives the probability an arriving job has to wait.

• An M/M/s/K queue is an M/M/c queue with a finite buffer of size K.– There are at most K + s customers in the system.– When the buffer is filled, the system is blocked and customers are lost.

• QUEUECALC computes performance measures for– M/M/s queues– M/M/s queues with a finite buffer size– M/M/s queues with a finite population size– M/G/1 queues

• In addition for a fixed T – For specified s it computes the percentage of jobs waiting less than T time units – It computes the number of servers needed to achieve a specified service level

• How many servers are needed so that 90% of jobs wait no more than 10 minutes for service.

Problem 1• Patients arrive at rate 5/hr. They require on average 1 hour of treatment.

– What is the offered load?

• How many service providers do we need to ensure that the average wait time is 20 minutes or less?

– Assume a large waiting room.

• Observe that we require more than 5 servers to ensure a stable system.

• Run “The Waiting Line Analyzer” to find – For 6 service providers - Average number in queue is 2.94 and average wait time in

queue is .5875 hours or 35.25 minutes• Note that with 6 service providers the probability a customer waits which equals the

probability all 6 are occupied occurs 58.75% of time.

• The capacity utilization is 83%

– For 7 service providers – Average number in queue is 0.81 and average wait is .1621 hours or 9.28 minutes.

• Note that with 7 service providers the probability a customer waits which equals the probability all 7 are occupied occurs 32.41% of time.

• The capacity utilization is 71%

• Observe the trade-off between capacity utilization and service!

More on Problem 1

• Service Levels– Suppose our target service times are 6 and 10

minutes – fill in the following table

Servers

P(Wq ≤ 6) P(Wq ≤ 10) Capacity Utilization

6

7

8

9

10

More on Problem 1

Servers P(Wq ≤ 6) P(Wq ≤ 10) Capacity

Utilization

6 .47 .50 83%

7 .73 .77 71%

8 .88 .90 63%

9 .95 .96 56%

10 .98 .98 50%

Still more on Problem 1

• Let’s explore relationship between (traffic intensity) utilization, queue lengths and wait times– Assume 5 servers increase arrival rate to 5.

• Conclusion – as traffic intensity increases to 1 queue lengths and wait time increase rapidly

Arrival Rate Utilization Wait time in queue (hrs)

Queue Length

4 80% 0.55 2.22

4.5 90% 1.52 6.86

4.9 98% 9.50 46.56

4.99 99.8% 99.50 496.5

Problem 2 – A small walk in clinic

• A walk in clinic has 3 doctors; • Average time spent with a patient is 12 minutes (5/hr)• Patients arrive at rate of 12 per hour• How many chairs should we have in the waiting room so

only 5% of patients are turned away?• Solution

– Assume first an infinite waiting room• This shows average queue length is 2.59

– Now try a model with a finite waiting room. • With 3 chairs 9% balk and 52% wait• With 4 chairs 7% balk and 55% wait• With 5 chairs 5% balk and 58% wait

– In this last case average waiting time is .038 hours• This seems too fast.

Problem 3 – Blocking in a Hospital Ward• Bed requests arrive at the rate of 3 per day.• Patients remain in beds for about 5 days• How many beds are required so that the probability a patient is not

admitted on arrival is less than 10%?– This is a finite capacity queuing system with no waiting room– Service rate = 1/5 = 0.2 patients per day– Offered load = 3/.2 = 15 so we need at least 15 beds.

• Model this as a finite capacity queuing system with no waiting room – we want the blocking probability to be less than 0.1.

• With 15 beds 18% are blocked• With 16 beds 14% are blocked• With 17 beds 11% are blocked• With 18 beds 9% are blocked

– In this case (s=18) the capacity utilization is 76%• Graph gives occupancy distribution or census.• This probability is computed using the Erlang-B formula

27

How can queuing theory improve porter scheduling?

Tuesday (Centralized Operation)(Oct 2003 - Feb 2004)

0

5

10

15

20

25

0:00

1:00

2:00

3:00

4:00

5:00

6:00

7:00

8:00

9:00

10:0

0

11:0

0

12:0

0

13:0

0

14:0

0

15:0

0

16:0

0

17:0

0

18:0

0

19:0

0

20:0

0

21:0

0

22:0

0

23:0

0

Time Slot

# of

Por

ters

-5

-4

-3

-2

-1

0

1

2

3

4

5

# of

Exc

ess

Por

ter /

Hr

Current Capacity

Average Demand

Difference

Assumption: Porters handle 3.3 trips/hour

Implications of queuing formulas• As the safety capacity vanishes, or equivalently, the traffic

intensity increases to 1:– waiting time increases without bound!– queue lengths become arbitrarily long!

• In the presence of variability in inter-arrival times and service times, a balanced system will be highly unstable.

• These formulas enable the manager to derive performance measures on the basis of a few basic descriptors of the queuing system– The arrival rate– The service rate– The number of servers

• When the system has a finite buffer, the percentage of jobs that are blocked can also be computed

Summary• When the manager knows the arrival rate and service

rate, he/she can compute:– The average number of jobs in the queue.– The average time spent in the queue.– The probability an arriving patient has to wait.– The system utilization.

• This can be done without simulation!• This information can be used to set capacity or explore the

sensitivity of recommendations to assumptions or changes.

• Thus queuing theory provides a powerful tool to manage capacity.

Don’t Match Capacity with Demand

• If service rate is close to arrival rate then there will be long wait times. – Recall average queue length = 2/(1- )

• For traffic intensity near 1, queue length will be very small.

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

20

40

60

80

100

120

Queue Length vs. Traffic In-tensity

Performance measure formulas (M/M/1 queue – no limit on queue size)

• System Utilization = P(Server is occupied) =

– If traffic intensity increases, the likelihood the server is occupied increases

– This occurs if the arrival rate increases or the service rate decreases

• P(System is empty) = 1-

• P(k in system) = k(1- )

• Average Time in System = 1/ Safety capacity

• Average Time in Queue = Average time in system – average service time

– If safety capacity decreases; time in queue increases!

• Average Number of jobs in the system (including being served) = /(1- )

• Average Queue Length = 2/(1- )

• If we know safety capacity, service time and traffic intensity, we can compute all system properties

• Little’s Law holds too number in queue = arrival rate x waiting time in queue

Idle Capacity And Wait Time Targets

Relationship between Wait Times and Idle Capacity

0

25

50

75

100

0 20 40

Percentage of time there is idle capacity

Pro

po

rtio

n o

f P

atie

nts

E

xcee

din

g W

ait

Tim

e T

arg

et

To ensure only 5% of patients exceed wait time target, there will be idle capacity 23% of the time.

Documents

Capacity Setting and Queuing Theory BAHC 510 Lecture 6 US Election Nov 6, 2012