A stochastic dynamic pricing model for the multiclass problems in the airline industry

ARTICLE IN PRESSJID: EOR [m5G;October 20, 2014;16:55]

European Journal of Operational Research 000 (2014) 1–13

Contents lists available at ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier.com/locate/ejor

Decision Support

A stochastic dynamic pricing model for the multiclass problems in the

airline industry

Daniel F. Otero, Raha Akhavan-Tabatabaei∗

Centro para la Optimización y la Probabilidad Aplicada (COPA), Departamento de Ingeniería Industrial, Universidad de los Andes, Bogotá, DC, Colombia

a r t i c l e i n f o

Article history:

Received 20 July 2013

Accepted 23 September 2014

Available online xxx

Keywords:

Revenue management

Phase-type distributions

Stochastic dynamic programming

Dynamic pricing

OR in airlines

a b s t r a c t

In the airline industry, deciding the ticket price for each flight directly affects the number of people that in

the future will try to buy a ticket. Depending on the willingness-to-pay of the customers the flight might take

off with empty seats or seats sold at a lower price. Therefore, based on the behavior of the customers, a price

must be fixed for each type of product in each period. We propose a stochastic dynamic pricing model to

solve this problem, applying phase type distributions and renewal processes to model the inter-arrival time

between two customers that book a ticket and the probability that a customer buys a ticket. We test this

model in a real-world case where as a result the revenue is increased on average by 31 percent.

© 2014 Published by Elsevier B.V.

1

t

(

c

I

b

p

i

p

I

p

t

o

a

t

c

t

w

p

p

r

&

l

d

4

c

s

a

t

o

p

h

8

s

t

p

t

m

i

(

u

d

o

a

t

h

0

. Introduction

In the airline industry, deciding the price for a product (i.e., the

icket) is one of the major problems to be solved on a daily basis

Talluri & Van Ryzin, 2004). On the other hand, the price that is de-

ided for a product directly affects its future demand (Phillips, 2005).

n a single flight there are different products with different demand

ehaviors, and the price decision is to be made for each one of these

roducts. Also, the product offered by the airlines cannot be replen-

shed and is perishable, indicating that there is a finite number of

roducts and a limited time period to which the decision is confined.

f at the beginning of a period a low price is offered for a particular

roduct then it is more likely that many potential customers buy the

ickets at a lower price than what they would be willing to pay. On the

ther hand if a higher price is fixed only a few potential customers

re likely to buy tickets and even lowering the price at the end of

he sales period might still leave some unsold tickets. Therefore, the

hallenge in the airline industry is to fix a price in each period of

ime that minimizes this loss of revenue and takes advantage of the

illingness-to-pay of the customers. Because the decision taken in a

articular moment of time affects what could happen in the following

eriods and it should also be renewed dynamically to maximize the

evenue, this problem is commonly known as dynamic pricing (Talluri

Van Ryzin, 2004).

∗ Corresponding author at: Departamento de Ingeniería Industrial, Universidad de

os Andes, Carrera 1a Este No. 19A-40, Bogotá, DC, Colombia. Tel.: +57 3394949 Ex 3870.

E-mail address: [email protected] (R. Akhavan-Tabatabaei).

a

s

2

a

o

ttp://dx.doi.org/10.1016/j.ejor.2014.09.038

377-2217/© 2014 Published by Elsevier B.V.

Please cite this article as: D. F. Otero, R. Akhavan-Tabatabaei, A stochasti

industry, European Journal of Operational Research (2014), http://dx.doi.

Revenue Management (RM) techniques have been applied to the

ynamic pricing problems in the airline industry since approximately

0 years ago. The objective of applying these techniques is to dynami-

ally change the prices during the selling period, such that each seat is

old at the maximum price possible. The prices fixed by the airline for

product during the selling periods depend on the type of product,

he type of customer, and the distribution channel. The airlines not

nly decide the new prices by these factors but also study the com-

etitors’ decisions closely. Application of RM to airline ticket pricing

as been proven to increment the revenue by between 2 percent and

percent (Li & Ji-hua, 2007). This in turn has increased the number of

cientific studies on how to efficiently change the prices to maximize

he total revenue of a flight (Talluri & Van Ryzin, 2004).

To maximize the total revenue, the price has to be changed in each

eriod, based on the behavior of the demand. Also, in each period of

ime the decision to change the price for the next period should be

ade for each product. In the literature, to solve this problem, var-

ous dynamic pricing models are proposed. Maglaras and Meissner

2006) provide a thorough survey of the different approaches that

se linear and nonlinear programming, stochastic and deterministic

ynamic programming and statistical models, among others. In most

f these approaches, similar assumptions are taken into account such

s: Non-Homogeneous Poisson Process (NHPP) for the inter-arrival

ime distribution of bookings, absence of competition, known prob-

bility of buying a product, and the absence of effects such as no-

hows, costs of overbooking and cancellations (Talluri & Van Ryzin,

004). There are also some models that relax one or more of these

ssumptions. For example Li and Ji-hua (2007) propose a model with

ne competitor, Li and Chen (2009) present time-dependant arrival

c dynamic pricing model for the multiclass problems in the airline

org/10.1016/j.ejor.2014.09.038

http://dx.doi.org/10.1016/j.ejor.2014.09.038

http://www.ScienceDirect.com

http://www.elsevier.com/locate/ejor

mailto:[email protected]



2 D. F. Otero, R. Akhavan-Tabatabaei / European Journal of Operational Research 000 (2014) 1–13


e

m

d

a

d

I

d

i

a

c

2

d

d

t

c

T

f

w

o

o

o

b

m

a

t

f

(

a

A

w

p

2

c

r

t

m

r

d

t

b

c

c

o

o

n

a

t

c

t

d

d

o

f

c

a

r

o

time distributions with multiple products, and Chen (2012) presents

censored observation approaches where the demand is unknown.

Most of the models consulted in this paper share the assumption

of NHPP for the inter-arrival time distribution of bookings. To accu-

rately estimate the rates for the NHPP, the arrival process needs to

follow an exponential distribution in each period of time. However,

in reality this behavior could follow any distribution besides the ex-

ponential (Talluri & Van Ryzin, 2004). Therefore, a major challenge

in this problem is how to accurately and efficiently estimate the be-

havior of the demand. This behavior can be divided into two parts,

i.e., the demand inter-arrival time and the probability that a ticket is

bought at a certain price.

There are different approaches on how to estimate the behavior

of the customers in terms of their demand inter-arrival time distri-

bution and their buying probability of a booked ticket. For example

Chen (2012) applies Markov Decision Process (MDP) models with

censored data, where censored data represent the non-buying cus-

tomers due to product unavailability. Haensel and Koole (2011) apply

a customer choice sets approach where the customers are grouped

by their preferences. Also Fiig, Isler, Hopperstad, and Belobaba (2010)

and Phillips (2005) propose multinomial approaches where they esti-

mate the utility that the customer perceives over the available prod-

ucts. Although these methodologies try to estimate the probability

distributions mentioned, most of them use the utility term which de-

fines the customer’s appeal toward a product. Therefore, it is difficult

to estimate this value unless information on the different choices that

a customer can make is available. One of the possible ways to get this

information is by conducting interviews, which is time consuming,

expensive, and requires a large sample size to create useful data.

In a typical flight different types of products can be sold (Phillips,

2005), where each product has a different behavior of demand. Most

of the airlines use RM software to help decide the price at which to

sell a ticket. But the models behind the software suggest the optimal

price using different methods of estimating the buying probability

and the demand inter-arrival distribution. Ernst and Kamrad (2006)

argue that these models rely heavily on historical data and can be

inaccurate. On the other hand there are airlines that do not have RM

software and their decisions are based on experience and comparison

with the performance of the competition, which do not necessarily

guarantee the optimal solution. Therefore, in this paper, we propose

a stochastic dynamic programming model to solve the problem of

maximizing the revenue over a finite horizon, divided into various

decision periods and with different products. In each decision period

a price is going to be decided, from a pool of available prices, for each

product depending on the number of seats available for each of these

products. We work under the assumption that the demand inter-

arrival time does not necessarily follow a NHPP, and the probability

that a ticket is bought at a certain price is not known and has to be es-

timated as well. In this model we approximate the inter-arrival time

distribution of demand and the buying probability using Phase Type

(PH) distributions. The PH distributions are combinations of expo-

nential distributions (phases) that can fit different types of behavior

(Latouche & Ramaswami, 1999). We use various algorithms proposed

in the literature to fit the distributions of demand inter-arrival time

and the buying probability, depending on the data characteristics. Fi-

nally, we use renewal processes to implement the PH distributions in

counting processes.

Having the PH estimates of the inter-arrival time distribution, ap-

plying the renewal processes, and the buying probability, we solve

the stochastic dynamic program with backward induction for differ-

ent scenarios, each of which with a different combination of parame-

ters to test the performance of our proposed method. We change the

number of seats, the type of distributions and finally test our proposed

model on a case study with industry data.

The rest of the paper is organized as follows. In Section 2 we

present the proposed model with a detailed explanation on how to



stimate each of the distributions and introduce the dynamic pricing

odel. In Section 3 we introduce the PH family of distributions and

iscuss data fitting techniques to a PH distribution using different

lgorithms. In Section 4 we present some structural properties of the

ynamic pricing model and prove the existence of an optimal solution.

n Section 5 we present the result for the scenarios and the case study,

ividing them into three parts: the fitting of the purchasing probabil-

ty distribution, the fitting of demand inter-arrival time distribution,

nd the results of the dynamic pricing model. Finally in Section 6 we

onclude the paper and state future steps for this research.

. The proposed model

In this section we first introduce some airline specific jargon and

efinitions, then we define the essential components of our proposed

ynamic pricing model. The first component consists of estimating

he probability that a customer buys a previously booked ticket at a

ertain price. We refer to this component as the ticketing probability.

he second component is the probability that j customers buy tickets

or a given product on a particular flight in a period of time t, which

e will call the buying probability. The buying probability depends

n the ticketing probability and the inter-arrival time distribution

f the demand. We estimate the buying probability by the number

f bookings (reservations) that the airline receives. To estimate the

ehavior of the probability distributions closer to reality and with

ore flexibility, we propose to fit PH distributions to them. This fitting

llows us to find a relatively accurate approximation to the behavior of

he underlying distributions when they do not necessarily follow the

orm of any standard distribution (e.g., the multi-modality behavior)

Latouche & Ramaswami, 1999). Then we apply renewal processes to

pproximate the arrival process of bookings by a counting process.

fter estimating the probability distributions of ticketing and buying,

e present the dynamic pricing model that attempts to find optimal

ricing policies, given these distributions.

.1. Definitions and notation

In this section we present the terms and definitions used in the

ontext of airline revenue management. We define booking as the

eservation of a seat where no money is involved. On the other hand,

icketing occurs when the customer pays for this booking. We esti-

ate the demand for a particular ticket by the number of bookings

eceived for that product. Using these two definitions, the probability

istribution of ticketing is defined as the probability that a booked

icket is paid for and bought eventually.

A product is defined by the benefits that it offers. The most common

enefits of a product include the number of miles received, the costs of

hanging the ticket, the type of cabin, and promotions. Each product

an be offered in various flying classes, which we denote as c. Each

f these flying classes has a price assigned, depending on the type

f product that they represent. In general, the number of cabins, the

umber of products, and the number of classes vary depending on the

irline. Some airlines offer only two classes while others can have up

o 26 different classes in a single flight (Phillips, 2005). For each class

, only one type of product is offered, while many classes can offer

he same product.

The tickets are up for sale since almost a year before the departure

ate of the flight, and this period is called the selling period. Because

ifferent classes can have the same product but with different prices,

nly one class is available per product in each period of time. There-

ore, during the selling period the class offered for a particular product

hanges, hence changing the prices of the product according to the

vailable class. So the price that a customer sees during the selling pe-

iod is the price of the available class for that product. The objective

f our proposed dynamic pricing model is to set the prices in each


org/10.1016/j.ejor.2014.09.038


D. F. Otero, R. Akhavan-Tabatabaei / European Journal of Operational Research 000 (2014) 1–13 3


Fig. 1. Booking behavior during the selling period.

t

e

g

a

2

i

b

i

d

C

a

s

D

o

D

a

f

o

t

D

o

o

a

t

w

b

r

h

i

2

t

L

t

s

n

t

s

a

l

t

i

t

T

d

t

2

i

i

n

s

t

t

m

p

p

P

A

t

P

o

P

a

P

o

c

t

p

w

p

b

t

c

g

b

t

g

ime epoch throughout the selling period, such that the total rev-

nue is maximized. All of the classes that offer the same product are

rouped in fare families denoted by k. So each fare family represents

particular product.

Any average to large-scale commercial airline can have between

000 to over 6000 flights during the day, which create plenty of data

ncluding the number of bookings per day, the number of tickets

ought, and any related information regarding the flight. The sell-

ng period can be divided into different epochs where the prices are

ecided for each product. These epochs are commonly called Data

ollection Points (DCP). The DCPs are moments in time when the

irlines collect data, and are chosen such that the data can repre-

ent the behavior of the customers. The information saved on each

CP for all the flights that are available for purchase, includes the data

f tickets and bookings. Normally airlines only save the data for 18–25

CPs out of the whole selling period (Talluri & Van Ryzin, 2004).

The Authorized Capacity (AU) refers to the number of seats avail-

ble for sale for each class. The sum of the AU of classes in a fare

amily is the AU of the fare family, and the sum of these is the AU

f the flight. The total AUs of a flight is usually bigger than or equal

o the number of seats on the plane that is assigned to the flight.

uring the selling period the AU remains constant, and the number

f bookings cannot exceed this value. Fig. 1 shows a common curve

f bookings for a given flight. Normally there is a decreasing effect

t the end of the curve. This effect is mostly produced by cancella-

ions of bookings during the whole period. Also at the departure date

e may observe the no-show effect, which is defined as the num-

er of customers that do not show up to board the flight for various

easons.

The price to be decided for a product varies depending on the be-

avior of demand, and this behavior changes for each product depend-

ng on the flight leg, flight number, or seasonality (Talluri & Van Ryzin,

004). A leg is the route that a plane follows between two destina-

ions. For example, consider the flight between the airports JFK and

AX. In each day there are different frequencies or flights between

hese two airports. Each of these flights takes off at a different hour,

o each of them has a different flight number. Each of these flight

umbers has a different demand behavior, depending on its depar-

ure time. A family is more likely to fly during the day, even more

o if they are traveling with small children. A business traveler usu-

lly flies in the early morning and late night hours. Those who are

ooking for cheaper prices tend to fly in the first or the last flights of

he day. The seasonality also affects the demand behavior. Vacations

n the middle and at the end of the year help increase the price of

he tickets, because most of the customers are willing to pay more.

herefore, in the estimation of the ticketing and buying probability

istributions, the flight leg, flight number, and seasonality should be

aken into account.



.2. Ticketing probability

We define the ticketing probability as the probability that a book-

ng turns into a ticket. As shown in the booking curve of Fig. 1, there

s a point where the maximum number of bookings is reached. This

umber of bookings cannot exceed the AU capacity of the flight. As-

uming that the number of tickets is the number of people that ac-

ually take the flight, we can estimate the probability that a booking

urns into a ticket. Let Q be the total number of tickets sold and

axB = max{bt} ∀t, where bt is the number of bookings during any

eriod of time with length t, along the selling period. So the ticketing

robability Pr(Q) is determined as follows:

r(Q) = Q

maxB. (1)

For each flying class a booking curve (similar to Fig. 1) is generated.

ssuming that Qc is the number of tickets sold for class c and maxBc

he maximum number of bookings for class c, then, the probability

rc that any booking turns into a ticket of a particular class c can be

btained as:

rc = Qc∑c maxBc

, (2)

nd the probability Pr0 of a ticket not being bought is,

r0 = 1 −∑

c

Prc . (3)

As mentioned in Section 2.1, each class within a fare family varies

nly in the value of the ticket. Given the ticketing probability for a class

(i.e., Prc) and the price assigned to that class, there is a probability

hat a booking turns into a ticket for the assigned price p. Let p be the

rice for a ticket in any period of time. If a customer arrives and is

illing to pay more than this price then he is going to buy the ticket at

rice p. Setting a price for each class, and using Eq. (2), let Pr(Pc) = Prc

e the probability that at price Pc a booking turns into a ticket, where

he price Pc is the price set for class c. Then the probability g(p) that a

ustomer buys a ticket at price p is:

(p) =∑Pc≥p

Pr(Pc). (4)

Now, assuming that there exists a continuous probability distri-

ution f (x) that describes the probability of a booking turning to a

icket at price x, then Eq. (4) can be written as:

(p) =∫ ∞

p

f (x)dx, (5)


org/10.1016/j.ejor.2014.09.038




2

e

t

m

t

c

f

i

d

t

l

t

s

t

L

w

o

s

a

u

w

t

p

g

o

c

g

S

f

f

g

s

d

u

w

l

u

T

s

o

a

d

2

X

s

e

t

t

e

o

a

i

c

t

e

S

c

and the probability that a customer that arrives does not buy a ticket

at price p is:

1 − g(p). (6)

2.3. Buying probability

We estimate the probability that j customers arrive in a period

�t, by a counting process. A counting process represents the prob-

ability that l events happen during a period of time �t, is defined

by Pr(N(�t) = l). To estimate the probability that j customers buy

tickets during a �t period of time, we need to estimate the con-

ditional probability that if l customers arrive, j of them buy tickets

and l − j do not. Using Eqs. (5) and (6), the probability of this event

occurring is:

Pr(j|l) =⎧⎨⎩

(lj

)g(p)j(1 − g(p))l−j j ≤ l

0 j > l, (7)

so the unconditional probability that j customers buy their ticket

during �t period of time at price p will be:

Pr(j,�t, p) =∞∑l=j

Pr(N(�t) = l)

(l

j

)g(p)j(1 − g(p))l−j ∀j ≤ AU. (8)

This probability is to be estimated for each price p for each fare

family k and over the set of feasible prices to be assigned to k. Let Pik

denote the price i from the set of feasible prices of fare family k, where

each fare family has a different number of feasible prices to choose

from, then Eq. (8) can be written as follows for each fare family k:

Prk(j,�t, Pik) =∞∑l=j

Prk(N(�t) = l)

(l

j

)gk(Pik)

j(1 − gk(Pik))l−j

∀j ≤ AU. (9)

If the inter-arrival time distribution of the arriving l customers

is exponential, then the counting process {N(�t), t ≥ 0} is a Poisson

process and hence Prk(N(�t) = l)follows a Poisson distribution. How-

ever, the assumption of exponential distribution for the inter-arrival

time of demand for airline products is not necessarily valid and in

many cases non-exponential behavior has been observed (Talluri &

Van Ryzin, 2004). For such cases, we model the arrival process using

renewal processes that relax the assumption of exponentiality.

2.4. Dynamic pricing model

After estimating the ticketing and buying probability distributions,

we can proceed to define the dynamic pricing model that attempts to

maximize the total revenue received for a particular flight. The maxi-

mum revenue is obtained by choosing the optimal price for each fare

family in each period of time through a stochastic dynamic program-

ming model.

We define the state of the system as the number of available seats

in each fare family. In this model, the probability of going from one

state to the other is composed of the combination of the inter-arrival

time distribution of bookings and the probability that a booking turns

into a ticket, i.e. Prk(N(�t) = l) and g(p), respectively. Given these

probabilities in each period of time we can evaluate the total ex-

pected discounted revenue for making a decision, which is composed

of the expected immediate revenue received in the current period of

decision making and the expected revenue received in the following

periods of time (till the end of the selling period) discounted to their

present value. This total expected revenue is described by the Bellman

equation and can be maximized by a variety of techniques including

backward induction, which we apply in this paper (Puterman, 2005).



.4.1. Model assumptions

The airline industry has many variables that could affect the rev-

nue of a single flight. However, some of these variables are difficult

o estimate through data, and therefore some assumptions need to be

ade on such variables. The first and one of the strongest assump-

ions is the lack of any kind of competition. Under the influence of

ompetition the price is regulated by the market. The market stands

or all the available airlines that operate a particular leg. If the market

ncreases the prices then the airline can increase its prices in accor-

ance with the market. On the other hand when the market reduces

he prices then the airline needs to reduce its price too, in order not to

ose its market share. In the absence of competition an airline can fix

he optimum price that only depends on the demand behavior. The

urvey made by Talluri and Van Ryzin (2004) indicates that most of

he models for airline RM do not consider the market. For example,

i and Chen (2009) and Chen (2012) propose dynamic pricing models

here there is no competition. They make this assumption because

btaining the data of the competitors is a difficult task, if not impos-

ible. Li and Ji-hua (2007) propose a model with one competitor but

ssuming that the data are known which is a strong and in some cases

nrealistic assumption. In our model we assume no competition as

ell.

As for the behavior of the customers, we also make some assump-

ions. First, we assume that each fare family has a different type of

roduct. If a customer is looking for a specific product, then he is not

oing to buy any other one, even if the product he is looking for is out

f stock. Also we assume that there is no time lag between when the

ustomer books a ticket and buys it.

We also assume that the price fixed for each fare family must be

reater than or equal to the price of a fare family with less benefits.

o the price decision for a fare family depends on the decision taken

or the other fare families. Although the demand behavior of each fare

amily is independent of the others, the problem is then solved alto-

ether in order to observe this restriction. Although this assumption

eems reasonable in some cases, when discounts apply this behavior

oes not necessarily happen. Offering discounts on a particular prod-

ct is a decision made in order to increase the number of bookings,

hich inflicts loss on the revenue but causes an increase in customer

oyalty. Although increase in loyalty is always good, indiscriminate

se of discounts can generate high costs (Talluri & Van Ryzin, 2004).

herefore, we also assume that discounts are not possible in the deci-

ion for prices. Finally we do not consider the effects of cancellation,

versale, and no-shows like most of the models surveyed by Talluri

nd Van Ryzin (2004), because they need large quantities of industry

ata to estimate these effects adequately.

.4.2. The stochastic dynamic programming model for pricing

Let Vt(X1, X2, . . . , Xn)be the total expected revenue at time t where

k seats are remaining to be sold for fare family k, k = 1, . . . , n. Let the

et of decision epochs be E = {1, 2, 3, . . . , m} where each decision

poch is a DCP defined by the airline and m is the departure date,

he last epoch where a decision can be made. In each period of time

∈ E, we are going to decide on the set of prices that maximizes Vt . In

ach of these periods of time the airline has new information, mostly

f new bookings received. Having the new bookings received, the

irline knows how many seats are still available. The set of states S

s defined as the number of available seats per fare family where any

ombination of remaining seats is possible, and that on the first epoch

he only state is the total number of seats of the flight, i.e., the AU of

ach fare family. This set can be written as:

=⎧⎨⎩

AU1, AU2, . . . , AUn t = 1

X1, X2, . . . , Xn 0 ≤ Xi ≤ AUi f or 1 < t ≤ m, (10)

Each fare family has a possibly distinct set of prices to be de-

ided on by the airline. Therefore, the set of decisions is defined as


org/10.1016/j.ejor.2014.09.038




D

t

f

f

A

b

a

f

a

w

t

o

a

e

t

i

i

c

P

w

o

i

t

d

a

f

R

f

n

t

t

r

p

n

d

E

t

p

R

e

s

V

2

t

o

l

e

t

e

∑

e

c

i

e

a

e

i

t

t

d

3

d

t

fi

d

p

f

d

a

i

d

a

a

{a

s

1

i

d

t

t

j

C

i

t

a

o

F

w

e

p

w

λc

F

= {Pi1, Pi2, . . . , Pin}, where Pik is the ith possible price to choose for

he fare family k. Note that choosing the ith price for a particular fare

amily does not mean that the ith price is also chosen for the other fare

amilies, and that this decision is independent for each fare family.

lso the decision is only taken when the number of available seats is

igger than 0 for a specific fare family. If a fare family has 0 seats avail-

ble then no decision is taken. Finally if all the seats are sold for all the

are families, then the expected revenue for the following periods is 0

nd no more decisions are made. Another terminal condition occurs

hen the mth decision epoch i.e., the departure date is reached, since

his is the last day the prices can be fixed. The set of probabilities

f going from one state to another depending on the decision taken

nd the time epoch is defined by Eq. (9). Let St be the set of states in

poch t and Xkt the number of available seats in fare family k in epoch t,

hen the probability of going from any state in S1 = {X11, X21, . . . , Xn1}n time epoch t1 to any state in the set S2 = {X12, X22, . . . , Xn2}n time epoch t2, where Xi1 ≥ Xi2 ∀Xi1 ∈ S1 and Xi2 ∈ S2 is

alculated as:

r(S1 → S2|Pi1, Pi2, . . . , Pin) = Pr1(X11 − X12,�t, Pi1)

× Pr2(X21 − X22,�t, Pi2). . . Prn(Xn1 − Xn2,�t, Pin), (11)

here �t = t1 − t2 and the number of tickets sold during this period

f time for the fare family k is defined by Xk1 − Xk2. On the other hand,

f at least one pair of available seats follows the inequality Xi1 < Xi2

hen the probability shown in Eq. (11) is 0.

The set of rewards is also calculated using the probability term

erived in Eq. (9). The reward function R(t, Pi1, Pi2, . . . , Pin) is defined

s the revenue received from the seats sold at price i, from each fare

amily k, in a period of time t. This reward function is calculated as:

(t, Pi1, Pi2, . . . , Pin) =n∑

k=1

Pik

Xk∑j=0

jPrk(j,�t, Pik) ∀t ∈ E. (12)

Eq. (12) shows that the expected revenue received for the fare

amily k is the price Pik chosen, times the expected value of the

umber of tickets sold at this price, i.e.,∑Xk

j=0jPrk(j,�t, Pik), during

he period of time �t, where �t is the difference in time between

he last and the current decision epochs. Then the total expected

eward is calculated as the sum of the expected revenue received

er fare family. Using Eq. (11), the expected revenue received in the

ext period, depending on the decision made in the current period is

efined as:

(Vt+1|Pi1, Pi2, . . . , Pin) =X1∑

j1=0

Pr1(j1,�t, Pi1)

. . .

Xn∑jn=0

Prn(jn,�t, Pin)Vt+1(X1 − j1, . . . , Xn − jn) for t = 1, . . . , m.

(13)

Using Eqs. (12) and (13) we can determine the Bellman equa-

ion for this dynamic pricing model, which maximizes the total ex-

ected revenue in each period of time given the immediate reward

(t, Pi1, Pi2, . . . , Pin) for choosing the price Pik for fare family k and the

xpected revenue received in the following epochs, till the end of the

elling period. The equation is define as:

t(X1, X2, . . . , Xn) = max{Pi1,Pi2,...,Pin}{R(t, Pi1, Pi2, . . . , Pin)

+ E(Vt+1|Pi1, Pi2, . . . , Pin)} for t = 1, . . . , m. (14)

We solve this problem using backward induction (Puterman,

005), taking into account that the total number of decisions in each

ime epoch depends on the number of fare families and the number

f initial available seats to sell. Recalling that S is the set of states,

et |S| be the size of S, and |E| is the size of the set of decision

pochs E. Also, let |St| be the number of states in the epoch t, then



he number of total decisions can be calculated with the following

quation:

t∈E

|St| =( ∏

i∈Classes

(AUi − 1)− 1

)(|E| − 1)+ 1 (15)

Eq. (15) uses the following reasoning. In all the decision epochs

xcept the first one, any combination of available seats is possible ex-

ept when the remaining number of seats is 0 for all the fare families,

n which case there is no decision to be made. In the first decision

poch there is only one combination of seats, i.e., all the capacity is

vailable.

The proposed dynamic pricing model depends heavily on the

stimation of ticketing and buying probabilities, which in real-

ty can possess any shape of density function. In the next sec-

ion we introduce a methodology to estimate these distribu-

ions, by a family of continuous distributions called continuous PH

istribution.

. Estimation of ticketing and buying probabilities by PH

istributions

Each of fare families has a ticketing and buying probability dis-

ribution which can possess any shape. So a distribution has to be

tted for each of these behaviors. To do this, we present the PH

istributions as a possible method to fit the ticketing and buying

robabilities because of their useful properties to fit unconventional

orms and their flexibility to describe rugged and multimodal data

istributions. In this section we define the PH family of distributions

nd discuss their useful properties in fitting data from the airline

ndustry.

The PH family of distributions is the combination of exponential

istributions that can fit a variety of behaviors, including those that

re not conventional (Ross, 2007). A PH distribution is defined as

Continuous Time Markov Chain (CMTC) with space of states S =1, 2, 3, . . . , a, 0}, where the first a phases are transient and 0 is the

bsorbent state. Being a CMTC, the time spent in each of the transient

tates follows an exponential distribution (Latouche & Ramaswami,

999). The most well-known PH distribution besides the exponential,

s the Erlang distribution, which is a combination of a exponential

istributions in series, with the same rate.

PH distributions are commonly defined by two characteristic ma-

rices (Latouche & Ramaswami, 1999). The first matrix is the vector τhat represents the initial vector of probabilities. The value in position

on τ represents the probability that the movement in the absorbing

TMC begins in state j or phase j of the distribution. The other matrix

s the matrix of transition probabilities denoted as T. This matrix con-

ains the rates of going from one phase to the other before the chain is

bsorbed. With these matrices the cumulative distribution function

f a PH random variable is defined as:

(x) = 1 − τ ′eTx1 , (16)

here eTx is defined as:

Tx =∞∑

k=0

1

k!(Tx)k . (17)

As mentioned before, the most common PH distribution is the ex-

onential distribution, which can be represented as a PH distribution

ith 1 phase. Consider an exponential distribution with parameter

, then τ = [1] and T = [−λ], so Eq. (16) transforms to the known

umulative distribution function of the exponential distribution, i.e.,

(x) = 1 − e−λx. Now consider an Erlang distribution with parameter


org/10.1016/j.ejor.2014.09.038




0 Expo(λ) Expo(λ) Expo(λ) …

a phases

Fig. 2. Erlang distribution.

Fig. 3. HE distribution.

3

a

d

l

P

E

a

m

t

(

2

2

o

t

P

t

g

d

E

o

d

A

t

B

λ and a phases with the transition rate diagram shown in Fig. 2, then

the characteristic matrices for this distribution will be:

τ =

⎡⎢⎢⎢⎢⎢⎣

100...0

⎤⎥⎥⎥⎥⎥⎦

, (18)

T =

⎡⎢⎢⎢⎢⎢⎣

−λ λ 0 0 · · · 00 −λ λ 0 · · · 0...

. . .. . .

. . .. . .

...0 · · · 0 0 −λ λ0 · · · 0 0 0 −λ

⎤⎥⎥⎥⎥⎥⎦

. (19)

According to the type of PH distribution used, different behaviors

such as high variability, long tails, and sharp tails can be modeled

(Perez & Riaño, 2007). Different algorithms are proposed to fit the

most appropriate PH distribution to empirical data, where the re-

sults of the algorithm are the vector τ and the matrix T. These algo-

rithms can be divided into four groups: moment matching, maximum

likelihood estimate (Perez & Riaño, 2007), nonlinear programming

(Johnson & Taaffe, 1990), and minimum distance algorithms (Parr &

Shucanny, 1979). Most of the algorithms in the literature are focused

on the first two approaches, which are easier to implement and have

satisfactory results (Perez & Riaño, 2007), which we will discuss in

the next section.



.1. PH fitting algorithms

Perez and Riaño (2007) study a wide range of PH fitting

lgorithms. They test different algorithms with different types of

ata and evaluate the performance of each one. For the maximum

ikelihood algorithms they test an Expectation-Maximization (EM)

hase algorithm (Asmussen, Nerman, & Olsson, 1996), an EM Hyper-

xponential algorithm (Khayari, 2003), and an EM Hyper-Erlang (HE)

lgorithm (Thummler, Buchholz & Telek, 2005). For the moment

atching algorithms they test an Acyclic continuous PH Distribu-

ion with two phases (ACPH2) (Telek & Heindl, 2003), a general ACPH

Bobbio, Horvath, Scarpa & Telek, 2003; Bobbio, Horvath, & Telek,

005), and an Erlang–Coxian distribution (Osogami & Harchol-Balter,

003). The algorithms that give the best results in terms of goodness-

f-fit are: the EM Hyper-Erlang, ACPH, and the Coxian distribution.

The HE distribution is a distribution that combines m Erlang dis-

ributions, each of which called a root. Each root i has a probability

ri to take up the movement in the CTMC. Each of these Erlang dis-

ributions can have a number of phases ai and rate λi. Fig. 3 shows a

raphic representation for this type of distributions.

The ACPH presented by Bobbio, Horvath, and Telek (2005) is a

istribution that combines an exponential phase with rate λ and an

rlang distribution with rate μ and a phases. Depending on the value

f the moments of the data that are fitted, the ACPH is represented

ifferently. Let Pr be the probability of going to the first phase of the

CPH distribution, then in Fig. 4 we show the two graphic represen-

ations of this type of distributions.

The Coxian PH distribution presented by Osogami and Harchol-

alter (2003) also combines the exponential and Erlang distributions.


org/10.1016/j.ejor.2014.09.038




(a)

(b)

Fig. 4. (a) Erlang first ACPH distribution; (b) exponential first ACPH distribution.

I

r

r

w

o

t

T

a

f

fl

w

i

h

b

t

e

v

c

E

w

a

u

a

d

t

f

Fig. 6. PH fitting using the HE algorithm for ticketing probability in the month of

October.

Table 1

Error of fitting for the ticketing distribu-

tions.

Month Error Number of

data points

July 0.0042 3109

August 0.005 1162

September 0.005 1032

October 0.0046 2857

November 0.0062 3621

December 0.0056 4448

a

d

f

a

a

r

c

o

e

c

t

2

n this case, there is a probability Pr to go first to the Erlang phase with

ate λ and a phases, then it directly goes to an exponential phase with

ate λ1, and finally a probability Pr1 to go to an exponential phase

ith rate λ2. In Fig. 5 we show the graphic representation of this type

f distributions.

To show the performance of these fitting algorithms, we fit indus-

ry data on ticketing and buying probabilities using these algorithms.

his dataset consists of data of bookings for flights with the same leg

nd flight number for a period of six months. In this dataset there is in-

ormation of when the booking was performed, when was the actual

ight date, in which fare family and class the ticket was bought, and at

hich price it was bought. In total, there are around 16,000 bookings

n the data set. First we apply the HE algorithm (Thummler, Buch-

olz & Telek, 2005) for the ticketing distribution, which we divided

y month and fare family. The fitted HE distribution to the dataset for

he month of October is shown in Fig. 6.

To measure the goodness-of-fit of the resulting HE distribution we

stimate the error between this distribution and the data as the mean

alue of the absolute difference of the two distributions at various

hosen data points along the density function curve as:

rror =∑n

i=1 |XRi− XPHi

|n

, (20)

here XRiis the value at price i of the real data and XPHi

is the value

t price i of the fitted PH distribution.

In Table 1 we show the average error for each of the six months,

sing Eq. (20). All the errors in this table are relatively small, indicating

dequate fitting. This result is somewhat expected since all of these

atasets contain over 1000 data points and the HE algorithm is shown

o work better with larger sets of data (Perez & Riaño, 2007). When

ewer data are available other algorithms must be applied.

Fig. 5. Coxian di



After fitting the ticketing distribution, we now focus on the inter-

rrival time distribution of bookings, using the same dataset but with

ata divided into classes for each month. Fig. 7 shows the fitting

or class Y , a high price class for the economy cabin, using the HE

lgorithm. The errors for different classes are shown in Table 2.

Comparing Tables 1 and 2 shows that with fewer data, the HE

lgorithm does not work satisfactorily, in terms of the fitting er-

or. To overcome this problem, one possible solution is to group the

lasses by fare families as a fare family is a combination of classes, all

f which offering the same product but with different prices. But

ven grouping the classes in fare families, the amount of data is

onsiderably small for some of these fare families. Therefore we also

est the grouped data with the Coxian (Osogami & Harchol-Balter,

003) and the ACPH (Bobbio, Horvath, & Telek, 2005) algorithms

stribution.


org/10.1016/j.ejor.2014.09.038




Fig. 7. PH fitting using the HE algorithm for the inter-arrival time distribution.

Table 2

Error per class for the inter-arrival

time distribution.

Class Error Number of

data points

J 0.0897 50

D 0.0663 174

Y 0.0259 204

M 0.0631 74

P 0.0333 184

T 0.0339 179

P

i

3

P

u

o

n

t

n

d

f

d

f

p

P

e

U

a

e

m

n

P

m

t

d

s

o

4

m

t

t

s

o

f

fi

E

p

which tend to find a feasible fit with few data. To test the ACPH

algorithm we use ACPH distributions with different number of states.

In Fig. 8 we show the results of testing the data divided by fare fami-

lies. These data represent the month of October for a certain domestic

flight.

As Fig. 8 shows, the errors of the Coxian and the ACPH fittings

are lower than the HE algorithm for Fare Family 1 (FF1). This ex-

ample shows that each of the fitting algorithms performs differently

depending on the data. Therefore, it is critical to test various algo-

rithms to find an appropriate fit for the particular data set under

consideration.

In conclusion, and based on the experimentation shown in this sec-

tion the PH family of distributions is a good alternative to approximate

the empirical data on inter-arrival time and ticketing distributions

when the data do not closely fit any of the conventional distributions.

Given the fitted distributions of ticketing and buying probabilities,

we can now model the counting process {N(�t), t >= 0}, to estimate

Fig. 8. PH fitting of inter-



rk(N(�t) = l). To this end we apply renewal processes as described

n the next section.

.2. Renewal process for PH distributions

The renewal process {N(t), t ≥ 0} is a counting process, where the

oisson process is a specific case (Ross, 2007). The Poisson process is

sed to calculate the probability that n events happen during a period

f time t, where the inter-arrival times of these events follow an expo-

ential distribution. The renewal process allows any general distribu-

ion for the inter-arrival times and calculates the same probability of

events happening during time period t. The renewal process can be

efined in the following way: Let F(t) be the cumulative distribution

unction of the inter-arrival times of events and Fn(t) the cumulative

istribution of the sum of n of such distributions, also called the n-

old convolution of the distribution with itself (Ross, 2007). Then the

robability that n events take place in a period t is:

r(N(t) = n) = Fn(t)− Fn+1(t). (21)

To estimate Eq. (21) Latouche and Ramaswami (1999) propose an

xact solution by applying a uniformization to the PH distributions.

sing the matrices T and τ of the PH distribution they define Pr(n, t)s the matrix of probabilities where Prij(n, t) is the probability that n

vents occur in time t and being in state j, given that at the time 0 the

ovement of the chain begins in state i. Therefore, the probability of

arrivals in a period of time t is:

r(N(t) = n) = τPr(n, t)1 . (22)

With this method, we can estimate Prk(N(�t) = l) which is the

ain component to estimate the buying probability. Having explained

he estimation of these probabilities, in the next section we present

ifferent structural properties that our proposed dynamic model pos-

esses. This helps us understand how the results behave depending

n the number of available seats and the decision taken.

. Structural properties and optimality of the dynamic pricing

odel

The dynamic pricing model discussed in this paper recommends

he optimal choice of price depending on the behavior of the cus-

omers, for each fare family. To understand the dynamics of this deci-

ion process, we first study the behavior of the value function based

n the decision taken. To this end we examine the shape of the value

unction after discussing the effect of changes in price, for which

rst we need to understand how the buying probability shown in

q. (9) changes when the price varies. Then we show that an optimal

olicy exists to maximize the total revenue over a finite horizon. The

arrival distribution.


org/10.1016/j.ejor.2014.09.038




f

e

P

f

P

g

0

P

f

P

U∑P

c

l

g

g

t

g

u

P

c

s

B

t

P∑i

P

t

P

w

a

l

w

i

p

P

t

P

(l

P

(w

n

T

f

p

g

a

i

a

Fig. 9. Example of the behavior of the total expected revenue changing the price.

v

t

a

t

l

l

l

d

P

P

V

t

b

π

P

c

t

w

t

w

t

t

f

X

i

X

e

N

1

P

(d

p

c

(X

t

V

s

π

V

o

o

ollowing propositions show the behavior of this probability and the

xistence of an optimal policy:

roposition 1. Let P1 and P2 be two available prices for a particular

are family where P1 < P2 then g(P1) > g(P2).

roof. Using Eq. (5) we know that g(P1) = ∫ ∞P1

f (x)dx then g(P1)−(P2) = ∫ ∞

P1f (x)dx − ∫ ∞

P2f (x)dx. Therefore g(P1)− g(P2) = ∫ P2

P1f (x)dx >

, hence g(P1) > g(P2).

roposition 2. Let P1 and P2 be two available prices for a particular

are family where P1 < P2 then Pr(j,�t, P1) > Pr(j,�t, P2).

roof. Consider the following difference, Pr(j,�t, P1)− Pr(j,�t, P2).sing Eq. (8) we know that Pr(j,�t, P1)− Pr(j,�t, P2) =∞l=j Prk(N(�t) = l)

(lj

)(g(P1)

j(1 − g(P1))l−j − g(P2)

j(1 − g(P2))l−j). Since

rk(N(�t) = l) and(l

j

)are always positive, we only need to

onsider (g(P1)j(1 − g(P1))

l−j − g(P2)j(1 − g(P2))

l−j). Looking at the

imits of the sum, we have that∑∞

l=j (g(P1)j(1−

(P1))l−j − g(P2)

j(1 − g(P2))l−j) is equal to g(P1)

j∑∞

l=j (1 − g(P1))l−j −

(P2)j∑∞

l=j (1 − g(P2))l−j. Now, since 1 − g(Pi) < 1 for any price Pi, then

hese sums are geometric series, hence g(P1)j∑∞

l=j (1 − g(P1))l−j −

(P2)j∑∞

l=j (1 − g(P2))l−j = g(P1)

j−1 − g(P2)j−1, which is positive

sing Proposition 1. Then Pr(j,�t, P1)− Pr(j,�t, P2) > 0 and

r(j,�t, P1) > Pr(j,�t, P2).

These two propositions show that when the price in-

reases the probability of buying decreases, implying that there

hould be a price Pi where the maximum revenue is reached.

ut to prove that this price exists we need to show that

he limP→∞ R(t, P) = 0, where for just one fare family R(t, P) =∑Xj=0 jPr(j,�t, P). Expanding this equation the limit becomes

Xj=0 j

∑∞l=j Prk(N(�t) = l)

(lj

)(limP→∞ Pg(P)j(1 − g(P))l−j). The follow-

ng two propositions are shown to prove this:

roposition 3. Let P be an available price for a particular fare family

hen limP→∞ Pg(P) = 0.

roof. First consider g(P) = ∫ ∞P f (x)dx, for which f (x) is estimated

ith a PH distribution. Then, using Eq. (16), g(P) = 1 − F(P) = τ ′eTP 1nd because F(x) is the cumulative distribution function, then

imP→∞ F(P) = 1 and limP→∞ g(P) = 0. Now, limP→∞ Pg(P) = P τ ′eTP 1here it is clearly seen that g(P) decreases faster compared to the

ncrease of P, because g(P) has an exponential component in its ex-

ression. Then limP→∞ Pg(P) = 0.

roposition 4. Let P be an available price for a particular fare family

hen limP→∞ PPr(j,�t, P) = 0.

roof. limP→∞ PPr(j,�t, P) = 0 is equal to∑∞

l=j Prk(N(�t) = l)(l

j

)limP→∞ Pg(P)j(1 − g(P))l−j). Knowing that limP→∞ g(P) =imP→∞

∫ ∞P f (x)dx = 0, hence limP→∞(1 − g(P))l−j = 1 and using

roposition 3 limP→∞ Pg(P)j = 0, then limP→∞ PPr(j,�t, P) = 0.

Then limP→∞ R(t, P) = ∑Xj=0 j


(lj

)(limP→∞ Pg(P)j

1 − g(P))l−j) = ∑Xj=0 j


(lj

)0 = 0, meaning that

hen the price tends to infinity the revenue received tends to 0. Fi-

ally having this result, we can study the shape of the value function.

o explore this, we create an example of the value function using one

are family, 10 available seats, a �t of 2, a Poisson distribution with

arameter equal to 1 as the arrival process, and as the distribution

(p) we use g(P) = e−P/100. The results of the total expected revenue

re shown in Fig. 9.

As seen in Fig. 9 the total expected revenue as a function of price

s convex in a range of prices and concave in another range. However,

s shown in Proposition 4, increasing the price to infinity makes the



alue of the total expected revenue tend to zero. On the other hand

here is a minimum value for the price of any ticket, that can be 0, so

t least one price exists that maximizes this problem.

We use backward induction as the proposed algorithm to solve

his problem. Because there exists a price that maximizes the prob-

em for each fare family, Proposition 5 shows that there exists at

east one optimum policy that maximizes the dynamic pricing prob-

em. Bertesaks (2000) proposes a proposition for the deterministic

ynamic programming model, we use this methodology to prove

roposition 5:

roposition 5. Consider a stochastic dynamic pricing problem with

t(X1, X2, . . . , Xn) as the total expected revenue and 1 ≤ t ≤ m. Assume

hat for each fare family there is a pool of prices to choose from. Let π ∗t

e the optimal policy that maximizes the revenue for period t, then this∗t exist for 1 ≤ t ≤ m.

roof. The proof is done by induction. So let t = m then, as con-

luded through Propositions 1–4, there exists a price that maximizes

he revenue for fare family k. Then for state X1, X2, . . . , Xn

hich represents the available seats for each fare family,

here is an optimal policy π ∗m(X1, X2, . . . , Xn) = (Pi1, Pi2, . . . , Pin)

here if Vm(X1, X2, . . . , Xn)π∗m(X1,X2,...,Xn) is the value of the

otal expected revenue evaluated with the optimal policy,

hen Vm(X1, X2, . . . , Xn)π∗m(X1,X2,...,Xn) ≥ Vm(X1, X2, . . . , Xn)πm(X1,X2,...,Xn)

or any other policy πm(X1, X2, . . . , Xn). Then Vm(X1, X2, . . . ,

n)π∗m(X1,X2,...,Xn) = V∗

m(X1, X2, . . . , Xn) where V∗m(X1, X2, . . . , Xn)

s the optimal value of the expected revenue for state

1, X2, . . . , Xn. Let π ∗m be the pool of optimal policies for

ach combination of possible available prices in t = m.

ow let t = m − 1 and Vm−1(X1, . . . , Xn) = max{Pi1,...,Pin}{R(m −, Pi1, . . . , Pin)+ E(Vm|Pi1, . . . , Pin)}. Then, as concluded through

ropositions 1–4, there exists at least a set of prices

Pi1, . . . , Pin) that maximizes R(m − 1, Pi1, . . . , Pin). Then we

efine π ∗m−1(X1, . . . , Xn) = (Pi1, . . . , Pin, π ∗

m) as the optimal

olicy and knowing that π ∗m is the set of optimal poli-

ies in t = m, then Vm−1(X1, X2, . . . , Xn)π∗m−1

(X1,X2,...,Xn) ≥ Vm−1

X1, X2, . . . , Xn)πm−1(X1,X2,...,Xn) for any other policy πm−1(X1,

2, . . . , Xn). Finally, assuming that this holds for t = j + 1 and

hat π ∗j+1

(X1, . . . , Xn) = (Pi1, . . . , Pin, π ∗j+2

) is the optimal policy for

j+1(X1, X2, . . . , Xn), then in t = j, as concluded through Prepo-

itions 1–4, and the same methodology of t = m − 1, we define∗j(X1, . . . , Xn) = (Pi1, . . . , Pin, π ∗

j+1) as the optimal policy where

j(X1, X2, . . . , Xn)π∗j(X1,X2,...,Xn) ≥ Vj(X1, X2, . . . , Xn)πj(X1,X2,...,Xn) for any

ther policy πj(X1, X2, . . . , Xn). This proves by induction that the

ptimal policy exists.


org/10.1016/j.ejor.2014.09.038




Table 3

Parameters for each scenario.

Scenario Number of FFs Decision epochs Available seats Buying data Arrival data Number of states

1 2 2 8 Exponential Exponential 24


3 2 7 8 Industry data Exponential 139


5 2 7 30 Lognormal Lognormal 1381

6 2 7 30 Lognormal Lognormal 1381

Table 4

Lognormal fitting.

Lognormal distribution PH distribution Error

Lognormal (300, 10000) Erlang (0.0339, 10) 0.0049




d

v

0

b

m

t

h

c

t

p

t

5

T

i

a

a

t

2

e

o

a

f

t

f

(

a

w

p

n

t

e

i

I

s

o

o

s

I

p

o

Proposition 5 shows that the optimal policy exists for our proposed

dynamic pricing model. However, this policy may not be unique and

for a given problem various policies may result in the same optimal

value of the total expected revenue. In the next section we proceed to

present numerical results on a variety of scenarios and a case with real

data to illustrate the performance of our proposed model, along with

further structural properties of these results, involving the behavior

of the price depending on the changes in available seats and days

prior to the departure date.

5. Numerical results

In this section we examine the performance of our proposed ap-

proach on a series of created scenarios and a real-world case study.

For each scenario and the case study we first present the results for

fitting the distributions of inter-arrival time of bookings and the tick-

eting probability with appropriate PH distributions. Then we present

the results of the dynamic pricing model for some proposed scenarios

and analyze how the parameters of each scenario affect the compu-

tational time of the algorithm. We will show that the recommended

decisions of our model follow a specific behavior in the scenarios

and in the case study. Finally we will show that our proposed policies

considerably improve the revenue compared to the real world pricing

policies adopted for this case study.

5.1. Scenarios

To test the dynamic pricing model we create six scenarios with

data generated from exponential and lognormal families of distribu-

tions for both the buying probability and inter-arrival times, except

Scenario 3 where we use real industry data for the buying distribu-

tion. The parameters for each scenario are shown in Table 3, where the

time interval between each epoch, �t, is equal to 2 days. The main dif-

ference between each of the scenarios is the number of states, which

depends on the number of decision periods (epochs) and the number

of seats initially available for sale (AU). Scenario 1, Scenario 2, and

Scenario 4 use exponential distributions for both probability distri-

butions of ticketing and buying. This is done to test only the dynamic

pricing algorithm and exclude the PH fitting algorithm. So each of the

other parameters of the model, decision epochs and available seats,

vary to test the difference in computational time in the dynamic pric-

ing algorithm. Scenario 3 is used to test the PH fitting algorithm for

only one of the probability distributions. On the other hand, Scenarios

5 and 6 are used to test the performance of the PH fitting algorithms

for both the arrival and buying distributions and observe how other

variables such as the parameters of the distribution could affect the

computational time.

We first fit the data on buying and inter-arrival time between

bookings in each scenario by an appropriate PH distribution as de-

scribed in Section 3, with the exception of the cases where data fol-

low an exponential distribution, which is the most basic form of a PH

distribution with only one phase.

Again, Scenarios 5 and 6 are used to test the PH fitting algorithms

for the inter-arrival time probability and the buying probability. In

particular, as shown in Table 4, in Scenario 5 the inter-arrival time



istribution is lognormal, where it has a mean value of 300 and a

ariance of 10,000. The error, calculated with Eq. (20) shows to be

.0049, which is relatively small. For all the other lognormal distri-

utions used in Scenarios 5 and 6, the errors were also in the same

agnitude, as shown in Table 4.

The results in Table 4 show that for each of the lognormal distribu-

ions used, the fitting algorithm finds a feasible PH distribution that

as a reduced error, where the Erlang distribution fitted best to all the

ases. After fitting the distributions of ticketing and the inter-arrival

ime of bookings with the appropriate PH equivalents, we test the

roposed dynamic pricing model in every scenario, as discussed in

he next section.

.2. Dynamic pricing model scenario results

We run the dynamic pricing model for each of the scenarios in

able 3. The results show the prices to be fixed for each fare family

n each period of time. For example, the results of Scenario 4 for

particular decision period are presented in Table 5. This scenario

ssumes 2 fare families, 10 decision epochs, 10 available seats for

he higher fare family, where each seat can be sold at a price of 150,

00, or 300, and 20 available seats for the lowest fare family, where

ach seat can be sold at a price of 50, 100, or 175. In the columns

f Table 5 we have the remaining seats of the high cost fare family

nd on the rows we have the remaining seats of the low cost fare

amily. In the body of the table, the first price shown in parentheses is

hat of the higher fare family and the second is that of the lower fare

amily. When either of the two fare families has zero seats available

or zero capacity) then the price shown is that of the fare family with

vailable seats. Table 6 shows a different behavior, that represents

hat happens when the seats are not sold throughout the selling

eriod. In this table we denote fare family as FF. When the seats are

ot sold the price tends to decrease, to attract more customers at

he final epochs and at least receive some additional revenue. For

xample for FF 1, from 12 days to 10 days before departure, the price

s reduced from 300 to 200.

Through Tables 5 and 6 we can observe some interesting behavior.

f a low number of seats remain available, then the price is increased

o the remaining seats can be sold at a higher price to take advantage

f the maximum willingness-to-pay of the customer. If the number

f seats remains the same, then the price decreases so at least some

eats are sold and fewer empty seats remain at the departure date.

n particular, Table 5 shows in bold the threshold line where the

rice changes for either one of the fare families. When the number

f available seats decreases in either direction, the price tends to


org/10.1016/j.ejor.2014.09.038




Table 5

Results of Scenario 4, for a particular period of time.

Remaining seats for lower fare family Remaining seats for higher fare family

0 1 2 3 4 5 6 7 8 9 10

0 0 300 300 200 200 200 200 200 200 200 200

1 175 (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

2 175 (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

3 175 (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

4 175 (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

5 175 (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

6 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

7 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

8 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

9 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

10 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

11 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

12 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

13 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

14 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

15 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

16 175 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

17 100 (300, 175) (300, 175) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

18 100 (300, 100) (300, 100) (300, 175) (300, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175) (200, 175)

19 100 (300, 100) (300, 100) (300, 100) (300, 100) (200, 100) (200, 100) (200, 100) (200, 100) (200, 100) (200, 100)

20 100 (300, 100) (300, 100) (300, 100) (300, 100) (200, 100) (200, 100) (200, 100) (200, 100) (200, 100) (200, 100)

Table 6

Results of Scenario 4 for fixed capacities.

Days before

departure

Available

seats FF1

Avaliable

seats FF2

Price FF 1 Price FF2

2 5 20 200 175

3 5 20 200 175

4 5 20 200 175

8 5 20 200 175

10 5 20 200 175

12 5 20 300 175

Table 7

Computation time results for each scenario

Scenario Time (s) Number of states

1 7.64 24

2 45.23 139

3 113.5 139

4 779.4 1381

5 5497.04 1381

6 3461.88 1381

i

t

b

t

r

d

s

w

p

a

i

I

t

P

e

b

a

L

i

s

v

h

o

a

i

5

u

n

t

b

w

fl

a

o

t

i

p

W

d

i

d

T

t

o

b

d

p

l

t

n

fl

fl

t

ncrease, increasing in this way the expected revenue received. On

he other hand, if there are many seats available the price tends to

e lower, increasing the probability of selling a seat, hence increasing

he expected revenue. Table 6 shows that when the number of seats

emains the same and there are fewer days prior to the departure

ate, the price decreases. Similar behavior is observed in all of the

cenarios of Table 3.

To test the computational performance of our proposed model,

e run each of the scenarios and in Table 7 we show the total com-

utational time. The computational time is the performance of the

lgorithm (PH fitting when applicable, and solving the dynamic pric-

ng model by backward induction) coded in MatLab and run on an

ntel Core i3 of 2.10 gigahertz with 3 gigabytes of RAM.

Table 7 shows that the number of states affects the computation

ime, which is presented in seconds. The time taken to estimate the

H distribution is negligible compared to the time that it takes to

stimate the whole algorithm. Estimating the transition probabilities

y Eq. (11) takes the majority of time in this algorithm, which is

ffected by the behavior of input data and the number of classes.



ooking at Scenarios 5 and 6 there is a relatively large difference

n their computational times, although the number of states in both

cenarios is the same. The main difference between them is the mean

alue and the standard deviation of the data. In this case, Scenario 5

as a mean value higher than Scenario 6. This affects the estimation

f the probabilities because with a higher mean value, the HE fitting

lgorithm (Thummler, Buchholz & Telek, 2005) takes longer to reach

ts stopping criteria.

.3. Case study with industry data

In this case study we use industry data for the month of October,

sing 31 domestic flights with the same characteristics, leg and flight

umber, divided into 3 fare families. This division is done such that

he data are grouped by the type of product, suggesting the same

ehavior within a group. To test the performance of our model, we

ill compare our optimal price policy to that of the airline for these

ights. The airline normally changes the price depending on the seats

vailable in that moment but it does not take into account the effect

f this change on the other periods of time. For example, a decision

hat the airline takes is to reduce the price to increase the revenue

mmediately, but this can result in the tickets to be sold at a lower

rice than what a customer is willing to pay in the following epochs.

e will show that our proposed policy outperforms the current airline

ecisions.

Two types of input data are considered in this study. The first

s the number of bookings per class with their corresponding price

uring the whole selling period, which is approximately one year.

his information is taken for each fare family, and depending on the

ype of fare family, there can be between 600 and 3000 records in

ne month. The other type of data is the information on the arrival of

ookings.

The number of seats in these flights is 138 in the economy cabin,

ivided into three fare families, 25 for the fare family with the highest

rices, 38 for the medium fare, and 75 for the fare family with the

ower price. For this case we exclude the executive cabin because

he prices do not change significantly during the selling period. The

umber of decision periods is 10, since the bookings for a domestic

ight begin closer to the departure date compared to an international

ight. Then the total number of states where a decision is taken for

his case study is approximately 690,000 states.


org/10.1016/j.ejor.2014.09.038




Table 8

Errors of fitting the buying probability.

Fare family Buying probability distribution Inter-arrival distribution

PH distribution Error PH distribution Error

1 HE—2 roots, 20 phases 0.012 HE—2 roots, 3 phases 0.018

2 HE—2 roots, 20 phases 0.014 HE—2 roots, 3 phases 0.037

3 ACPH–5 phases 0.005 HE—2 roots, 3 phases 0.033

Fig. 10. HE distribution fitting example.

e

o

i

c

a

a

7

t

w

l

t

f

s

f

t

t

p

i

r

m

r

fi

a

i

t

a

m

i

H

n

t

s

w

w

m

c

a

n

r

We fit the arrival and buying probability distributions of this case

using the algorithms explained in Section 3. The results are shown in

Table 8. It can be seen that the fitting errors are relatively small, so

we can assume that the chosen PH distributions fit the behavior of

the real data well.

Most of the PH distributions that were used for the fitting were

from the HE family of distributions. In particular, the fitting for the

buying probability of Fare Family 1 is an HE distribution of 2 roots with

20 phases where the first root is an Erlang (0.071,12) and the second

one is an Erlang (0.2387,8). Fig. 10 shows the graphical representation

for this particular example.

We run the dynamic pricing model where we have as a result

the decisions of the price at which to sell each seat for each fare

family depending on the number of available seats and the epoch.

In Table 9 we see a fragment of the complete set of results. In this

table we fix the number of seats of the fare family with the high-

est price at a particular moment of time and change the number of

seats available for the other two fare families. The total result is a

table of 75X38 but with this fragment the results’ behavior is shown.

The main observation of this case study is that the decisions follow

the same behavior as shown in Table 5. When the number of available

seats decreases then the prices increase and a threshold line can be

Table 9

Results of case study fixing the highest fare family and the period of time.

Remaining seats for lower fare family Remaining seats for medium fare fam

4 5 6

49 (120,40) (120,40) (120,40)

50 (120,20) (120,20) (120,20)

51 (120,20) (120,20) (120,20)

52 (120,20) (120,20) (120,20)

53 (120,20) (120,20) (120,20)

54 (120,20) (120,20) (120,20)

55 (120,20) (120,20) (120,20)

56 (120,20) (120,20) (120,20)

57 (120,20) (120,20) (120,20)

58 (120,20) (120,20) (120,20)

59 (120,20) (120,20) (120,20)

60 (120,20) (120,20) (120,20)

61 (120,20) (120,20) (120,20)

62 (120,20) (120,20) (120,20)

63 (120,20) (120,20) (120,20)

64 (120,20) (120,20) (120,20)

65 (120,20) (120,20) (120,20)

66 (120,20) (120,20) (120,20)

67 (120,20) (120,20) (120,20)



stablished. For example, if we fix the period of time and the number

f available seats of a particular fare family, we can see the behav-

or of the prices of the other fare families when the number of seats

hanges.

As the table shows, the price increases as we have less seats avail-

ble. Blocks of prices are formed where the price does not change

nd threshold lines where the price changes. For example between

and 15 seats available for the medium fare family and from 53

o 59 seats available for the lower fare family there exists a block

here the price is 100 for the medium fare family and 20 for the

ower fare family. Surrounding this block, there is what we call a

hreshold line where the price changes for either or both of the fare

amilies, highlighted in bold font. Take into account that we show the

eats from 7 to 13 seats available in the same column for the higher

are family because the prices do not change between these two. For

he airline, the threshold lines are the most important results, since

hey provide practical guidelines on when and how to change the

rice.

As for the results of the dynamic pricing model, the revenue is

ncreased for 21 out of the 31 flights in the month. On average the

evenue was increased by 31 percent for these flights over a month,

aking a higher profit than the current decisions of the airline. The

evenue in our model was calculated with Eq. (14) applied in the

rst decision epoch to the state represented with the total capacity

vailable for each fare family. This calculation of the revenue takes

nto account the policy that the model gives as the result, where

his policy considers the effect of the price of the demand during

ll the selling period. The difference compared to the policies, as we

entioned before, is that the airline changes the price without taking

nto account what a customer is willing to pay in the following epochs.

owever, our model takes this effect into account.

For the other 10 flights where the revenue was not increased, we

oticed that they were mostly on a weekend or a holiday, which shows

hat a segmentation of the demand may be needed. With a better

egmentation, i.e., dividing the data into weekends and weekdays,

e can find a policy that increments the revenue on the weekends as

ell.

Finally the total time taken to solve this problem was approxi-

ately 35 hours, taking into account that the specifications of the

omputer (as mentioned in Section 5.2) are lower than the servers of

n airline. We should also take into account that the problem only

eeds to be run one time to calculate the decisions that optimize the

evenue for the whole selling period.

ily

7–13 14 15 16 17 18

(100,40) (100,40) (100,40) (100,40) (80,40) (80,40)

(100,40) (100,40) (100,40) (100,40) (80,40) (80,40)

(100,40) (100,40) (100,40) (100,40) (80,40) (80,40)

(100,20) (100,40) (100,40) (100,40) (80,40) (80,40)

(100,20) (100,20) (100,20) (100,20) (80,40) (80,40)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (100,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (100,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (80,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (80,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (80,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (80,20) (80,20) (80,20) (80,20)

(100,20) (100,20) (80,20) (80,20) (80,20) (80,20)


org/10.1016/j.ejor.2014.09.038




6

d

d

i

t

h

s

p

w

t

M

p

m

d

t

e

b

r

t

t

T

c

t

m

b

r

r

p

t

p

t

n

n

o

s

a

c

p

t

t

t

e

a

r

o

l

d

b

a

p

w

t

e

s

m

t

R

A

BB

B

C

E

F

H

J

K

L

L

L

M

O

P

P

PP

R

T

T

T

. Conclusions and future work

In the airline industry fixing the price of a ticket can affect the

emand in the future. Fixing a low price will force an increase in

emand but also a loss in the expected revenue received. If a passenger

s willing to pay more for a ticket than the actual price of the same

icket, then that difference in revenue is considered loss. On the other

and if a higher price is fixed then there exists a high chance that a

eat flies empty, which is also a loss in the revenue. Therefore, in this

aper we propose a dynamic pricing model to find the pricing policy

hich maximizes the total revenue of the flight, taking into account

he trade-off between selling a seat underpriced or flying it empty.

ost of the work on dynamic pricing in the literature assumes that the

robability of buying follows a known distribution. This assumption

ay be unrealistic in practical settings as in many occasions empirical

ata do not follow a known distribution. In this paper we propose

he application of phase type distributions to closely approximate

mpirical data.

The probability that a number of customers buy a previously

ooked ticket is affected by two behaviors. The first behavior is the ar-

ival process of bookings where we use renewal processes to estimate

his probability. The second behavior is the distribution of probability

hat at a certain price the customer buys a previously booked ticket.

his probability is estimated with the number of bookings that be-

ome tickets at the end. The first prominent result of this work is

hat the phase type distributions fit the empirical data with noise or

ultimodality better than the standard distributions.

We test our proposed dynamic pricing model on different distri-

utions and combination of seats, and observe reasonable results. By

unning our model, we find the prices for each fare family in each pe-

iod of time to maximize the total expected revenue over the selling

eriod. With these results, it is known when to increase or decrease

he prices depending on the number of seats and the time left to de-

arture. We prove that there exists at least one price that maximizes

he total expected revenue and constantly increasing the price does

ot necessarily result in increasing the revenue. Furthermore, in the

umerical results we show that the behavior of the price depending

n the number of available seats is according to intuition. If many

eats are sold then the prices rise and, on the other hand if the seats

re not sold, then the prices decrease. The computational time for a

ommon flight can take several hours on a personal computer but ap-

lying the model to an airline server will reduce the time. Also, since

he model only runs one time for each flight and the results work for

he whole selling period, we presume that the computational time is

olerable for this type of decision-making.

The case study presented in this paper shows an increase in rev-

nue using our model, for 21 of the 31 flights in the month and on

verage the revenue is increased by 31 percent. The PH fitting algo-

ithm fitted according to the behavior of the inter-arrival distribution

f bookings and the ticketing probability. The fitting errors were not

arger than 0.037, which is relatively low compared to fitting known

istributions, such as the exponential distribution, to the same data.



To extend and improve this work, two main directions should

e considered. The first is to reduce the computational time of the

lgorithm, which requires heuristics or methodologies such as ap-

roximate dynamic programming. Reducing the computational time

ill help find the optimum price policy for various flights at the same

ime. It also makes it possible to increase the number of decision

pochs to have a more precise model. The second direction to be con-

idered is to relax some of the assumptions of the dynamic pricing

odel, for example cancellations and overbooking, which will make

he results closer to reality.

eferences

smussen, S., Nerman, O., & Olsson, M. (1996). Fitting phase-type distributions via theEM algorithm. Scandinavian Journal of Statistics, 23 (4), 419–441.

ertesaks, D. (2000). Dynamic programming and optimal control. Athena Scientific.obbio, A., Horváth, A., Scarpa, A., & Telek, M. (2003). Acyclic discrete phase type distri-

butions: Properties and a parameter estimation algorithm. Performance Evaluation,

54 (1), 1–32.obbio, A., Horváth, A., & Telek, M. (2005). Matching three moments with minimal

acyclic phase type distributions. Stochastic Models, 21 (1), 303–326.hen, L. (2012). Dynamic pricing with active learning under twosided censoring Work-

ing paper, (pp. 1–33). The Fuqua School of Business Duke University.rnst, R., & Kamrad, B. (2006). Estimating demand by using sales information:

Inaccuracies encountered. European Journal of Operational Research, 174 (2),

675–688.iig, T., Isler, K., Hopperstad, C., & Belobaba, P. (2010). Optimization of mixed fare

structures: Theory and applications. Journal of Revenue and Pricing Management, 9(1), 152–170.

aensel, A., & Koole, G. (2011). Estimating unconstrained demand rate functions usingcustomer choice sets. Journal of Revenue and Pricing Management, 10 (5), 1–17.

ohnson, M., & Taaffe, M. (1990). Mathcing moments to phase distributions: Nonlinear

programming approaches. Communications in Statistics - Stochastic Models, 6 (2),259–281.

hayari, R., Sadre, R., & Haverkort, B, (2003). Fitting world-wide web request traceswith the em-algorithm. Performance Evaluation, 52 (1), 175–191.

atouche, G., & Ramaswami, V. (1999). Introduction to matrix analytic methods in stochas-tic modeling. ASA-SIAM.

i, J., & Chen, S. (2009). Real-time dynamic pricing for multiproduct models with

time-dependent customer arrival rates. In: 2009 American Control Conference(pp. 2197–2201).

i, L., & Ji-hua, P. (2007). Dynamic pricing model for airline revenue management undercompetition. Systems Engineering - Theory and Practice, 27 (11), 15–25.

aglaras, C., & Meissner, J. (2006). Dynamic pricing strategies for multi-productrevenue management problems. Manufacturing & Service Operations Management

(MSOM), 8 (2), 136–148.

sogami, T., & Harchol-Balter, M. (2003). A closed-form solution for mapping gen-eral distributions to minimal PH distributions. Computer performance evaluation.

Modelling techniques and tools (pp. 200–217). North-Hollandarr, W., & Shucanny, W. (1979). Minimum distance and robust optimization Working

paper, (pp. 1–39) Department of Statistics, Shouthern Methodist University.erez, J., & Riaño, G. (2007). Benchmarking of fitting algorithms for continuous phase-

type distributions Working paper, (pp. 1–20). COPA Universidad de los Andes.

hillips, R. (2005). Pricing and revenue optimization. Stanford Business Books.uterman, M. (2005). Markov decision process: Discrete stochastic dynamic programming.

Wiley Interscience.oss, S. (2007). Introduction to probability models. Elsevier.

alluri, K., & Van Ryzin, G. (2004). The theory and practice of revenue management.Springer.

elek, M., & Heindl, A. (2003). Matching moments for acyclic discrete and continuousphase type distributions of second order. International Journal of Simulation, 3 (3),

1–11.

hummler, A., Buchholz, P., & Telek, M. (2005). A novel approach for phase-type fittingwith the EM algorithm. IEEE Computer Society, 3 (3), 1–35.


org/10.1016/j.ejor.2014.09.038

http://refhub.elsevier.com/S0377-2217(14)00777-2/bib001
























Documents

A stochastic dynamic pricing model for the multiclass problems in the airline industry